I fixed the problem. See the attached patch.
For the case of writing 3 bytes to memory, first I tried using a
simple 3-byte array, but that didn't work quite right. Then I wrapped
the 3-byte array in a struct and added some operator overloads to it
so that things like assignment to 0 will work as expected.
Unfortunately some of the old format definitions (like StoreExecute)
in mem.isa were unsuitable for the unaligned stores because they
assumed a fixed size; the type to be written was explicitly cast in
the call to xc->write(). With the unaligned accesses, the number of
bytes to write will vary at runtime and depend on the byte offset
within the memory word. Thus, I created some new format definitions
(like StoreUnalignedExecute) that switch on the size of the write to
make the appropriate call to xc->write().
All the write() functions were templated to accept writes of different
sizes, but none of them had declarations to handle 3-byte writes (and,
as you probably know, the C++ templating system requires this if a
templated function is used in a different file that is compiled
separately), so I ended up having to add new declarations to write()
everywhere to handle the 3-byte writes.
I've tested the changes and they appear to work. But I've only tested
on little-endian host machines (x86 and x86_64) since I don't have a
big-endian machine, and only tested MIPS in its default little-endian
mode since I don't have a cross-compiler to produce MIPS big-endian
binaries. While I tried to code carefully to handle these other
endian configurations, I cannot guarantee that they will work without
some slight modifications.
Anyway, thanks for all your help and advice.
On Tue, Jan 19, 2010 at 7:03 PM, Korey Sewell <[email protected]> wrote:
> Good points about potential padding.
>
> The structure could be just used to obtain the width, where as the
> assignment of data could be just be in a void* or whatever arbitrary type as
> I'm guessing that once the request is made, it will simply read/write as
> many bytes as the width says it should.
>
> Matt would you be willing to give one of these a try in the interim?
>
> On Tue, Jan 19, 2010 at 9:48 PM, Gabriel Michael Black
> <[email protected]> wrote:
>>
>> A structure might not work because of padding (I'm just guessing
>> though), but what about uint8_t [3]? That might get wierd if something
>> tries to dereference a pointer to that type and do a direct
>> assignment, something like:
>>
>> template <class T>
>> bar(T *foo)
>> {
>> ...
>> T *bar = (T *)baz;
>> *bar = *foo;
>> }
>>
>> But that might work too. I'm not sure.
>>
>> Gabe
>>
>> Quoting Korey Sewell <[email protected]>:
>>
>> > Hi Matt/Steve,
>> > sorry I'm late to the conversation.
>> >
>> > In short, the reason it's implemented currently as a read-modify-write
>> > was
>> > because what Matt found out to be true: the 3-byte store problem. At the
>> > time of implementation, M5 was at the crossroads of another memory
>> > system
>> > change (remember mergedmem anyone?) and there wasnt any really good,
>> > unobtrusive solution to integrate this in.
>> >
>> > Like Steve mentioned, the ISA/CPU model "dance" was just not working out
>> > at
>> > the time but getting them to work in the AtomicSimpleCPU was at least a
>> > good
>> > functionality start. I had a "readFunctional" there which was able to
>> > read
>> > from memory without adversely affecting the LSQ and other dependent
>> > structures. Reading the data out first before storing would allow you to
>> > get
>> > the right data and *hopefully not upset the timing too much depending on
>> > what you're doing. Performing a regular read there will definitely not
>> > work
>> > there though as Matt found out.
>> >
>> > I believe the only addition that we previously made was to add a
>> > "unaligned"
>> > flag or something of that effect to the request structure so that we
>> > wouldnt
>> > fault when the swl/swr pairs were used in MIPS.
>> >
>> > I dont know why this never came up then (or maybe it did), but a
>> > proposed
>> > solution might be to create a new struct that is exactly 3 bytes long
>> > (say
>> > an uint16_t and a uint8_t), then when a 3-byte write is necessary, we
>> > can
>> > use that structure to generate the right width for the request.
>> >
>> > Thoughts?
>> >
>> > On Tue, Jan 19, 2010 at 8:31 PM, Steve Reinhardt <[email protected]>
>> > wrote:
>> >
>> >> On Tue, Jan 19, 2010 at 5:13 PM, Matt DeVuyst <[email protected]>
>> >> wrote:
>> >> > I don't see how 3 byte stores will work. The write() functions don't
>> >> > take a size argument. They just template the data argument (and can
>> >> > figure out the size by with sizeof(T)). So, for example, if you want
>> >> > to write 2 bytes to memory you just call write((uint16_t)data,
>> >> > address, flags, NULL). But I'm not aware of any data type that's 3
>> >> > bytes wide. Is changing all the write() functions to accept a size
>> >> > argument the best thing to do here?
>> >>
>> >> Good point... I hadn't thought that through. I think the memory
>> >> system at the lower level will handle this (i.e., I believe you can
>> >> put together a Request object that has size=3 and send that to a
>> >> memory object and it will work; it's really just the interface between
>> >> the ISA and the CPU model that gets in the way.
>> >>
>> >> >
>> >> > Also, I'm trying to picture how these instructions would be
>> >> > implemented in a real MIPS processor. Would you really be able to
>> >> > write 3 bytes to memory or would you have to read a full word first
>> >> > and then modify the appropriate bytes and write back?
>> >>
>> >> In hardware it's pretty easy to do per-byte enables if you need to;
>> >> much simpler than doing a read-modify-write sequence. The only reason
>> >> I can imagine that they wouldn't do it that way is if they already had
>> >> to do an RMW for some other instructions and so they piggybacked on
>> >> that. I don't know of any MIPS instructions that would require RMW
>> >> though (particularly since MIPS uses LL/SC for atomic ops, which is
>> >> the main reason you'd want RMW).
>> >>
>> >> Have we discussed doing 3-byte memory ops on the list before? It's
>> >> clear it would require some new functionality at the read()/write()
>> >> ExecContext interface, but that seems like a worthwhile addition to
>> >> me.
>> >>
>> >> Steve
>> >>
>> >> >
>> >> > On Tue, Jan 19, 2010 at 3:08 PM, Steve Reinhardt <[email protected]>
>> >> wrote:
>> >> >> My guess is that the instruction definition should be rewritten to
>> >> >> do
>> >> >> a store of the correct number of bytes and not a read-modify-write.
>> >> >> I
>> >> >> don't think there's a reason the store function shouldn't handle a
>> >> >> 3-byte store; somebody correct me if I'm wrong.
>> >> >>
>> >> >> Steve
>> >> >>
>> >> >> On Tue, Jan 19, 2010 at 2:38 PM, Matt DeVuyst <[email protected]>
>> >> wrote:
>> >> >>> Hi,
>> >> >>>
>> >> >>> I'm getting an an assertion failure when executing MIPS code
>> >> >>> in the o3 cpu model. The same program executes correctly in the
>> >> >>> atomic cpu model. The assertion that is failing is
>> >> >>> src/cpu/o3/lsq_unit.hh line 489. It appears that the MIPS
>> >> >>> instructions SWL and SWR don't work in o3.
>> >> >>>
>> >> >>> Here is the relevant portion of the backtrace:
>> >> >>>
>> >> >>> #3 0xb7a9e5ce in __assert_fail () from
>> >> >>> /lib/tls/i686/cmov/libc.so.6
>> >> >>>
>> >> >>> #4 0x08400914 in LSQUnit<O3CPUImpl>::read<unsigned int>
>> >> >>> (this=0xa695f68, req=0xa7a3a98, da...@0xbffbbb3c, load_idx=8) at
>> >> >>> build-smt03/build/MIPS_SE/cpu/o3/lsq_unit.hh:489
>> >> >>>
>> >> >>> #5 0x08403745 in LSQ<O3CPUImpl>::read<unsigned int>
>> >> >>> (this=0xa695f40,
>> >> >>> req=0xa7a3a98, da...@0xbffbbb3c, load_idx=8) at
>> >> >>> build-smt03/build/MIPS_SE/cpu/o3/lsq.hh:376
>> >> >>>
>> >> >>> #6 0x0840378d in FullO3CPU<O3CPUImpl>::read<unsigned int>
>> >> >>> (this=0xa694bc0, r...@0xbffbbad4, da...@0xbffbbb3c, load_idx=8) at
>> >> >>> build-smt03/build/MIPS_SE/cpu/o3/cpu.hh:708
>> >> >>>
>> >> >>> #7 0x0840399a in BaseDynInst<O3CPUImpl>::read<unsigned int>
>> >> >>> (this=0xa7dd9a0, addr=5068452, da...@0xbffbbb3c, flags=0) at
>> >> >>> build-smt03/build/MIPS_SE/cpu/base_dyn_inst.hh:884
>> >> >>>
>> >> >>> #8 0x083bb699 in MipsISAInst::Swr::initiateAcc (this=0xa8d0428,
>> >> >>> xc=0xa7dd9a0, traceData=0x0) at
>> >> >>> build-smt03/build/MIPS_SE/arch/mips/o3_cpu_exec.cc:25017
>> >> >>>
>> >> >>> #9 0x08228a27 in BaseO3DynInst<O3CPUImpl>::initiateAcc
>> >> >>> (this=0xa7dd9a0) at
>> >> >>> build-smt03/build/MIPS_SE/cpu/o3/dyn_inst_impl.hh:111
>> >> >>>
>> >> >>> #10 0x0837b69f in LSQUnit<O3CPUImpl>::executeStore (this=0xa695f68,
>> >> >>> #store_in...@0xbffbc1f0) at
>> >> >>> #build-smt03/build/MIPS_SE/cpu/o3/lsq_unit_impl.hh:495
>> >> >>>
>> >> >>> #11 0x0836747d in LSQ<O3CPUImpl>::executeStore (this=0xa695f40,
>> >> >>> in...@0xbffbc1f0) at
>> >> >>> build-smt03/build/MIPS_SE/cpu/o3/lsq_impl.hh:327
>> >> >>>
>> >> >>> #12 0x0828c0fa in DefaultIEW<O3CPUImpl>::executeInsts
>> >> >>> (this=0xa6956a0)
>> >> >>> #at build-smt03/build/MIPS_SE/cpu/o3/iew_impl.hh:1232
>> >> >>>
>> >> >>> #13 0x0829436f in DefaultIEW<O3CPUImpl>::tick (this=0xa6956a0) at
>> >> >>> #build-smt03/build/MIPS_SE/cpu/o3/iew_impl.hh:1458
>> >> >>>
>> >> >>> #14 0x081b5ae4 in FullO3CPU<O3CPUImpl>::tick (this=0xa694bc0) at
>> >> >>> #build-smt03/build/MIPS_SE/cpu/o3/cpu.cc:514
>> >> >>>
>> >> >>> #15 0x081b7254 in FullO3CPU<O3CPUImpl>::TickEvent::process
>> >> >>> #(this=0xa694d0c) at build-smt03/build/MIPS_SE/cpu/o3/cpu.cc:86
>> >> >>>
>> >> >>> #16 0x0822feda in EventQueue::serviceOne (this=0x8834008) at
>> >> >>> #build-smt03/build/MIPS_SE/sim/eventq.cc:202
>> >> >>>
>> >> >>> #17 0x085c238c in simulate (num_cycles=9223372036854775807) at
>> >> >>> build-smt03/build/MIPS_SE/sim/simulate.cc:73
>> >> >>>
>> >> >>> So the MIPS instruction is an SWR instruction that stores anywhere
>> >> >>> from 1 to 4 bytes into an aligned memory word. The format function
>> >> >>> that handles this instruction is StoreUnalignedMemory (in
>> >> >>> src/arch/mips/isa/formats/mem.isa). This specifies that the read()
>> >> >>> function of the dynamic instruction
>> >> >>> (BaseDynInst<O3CPUImpl>::read())
>> >> >>> be called to get the whole aligned word from memory. Then the
>> >> >>> appropriate bytes can be changed and the whole word written back to
>> >> >>> memory.
>> >> >>>
>> >> >>> The problem seems to be that this store instructions (and other
>> >> >>> unaligned instructions like it) read from memory as well as write
>> >> >>> memory, but only a single StoreQueue (and not LoadQueue) LSQ entry
>> >> >>> was
>> >> >>> created for it.
>> >> >>>
>> >> >>> Should an entry be made on the LoadQueue as well? Or should a
>> >> >>> separate lightweight memory read function be called instead? It
>> >> >>> looks
>> >> >>> like the latter may have been what was done in the past (if this
>> >> >>> instruction ever used to work correctly)...because there's a
>> >> >>> commented out line in the format definition of StoreUnalignedMemory
>> >> >>> that calls xc->readFunctional().
>> >> >>>
>> >> >>>
>> >> >>> --
>> >> >>> Cheers,
>> >> >>> Matt
>> >> >>> _______________________________________________
>> >> >>> m5-dev mailing list
>> >> >>> [email protected]
>> >> >>> http://m5sim.org/mailman/listinfo/m5-dev
>> >> >>>
>> >> >> _______________________________________________
>> >> >> m5-dev mailing list
>> >> >> [email protected]
>> >> >> http://m5sim.org/mailman/listinfo/m5-dev
>> >> >>
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Cheers,
>> >> > Matt
>> >> > _______________________________________________
>> >> > m5-dev mailing list
>> >> > [email protected]
>> >> > http://m5sim.org/mailman/listinfo/m5-dev
>> >> >
>> >> _______________________________________________
>> >> m5-dev mailing list
>> >> [email protected]
>> >> http://m5sim.org/mailman/listinfo/m5-dev
>> >>
>> >
>> >
>> >
>> > --
>> > - Korey
>> >
>>
>>
>> _______________________________________________
>> m5-dev mailing list
>> [email protected]
>> http://m5sim.org/mailman/listinfo/m5-dev
>
>
>
> --
> - Korey
>
> _______________________________________________
> m5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/m5-dev
>
>
--
Cheers,
Matt
diff --git a/src/arch/mips/isa/decoder.isa b/src/arch/mips/isa/decoder.isa
index 36533e0..afc2ad0 100644
--- a/src/arch/mips/isa/decoder.isa
+++ b/src/arch/mips/isa/decoder.isa
@@ -2522,15 +2522,24 @@ decode OPCODE_HI default Unknown::unknown() {
format StoreUnalignedMemory {
0x2: swl({{
+ size = byte_offset + 1;
+ #if BYTE_ORDER == BIG_ENDIAN
+ uint32_t reg_shift = 8 * byte_offset;
+ EA += byte_offset;
+ #else
uint32_t reg_shift = 24 - (8 * byte_offset);
- uint32_t mem_shift = 32 - reg_shift;
- mem_word = (mem_word & (mask(reg_shift) << mem_shift)) |
- (Rt.uw >> reg_shift);
+ #endif
+ Mem = Rt.uw >> reg_shift;
}});
0x6: swr({{
+ size = 4 - byte_offset;
+ #if BYTE_ORDER == BIG_ENDIAN
+ uint32_t reg_shift = 24 - (8 * byte_offset);
+ #else
uint32_t reg_shift = 8 * byte_offset;
- mem_word = Rt.uw << reg_shift |
- (mem_word & (mask(reg_shift)));
+ EA += byte_offset;
+ #endif
+ Mem = (Rt.uw << reg_shift) >> reg_shift;
}});
}
format CP0Control {
diff --git a/src/arch/mips/isa/formats/mem.isa b/src/arch/mips/isa/formats/mem.isa
index 161a52b..56ea483 100644
--- a/src/arch/mips/isa/formats/mem.isa
+++ b/src/arch/mips/isa/formats/mem.isa
@@ -321,6 +321,80 @@ def template StoreExecute {{
}};
+def template StoreUnalignedExecute {{
+ Fault %(class_name)s::execute(%(CPU_exec_context)s *xc,
+ Trace::InstRecord *traceData) const
+ {
+ Addr EA;
+ Fault fault = NoFault;
+ unsigned int size = 0;
+
+ %(fp_enable_check)s;
+ %(op_decl)s;
+ %(op_rd)s;
+ %(ea_code)s;
+
+ if (fault == NoFault) {
+ %(memacc_code)s;
+ }
+
+ if (fault == NoFault) {
+ switch (size) {
+ case 0:
+ panic ("Store of size 0 incorrect\n");
+ break;
+ case 1:
+ {
+ uint8_t M = Mem;
+ M = htog(M);
+ fault = xc->write(M, EA,
+ memAccessFlags, NULL);
+ }
+ break;
+ case 2:
+ {
+ uint16_t M = Mem;
+ M = htog(M);
+ fault = xc->write(M, EA,
+ memAccessFlags, NULL);
+ }
+ break;
+ case 3:
+ {
+ uint24_t M = Mem;
+ M = htog(M);
+ fault = xc->write(M, EA,
+ memAccessFlags, NULL);
+ }
+ break;
+ case 4:
+ {
+ uint32_t M = Mem;
+ M = htog(M);
+ fault = xc->write(M, EA,
+ memAccessFlags, NULL);
+ }
+ break;
+ default:
+ panic ("Store of size %d unsupported\n", size);
+ break;
+ }
+ if (traceData) { traceData->setData(Mem); }
+ }
+
+ if (fault == NoFault) {
+ %(postacc_code)s;
+ }
+
+ if (fault == NoFault) {
+ %(op_wb)s;
+ }
+
+ return fault;
+ }
+}};
+
+
def template StoreFPExecute {{
Fault %(class_name)s::execute(%(CPU_exec_context)s *xc,
Trace::InstRecord *traceData) const
@@ -392,6 +466,7 @@ def template StoreCondExecute {{
}
}};
+
def template StoreInitiateAcc {{
Fault %(class_name)s::initiateAcc(%(CPU_exec_context)s *xc,
Trace::InstRecord *traceData) const
@@ -419,6 +494,72 @@ def template StoreInitiateAcc {{
}};
+def template StoreUnalignedInitiateAcc {{
+ Fault %(class_name)s::initiateAcc(%(CPU_exec_context)s *xc,
+ Trace::InstRecord *traceData) const
+ {
+ Addr EA;
+ Fault fault = NoFault;
+ unsigned int size = 0;
+
+ %(fp_enable_check)s;
+ %(op_decl)s;
+ %(op_rd)s;
+ %(ea_code)s;
+
+ if (fault == NoFault) {
+ %(memacc_code)s;
+ }
+
+ if (fault == NoFault) {
+ switch (size) {
+ case 0:
+ panic ("Store of size 0 incorrect\n");
+ break;
+ case 1:
+ {
+ uint8_t M = Mem;
+ M = htog(M);
+ fault = xc->write(M, EA,
+ memAccessFlags, NULL);
+ }
+ break;
+ case 2:
+ {
+ uint16_t M = Mem;
+ M = htog(M);
+ fault = xc->write(M, EA,
+ memAccessFlags, NULL);
+ }
+ break;
+ case 3:
+ {
+ uint24_t M = Mem;
+ M = htog(M);
+ fault = xc->write(M, EA,
+ memAccessFlags, NULL);
+ }
+ break;
+ case 4:
+ {
+ uint32_t M = Mem;
+ M = htog(M);
+ fault = xc->write(M, EA,
+ memAccessFlags, NULL);
+ }
+ break;
+ default:
+ panic ("Store of size %d unsupported\n", size);
+ break;
+ }
+ if (traceData) { traceData->setData(Mem); }
+ }
+
+ return fault;
+ }
+}};
+
+
def template StoreCompleteAcc {{
Fault %(class_name)s::completeAcc(Packet *pkt,
%(CPU_exec_context)s *xc,
@@ -444,6 +585,31 @@ def template StoreCompleteAcc {{
}};
+def template StoreUnalignedCompleteAcc {{
+ Fault %(class_name)s::completeAcc(Packet *pkt,
+ %(CPU_exec_context)s *xc,
+ Trace::InstRecord *traceData) const
+ {
+ Fault fault = NoFault;
+
+ %(fp_enable_check)s;
+ %(op_dest_decl)s;
+
+ if (fault == NoFault) {
+ %(postacc_code)s;
+ }
+
+ if (fault == NoFault) {
+ %(op_wb)s;
+
+ if (traceData) { traceData->setData(getMemData(xc, pkt)); }
+ }
+
+ return fault;
+ }
+}};
+
+
def template StoreCompleteAcc {{
Fault %(class_name)s::completeAcc(Packet *pkt,
%(CPU_exec_context)s *xc,
@@ -602,19 +768,13 @@ def format LoadUnalignedMemory(memacc_code, ea_code = {{ EA = (Rs + disp) & ~3;
def format StoreUnalignedMemory(memacc_code, ea_code = {{ EA = (Rs + disp) & ~3; }},
mem_flags = [], inst_flags = []) {{
- decl_code = 'uint32_t mem_word = 0;\n'
- decl_code += 'uint32_t unaligned_addr = Rs + disp;\n'
+ decl_code = 'uint32_t unaligned_addr = Rs + disp;\n'
decl_code += 'uint32_t byte_offset = unaligned_addr & 3;\n'
- decl_code += '#if BYTE_ORDER == BIG_ENDIAN\n'
- decl_code += '\tbyte_offset ^= 3;\n'
- decl_code += '#endif\n'
- decl_code += 'fault = xc->read(EA, (uint32_t&)mem_word, memAccessFlags);\n'
- #decl_code += 'xc->readFunctional(EA,(uint32_t&)mem_word);'
- memacc_code = decl_code + memacc_code + '\nMem = mem_word;\n'
+ memacc_code = decl_code + memacc_code
(header_output, decoder_output, decode_block, exec_output) = \
LoadStoreBase(name, Name, ea_code, memacc_code, mem_flags, inst_flags,
- exec_template_base = 'Store')
+ exec_template_base = 'StoreUnaligned')
}};
def format Prefetch(ea_code = {{ EA = Rs + disp; }},
diff --git a/src/base/bigint.cc b/src/base/bigint.cc
index d741e1f..5930e2f 100644
--- a/src/base/bigint.cc
+++ b/src/base/bigint.cc
@@ -45,3 +45,12 @@ ostream & operator << (ostream & os, const Twin32_t & t)
os << t.a << ", " << t.b;
return os;
}
+
+ostream & operator << (ostream & os, const uint24_t & t)
+{
+ uint32_t x = 0;
+ for (int i = 2; i >= 0; i++)
+ x |= (uint32_t)(t.val[i]) << (i * 8);
+ os << x;
+ return os;
+}
diff --git a/src/base/bigint.hh b/src/base/bigint.hh
index a4e8738..7454ce1 100644
--- a/src/base/bigint.hh
+++ b/src/base/bigint.hh
@@ -84,15 +84,52 @@ struct m5_twin32_t {
}
};
+struct m5_uint24_t {
+ uint8_t val[3];
+
+ m5_uint24_t()
+ {}
+
+ m5_uint24_t(const uint32_t x)
+ {
+ for (int i = 0; i < 3; i++)
+ val[i] = (x >> (8*i)) & 0xff;
+ }
+
+ inline m5_uint24_t& operator=(const uint32_t x)
+ {
+ for (int i = 0; i < 3; i++)
+ val[i] = (x >> (8*i)) & 0xff;
+ return *this;
+ }
+
+ operator uint32_t()
+ {
+ panic("Tried to cram a uint24_t into an integer!\n");
+ uint32_t x = 0;
+ for (int i = 2; i >= 0; i++)
+ x |= (uint32_t)(val[i]) << (i * 8);
+ return x;
+ }
+
+ void swap_bytes()
+ {
+ uint8_t old_0 = val[0];
+ val[0] = val[2];
+ val[2] = old_0;
+ }
+};
// This is for twin loads (two 64 bit values), not 1 128 bit value (as far as
// endian conversion is concerned!
typedef m5_twin64_t Twin64_t;
typedef m5_twin32_t Twin32_t;
+typedef m5_uint24_t uint24_t;
// Output operator overloads
std::ostream & operator << (std::ostream & os, const Twin64_t & t);
std::ostream & operator << (std::ostream & os, const Twin32_t & t);
+std::ostream & operator << (std::ostream & os, const uint24_t & t);
#endif // __BASE_BIGINT_HH__
diff --git a/src/cpu/inorder/cpu.cc b/src/cpu/inorder/cpu.cc
index 1e3fdc4..3498d0a 100644
--- a/src/cpu/inorder/cpu.cc
+++ b/src/cpu/inorder/cpu.cc
@@ -1479,6 +1479,11 @@ Fault
InOrderCPU::write(DynInstPtr inst, uint8_t data, Addr addr,
unsigned flags, uint64_t *res);
+template
+Fault
+InOrderCPU::write(DynInstPtr inst, uint24_t data, Addr addr,
+ unsigned flags, uint64_t *res);
+
#endif //DOXYGEN_SHOULD_SKIP_THIS
template<>
diff --git a/src/cpu/inorder/inorder_dyn_inst.cc b/src/cpu/inorder/inorder_dyn_inst.cc
index 5ab8396..a289f6b 100644
--- a/src/cpu/inorder/inorder_dyn_inst.cc
+++ b/src/cpu/inorder/inorder_dyn_inst.cc
@@ -688,6 +688,11 @@ Fault
InOrderDynInst::write(uint8_t data, Addr addr,
unsigned flags, uint64_t *res);
+template
+Fault
+InOrderDynInst::write(uint24_t data, Addr addr,
+ unsigned flags, uint64_t *res);
+
#endif //DOXYGEN_SHOULD_SKIP_THIS
template<>
diff --git a/src/cpu/inorder/resources/cache_unit.cc b/src/cpu/inorder/resources/cache_unit.cc
index eb66e10..f8583c2 100644
--- a/src/cpu/inorder/resources/cache_unit.cc
+++ b/src/cpu/inorder/resources/cache_unit.cc
@@ -923,6 +923,11 @@ Fault
CacheUnit::write(DynInstPtr inst, uint8_t data, Addr addr,
unsigned flags, uint64_t *res);
+template
+Fault
+CacheUnit::write(DynInstPtr inst, uint24_t data, Addr addr,
+ unsigned flags, uint64_t *res);
+
#endif //DOXYGEN_SHOULD_SKIP_THIS
template<>
diff --git a/src/cpu/simple/atomic.cc b/src/cpu/simple/atomic.cc
index 05b4ca3..f0f2284 100644
--- a/src/cpu/simple/atomic.cc
+++ b/src/cpu/simple/atomic.cc
@@ -593,6 +593,11 @@ Fault
AtomicSimpleCPU::write(uint8_t data, Addr addr,
unsigned flags, uint64_t *res);
+template
+Fault
+AtomicSimpleCPU::write(uint24_t data, Addr addr,
+ unsigned flags, uint64_t *res);
+
#endif //DOXYGEN_SHOULD_SKIP_THIS
template<>
diff --git a/src/cpu/simple/timing.cc b/src/cpu/simple/timing.cc
index 6b22d2f..b9da975 100644
--- a/src/cpu/simple/timing.cc
+++ b/src/cpu/simple/timing.cc
@@ -642,6 +642,11 @@ Fault
TimingSimpleCPU::write(uint8_t data, Addr addr,
unsigned flags, uint64_t *res);
+template
+Fault
+TimingSimpleCPU::write(uint24_t data, Addr addr,
+ unsigned flags, uint64_t *res);
+
#endif //DOXYGEN_SHOULD_SKIP_THIS
template<>
diff --git a/src/sim/byteswap.hh b/src/sim/byteswap.hh
index fc36bdf..ba27263 100644
--- a/src/sim/byteswap.hh
+++ b/src/sim/byteswap.hh
@@ -108,6 +108,13 @@ swap_byte16(uint16_t x)
#endif
}
+inline uint24_t
+swap_byte24(uint24_t x)
+{
+ x.swap_bytes();
+ return x;
+}
+
// This function lets the compiler figure out how to call the
// swap_byte functions above for different data types. Since the
// sizeof() values are known at compile time, it should inline to a
@@ -118,6 +125,8 @@ inline T swap_byte(T x) {
return swap_byte64((uint64_t)x);
else if (sizeof(T) == 4)
return swap_byte32((uint32_t)x);
+ else if (sizeof(T) == 3)
+ return swap_byte24((uint24_t)x);
else if (sizeof(T) == 2)
return swap_byte16((uint16_t)x);
else if (sizeof(T) == 1)
diff --git a/src/sim/insttracer.hh b/src/sim/insttracer.hh
index a8cdff6..2c18444 100644
--- a/src/sim/insttracer.hh
+++ b/src/sim/insttracer.hh
@@ -114,6 +114,7 @@ class InstRecord
void setData(uint32_t d) { data.as_int = d; data_status = DataInt32; }
void setData(uint16_t d) { data.as_int = d; data_status = DataInt16; }
void setData(uint8_t d) { data.as_int = d; data_status = DataInt8; }
+ void setData(uint24_t d) { data.as_int = d; data_status = DataInt32; }
void setData(int64_t d) { setData((uint64_t)d); }
void setData(int32_t d) { setData((uint32_t)d); }
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev