A structure might not work because of padding (I'm just guessing
though), but what about uint8_t [3]? That might get wierd if something
tries to dereference a pointer to that type and do a direct
assignment, something like:
template <class T>
bar(T *foo)
{
...
T *bar = (T *)baz;
*bar = *foo;
}
But that might work too. I'm not sure.
Gabe
Quoting Korey Sewell <[email protected]>:
> Hi Matt/Steve,
> sorry I'm late to the conversation.
>
> In short, the reason it's implemented currently as a read-modify-write was
> because what Matt found out to be true: the 3-byte store problem. At the
> time of implementation, M5 was at the crossroads of another memory system
> change (remember mergedmem anyone?) and there wasnt any really good,
> unobtrusive solution to integrate this in.
>
> Like Steve mentioned, the ISA/CPU model "dance" was just not working out at
> the time but getting them to work in the AtomicSimpleCPU was at least a good
> functionality start. I had a "readFunctional" there which was able to read
> from memory without adversely affecting the LSQ and other dependent
> structures. Reading the data out first before storing would allow you to get
> the right data and *hopefully not upset the timing too much depending on
> what you're doing. Performing a regular read there will definitely not work
> there though as Matt found out.
>
> I believe the only addition that we previously made was to add a "unaligned"
> flag or something of that effect to the request structure so that we wouldnt
> fault when the swl/swr pairs were used in MIPS.
>
> I dont know why this never came up then (or maybe it did), but a proposed
> solution might be to create a new struct that is exactly 3 bytes long (say
> an uint16_t and a uint8_t), then when a 3-byte write is necessary, we can
> use that structure to generate the right width for the request.
>
> Thoughts?
>
> On Tue, Jan 19, 2010 at 8:31 PM, Steve Reinhardt <[email protected]> wrote:
>
>> On Tue, Jan 19, 2010 at 5:13 PM, Matt DeVuyst <[email protected]> wrote:
>> > I don't see how 3 byte stores will work. The write() functions don't
>> > take a size argument. They just template the data argument (and can
>> > figure out the size by with sizeof(T)). So, for example, if you want
>> > to write 2 bytes to memory you just call write((uint16_t)data,
>> > address, flags, NULL). But I'm not aware of any data type that's 3
>> > bytes wide. Is changing all the write() functions to accept a size
>> > argument the best thing to do here?
>>
>> Good point... I hadn't thought that through. I think the memory
>> system at the lower level will handle this (i.e., I believe you can
>> put together a Request object that has size=3 and send that to a
>> memory object and it will work; it's really just the interface between
>> the ISA and the CPU model that gets in the way.
>>
>> >
>> > Also, I'm trying to picture how these instructions would be
>> > implemented in a real MIPS processor. Would you really be able to
>> > write 3 bytes to memory or would you have to read a full word first
>> > and then modify the appropriate bytes and write back?
>>
>> In hardware it's pretty easy to do per-byte enables if you need to;
>> much simpler than doing a read-modify-write sequence. The only reason
>> I can imagine that they wouldn't do it that way is if they already had
>> to do an RMW for some other instructions and so they piggybacked on
>> that. I don't know of any MIPS instructions that would require RMW
>> though (particularly since MIPS uses LL/SC for atomic ops, which is
>> the main reason you'd want RMW).
>>
>> Have we discussed doing 3-byte memory ops on the list before? It's
>> clear it would require some new functionality at the read()/write()
>> ExecContext interface, but that seems like a worthwhile addition to
>> me.
>>
>> Steve
>>
>> >
>> > On Tue, Jan 19, 2010 at 3:08 PM, Steve Reinhardt <[email protected]>
>> wrote:
>> >> My guess is that the instruction definition should be rewritten to do
>> >> a store of the correct number of bytes and not a read-modify-write. I
>> >> don't think there's a reason the store function shouldn't handle a
>> >> 3-byte store; somebody correct me if I'm wrong.
>> >>
>> >> Steve
>> >>
>> >> On Tue, Jan 19, 2010 at 2:38 PM, Matt DeVuyst <[email protected]>
>> wrote:
>> >>> Hi,
>> >>>
>> >>> I'm getting an an assertion failure when executing MIPS code
>> >>> in the o3 cpu model. The same program executes correctly in the
>> >>> atomic cpu model. The assertion that is failing is
>> >>> src/cpu/o3/lsq_unit.hh line 489. It appears that the MIPS
>> >>> instructions SWL and SWR don't work in o3.
>> >>>
>> >>> Here is the relevant portion of the backtrace:
>> >>>
>> >>> #3 0xb7a9e5ce in __assert_fail () from /lib/tls/i686/cmov/libc.so.6
>> >>>
>> >>> #4 0x08400914 in LSQUnit<O3CPUImpl>::read<unsigned int>
>> >>> (this=0xa695f68, req=0xa7a3a98, da...@0xbffbbb3c, load_idx=8) at
>> >>> build-smt03/build/MIPS_SE/cpu/o3/lsq_unit.hh:489
>> >>>
>> >>> #5 0x08403745 in LSQ<O3CPUImpl>::read<unsigned int> (this=0xa695f40,
>> >>> req=0xa7a3a98, da...@0xbffbbb3c, load_idx=8) at
>> >>> build-smt03/build/MIPS_SE/cpu/o3/lsq.hh:376
>> >>>
>> >>> #6 0x0840378d in FullO3CPU<O3CPUImpl>::read<unsigned int>
>> >>> (this=0xa694bc0, r...@0xbffbbad4, da...@0xbffbbb3c, load_idx=8) at
>> >>> build-smt03/build/MIPS_SE/cpu/o3/cpu.hh:708
>> >>>
>> >>> #7 0x0840399a in BaseDynInst<O3CPUImpl>::read<unsigned int>
>> >>> (this=0xa7dd9a0, addr=5068452, da...@0xbffbbb3c, flags=0) at
>> >>> build-smt03/build/MIPS_SE/cpu/base_dyn_inst.hh:884
>> >>>
>> >>> #8 0x083bb699 in MipsISAInst::Swr::initiateAcc (this=0xa8d0428,
>> >>> xc=0xa7dd9a0, traceData=0x0) at
>> >>> build-smt03/build/MIPS_SE/arch/mips/o3_cpu_exec.cc:25017
>> >>>
>> >>> #9 0x08228a27 in BaseO3DynInst<O3CPUImpl>::initiateAcc
>> >>> (this=0xa7dd9a0) at
>> >>> build-smt03/build/MIPS_SE/cpu/o3/dyn_inst_impl.hh:111
>> >>>
>> >>> #10 0x0837b69f in LSQUnit<O3CPUImpl>::executeStore (this=0xa695f68,
>> >>> #store_in...@0xbffbc1f0) at
>> >>> #build-smt03/build/MIPS_SE/cpu/o3/lsq_unit_impl.hh:495
>> >>>
>> >>> #11 0x0836747d in LSQ<O3CPUImpl>::executeStore (this=0xa695f40,
>> >>> in...@0xbffbc1f0) at build-smt03/build/MIPS_SE/cpu/o3/lsq_impl.hh:327
>> >>>
>> >>> #12 0x0828c0fa in DefaultIEW<O3CPUImpl>::executeInsts (this=0xa6956a0)
>> >>> #at build-smt03/build/MIPS_SE/cpu/o3/iew_impl.hh:1232
>> >>>
>> >>> #13 0x0829436f in DefaultIEW<O3CPUImpl>::tick (this=0xa6956a0) at
>> >>> #build-smt03/build/MIPS_SE/cpu/o3/iew_impl.hh:1458
>> >>>
>> >>> #14 0x081b5ae4 in FullO3CPU<O3CPUImpl>::tick (this=0xa694bc0) at
>> >>> #build-smt03/build/MIPS_SE/cpu/o3/cpu.cc:514
>> >>>
>> >>> #15 0x081b7254 in FullO3CPU<O3CPUImpl>::TickEvent::process
>> >>> #(this=0xa694d0c) at build-smt03/build/MIPS_SE/cpu/o3/cpu.cc:86
>> >>>
>> >>> #16 0x0822feda in EventQueue::serviceOne (this=0x8834008) at
>> >>> #build-smt03/build/MIPS_SE/sim/eventq.cc:202
>> >>>
>> >>> #17 0x085c238c in simulate (num_cycles=9223372036854775807) at
>> >>> build-smt03/build/MIPS_SE/sim/simulate.cc:73
>> >>>
>> >>> So the MIPS instruction is an SWR instruction that stores anywhere
>> >>> from 1 to 4 bytes into an aligned memory word. The format function
>> >>> that handles this instruction is StoreUnalignedMemory (in
>> >>> src/arch/mips/isa/formats/mem.isa). This specifies that the read()
>> >>> function of the dynamic instruction (BaseDynInst<O3CPUImpl>::read())
>> >>> be called to get the whole aligned word from memory. Then the
>> >>> appropriate bytes can be changed and the whole word written back to
>> >>> memory.
>> >>>
>> >>> The problem seems to be that this store instructions (and other
>> >>> unaligned instructions like it) read from memory as well as write
>> >>> memory, but only a single StoreQueue (and not LoadQueue) LSQ entry was
>> >>> created for it.
>> >>>
>> >>> Should an entry be made on the LoadQueue as well? Or should a
>> >>> separate lightweight memory read function be called instead? It looks
>> >>> like the latter may have been what was done in the past (if this
>> >>> instruction ever used to work correctly)...because there's a
>> >>> commented out line in the format definition of StoreUnalignedMemory
>> >>> that calls xc->readFunctional().
>> >>>
>> >>>
>> >>> --
>> >>> Cheers,
>> >>> Matt
>> >>> _______________________________________________
>> >>> m5-dev mailing list
>> >>> [email protected]
>> >>> http://m5sim.org/mailman/listinfo/m5-dev
>> >>>
>> >> _______________________________________________
>> >> m5-dev mailing list
>> >> [email protected]
>> >> http://m5sim.org/mailman/listinfo/m5-dev
>> >>
>> >
>> >
>> >
>> > --
>> > Cheers,
>> > Matt
>> > _______________________________________________
>> > m5-dev mailing list
>> > [email protected]
>> > http://m5sim.org/mailman/listinfo/m5-dev
>> >
>> _______________________________________________
>> m5-dev mailing list
>> [email protected]
>> http://m5sim.org/mailman/listinfo/m5-dev
>>
>
>
>
> --
> - Korey
>
_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev