Hi Matt. Sorry no one has taken a look at this yet, at least that I'm
aware of. We haven't forgotten you, but it might take a little while
before we can get this taken care of.

Gabe

Matt DeVuyst wrote:
> I fixed the problem.  See the attached patch.
>
> For the case of writing 3 bytes to memory, first I tried using a
> simple 3-byte array, but that didn't work quite right.  Then I wrapped
> the 3-byte array in a struct and added some operator overloads to it
> so that things like assignment to 0 will work as expected.
>
> Unfortunately some of the old format definitions (like StoreExecute)
> in mem.isa were unsuitable for the unaligned stores because they
> assumed a fixed size; the type to be written was explicitly cast in
> the call to xc->write().  With the unaligned accesses, the number of
> bytes to write will vary at runtime and depend on the byte offset
> within the memory word.  Thus, I created some new format definitions
> (like StoreUnalignedExecute) that switch on the size of the write to
> make the appropriate call to xc->write().
>
> All the write() functions were templated to accept writes of different
> sizes, but none of them had declarations to handle 3-byte writes (and,
> as you probably know, the C++ templating system requires this if a
> templated function is used in a different file that is compiled
> separately), so I ended up having to add new declarations to write()
> everywhere to handle the 3-byte writes.
>
> I've tested the changes and they appear to work.  But I've only tested
> on little-endian host machines (x86 and x86_64) since I don't have a
> big-endian machine, and only tested MIPS in its default little-endian
> mode since I don't have a cross-compiler to produce MIPS big-endian
> binaries.  While I tried to code carefully to handle these other
> endian configurations, I cannot guarantee that they will work without
> some slight modifications.
>
> Anyway, thanks for all your help and advice.
>
> On Tue, Jan 19, 2010 at 7:03 PM, Korey Sewell <[email protected]> wrote:
>   
>> Good points about potential padding.
>>
>> The structure could be just used to obtain the width, where as the
>> assignment of data could be just be in a void* or whatever arbitrary type as
>> I'm guessing that once the request is made, it will simply read/write as
>> many bytes as the width says it should.
>>
>> Matt would you be willing to give one of these a try in the interim?
>>
>> On Tue, Jan 19, 2010 at 9:48 PM, Gabriel Michael Black
>> <[email protected]> wrote:
>>     
>>> A structure might not work because of padding (I'm just guessing
>>> though), but what about uint8_t [3]? That might get wierd if something
>>> tries to dereference a pointer to that type and do a direct
>>> assignment, something like:
>>>
>>> template <class T>
>>> bar(T *foo)
>>> {
>>>     ...
>>>     T *bar = (T *)baz;
>>>     *bar = *foo;
>>> }
>>>
>>> But that might work too. I'm not sure.
>>>
>>> Gabe
>>>
>>> Quoting Korey Sewell <[email protected]>:
>>>
>>>       
>>>> Hi Matt/Steve,
>>>> sorry I'm late to the conversation.
>>>>
>>>> In short, the reason it's implemented currently as a read-modify-write
>>>> was
>>>> because what Matt found out to be true: the 3-byte store problem. At the
>>>> time of implementation, M5 was at the crossroads of another memory
>>>> system
>>>> change (remember mergedmem anyone?) and there wasnt any really good,
>>>> unobtrusive solution to integrate this in.
>>>>
>>>> Like Steve mentioned, the ISA/CPU model "dance" was just not working out
>>>> at
>>>> the time but getting them to work in the AtomicSimpleCPU was at least a
>>>> good
>>>> functionality start. I had a "readFunctional" there which was able to
>>>> read
>>>> from memory without adversely affecting the LSQ and other dependent
>>>> structures. Reading the data out first before storing would allow you to
>>>> get
>>>> the right data and *hopefully not upset the timing too much depending on
>>>> what you're doing. Performing a regular read there will definitely not
>>>> work
>>>> there though as Matt found out.
>>>>
>>>> I believe the only addition that we previously made was to add a
>>>> "unaligned"
>>>> flag or something of that effect to the request structure so that we
>>>> wouldnt
>>>> fault when the swl/swr pairs were used in MIPS.
>>>>
>>>> I dont know why this never came up then (or maybe it did), but a
>>>> proposed
>>>> solution might be to create a new struct that is exactly 3 bytes long
>>>> (say
>>>> an uint16_t and a uint8_t), then when a 3-byte write is necessary, we
>>>> can
>>>> use that structure to generate the right width for the request.
>>>>
>>>> Thoughts?
>>>>
>>>> On Tue, Jan 19, 2010 at 8:31 PM, Steve Reinhardt <[email protected]>
>>>> wrote:
>>>>
>>>>         
>>>>> On Tue, Jan 19, 2010 at 5:13 PM, Matt DeVuyst <[email protected]>
>>>>> wrote:
>>>>>           
>>>>>> I don't see how 3 byte stores will work.  The write() functions don't
>>>>>> take a size argument.  They just template the data argument (and can
>>>>>> figure out the size by with sizeof(T)).  So, for example, if you want
>>>>>> to write 2 bytes to memory you just call write((uint16_t)data,
>>>>>> address, flags, NULL).  But I'm not aware of any data type that's 3
>>>>>> bytes wide.  Is changing all the write() functions to accept a size
>>>>>> argument the best thing to do here?
>>>>>>             
>>>>> Good point... I hadn't thought that through.  I think the memory
>>>>> system at the lower level will handle this (i.e., I believe you can
>>>>> put together a Request object that has size=3 and send that to a
>>>>> memory object and it will work; it's really just the interface between
>>>>> the ISA and the CPU model that gets in the way.
>>>>>
>>>>>           
>>>>>> Also, I'm trying to picture how these instructions would be
>>>>>> implemented in a real MIPS processor.  Would you really be able to
>>>>>> write 3 bytes to memory or would you have to read a full word first
>>>>>> and then modify the appropriate bytes and write back?
>>>>>>             
>>>>> In hardware it's pretty easy to do per-byte enables if you need to;
>>>>> much simpler than doing a read-modify-write sequence.  The only reason
>>>>> I can imagine that they wouldn't do it that way is if they already had
>>>>> to do an RMW for some other instructions and so they piggybacked on
>>>>> that.  I don't know of any MIPS instructions that would require RMW
>>>>> though (particularly since MIPS uses LL/SC for atomic ops, which is
>>>>> the main reason you'd want RMW).
>>>>>
>>>>> Have we discussed doing 3-byte memory ops on the list before?  It's
>>>>> clear it would require some new functionality at the read()/write()
>>>>> ExecContext interface, but that seems like a worthwhile addition to
>>>>> me.
>>>>>
>>>>> Steve
>>>>>
>>>>>           
>>>>>> On Tue, Jan 19, 2010 at 3:08 PM, Steve Reinhardt <[email protected]>
>>>>>>             
>>>>> wrote:
>>>>>           
>>>>>>> My guess is that the instruction definition should be rewritten to
>>>>>>> do
>>>>>>> a store of the correct number of bytes and not a read-modify-write.
>>>>>>>  I
>>>>>>> don't think there's a reason the store function shouldn't handle a
>>>>>>> 3-byte store; somebody correct me if I'm wrong.
>>>>>>>
>>>>>>> Steve
>>>>>>>
>>>>>>> On Tue, Jan 19, 2010 at 2:38 PM, Matt DeVuyst <[email protected]>
>>>>>>>               
>>>>> wrote:
>>>>>           
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I'm getting an an assertion failure when executing MIPS code
>>>>>>>> in the o3 cpu model.  The same program executes correctly in the
>>>>>>>> atomic cpu model.  The assertion that is failing is
>>>>>>>> src/cpu/o3/lsq_unit.hh line 489.  It appears that the MIPS
>>>>>>>> instructions SWL and SWR don't work in o3.
>>>>>>>>
>>>>>>>> Here is the relevant portion of the backtrace:
>>>>>>>>
>>>>>>>> #3  0xb7a9e5ce in __assert_fail () from
>>>>>>>> /lib/tls/i686/cmov/libc.so.6
>>>>>>>>
>>>>>>>> #4  0x08400914 in LSQUnit<O3CPUImpl>::read<unsigned int>
>>>>>>>>  (this=0xa695f68, req=0xa7a3a98, da...@0xbffbbb3c, load_idx=8) at
>>>>>>>>  build-smt03/build/MIPS_SE/cpu/o3/lsq_unit.hh:489
>>>>>>>>
>>>>>>>> #5  0x08403745 in LSQ<O3CPUImpl>::read<unsigned int>
>>>>>>>> (this=0xa695f40,
>>>>>>>>  req=0xa7a3a98, da...@0xbffbbb3c, load_idx=8) at
>>>>>>>>  build-smt03/build/MIPS_SE/cpu/o3/lsq.hh:376
>>>>>>>>
>>>>>>>> #6  0x0840378d in FullO3CPU<O3CPUImpl>::read<unsigned int>
>>>>>>>>  (this=0xa694bc0, r...@0xbffbbad4, da...@0xbffbbb3c, load_idx=8) at
>>>>>>>>  build-smt03/build/MIPS_SE/cpu/o3/cpu.hh:708
>>>>>>>>
>>>>>>>> #7  0x0840399a in BaseDynInst<O3CPUImpl>::read<unsigned int>
>>>>>>>>  (this=0xa7dd9a0, addr=5068452, da...@0xbffbbb3c, flags=0) at
>>>>>>>>  build-smt03/build/MIPS_SE/cpu/base_dyn_inst.hh:884
>>>>>>>>
>>>>>>>> #8  0x083bb699 in MipsISAInst::Swr::initiateAcc (this=0xa8d0428,
>>>>>>>>  xc=0xa7dd9a0, traceData=0x0) at
>>>>>>>>  build-smt03/build/MIPS_SE/arch/mips/o3_cpu_exec.cc:25017
>>>>>>>>
>>>>>>>> #9  0x08228a27 in BaseO3DynInst<O3CPUImpl>::initiateAcc
>>>>>>>>  (this=0xa7dd9a0) at
>>>>>>>>  build-smt03/build/MIPS_SE/cpu/o3/dyn_inst_impl.hh:111
>>>>>>>>
>>>>>>>> #10 0x0837b69f in LSQUnit<O3CPUImpl>::executeStore (this=0xa695f68,
>>>>>>>> #store_in...@0xbffbc1f0) at
>>>>>>>> #build-smt03/build/MIPS_SE/cpu/o3/lsq_unit_impl.hh:495
>>>>>>>>
>>>>>>>> #11 0x0836747d in LSQ<O3CPUImpl>::executeStore (this=0xa695f40,
>>>>>>>>  in...@0xbffbc1f0) at
>>>>>>>> build-smt03/build/MIPS_SE/cpu/o3/lsq_impl.hh:327
>>>>>>>>
>>>>>>>> #12 0x0828c0fa in DefaultIEW<O3CPUImpl>::executeInsts
>>>>>>>> (this=0xa6956a0)
>>>>>>>> #at build-smt03/build/MIPS_SE/cpu/o3/iew_impl.hh:1232
>>>>>>>>
>>>>>>>> #13 0x0829436f in DefaultIEW<O3CPUImpl>::tick (this=0xa6956a0) at
>>>>>>>> #build-smt03/build/MIPS_SE/cpu/o3/iew_impl.hh:1458
>>>>>>>>
>>>>>>>> #14 0x081b5ae4 in FullO3CPU<O3CPUImpl>::tick (this=0xa694bc0) at
>>>>>>>> #build-smt03/build/MIPS_SE/cpu/o3/cpu.cc:514
>>>>>>>>
>>>>>>>> #15 0x081b7254 in FullO3CPU<O3CPUImpl>::TickEvent::process
>>>>>>>> #(this=0xa694d0c) at build-smt03/build/MIPS_SE/cpu/o3/cpu.cc:86
>>>>>>>>
>>>>>>>> #16 0x0822feda in EventQueue::serviceOne (this=0x8834008) at
>>>>>>>> #build-smt03/build/MIPS_SE/sim/eventq.cc:202
>>>>>>>>
>>>>>>>> #17 0x085c238c in simulate (num_cycles=9223372036854775807) at
>>>>>>>>  build-smt03/build/MIPS_SE/sim/simulate.cc:73
>>>>>>>>
>>>>>>>> So the MIPS instruction is an SWR instruction that stores anywhere
>>>>>>>> from 1 to 4 bytes into an aligned memory word.  The format function
>>>>>>>> that handles this instruction is StoreUnalignedMemory (in
>>>>>>>> src/arch/mips/isa/formats/mem.isa).  This specifies that the read()
>>>>>>>> function of the dynamic instruction
>>>>>>>> (BaseDynInst<O3CPUImpl>::read())
>>>>>>>> be called to get the whole aligned word from memory.  Then the
>>>>>>>> appropriate bytes can be changed and the whole word written back to
>>>>>>>> memory.
>>>>>>>>
>>>>>>>> The problem seems to be that this store instructions (and other
>>>>>>>> unaligned instructions like it) read from memory as well as write
>>>>>>>> memory, but only a single StoreQueue (and not LoadQueue) LSQ entry
>>>>>>>> was
>>>>>>>> created for it.
>>>>>>>>
>>>>>>>> Should an entry be made on the LoadQueue as well?  Or should a
>>>>>>>> separate lightweight memory read function be called instead?  It
>>>>>>>> looks
>>>>>>>> like the latter may have been what was done in the past (if this
>>>>>>>> instruction ever used to work correctly)...because there's a
>>>>>>>> commented out line in the format definition of StoreUnalignedMemory
>>>>>>>> that calls xc->readFunctional().
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Cheers,
>>>>>>>> Matt
>>>>>>>> _______________________________________________
>>>>>>>> m5-dev mailing list
>>>>>>>> [email protected]
>>>>>>>> http://m5sim.org/mailman/listinfo/m5-dev
>>>>>>>>
>>>>>>>>                 
>>>>>>> _______________________________________________
>>>>>>> m5-dev mailing list
>>>>>>> [email protected]
>>>>>>> http://m5sim.org/mailman/listinfo/m5-dev
>>>>>>>
>>>>>>>               
>>>>>>
>>>>>> --
>>>>>> Cheers,
>>>>>> Matt
>>>>>> _______________________________________________
>>>>>> m5-dev mailing list
>>>>>> [email protected]
>>>>>> http://m5sim.org/mailman/listinfo/m5-dev
>>>>>>
>>>>>>             
>>>>> _______________________________________________
>>>>> m5-dev mailing list
>>>>> [email protected]
>>>>> http://m5sim.org/mailman/listinfo/m5-dev
>>>>>
>>>>>           
>>>>
>>>> --
>>>> - Korey
>>>>
>>>>         
>>> _______________________________________________
>>> m5-dev mailing list
>>> [email protected]
>>> http://m5sim.org/mailman/listinfo/m5-dev
>>>       
>>
>> --
>> - Korey
>>
>> _______________________________________________
>> m5-dev mailing list
>> [email protected]
>> http://m5sim.org/mailman/listinfo/m5-dev
>>
>>
>>     
>
>
>
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> m5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/m5-dev

_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to