I think I've found the problem; it has less to do with split memory
requests and more to do with integer overflow (register a1's value of -1
should have tipped me off earlier).  Basically, splitRequest() only checks
if the request crosses a cache boundary by testing if the address of the
last byte of the memory request is greater than that of the first one (and
if it's on a different line); if the memory request's address is -1, then
overflow will occur and the last byte's address will be before the first
one's.  In the case of an ld instruction with address -1, the last byte
will be at address 0x7, which is less than 0xFFFFFFFFFFFFFFFF, so a split
doesn't occur.  I think the basic fix is for the TLB to return a fault
before even bothering with translation if this overflow occurs.

Thanks for all your help!

P.S. For anyone who's using MIPS with the O3 model (if there is anyone),
RISC-V's TLB is copied from it so this will likely also happen there as
well.

On Fri, Mar 10, 2017 at 6:28 PM, Gabe Black <[email protected]> wrote:

> If your ISA doesn't support unaligned accesses, the page table code should
> return a fault (architectural or made up) and return that. If it does and
> you're not expecting one that's a different story since it indicates some
> other bit of code isn't doing it's job, for instance not splitting up an
> access.
>
> If there's no architectural fault which makes sense and you're just
> reporting something you don't handle yet, etc., you can make up a fault
> which just calls panic() when it's invoked. I think M5DebugFault will help,
> in src/arch/generic/debugfaults.hh. You could also take this approach if
> this is in SE mode for instance, and there's nowhere for the full blown
> architectural exception mechanism to send simulated execution to handle
> things.
>
> Gabe
>
> On Fri, Mar 10, 2017 at 3:13 PM, Steve Reinhardt <[email protected]> wrote:
>
> > I haven't looked at the source directly to see how much of the unaligned
> > access code is in x86-specific code vs. generic code, and I don't
> remember
> > off the top of my head. I'm glad to hear that the splitRequest() call is
> in
> > the generic part of the code.
> >
> > The symptom that you're getting implies that the TLB is being accessed
> with
> > a non-split unaligned request though, so there must be some code path
> where
> > this is not handled automatically by setting HasUnalignedMemAcc. I'd
> > suggest getting a stack trace from where you hit this assertion to see
> > where this call is coming from, and why it's not using the split version
> of
> > the requests.
> >
> > Steve
> >
> >
> > On Fri, Mar 10, 2017 at 3:06 PM, Alec Roelke <[email protected]> wrote:
> >
> > > There used to be a problem that looked similar to this one where an ld
> > > instruction would cross a cache line, and to fix that I was advised to
> > look
> > > at the splitRequest() function for help.  In doing so, I found that
> > > initiateMemRead() in BaseDynInst already calls it if the ISA supports
> > > unaligned memory accesses, so I set RISC-V to do that and that problem
> > was
> > > fixed.  I haven't looked at x86's implementation of anything yet due to
> > its
> > > complexity; does x86 have additional code beyond this to handle
> unaligned
> > > accesses?
> > >
> > > On Fri, Mar 10, 2017 at 5:14 PM, Steve Reinhardt <[email protected]>
> > wrote:
> > >
> > > > Have you looked at how x86 supports unaligned accesses? The gem5
> memory
> > > > system does not support them natively; you have to check if your
> access
> > > is
> > > > unaligned, and issue two requests to the cache for the two halves to
> > > > guarantee there are no cache line or page crossings within a single
> > > > request.
> > > >
> > > > Steve
> > > >
> > > >
> > > > On Fri, Mar 10, 2017 at 2:06 PM, Alec Roelke <[email protected]>
> > wrote:
> > > >
> > > > > I get an assertion failure:
> > > > >
> > > > > build/RISCV/mem/page_table.cc:190: Fault
> > > > > PageTableBase::translate(RequestPtr): Assertion
> > > > `pageAlign(req->getVaddr()
> > > > > + req->getSize() - 1) == pageAlign(req->getVaddr())' failed.
> > > > >
> > > > > I can't tell from my debug trace (with Exec and Commit flags on) if
> > > this
> > > > is
> > > > > happening during exec or commit.  I can see that there's an ld
> > > > instruction
> > > > > preceding the ret instruction that is supposed to load the correct
> > > > address
> > > > > into the return-address register, but for some reason is unable to
> > > > commit.
> > > > > A more complete trace looks like this:
> > > > >
> > > > > ...
> > > > > ret <------------ this ret appears to execute properly
> > > > > ld ra,24(sp) <--- this never commits
> > > > > li a0,0
> > > > > ld s0,16(sp)
> > > > > ld s1,8(sp)
> > > > > addi sp,sp,32
> > > > > ret
> > > > > ld s1,0(a1) <---- this is executed speculatively, but a1 isn't a
> > valid
> > > > > address
> > > > > ...
> > > > >
> > > > > Register a1 contains -1 at this point.
> > > > >
> > > > > The commit log doesn't show me any information about that last
> > > > instruction
> > > > > except for when it is inserted into the ROB, which is why I
> > originally
> > > > > thought the error was happening during speculation.
> > > > >
> > > > > On Fri, Mar 10, 2017 at 4:45 PM, Steve Reinhardt <[email protected]
> >
> > > > wrote:
> > > > >
> > > > > > The instruction should be marked as causing a fault, but the
> fault
> > > > action
> > > > > > should not be invoked until the instruction is committed. Because
> > > it's
> > > > a
> > > > > > mis-speculated instruction, it will never be committed, so the
> > fault
> > > > > should
> > > > > > never be observed. I'm not sure what you mean by "crashes", so
> I'm
> > > not
> > > > > sure
> > > > > > what part of this process is not operating properly for you.
> > > > > >
> > > > > > Steve
> > > > > >
> > > > > >
> > > > > > On Fri, Mar 10, 2017 at 1:40 PM, Alec Roelke <[email protected]
> >
> > > > wrote:
> > > > > >
> > > > > > > Hello Everyone,
> > > > > > >
> > > > > > > I'm trying to debug RISC-V running on the O3 model, and I've
> > > > > encountered
> > > > > > a
> > > > > > > problem where the CPU tries to speculatively execute a load
> > > > instruction
> > > > > > > (which is actually along a branch that ends up not being taken)
> > in
> > > > > which
> > > > > > > the data crosses a page boundary and causes a fault.
> > > > > > >
> > > > > > > The specific section of code looks like this:
> > > > > > >
> > > > > > > ...
> > > > > > > ret
> > > > > > > ld s1,0(a1)
> > > > > > > ...
> > > > > > >
> > > > > > > If you're not familiar with RISC-V, ret is a pseudo-instruction
> > > that
> > > > > just
> > > > > > > jumps to whatever address is stored in the return-address
> > register,
> > > > and
> > > > > > ld
> > > > > > > loads a 64-bit value from memory.
> > > > > > >
> > > > > > > The problem I'm encountering is that the value stored in
> register
> > > a1
> > > > is
> > > > > > not
> > > > > > > a valid address as that instruction is not supposed to be
> > executed,
> > > > and
> > > > > > it
> > > > > > > just so happens that the word it points to crosses a page
> > boundary.
> > > > > When
> > > > > > > gem5 speculatively executes this, it crashes.
> > > > > > >
> > > > > > > How can I prevent gem5 from doing this?  I know I could flag
> load
> > > and
> > > > > > store
> > > > > > > instructions as being nonspeculative, but that will slow down
> > > > execution
> > > > > > and
> > > > > > > affect output stats.  I'm working on top of these four patches:
> > > > > > >
> > > > > > >    - https://gem5-review.googlesource.com/c/2304/
> > > > > > >    - https://gem5-review.googlesource.com/c/2305/5
> > > > > > >    - https://gem5-review.googlesource.com/c/2340/2
> > > > > > >    - https://gem5-review.googlesource.com/c/2341/2
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Alec Roelke
> > > > > > > _______________________________________________
> > > > > > > gem5-dev mailing list
> > > > > > > [email protected]
> > > > > > > http://m5sim.org/mailman/listinfo/gem5-dev
> > > > > > _______________________________________________
> > > > > > gem5-dev mailing list
> > > > > > [email protected]
> > > > > > http://m5sim.org/mailman/listinfo/gem5-dev
> > > > > _______________________________________________
> > > > > gem5-dev mailing list
> > > > > [email protected]
> > > > > http://m5sim.org/mailman/listinfo/gem5-dev
> > > > >
> > > > _______________________________________________
> > > > gem5-dev mailing list
> > > > [email protected]
> > > > http://m5sim.org/mailman/listinfo/gem5-dev
> > > >
> > > _______________________________________________
> > > gem5-dev mailing list
> > > [email protected]
> > > http://m5sim.org/mailman/listinfo/gem5-dev
> > >
> > _______________________________________________
> > gem5-dev mailing list
> > [email protected]
> > http://m5sim.org/mailman/listinfo/gem5-dev
> >
> _______________________________________________
> gem5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/gem5-dev
>
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to