Re: [gem5-dev] Bug in x86 stack instructions

Jason Lowe-Power Wed, 25 Jan 2017 13:29:06 -0800

I'm leaning towards deleting the AddrSizeFlagBit. Gabe, can you shed some
light on the purpose of this flag? It's only ever used for masking
addresses in the TLB:
http://grok.gem5.org/xref/gem5/src/arch/x86/tlb.cc#318
http://grok.gem5.org/xref/gem5/src/arch/x86/tlb.cc#330


From what I can understand, it should be unnecessary to mask the address
here. In the instruction implementation, the address is already masked if
it is a 32-bit mode instruction:
http://grok.gem5.org/xref/gem5/src/arch/x86/isa/microops/ldstop.isa#433

It seems straightforward to add addressSize=ssz to the st instructions
where necessary. I'm confused as to why the TLB ever needs to know what the
address mode of the instruction is. Why/how would the TLB ever receive a
request with an address > 4GB for a 32-bit mode instruction?

Thanks again for the help!

Cheers,
Jason

On Wed, Jan 25, 2017 at 10:37 AM Steve Reinhardt <ste...@gmail.com> wrote:

> Thanks, Gabe! That's all really helpful. I appreciate you taking the time
> to respond.
>
> One concern I have is that the portion of the manual Jason quoted says the
> rule holds "For instructions that
> *implicitly* access the stack segment (SS)" (emphasis mine)... would you
> say then that instructions that *explicitly* choose SS as the segment
> (e.g., with a segment override prefix) would not follow this rule? If so,
> then it may not be correct to do things inside the instruction based solely
> on the fact that segment==SS. Or maybe that's a case that just can't occur
> so we don't have to worry about it?
>
> Also, would there be any harm in setting the AddrSizeFlagBit based on
> whether addressSize==32 rather than looking at the legacy.addr bit? If not,
> this seems like a simpler solution to me.
>
> Thanks again!
>
> Steve
>
>
> On Wed, Jan 25, 2017 at 12:41 AM, Gabe Black <gabebl...@google.com> wrote:
>
> > Forgive me if I'm repeating a bit of what you said, Steve, but I looked a
> > little more and there's more to fix.
> >
> > Also, if you look at x86/isa/microops/ldstop.isa, you'll find the python
> > classes which receive the parameters of the load and store microops and
> > translate them into C++ instantiations of the underlying microop classes.
> > In their __init__ methods, you'll see at the end where the
> AddrSizeFlagBit
> > is set based on whether or not the machInst.legacy.addr bit is set,
> > basically whether the address size prefix had been found when the
> > instruction was decoded. For the two classes involved (LdStOp and
> > BigLdStOp), I think you'll want to add a check which doesn't set
> > AddrSizeFlagBit if the segment is the stack segment.
> >
> > Additionally, you'll want to make sure the TLB knows to use the stack
> > segment size (as opposed to either the default or alternate address
> sizes)
> > when manipulating the segment offset. You could add a new ISA specific
> flag
> > for x86 memory requests (there seems to be room for exactly one more in
> > arch/x86/ldstflags.hh) and set it if the segment is the stack in those ld
> > and st ops, similar to how AddrSizeFlagBit is being set. Then in the TLB,
> > specifically where it's calculating logSize, you'll want to make it
> > recognize your new flag and use the stack size from the m5Reg (called
> just
> > m5Reg.stack, I think) instead of either of the address size values.
> >
> > This is a little gross because it means there's some calculation going on
> > when address translation is being done (frequently) when it could have
> been
> > done earlier during decode (less frequently). One reason I did things
> this
> > way (all speculation, I've long since forgot) is that it makes the
> decoded
> > instructions hold less state. The same bytes may not always refer to the
> > same sizes depending on the segmentation state in the machine, so if
> those
> > particular bytes are being decoded, the gem5 decode cache needs to either
> > consider that segmentation state when deciding what instruction to spit
> > out. If it didn't, it might return an instruction which statically says
> to
> > use 32 bit addresses (for instance), when now segmentation state says it
> > the same bytes should mean it should use 16 bit addresses. Instead, it
> canh
> > be factored in dynamically using external state (the m5Reg) to an
> > instruction that says which version of the sizes to use but doesn't care
> > what values those actually map to at that moment.
> >
> > Gabe
> >
> > On Wed, Jan 25, 2017 at 12:03 AM, Gabe Black <gabebl...@google.com>
> wrote:
> >
> > > Hi. I think you guys are generally interpreting this code correctly,
> but
> > > I'll throw in my two cents. If you look at the pseudo code here:
> > >
> > > http://www.felixcloutier.com/x86/CALL.html
> > >
> > > then it appears that the size of the return address you push onto the
> > > stack should be based on the operand size, ie. the data size. That
> would
> > > mean that the displacement for the store should be -env.dataSize like
> it
> > > is. By the same logic, the subtraction below it should also be
> > env.dataSize
> > > (symbolically dsz) and not ssz.
> > >
> > > You are also correct, I think, that since the instruction references
> the
> > > stack segment, it's address size should be set to the stack size when
> > doing
> > > memory operations. Adding addressSize=ssz to the st microop should do
> > that.
> > >
> > > According to that pseudo code, the size of the target IP is also the
> > > operand size, but that's the size the wrip microop will use by default
> > and
> > > so should be fine as written.
> > >
> > > I think if you make those two changes (change ssz to dsz in the sub,
> add
> > > addressSize=ssz to the st), then this macroop definition will be
> correct.
> > > It would be worthwhile to check the other definitions of CALL and make
> > sure
> > > they aren't similarly broken.
> > >
> > > Gabe
> > >
> > > On Tue, Jan 24, 2017 at 3:28 PM, Jason Lowe-Power <ja...@lowepower.com
> >
> > > wrote:
> > >
> > >> Thanks for helping me work this out.
> > >>
> > >> I got the binary working by completely removing the AddrSizeFlagBit.
> The
> > >> only place this bit is used is in the LdStOp (it's set to true if
> > >> legacy.addr is true) and then, in the TLB, if the AddrSizeFlagBit is
> set
> > >> it
> > >> truncates the address to 32-bits removing the upper-order bits.
> > >>
> > >> By removing this flag bit, and adding "addressSize=ssz" to the st
> > >> micro-op,
> > >> the binary is now working.
> > >>
> > >> My next question: How can I have any confidence that this doesn't
> break
> > >> other things? Especially when we're talking about 32-bit compatibility
> > >> mode. Other than the hello32 binary, are there other things I should
> > run?
> > >>
> > >> My intuition says this is an OK change. The addr field in the Request
> > >> should never have the upper-order 32 bits set if the instruction is a
> > >> 32-bit-mode instruction. The instruction implementation already takes
> > care
> > >> of this, as I found with this bug.
> > >>
> > >> Other thoughts? Does anyone know if it is definitely required to use
> > >> AddrSizeFlagBit to truncate the address in the TLB?
> > >>
> > >> If not, I'll post a patch for this tomorrow.
> > >>
> > >> Thanks,
> > >> Jason
> > >>
> > >> On Tue, Jan 24, 2017 at 12:19 PM Steve Reinhardt <ste...@gmail.com>
> > >> wrote:
> > >>
> > >> > Hmm, seems like there's a little bit of an inconsistency in that the
> > >> > request is using the legacy.addr bit (which is set by the decoder
> when
> > >> it
> > >> > sees the address size override prefix [1]) directly, while the
> > >> legacy.addr
> > >> > bit is also used to calculate the addressSize value [2] but can be
> > >> > overridden (as we are doing). So if addressSize is overridden then
> > >> > AddrSizeFlagBit should no longer be set based on legacy.addr.
> > >> >
> > >> > Or another way of looking at it is that the same process of checking
> > >> > legacy.addr to override the address size is done in two places, once
> > in
> > >> the
> > >> > decoder [2] and again in the TLB [3] and it's not clear how to
> > suppress
> > >> the
> > >> > latter one.
> > >> >
> > >> > I suppose one gnarly way of doing it would be to go into the
> micro-op
> > >> > definition somewhere and clear the AddrSizeFlagBit out of memFlags
> if
> > >> > addressSize != env.addressSize (i.e., the address size was
> > overridden).
> > >> > There's probably a cleaner way, but that's the easiest path I can
> > think
> > >> of
> > >> > (if it even works).
> > >> >
> > >> > [1] http://grok.gem5.org/xref/gem5/src/arch/x86/decoder.cc#195
> > >> > [2] http://grok.gem5.org/xref/gem5/src/arch/x86/decoder.cc#390
> > >> > [3] http://grok.gem5.org/source/xref/gem5/src/arch/x86/tlb.cc#315
> > >> >
> > >> >
> > >> > On Tue, Jan 24, 2017 at 8:37 AM, Jason Lowe-Power <
> > ja...@lowepower.com>
> > >> > wrote:
> > >> >
> > >> > > Hi Steve,
> > >> > >
> > >> > > That was super helpful. I'm now a step closer to solving this!
> > >> > >
> > >> > > Your suggestion of =ssz, lead me to search for the uses of that in
> > >> x86,
> > >> > and
> > >> > > it turns out that almost all of other stack instructions have
> > >> > dataSize=ssz.
> > >> > > So, I added both dataSize=ssz and addressSize=ssz to the call
> > >> > instruction,
> > >> > > though I think only the addressSize is actually needed, but I'm
> not
> > >> > > certain.
> > >> > >
> > >> > > Now, the address is passed to the Request object correctly, but
> the
> > >> > program
> > >> > > still fails. The problem now is that the request object is getting
> > >> > > the AddrSizeFlagBit set to true, because machInst.legacy.addr is
> > true.
> > >> > > Thus, the TLB only uses the lower 32 bits of the 64-bit address.
> > >> > >
> > >> > > Any ideas on how to change the instruction's memFlags from the
> > >> macro-op
> > >> > > definition? They are set on
> > >> > > http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/
> > >> > > x86/isa/microops/ldstop.isa#l334
> > >> > > .
> > >> > >
> > >> > > It would be nice if we could fix this in the decoder. I think the
> > >> logic
> > >> > > should be "if the address prefix is set and this is an implicit
> > stack
> > >> > > operation, ignore the address prefix". However, I'm not sure how
> to
> > >> tell
> > >> > if
> > >> > > the instruction is "an implicit stack operation" from the decoder.
> > >> > >
> > >> > > Thanks,
> > >> > > Jason
> > >> > >
> > >> > > On Tue, Jan 24, 2017 at 9:05 AM Steve Reinhardt <ste...@gmail.com
> >
> > >> > wrote:
> > >> > >
> > >> > > My recollection of how all this works is that the arguments to the
> > >> 'st'
> > >> > > micro-op get turned into arguments to a call to the StoreOp
> > >> constructor:
> > >> > >
> > >> > >    597 <http://repo.gem5.o, and that address doesn't
> > >> > > existrg/gem5/file/cd7f3a1dbf55/src/arch/x86/isa/microops/
> > >> ldstop.isa#l597
> > >> > > <http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/
> > >> > > x86/isa/microops/ldstop.isa#l597>
> > >> > > >
> > >> > >         class StoreOp(LdStOp):
> > >> > >    598 <
> > >> > > http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/
> > >> > > x86/isa/microops/ldstop.isa#l598
> > >> > > >
> > >> > >             def __init__(self, data, segment, addr, disp = 0,
> > >> > >    599 <
> > >> > > http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/
> > >> > > x86/isa/microops/ldstop.isa#l599
> > >> > > >
> > >> > >                     dataSize="env.dataSize",
> > >> > >    600 <
> > >> > > http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/
> > >> > > x86/isa/microops/ldstop.isa#l600
> > >> > > >
> > >> > >                     addressSize="env.addressSize",
> > >> > >    601 <
> > >> > > http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/
> > >> > > x86/isa/microops/ldstop.isa#l601
> > >> > > >
> > >> > >                     atCPL0=False, nonSpec=False):
> > >> > >
> > >> > > so the "-env.dataSize" you see is actually the displacement for
> the
> > >> > store,
> > >> > > not the dataSize or addressSize.  I think the problem is that the
> > >> > > addressSize is using the env,addressSize that's calculated the
> > >> "normal"
> > >> > > way, i.e., including the effects of the override prefix.
> > >> > >
> > >> > > Poking around some more, there's an 'ssz' symbol that maps to
> > >> > env.stackSize
> > >> > > [1] which gets used a lot in stack operations; for example, right
> > >> after
> > >> > the
> > >> > > store in CALL_NEAR_I, 'ssz' is subtracted off of the stack
> pointer.
> > >> The
> > >> > > calculation of env.stackSize seems to follow the rule you
> mentioned
> > >> about
> > >> > > being fixed at 64 bits in 64-bit mode [2, 3].
> > >> > >
> > >> > > So it might be sufficient to add ', addressSize=ssz' to the end of
> > the
> > >> > 'st'
> > >> > > micro-op. Oddly though I don't see addressSize being set on any
> > other
> > >> > stack
> > >> > > ops (just dataSize), so I wonder if this is a problem with other
> > stack
> > >> > > instructions or not. If so, it might be useful to have an override
> > >> > > hardwired in at some lower level to check if the segment is ss and
> > >> force
> > >> > > the address size to be stackSize (rather than adding this extra
> > >> parameter
> > >> > > in a dozen different places). I wouldn't know where best to do
> that
> > >> > though,
> > >> > > so the manual override seems best for now.
> > >> > >
> > >> > > Steve
> > >> > >
> > >> > > [1] http://grok.gem5.org/xref/gem5/src/arch/x86/isa/
> > microasm.isa#107
> > >> > > [2] http://grok.gem5.org/xref/gem5/src/arch/x86/isa.cc#91
> > >> > > [3] http://grok.gem5.org/xref/gem5/src/arch/x86/decoder.cc#400
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > > On Mon, Jan 23, 2017 at 4:04 PM, Jason Lowe-Power <
> > >> ja...@lowepower.com>
> > >> > > wrote:
> > >> > >
> > >> > > > To those of you with more x86 ISA implementation knowledge than
> I
> > >> have:
> > >> > > >
> > >> > > > I've been working through a bug one of our users found (thanks
> > >> > > Sanchayan!).
> > >> > > > It looks like current versions of ld use the 0x67 instruction
> > prefix
> > >> > > > (address size override) as an optimization instead of using a
> nop.
> > >> See
> > >> > > > https://www.sourceware.org/ml/binutils/2016-05/msg00323.html.
> > >> > > >
> > >> > > > This causes the call instruction to be decoded with with the
> > >> "address
> > >> > > size
> > >> > > > override prefix", which is correct, in a sense. From what I can
> > >> tell,
> > >> > > this
> > >> > > > is passed to the call instruction via "-env.dataSize" (see call
> > >> > > instruction
> > >> > > > implementation below).
> > >> > > >
> > >> > > > def macroop CALL_NEAR_I
> > >> > > > {
> > >> > > >     # Make the default data size of calls 64 bits in 64 bit mode
> > >> > > >     .adjust_env oszIn64Override
> > >> > > >     .function_call
> > >> > > >
> > >> > > >     limm t1, imm
> > >> > > >     rdip t7
> > >> > > >     # Check target of call
> > >> > > >     st t7, ss, [0, t0, rsp], "-env.dataSize"
> > >> > > >     subi rsp, rsp, ssz
> > >> > > >     wrip t7, t1
> > >> > > > };
> > >> > > >
> > >> > > > Now, the bug is, according to the x86 manual, "For instructions
> > that
> > >> > > > implicitly access the stack segment (SS), the address size for
> > stack
> > >> > > > accesses is determined by the D (default) bit in the
> stack-segment
> > >> > > > descriptor. In 64-bit mode, the D bit is ignored, and all stack
> > >> > > references
> > >> > > > have a 64-bit address size." See
> > >> > > > https://support.amd.com/TechDocs/24594.pdf page
> > >> > > > 9.
> > >> > > >
> > >> > > > Thoughts on how to fix this?
> > >> > > >
> > >> > > > Thanks,
> > >> > > > Jason
> > >> > > > _______________________________________________
> > >> > > > gem5-dev mailing list
> > >> > > > gem5-dev@gem5.org
> > >> > > > http://m5sim.org/mailman/listinfo/gem5-dev
> > >> > > >
> > >> > > _______________________________________________
> > >> > > gem5-dev mailing list
> > >> > > gem5-dev@gem5.org
> > >> > > http://m5sim.org/mailman/listinfo/gem5-dev
> > >> > >
> > >> > > --
> > >> > >
> > >> > > Jason
> > >> > > _______________________________________________
> > >> > > gem5-dev mailing list
> > >> > > gem5-dev@gem5.org
> > >> > > http://m5sim.org/mailman/listinfo/gem5-dev
> > >> > >
> > >> > _______________________________________________
> > >> > gem5-dev mailing list
> > >> > gem5-dev@gem5.org
> > >> > http://m5sim.org/mailman/listinfo/gem5-dev
> > >> >
> > >> _______________________________________________
> > >> gem5-dev mailing list
> > >> gem5-dev@gem5.org
> > >> http://m5sim.org/mailman/listinfo/gem5-dev
> > >>
> > >
> > >
> > _______________________________________________
> > gem5-dev mailing list
> > gem5-dev@gem5.org
> > http://m5sim.org/mailman/listinfo/gem5-dev
> >
> _______________________________________________
> gem5-dev mailing list
> gem5-dev@gem5.org
> http://m5sim.org/mailman/listinfo/gem5-dev
>
_______________________________________________
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

Re: [gem5-dev] Bug in x86 stack instructions

Reply via email to