I'm leaning towards deleting the AddrSizeFlagBit. Gabe, can you shed some light on the purpose of this flag? It's only ever used for masking addresses in the TLB: http://grok.gem5.org/xref/gem5/src/arch/x86/tlb.cc#318 http://grok.gem5.org/xref/gem5/src/arch/x86/tlb.cc#330
From what I can understand, it should be unnecessary to mask the address here. In the instruction implementation, the address is already masked if it is a 32-bit mode instruction: http://grok.gem5.org/xref/gem5/src/arch/x86/isa/microops/ldstop.isa#433 It seems straightforward to add addressSize=ssz to the st instructions where necessary. I'm confused as to why the TLB ever needs to know what the address mode of the instruction is. Why/how would the TLB ever receive a request with an address > 4GB for a 32-bit mode instruction? Thanks again for the help! Cheers, Jason On Wed, Jan 25, 2017 at 10:37 AM Steve Reinhardt <ste...@gmail.com> wrote: > Thanks, Gabe! That's all really helpful. I appreciate you taking the time > to respond. > > One concern I have is that the portion of the manual Jason quoted says the > rule holds "For instructions that > *implicitly* access the stack segment (SS)" (emphasis mine)... would you > say then that instructions that *explicitly* choose SS as the segment > (e.g., with a segment override prefix) would not follow this rule? If so, > then it may not be correct to do things inside the instruction based solely > on the fact that segment==SS. Or maybe that's a case that just can't occur > so we don't have to worry about it? > > Also, would there be any harm in setting the AddrSizeFlagBit based on > whether addressSize==32 rather than looking at the legacy.addr bit? If not, > this seems like a simpler solution to me. > > Thanks again! > > Steve > > > On Wed, Jan 25, 2017 at 12:41 AM, Gabe Black <gabebl...@google.com> wrote: > > > Forgive me if I'm repeating a bit of what you said, Steve, but I looked a > > little more and there's more to fix. > > > > Also, if you look at x86/isa/microops/ldstop.isa, you'll find the python > > classes which receive the parameters of the load and store microops and > > translate them into C++ instantiations of the underlying microop classes. > > In their __init__ methods, you'll see at the end where the > AddrSizeFlagBit > > is set based on whether or not the machInst.legacy.addr bit is set, > > basically whether the address size prefix had been found when the > > instruction was decoded. For the two classes involved (LdStOp and > > BigLdStOp), I think you'll want to add a check which doesn't set > > AddrSizeFlagBit if the segment is the stack segment. > > > > Additionally, you'll want to make sure the TLB knows to use the stack > > segment size (as opposed to either the default or alternate address > sizes) > > when manipulating the segment offset. You could add a new ISA specific > flag > > for x86 memory requests (there seems to be room for exactly one more in > > arch/x86/ldstflags.hh) and set it if the segment is the stack in those ld > > and st ops, similar to how AddrSizeFlagBit is being set. Then in the TLB, > > specifically where it's calculating logSize, you'll want to make it > > recognize your new flag and use the stack size from the m5Reg (called > just > > m5Reg.stack, I think) instead of either of the address size values. > > > > This is a little gross because it means there's some calculation going on > > when address translation is being done (frequently) when it could have > been > > done earlier during decode (less frequently). One reason I did things > this > > way (all speculation, I've long since forgot) is that it makes the > decoded > > instructions hold less state. The same bytes may not always refer to the > > same sizes depending on the segmentation state in the machine, so if > those > > particular bytes are being decoded, the gem5 decode cache needs to either > > consider that segmentation state when deciding what instruction to spit > > out. If it didn't, it might return an instruction which statically says > to > > use 32 bit addresses (for instance), when now segmentation state says it > > the same bytes should mean it should use 16 bit addresses. Instead, it > canh > > be factored in dynamically using external state (the m5Reg) to an > > instruction that says which version of the sizes to use but doesn't care > > what values those actually map to at that moment. > > > > Gabe > > > > On Wed, Jan 25, 2017 at 12:03 AM, Gabe Black <gabebl...@google.com> > wrote: > > > > > Hi. I think you guys are generally interpreting this code correctly, > but > > > I'll throw in my two cents. If you look at the pseudo code here: > > > > > > http://www.felixcloutier.com/x86/CALL.html > > > > > > then it appears that the size of the return address you push onto the > > > stack should be based on the operand size, ie. the data size. That > would > > > mean that the displacement for the store should be -env.dataSize like > it > > > is. By the same logic, the subtraction below it should also be > > env.dataSize > > > (symbolically dsz) and not ssz. > > > > > > You are also correct, I think, that since the instruction references > the > > > stack segment, it's address size should be set to the stack size when > > doing > > > memory operations. Adding addressSize=ssz to the st microop should do > > that. > > > > > > According to that pseudo code, the size of the target IP is also the > > > operand size, but that's the size the wrip microop will use by default > > and > > > so should be fine as written. > > > > > > I think if you make those two changes (change ssz to dsz in the sub, > add > > > addressSize=ssz to the st), then this macroop definition will be > correct. > > > It would be worthwhile to check the other definitions of CALL and make > > sure > > > they aren't similarly broken. > > > > > > Gabe > > > > > > On Tue, Jan 24, 2017 at 3:28 PM, Jason Lowe-Power <ja...@lowepower.com > > > > > wrote: > > > > > >> Thanks for helping me work this out. > > >> > > >> I got the binary working by completely removing the AddrSizeFlagBit. > The > > >> only place this bit is used is in the LdStOp (it's set to true if > > >> legacy.addr is true) and then, in the TLB, if the AddrSizeFlagBit is > set > > >> it > > >> truncates the address to 32-bits removing the upper-order bits. > > >> > > >> By removing this flag bit, and adding "addressSize=ssz" to the st > > >> micro-op, > > >> the binary is now working. > > >> > > >> My next question: How can I have any confidence that this doesn't > break > > >> other things? Especially when we're talking about 32-bit compatibility > > >> mode. Other than the hello32 binary, are there other things I should > > run? > > >> > > >> My intuition says this is an OK change. The addr field in the Request > > >> should never have the upper-order 32 bits set if the instruction is a > > >> 32-bit-mode instruction. The instruction implementation already takes > > care > > >> of this, as I found with this bug. > > >> > > >> Other thoughts? Does anyone know if it is definitely required to use > > >> AddrSizeFlagBit to truncate the address in the TLB? > > >> > > >> If not, I'll post a patch for this tomorrow. > > >> > > >> Thanks, > > >> Jason > > >> > > >> On Tue, Jan 24, 2017 at 12:19 PM Steve Reinhardt <ste...@gmail.com> > > >> wrote: > > >> > > >> > Hmm, seems like there's a little bit of an inconsistency in that the > > >> > request is using the legacy.addr bit (which is set by the decoder > when > > >> it > > >> > sees the address size override prefix [1]) directly, while the > > >> legacy.addr > > >> > bit is also used to calculate the addressSize value [2] but can be > > >> > overridden (as we are doing). So if addressSize is overridden then > > >> > AddrSizeFlagBit should no longer be set based on legacy.addr. > > >> > > > >> > Or another way of looking at it is that the same process of checking > > >> > legacy.addr to override the address size is done in two places, once > > in > > >> the > > >> > decoder [2] and again in the TLB [3] and it's not clear how to > > suppress > > >> the > > >> > latter one. > > >> > > > >> > I suppose one gnarly way of doing it would be to go into the > micro-op > > >> > definition somewhere and clear the AddrSizeFlagBit out of memFlags > if > > >> > addressSize != env.addressSize (i.e., the address size was > > overridden). > > >> > There's probably a cleaner way, but that's the easiest path I can > > think > > >> of > > >> > (if it even works). > > >> > > > >> > [1] http://grok.gem5.org/xref/gem5/src/arch/x86/decoder.cc#195 > > >> > [2] http://grok.gem5.org/xref/gem5/src/arch/x86/decoder.cc#390 > > >> > [3] http://grok.gem5.org/source/xref/gem5/src/arch/x86/tlb.cc#315 > > >> > > > >> > > > >> > On Tue, Jan 24, 2017 at 8:37 AM, Jason Lowe-Power < > > ja...@lowepower.com> > > >> > wrote: > > >> > > > >> > > Hi Steve, > > >> > > > > >> > > That was super helpful. I'm now a step closer to solving this! > > >> > > > > >> > > Your suggestion of =ssz, lead me to search for the uses of that in > > >> x86, > > >> > and > > >> > > it turns out that almost all of other stack instructions have > > >> > dataSize=ssz. > > >> > > So, I added both dataSize=ssz and addressSize=ssz to the call > > >> > instruction, > > >> > > though I think only the addressSize is actually needed, but I'm > not > > >> > > certain. > > >> > > > > >> > > Now, the address is passed to the Request object correctly, but > the > > >> > program > > >> > > still fails. The problem now is that the request object is getting > > >> > > the AddrSizeFlagBit set to true, because machInst.legacy.addr is > > true. > > >> > > Thus, the TLB only uses the lower 32 bits of the 64-bit address. > > >> > > > > >> > > Any ideas on how to change the instruction's memFlags from the > > >> macro-op > > >> > > definition? They are set on > > >> > > http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/ > > >> > > x86/isa/microops/ldstop.isa#l334 > > >> > > . > > >> > > > > >> > > It would be nice if we could fix this in the decoder. I think the > > >> logic > > >> > > should be "if the address prefix is set and this is an implicit > > stack > > >> > > operation, ignore the address prefix". However, I'm not sure how > to > > >> tell > > >> > if > > >> > > the instruction is "an implicit stack operation" from the decoder. > > >> > > > > >> > > Thanks, > > >> > > Jason > > >> > > > > >> > > On Tue, Jan 24, 2017 at 9:05 AM Steve Reinhardt <ste...@gmail.com > > > > >> > wrote: > > >> > > > > >> > > My recollection of how all this works is that the arguments to the > > >> 'st' > > >> > > micro-op get turned into arguments to a call to the StoreOp > > >> constructor: > > >> > > > > >> > > 597 <http://repo.gem5.o, and that address doesn't > > >> > > existrg/gem5/file/cd7f3a1dbf55/src/arch/x86/isa/microops/ > > >> ldstop.isa#l597 > > >> > > <http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/ > > >> > > x86/isa/microops/ldstop.isa#l597> > > >> > > > > > >> > > class StoreOp(LdStOp): > > >> > > 598 < > > >> > > http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/ > > >> > > x86/isa/microops/ldstop.isa#l598 > > >> > > > > > >> > > def __init__(self, data, segment, addr, disp = 0, > > >> > > 599 < > > >> > > http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/ > > >> > > x86/isa/microops/ldstop.isa#l599 > > >> > > > > > >> > > dataSize="env.dataSize", > > >> > > 600 < > > >> > > http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/ > > >> > > x86/isa/microops/ldstop.isa#l600 > > >> > > > > > >> > > addressSize="env.addressSize", > > >> > > 601 < > > >> > > http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/ > > >> > > x86/isa/microops/ldstop.isa#l601 > > >> > > > > > >> > > atCPL0=False, nonSpec=False): > > >> > > > > >> > > so the "-env.dataSize" you see is actually the displacement for > the > > >> > store, > > >> > > not the dataSize or addressSize. I think the problem is that the > > >> > > addressSize is using the env,addressSize that's calculated the > > >> "normal" > > >> > > way, i.e., including the effects of the override prefix. > > >> > > > > >> > > Poking around some more, there's an 'ssz' symbol that maps to > > >> > env.stackSize > > >> > > [1] which gets used a lot in stack operations; for example, right > > >> after > > >> > the > > >> > > store in CALL_NEAR_I, 'ssz' is subtracted off of the stack > pointer. > > >> The > > >> > > calculation of env.stackSize seems to follow the rule you > mentioned > > >> about > > >> > > being fixed at 64 bits in 64-bit mode [2, 3]. > > >> > > > > >> > > So it might be sufficient to add ', addressSize=ssz' to the end of > > the > > >> > 'st' > > >> > > micro-op. Oddly though I don't see addressSize being set on any > > other > > >> > stack > > >> > > ops (just dataSize), so I wonder if this is a problem with other > > stack > > >> > > instructions or not. If so, it might be useful to have an override > > >> > > hardwired in at some lower level to check if the segment is ss and > > >> force > > >> > > the address size to be stackSize (rather than adding this extra > > >> parameter > > >> > > in a dozen different places). I wouldn't know where best to do > that > > >> > though, > > >> > > so the manual override seems best for now. > > >> > > > > >> > > Steve > > >> > > > > >> > > [1] http://grok.gem5.org/xref/gem5/src/arch/x86/isa/ > > microasm.isa#107 > > >> > > [2] http://grok.gem5.org/xref/gem5/src/arch/x86/isa.cc#91 > > >> > > [3] http://grok.gem5.org/xref/gem5/src/arch/x86/decoder.cc#400 > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > On Mon, Jan 23, 2017 at 4:04 PM, Jason Lowe-Power < > > >> ja...@lowepower.com> > > >> > > wrote: > > >> > > > > >> > > > To those of you with more x86 ISA implementation knowledge than > I > > >> have: > > >> > > > > > >> > > > I've been working through a bug one of our users found (thanks > > >> > > Sanchayan!). > > >> > > > It looks like current versions of ld use the 0x67 instruction > > prefix > > >> > > > (address size override) as an optimization instead of using a > nop. > > >> See > > >> > > > https://www.sourceware.org/ml/binutils/2016-05/msg00323.html. > > >> > > > > > >> > > > This causes the call instruction to be decoded with with the > > >> "address > > >> > > size > > >> > > > override prefix", which is correct, in a sense. From what I can > > >> tell, > > >> > > this > > >> > > > is passed to the call instruction via "-env.dataSize" (see call > > >> > > instruction > > >> > > > implementation below). > > >> > > > > > >> > > > def macroop CALL_NEAR_I > > >> > > > { > > >> > > > # Make the default data size of calls 64 bits in 64 bit mode > > >> > > > .adjust_env oszIn64Override > > >> > > > .function_call > > >> > > > > > >> > > > limm t1, imm > > >> > > > rdip t7 > > >> > > > # Check target of call > > >> > > > st t7, ss, [0, t0, rsp], "-env.dataSize" > > >> > > > subi rsp, rsp, ssz > > >> > > > wrip t7, t1 > > >> > > > }; > > >> > > > > > >> > > > Now, the bug is, according to the x86 manual, "For instructions > > that > > >> > > > implicitly access the stack segment (SS), the address size for > > stack > > >> > > > accesses is determined by the D (default) bit in the > stack-segment > > >> > > > descriptor. In 64-bit mode, the D bit is ignored, and all stack > > >> > > references > > >> > > > have a 64-bit address size." See > > >> > > > https://support.amd.com/TechDocs/24594.pdf page > > >> > > > 9. > > >> > > > > > >> > > > Thoughts on how to fix this? > > >> > > > > > >> > > > Thanks, > > >> > > > Jason > > >> > > > _______________________________________________ > > >> > > > gem5-dev mailing list > > >> > > > gem5-dev@gem5.org > > >> > > > http://m5sim.org/mailman/listinfo/gem5-dev > > >> > > > > > >> > > _______________________________________________ > > >> > > gem5-dev mailing list > > >> > > gem5-dev@gem5.org > > >> > > http://m5sim.org/mailman/listinfo/gem5-dev > > >> > > > > >> > > -- > > >> > > > > >> > > Jason > > >> > > _______________________________________________ > > >> > > gem5-dev mailing list > > >> > > gem5-dev@gem5.org > > >> > > http://m5sim.org/mailman/listinfo/gem5-dev > > >> > > > > >> > _______________________________________________ > > >> > gem5-dev mailing list > > >> > gem5-dev@gem5.org > > >> > http://m5sim.org/mailman/listinfo/gem5-dev > > >> > > > >> _______________________________________________ > > >> gem5-dev mailing list > > >> gem5-dev@gem5.org > > >> http://m5sim.org/mailman/listinfo/gem5-dev > > >> > > > > > > > > _______________________________________________ > > gem5-dev mailing list > > gem5-dev@gem5.org > > http://m5sim.org/mailman/listinfo/gem5-dev > > > _______________________________________________ > gem5-dev mailing list > gem5-dev@gem5.org > http://m5sim.org/mailman/listinfo/gem5-dev > _______________________________________________ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev