On 10/25/11 07:46, Steve Reinhardt wrote: > On Tue, Oct 25, 2011 at 2:30 AM, Gabe Black <[email protected]> wrote: > >> Ah, ok, I was just being dumb. All the stdf-s and lddf-s are just moving >> memory around, I think. That way you can load/store 64 bits at a time >> and get it done with fewer instructions. I think those instructions >> themselves can be ignored. > > If what you mean is that the actual problem is induced in an FP operation > and not in the stdf/lddf itself, then yes, it looks like you're right. Note > that in the detailed tracediff below, the original divergence is on the > result of an fsubd. I think there are quite a few FP ops that are giving > slightly different results before one shows up in the exec trace, and the > reason appears to be that the data field output on FP op exec tracing is > broken... maybe we're only properly reading one register from the register > pair? So I think the only reason the error first shows up in a stdf in the > exec trace is because that's the first instruction where the trace output > isn't broken.
It's not that it's broken, it's that it sets more than one register. One or two will be the fp result, and one is for the FP condition codes. Registers are ordered like they are so the disassembly can figure out (sort of) which registers to use for which purpose, and as a result the condition codes tend to be the thing picked for both integer and FP instructions. That's frequently not very useful, but even if it was an FP dest reg it wouldn't be both of them for double precision instructions. If you want the real story with how registers are being read/written, use the "Registers" trace flag. If it's not that it's similar. That will print out what registers are being accessed, how, and what value is being passed around. > I created a /tmp/sparc-error directory on zizzer, moved the original > tracediff in there, and also copied two new files: pre-error-trace.out and > detailed-tracediff.out. Hope the names are self-explanatory. Now you have > access to all the traces I generated. Ok, thanks. > > >> I'm also surprised that there would be much >> floating point. >> > Yea, and it's really weird stuff too... almost like they're running tests on > the FPU or something: > > 931697674: system.cpu T0 : 0xff1aa4b0 : faddd %f21,%f20,%f20 : > FloatAdd : D=0x00000000c0000000 > 931697675: system.cpu T0 : 0xff1aa4b4 : fsubd %f17,%f16,%f28 : > FloatAdd : D=0x00000000c0000000 > 931697676: system.cpu T0 : 0xff1aa4b8 : faddd %f19,%f18,%f4 : > FloatAdd : D=0x00000000c0000000 > 931697677: system.cpu T0 : 0xff1aa4bc : fsubd %f3,%f2,%f0 : > FloatAdd : D=0x00000000c0000000 > 931697678: system.cpu T0 : 0xff1aa4c0 : faddd %f7,%f6,%f14 : > FloatAdd : D=0x00000000c0000000 > 931697679: system.cpu T0 : 0xff1aa4c4 : fsubd %f5,%f4,%f30 : > FloatAdd : D=0x00000000c0000000 > 931697680: system.cpu T0 : 0xff1aa4c8 : faddd %f11,%f10,%f6 : > FloatAdd : D=0x00000000c0000000 > 931697681: system.cpu T0 : 0xff1aa4cc : fcmpd %f21,%f20,%fsr : > FloatAdd : D=0x00000000c0000000 > 931697682: system.cpu T0 : 0xff1aa4d0 : faddd %f7,%f6,%f18 : > FloatAdd : D=0x00000000c0000000 > > Note also how the data field in the trace output is always the same, even > though the detailed tracediff shows that these instructions aren't always > producing the same values. I think they're not setting any FP condition codes differently, really. It actually could be some sort of boot time self test now that you mention it. >> I'm currently building binutils for SPARC, so hopefully I can >> disassemble some things and get a better idea of what's going on. It's >> probably going to be really annoying to figure it out. > > If it's really just an FP rounding error, it might not be that hard... just > look at the examples from the trace of where it's going wrong, figure out > what the right answer is, and focus on those few instructions. FP is pretty > thoroughly specified by IEEE, so if it's not an outright compiler bug, maybe > it's just some change in the default rounding settings or something. Yeah, I think ISAs treat IEEE as a really good suggestion rather than a standard. ARM isn't strictly conformant, and neither is x86. The default rounding mode *is* standard, though, and I don't think is adjusted in SPARC as a result of execution. If it changed somehow (unless I'm forgetting where SPARC does that) it's a fairly significant problem. Whether instructions generate +/- 0 in various situations may depend on, for instance, what order gcc decides to put the operands. I'm not sure that it does, but there are all kinds of weird, subtle behaviors with FP, and you can't just fix how add works if x86 picked the wrong thing. Then you have to replace add, or semi-replace it by faking it out with other FP operations. If we're running real x87 instructions (we shouldn't be in 64 bit mode, but we still could) then those use 80 bit operands internally. Where and when rounding takes place depends on when those are moved in/out of the FPU, and will be different than true 64 bit operands. SSE based FP uses real 64 bit doubles, so that should behave better. It should also be the default in 64 bit mode since the compiler can assume some basic SSE support is present. > Even if the FP rounding error isn't the source of the problem, it might be > easiest to fix that and get it out of the way so we can see what the actual > problem is. > > If you really want to know *why* the kernel is doing all this FP, then yes, > you probably need to look at the source code. > > Steve > _______________________________________________ > gem5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/gem5-dev _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
