Hard to tell... there are larger and larger differences after that point that seem to be cascading from this one, but it takes a while before they diverge completely. I put the trace in /tmp/tracediff-8625.out on zizzer if you want to take a look for yourself.
It seems odd that the solaris boot would be doing that much FP in any case, but there does seem to be quite a bit of it. Steve On Tue, Oct 25, 2011 at 12:17 AM, Gabe Black <[email protected]> wrote: > An FP rounding error seems very plausible, but I'm not sure how +/- zero > would make any difference. I'm skeptical that our FP implementation in > SPARC is accurate enough to care much about such a small difference, > although it is, of course, entirely possible it cascades from there into > a larger difference which breaks things. > > I've gone back and improved the SPARC disassembly in the past, but it's > still not perfect. The problem is the hierarchy that works for getting > instructions to work doesn't necessarily mirror the one you need to get > accurate disassembly. I think I went with operand position too (src 0 is > for this, dest 0 is for that) and that doesn't always work very well. > That's probably what's going wrong here. > > Is there a point after this where things diverge significantly? This > could be just a blip of noise and the real problem happens a lot later. > It's a *major* pain in the butt to write code that theoretically handles > all the little FP weird cases and gets all the bits right when the host > ISA has different rules for FP than the guiest, and it's even harder to > actually get the compiler to generate that code without moving things > around and messing it all up. And glibc's FP support is wrong sometimes! > What fun. I largely think it's farther on, and also partially am holding > out hope we don't have to wade into FP soup. > > Gabe > > On 10/24/11 09:19, Steve Reinhardt wrote: > > Great, thanks a lot. I was able to build with > > 'CC=/usr/bin/gcc-4.4 CXX=/usr/bin/g++-4.4' and get a binary that passes > this > > test on the head, so it's definitely the compiler. I also ran tracediff > and > > it looks like it's an off-by-one thing with %fp; here's the first error: > > > > -931697720: system.cpu T0 : 0xff1aa5b8 : stdf %fp, [%f29 + > -0x20] : > > MemWrite : D=0x423000000000197a A=0xfeffa280 > > +931697720: system.cpu T0 : 0xff1aa5b8 : stdf %fp, [%f29 + > -0x20] : > > MemWrite : D=0x4230000000001979 A=0xfeffa280 > > > > (The good gcc-4.4 version is second, so the '1979' is the correct value > > here.) > > > > I ran one more tracediff with '--debug-flag=All --trace-start=931600000' > to > > see if anything else turns up sooner, and got this: > > > > @@ -1380553 +1380553 @@ > > 931697014: system.cpu.[tid:0]: Reading float reg 3 (3) bits as 0, 0. > > 931697014: system.cpu.[tid:0]: Reading float reg 2 (2) bits as > 0x3e300000, > > 0.171875. > > 931697014: global: FSR read as: 0xc0000000 > > -931697014: system.cpu.[tid:0]: Setting float reg 12 (12) bits to 0, 0. > > +931697014: system.cpu.[tid:0]: Setting float reg 12 (12) bits to > > 0x80000000, -0. > > 931697014: system.cpu.[tid:0]: Setting float reg 13 (13) bits to 0, 0. > > 931697014: global: FSR written with: 0xc0000000 > > 931697014: system.cpu + A16 T0 : 0xff1aa434 : fsubd > > %f31,%f30,%f12 : FloatAdd : D=0x00000000c0000000 > > @@ -1380951 +1380951 @@ > > 931697038: system.cpu.[tid:0]: Reading float reg 5 (5) bits as 0, 0. > > 931697038: system.cpu.[tid:0]: Reading float reg 4 (4) bits as 0, 0. > > 931697038: system.cpu.[tid:0]: Reading float reg 13 (13) bits as 0, 0. > > -931697038: system.cpu.[tid:0]: Reading float reg 12 (12) bits as 0, 0. > > +931697038: system.cpu.[tid:0]: Reading float reg 12 (12) bits as > > 0x80000000, -0. > > 931697038: global: FSR read as: 0xc0000000 > > 931697038: system.cpu.[tid:0]: Setting float reg 18 (18) bits to 0, 0. > > 931697038: system.cpu.[tid:0]: Setting float reg 19 (19) bits to 0, 0. > > @@ -1381022 +1381022 @@ > > 931697042: system.cpu.[tid:0]: Reading float reg 10 (10) bits as > > 0x41300000, 11. > > 931697042: global: FSR read as: 0xc0000000 > > 931697042: system.cpu.[tid:0]: Setting float reg 16 (16) bits to > > 0x41300000, 11. > > -931697042: system.cpu.[tid:0]: Setting float reg 17 (17) bits to 0xe685, > > 8.26948e-41. > > +931697042: system.cpu.[tid:0]: Setting float reg 17 (17) bits to 0xe684, > > 8.26934e-41. > > 931697042: global: FSR written with: 0xc0000000 > > 931697042: system.cpu + A16 T0 : 0xff1aa4a4 : faddd > %f3,%f2,%f16 > > : FloatAdd : D=0x00000000c0000000 > > 931697042: Event_18: AtomicSimpleCPU tick event scheduled @ 931697043 > > > > Could it be some kind of FP rounding error? It's not clear how that > would > > end up affecting %fp though. (Actually, looking at this a little closer, > > are we even disassembling that correctly? Seems to me it should be 'stdf > > %f29, [%fp + -0x20]'.) > > > > I won't have time to look into this further anytime soon, but I hope this > > will give someone else (Gabe?) enough to go on to get this figured out. > > > > Thanks, > > > > Steve > > > > > > On Sun, Oct 23, 2011 at 7:50 PM, Ali Saidi <[email protected]> wrote: > > > >> I've installed it. > >> > >> Ali > >> > >> On Oct 23, 2011, at 7:18 PM, Steve Reinhardt wrote: > >> > >>> This makes sense, since the time the regression started failing is > >>> consistent with when gcc was upgraded on zizzer. > >>> > >>> I see there is a gcc-4.4 package available for ubuntu 11.04 (which > zizzer > >> is > >>> running)... is there more to it than installing that package and > >> recompiling > >>> to get a workable binary to run tracediff with? > >>> > >>> I'd try myself but I've forgotten my zizzer password (again!) so I > can't > >>> sudo. It's tough when you've had the same password for ten years then > >> you > >>> change it but don't use the new one much... > >>> > >>> Steve > >>> > >>> On Sun, Sep 25, 2011 at 1:14 PM, Ali Saidi <[email protected]> wrote: > >>> > >>>> Yes.. What Gabe said. With gcc 4.5 (version zizzer now runs) I cannot > >> find > >>>> a version of the repository that passes sparc boot. I'm pretty sure > >> it's an > >>>> annoying compiler issue, but there are some annoyances is figuring out > >> where > >>>> to look at Gabe points out. If you're stats changes work on everything > >> else, > >>>> I'm happy to see them committed while this issue goes on in the > >> background. > >>>> Thanks, > >>>> > >>>> Ali > >>>> > >>>> Sent from my ARM powered device > >>>> > >>>> On Sep 25, 2011, at 3:06 PM, Gabe Black <[email protected]> > wrote: > >>>> > >>>>> We (Ali and I) have each looked at that before, and we think it > depends > >>>>> on the compiler version. Something changes when you have a new enough > >>>>> gcc and then the behavior of SPARC changes. I think the new behavior > is > >>>>> broken and the old behavior is correct, but I'd have to look at it > >>>>> again. I haven't looked into it farther than that yet because I'd > want > >>>>> to tracediff between versions built with different compilers. Since > >> they > >>>>> would need to find different versions of libraries and can't just run > >>>>> from the same command line, it's logistically annoying. > >>>>> > >>>>> Gabe > >>>>> > >>>>> On 09/25/11 09:52, nathan binkert wrote: > >>>>>> I'm trying to get my python stats changes into the tree, but it > >>>>>> appears that one of the regression tests no longer works (zizzer > >>>>>> agrees with me): > >>>>>> > >>>>>> > >> > SPARC_FS/tests/opt/long/80.solaris-boot/sparc/solaris/t1000-simple-atomic > >>>>>> > >>>>>> Gabe, I think you're the only one that's been messing with SPARC. > Can > >>>>>> you take a look? > >>>>>> > >>>>>> Nate > >>>>>> _______________________________________________ > >>>>>> gem5-dev mailing list > >>>>>> [email protected] > >>>>>> http://m5sim.org/mailman/listinfo/gem5-dev > >>>>> _______________________________________________ > >>>>> gem5-dev mailing list > >>>>> [email protected] > >>>>> http://m5sim.org/mailman/listinfo/gem5-dev > >>>>> > >>>> _______________________________________________ > >>>> gem5-dev mailing list > >>>> [email protected] > >>>> http://m5sim.org/mailman/listinfo/gem5-dev > >>>> > >>> _______________________________________________ > >>> gem5-dev mailing list > >>> [email protected] > >>> http://m5sim.org/mailman/listinfo/gem5-dev > >>> > >> _______________________________________________ > >> gem5-dev mailing list > >> [email protected] > >> http://m5sim.org/mailman/listinfo/gem5-dev > >> > > _______________________________________________ > > gem5-dev mailing list > > [email protected] > > http://m5sim.org/mailman/listinfo/gem5-dev > > _______________________________________________ > gem5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/gem5-dev > _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
