Hard to tell... there are larger and larger differences after that point
that seem to be cascading from this one, but it takes a while before they
diverge completely.  I put the trace in /tmp/tracediff-8625.out on zizzer if
you want to take a look for yourself.

It seems odd that the solaris boot would be doing that much FP in any case,
but there does seem to be quite a bit of it.

Steve


On Tue, Oct 25, 2011 at 12:17 AM, Gabe Black <[email protected]> wrote:

> An FP rounding error seems very plausible, but I'm not sure how +/- zero
> would make any difference. I'm skeptical that our FP implementation in
> SPARC is accurate enough to care much about such a small difference,
> although it is, of course, entirely possible it cascades from there into
> a larger difference which breaks things.
>
> I've gone back and improved the SPARC disassembly in the past, but it's
> still not perfect. The problem is the hierarchy that works for getting
> instructions to work doesn't necessarily mirror the one you need to get
> accurate disassembly. I think I went with operand position too (src 0 is
> for this, dest 0 is for that) and that doesn't always work very well.
> That's probably what's going wrong here.
>
> Is there a point after this where things diverge significantly? This
> could be just a blip of noise and the real problem happens a lot later.
> It's a *major* pain in the butt to write code that theoretically handles
> all the little FP weird cases and gets all the bits right when the host
> ISA has different rules for FP than the guiest, and it's even harder to
> actually get the compiler to generate that code without moving things
> around and messing it all up. And glibc's FP support is wrong sometimes!
> What fun. I largely think it's farther on, and also partially am holding
> out hope we don't have to wade into FP soup.
>
> Gabe
>
> On 10/24/11 09:19, Steve Reinhardt wrote:
> > Great, thanks a lot.  I was able to build with
> > 'CC=/usr/bin/gcc-4.4 CXX=/usr/bin/g++-4.4' and get a binary that passes
> this
> > test on the head, so it's definitely the compiler.  I also ran tracediff
> and
> > it looks like it's an off-by-one thing with %fp; here's the first error:
> >
> > -931697720: system.cpu T0 : 0xff1aa5b8    :     stdf   %fp, [%f29 +
> -0x20] :
> > MemWrite :  D=0x423000000000197a A=0xfeffa280
> > +931697720: system.cpu T0 : 0xff1aa5b8    :     stdf   %fp, [%f29 +
> -0x20] :
> > MemWrite :  D=0x4230000000001979 A=0xfeffa280
> >
> > (The good gcc-4.4 version is second, so the '1979' is the correct value
> > here.)
> >
> > I ran one more tracediff with '--debug-flag=All --trace-start=931600000'
> to
> > see if anything else turns up sooner, and got this:
> >
> > @@ -1380553 +1380553 @@
> >  931697014: system.cpu.[tid:0]: Reading float reg 3 (3) bits as 0, 0.
> >  931697014: system.cpu.[tid:0]: Reading float reg 2 (2) bits as
> 0x3e300000,
> > 0.171875.
> >  931697014: global: FSR read as: 0xc0000000
> > -931697014: system.cpu.[tid:0]: Setting float reg 12 (12) bits to 0, 0.
> > +931697014: system.cpu.[tid:0]: Setting float reg 12 (12) bits to
> > 0x80000000, -0.
> >  931697014: system.cpu.[tid:0]: Setting float reg 13 (13) bits to 0, 0.
> >  931697014: global: FSR written with: 0xc0000000
> >  931697014: system.cpu + A16 T0 : 0xff1aa434    :       fsubd
> > %f31,%f30,%f12    : FloatAdd :  D=0x00000000c0000000
> > @@ -1380951 +1380951 @@
> >  931697038: system.cpu.[tid:0]: Reading float reg 5 (5) bits as 0, 0.
> >  931697038: system.cpu.[tid:0]: Reading float reg 4 (4) bits as 0, 0.
> >  931697038: system.cpu.[tid:0]: Reading float reg 13 (13) bits as 0, 0.
> > -931697038: system.cpu.[tid:0]: Reading float reg 12 (12) bits as 0, 0.
> > +931697038: system.cpu.[tid:0]: Reading float reg 12 (12) bits as
> > 0x80000000, -0.
> >  931697038: global: FSR read as: 0xc0000000
> >  931697038: system.cpu.[tid:0]: Setting float reg 18 (18) bits to 0, 0.
> >  931697038: system.cpu.[tid:0]: Setting float reg 19 (19) bits to 0, 0.
> > @@ -1381022 +1381022 @@
> >  931697042: system.cpu.[tid:0]: Reading float reg 10 (10) bits as
> > 0x41300000, 11.
> >  931697042: global: FSR read as: 0xc0000000
> >  931697042: system.cpu.[tid:0]: Setting float reg 16 (16) bits to
> > 0x41300000, 11.
> > -931697042: system.cpu.[tid:0]: Setting float reg 17 (17) bits to 0xe685,
> > 8.26948e-41.
> > +931697042: system.cpu.[tid:0]: Setting float reg 17 (17) bits to 0xe684,
> > 8.26934e-41.
> >  931697042: global: FSR written with: 0xc0000000
> >  931697042: system.cpu + A16 T0 : 0xff1aa4a4    :       faddd
> %f3,%f2,%f16
> >      : FloatAdd :  D=0x00000000c0000000
> >  931697042: Event_18: AtomicSimpleCPU tick event scheduled @ 931697043
> >
> > Could it be some kind of FP rounding error?  It's not clear how that
> would
> > end up affecting %fp though.  (Actually, looking at this a little closer,
> > are we even disassembling that correctly?  Seems to me it should be 'stdf
> > %f29, [%fp + -0x20]'.)
> >
> > I won't have time to look into this further anytime soon, but I hope this
> > will give someone else (Gabe?) enough to go on to get this figured out.
> >
> > Thanks,
> >
> > Steve
> >
> >
> > On Sun, Oct 23, 2011 at 7:50 PM, Ali Saidi <[email protected]> wrote:
> >
> >> I've installed it.
> >>
> >> Ali
> >>
> >> On Oct 23, 2011, at 7:18 PM, Steve Reinhardt wrote:
> >>
> >>> This makes sense, since the time the regression started failing is
> >>> consistent with when gcc was upgraded on zizzer.
> >>>
> >>> I see there is a gcc-4.4 package available for ubuntu 11.04 (which
> zizzer
> >> is
> >>> running)... is there more to it than installing that package and
> >> recompiling
> >>> to get a workable binary to run tracediff with?
> >>>
> >>> I'd try myself but I've forgotten my zizzer password (again!) so I
> can't
> >>> sudo.  It's tough when you've had the same password for ten years then
> >> you
> >>> change it but don't use the new one much...
> >>>
> >>> Steve
> >>>
> >>> On Sun, Sep 25, 2011 at 1:14 PM, Ali Saidi <[email protected]> wrote:
> >>>
> >>>> Yes.. What Gabe said. With gcc 4.5 (version zizzer now runs) I cannot
> >> find
> >>>> a version of the repository that passes sparc boot.  I'm pretty sure
> >> it's an
> >>>> annoying compiler issue, but there are some annoyances is figuring out
> >> where
> >>>> to look at Gabe points out. If you're stats changes work on everything
> >> else,
> >>>> I'm happy to see them committed while this issue goes on in the
> >> background.
> >>>> Thanks,
> >>>>
> >>>> Ali
> >>>>
> >>>> Sent from my ARM powered device
> >>>>
> >>>> On Sep 25, 2011, at 3:06 PM, Gabe Black <[email protected]>
> wrote:
> >>>>
> >>>>> We (Ali and I) have each looked at that before, and we think it
> depends
> >>>>> on the compiler version. Something changes when you have a new enough
> >>>>> gcc and then the behavior of SPARC changes. I think the new behavior
> is
> >>>>> broken and the old behavior is correct, but I'd have to look at it
> >>>>> again. I haven't looked into it farther than that yet because I'd
> want
> >>>>> to tracediff between versions built with different compilers. Since
> >> they
> >>>>> would need to find different versions of libraries and can't just run
> >>>>> from the same command line, it's logistically annoying.
> >>>>>
> >>>>> Gabe
> >>>>>
> >>>>> On 09/25/11 09:52, nathan binkert wrote:
> >>>>>> I'm trying to get my python stats changes into the tree, but it
> >>>>>> appears that one of the regression tests no longer works (zizzer
> >>>>>> agrees with me):
> >>>>>>
> >>>>>>
> >>
> SPARC_FS/tests/opt/long/80.solaris-boot/sparc/solaris/t1000-simple-atomic
> >>>>>>
> >>>>>> Gabe, I think you're the only one that's been messing with SPARC.
>  Can
> >>>>>> you take a look?
> >>>>>>
> >>>>>> Nate
> >>>>>> _______________________________________________
> >>>>>> gem5-dev mailing list
> >>>>>> [email protected]
> >>>>>> http://m5sim.org/mailman/listinfo/gem5-dev
> >>>>> _______________________________________________
> >>>>> gem5-dev mailing list
> >>>>> [email protected]
> >>>>> http://m5sim.org/mailman/listinfo/gem5-dev
> >>>>>
> >>>> _______________________________________________
> >>>> gem5-dev mailing list
> >>>> [email protected]
> >>>> http://m5sim.org/mailman/listinfo/gem5-dev
> >>>>
> >>> _______________________________________________
> >>> gem5-dev mailing list
> >>> [email protected]
> >>> http://m5sim.org/mailman/listinfo/gem5-dev
> >>>
> >> _______________________________________________
> >> gem5-dev mailing list
> >> [email protected]
> >> http://m5sim.org/mailman/listinfo/gem5-dev
> >>
> > _______________________________________________
> > gem5-dev mailing list
> > [email protected]
> > http://m5sim.org/mailman/listinfo/gem5-dev
>
> _______________________________________________
> gem5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/gem5-dev
>
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to