I believe what you say, it's what you hadn't said that I was wondering about
;-).

Steve

On Thu, Oct 27, 2011 at 11:23 PM, Gabe Black <[email protected]> wrote:

> I forgot to mention that while working on ARM, I did actually look at
> the assembly that was generated and gcc was moving things around in less
> than helpful ways. You're welcome to look at the assembly if you don't
> believe me :-). SPARC is pretty straightforward ISA description wise, so
> it should be too difficult to find the responsible code.
>
> Gabe
>
> On 10/27/11 11:30, Gabe Black wrote:
> > Exactly this was happening on ARM, and that's why there are all the
> > weird __asm__ statements in its instructions. There I had to also
> > specify variables as inputs and outputs from the __asm__ statements so
> > that other instructions would have to produce their value before or
> > consume their value after that point. Here I can't do that since (quite
> > reasonably) the code that sets the rounding mode is factored out into a
> > common blob, and I don't know what the necessary variables are. There's
> > supposed to be some way to prevent this sort of problem, but for gcc
> > it's not implemented. I forget exactly how that's supposed to work.
> >
> > Gabe
> >
> > On 10/27/11 08:32, Steve Reinhardt wrote:
> >> Are you positive this is it?  It does sound very likely that this is the
> >> issue, but is there indisputable evidence, like you looked at the
> >> disassembly and you can see that things are scheduled in the wrong
> order?
> >>  I'm asking because even though I agree that this seems likely to be the
> >> issue, it seems equally unlikely that gcc would reorder operations
> around
> >> function calls like m5_fesetround() (unless they're inlined), and the
> fact
> >> that the asm statements didn't help seems like further evidence that
> maybe
> >> we're not focusing on exactly the right place.
> >>
> >> Steve
> >>
> >> On Thu, Oct 27, 2011 at 12:35 AM, Gabe Black <[email protected]>
> wrote:
> >>
> >>> I'm convinced we've successfully identified the problem, but
> >>> unfortunately I added barriers liberally and it still failed.
> >>>
> >>> Gabe
> >>>
> >>>    int newrnd = M5_FE_TONEAREST;
> >>>    switch (Fsr<31:30>) {
> >>>      case 0: newrnd = M5_FE_TONEAREST; break;
> >>>      case 1: newrnd = M5_FE_TOWARDZERO; break;
> >>>      case 2: newrnd = M5_FE_UPWARD; break;
> >>>      case 3: newrnd = M5_FE_DOWNWARD; break;
> >>>    }
> >>>    __asm__ __volatile__ ("" ::: "memory");
> >>>    int oldrnd = m5_fegetround();
> >>>    __asm__ __volatile__ ("" ::: "memory");
> >>>    m5_fesetround(newrnd);
> >>>    __asm__ __volatile__ ("" ::: "memory");
> >>> """
> >>>
> >>>        fp_code += code
> >>>
> >>>
> >>>        fp_code += """
> >>>    __asm__ __volatile__ ("" ::: "memory");
> >>>   m5_fesetround(oldrnd);
> >>>    __asm__ __volatile__ ("" ::: "memory");
> >>> """
> >>>        fp_code = filterDoubles(fp_code)
> >>>        iop = InstObjParams(name, Name, 'SparcStaticInst', fp_code,
> flags)
> >>>        header_output = BasicDeclare.subst(iop)
> >>>        decoder_output = BasicConstructor.subst(iop)
> >>>        decode_block = BasicDecode.subst(iop)
> >>>        exec_output = BasicExecute.subst(iop)
> >>> }};
> >>>
> >>>
> >>> On 10/26/11 07:10, Steve Reinhardt wrote:
> >>>> I forgot to mention that I fired off a gem5.debug run before I went to
> >>> bed
> >>>> last night, and it completed successfully.  So it does appear to be
> the
> >>>> optimizer.
> >>>>
> >>>> Steve
> >>>>
> >>>> On Wed, Oct 26, 2011 at 12:55 AM, Gabe Black <[email protected]>
> >>> wrote:
> >>>>> On 10/25/11 22:28, Ali Saidi wrote:
> >>>>>> On Tue, 25 Oct 2011 11:53:29 -0700, Gabe Black <
> [email protected]>
> >>>>>> wrote:
> >>>>>>> On 10/25/11 07:46, Steve Reinhardt wrote:
> >>>>>>>> On Tue, Oct 25, 2011 at 2:30 AM, Gabe Black <
> [email protected]>
> >>>>>>>> wrote:
> >>>>>>>>> I'm currently building binutils for SPARC, so hopefully I can
> >>>>>>>>> disassemble some things and get a better idea of what's going on.
> >>> It's
> >>>>>>>>> probably going to be really annoying to figure it out.
> >>>>>>>> If it's really just an FP rounding error, it might not be that
> >>>>>>>> hard... just
> >>>>>>>> look at the examples from the trace of where it's going wrong,
> >>>>>>>> figure out
> >>>>>>>> what the right answer is, and focus on those few instructions.  FP
> >>>>>>>> is pretty
> >>>>>>>> thoroughly specified by IEEE, so if it's not an outright compiler
> >>>>>>>> bug, maybe
> >>>>>>>> it's just some change in the default rounding settings or
> something.
> >>>>>>> Yeah, I think ISAs treat IEEE as a really good suggestion rather
> than
> >>> a
> >>>>>>> standard. ARM isn't strictly conformant, and neither is x86. The
> >>> default
> >>>>>>> rounding mode *is* standard, though, and I don't think is adjusted
> in
> >>>>>>> SPARC as a result of execution. If it changed somehow (unless I'm
> >>>>>>> forgetting where SPARC does that) it's a fairly significant
> problem.
> >>>>>>> Whether instructions generate +/- 0 in various situations may
> depend
> >>> on,
> >>>>>>> for instance, what order gcc decides to put the operands. I'm not
> sure
> >>>>>>> that it does, but there are all kinds of weird, subtle behaviors
> with
> >>>>>>> FP, and you can't just fix how add works if x86 picked the wrong
> >>> thing.
> >>>>>>> Then you have to replace add, or semi-replace it by faking it out
> with
> >>>>>>> other FP operations. If we're running real x87 instructions (we
> >>>>>>> shouldn't be in 64 bit mode, but we still could) then those use 80
> bit
> >>>>>>> operands internally. Where and when rounding takes place depends on
> >>> when
> >>>>>>> those are moved in/out of the FPU, and will be different than true
> 64
> >>>>>>> bit operands. SSE based FP uses real 64 bit doubles, so that should
> >>>>>>> behave better. It should also be the default in 64 bit mode since
> the
> >>>>>>> compiler can assume some basic SSE support is present.
> >>>>>> The rounding mode in SPARC is controlled by bits 31:30 of the FSR.
> My
> >>>>>> guess is that this is actually the problem and gcc 4.5+ is doing
> some
> >>>>>> code motion that is moving the actual fp code around our setting of
> >>>>>> the rounding mode. Using one of the asm tricks to prevent code
> >>>>>> movement (supposedly an empty asm() is supposed to be  code barrier
> in
> >>>>>> gcc), might fix the problem. I don't have time to try it, but
> >>>>>> src/arch/sparc/isa/formats/basic.isa:145 looks like the right place.
> >>>>>> Also, trying to run the regression with m5.debug might see if the
> >>>>>> optimizer is at fault.
> >>>>>>
> >>>>>> Ali
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> gem5-dev mailing list
> >>>>>> [email protected]
> >>>>>> http://m5sim.org/mailman/listinfo/gem5-dev
> >>>>> Ah, ok, so we do set the mode apparently. I'll try gem5.debug and
> also
> >>>>> look at that template and see what I can see. Thanks Ali!
> >>>>>
> >>>>> Gabe
> >>>>> _______________________________________________
> >>>>> gem5-dev mailing list
> >>>>> [email protected]
> >>>>> http://m5sim.org/mailman/listinfo/gem5-dev
> >>>>>
> >>>> _______________________________________________
> >>>> gem5-dev mailing list
> >>>> [email protected]
> >>>> http://m5sim.org/mailman/listinfo/gem5-dev
> >>> _______________________________________________
> >>> gem5-dev mailing list
> >>> [email protected]
> >>> http://m5sim.org/mailman/listinfo/gem5-dev
> >>>
> >> _______________________________________________
> >> gem5-dev mailing list
> >> [email protected]
> >> http://m5sim.org/mailman/listinfo/gem5-dev
> > _______________________________________________
> > gem5-dev mailing list
> > [email protected]
> > http://m5sim.org/mailman/listinfo/gem5-dev
>
> _______________________________________________
> gem5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/gem5-dev
>
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to