Exactly this was happening on ARM, and that's why there are all the
weird __asm__ statements in its instructions. There I had to also
specify variables as inputs and outputs from the __asm__ statements so
that other instructions would have to produce their value before or
consume their value after that point. Here I can't do that since (quite
reasonably) the code that sets the rounding mode is factored out into a
common blob, and I don't know what the necessary variables are. There's
supposed to be some way to prevent this sort of problem, but for gcc
it's not implemented. I forget exactly how that's supposed to work.

Gabe

On 10/27/11 08:32, Steve Reinhardt wrote:
> Are you positive this is it?  It does sound very likely that this is the
> issue, but is there indisputable evidence, like you looked at the
> disassembly and you can see that things are scheduled in the wrong order?
>  I'm asking because even though I agree that this seems likely to be the
> issue, it seems equally unlikely that gcc would reorder operations around
> function calls like m5_fesetround() (unless they're inlined), and the fact
> that the asm statements didn't help seems like further evidence that maybe
> we're not focusing on exactly the right place.
>
> Steve
>
> On Thu, Oct 27, 2011 at 12:35 AM, Gabe Black <[email protected]> wrote:
>
>> I'm convinced we've successfully identified the problem, but
>> unfortunately I added barriers liberally and it still failed.
>>
>> Gabe
>>
>>    int newrnd = M5_FE_TONEAREST;
>>    switch (Fsr<31:30>) {
>>      case 0: newrnd = M5_FE_TONEAREST; break;
>>      case 1: newrnd = M5_FE_TOWARDZERO; break;
>>      case 2: newrnd = M5_FE_UPWARD; break;
>>      case 3: newrnd = M5_FE_DOWNWARD; break;
>>    }
>>    __asm__ __volatile__ ("" ::: "memory");
>>    int oldrnd = m5_fegetround();
>>    __asm__ __volatile__ ("" ::: "memory");
>>    m5_fesetround(newrnd);
>>    __asm__ __volatile__ ("" ::: "memory");
>> """
>>
>>        fp_code += code
>>
>>
>>        fp_code += """
>>    __asm__ __volatile__ ("" ::: "memory");
>>   m5_fesetround(oldrnd);
>>    __asm__ __volatile__ ("" ::: "memory");
>> """
>>        fp_code = filterDoubles(fp_code)
>>        iop = InstObjParams(name, Name, 'SparcStaticInst', fp_code, flags)
>>        header_output = BasicDeclare.subst(iop)
>>        decoder_output = BasicConstructor.subst(iop)
>>        decode_block = BasicDecode.subst(iop)
>>        exec_output = BasicExecute.subst(iop)
>> }};
>>
>>
>> On 10/26/11 07:10, Steve Reinhardt wrote:
>>> I forgot to mention that I fired off a gem5.debug run before I went to
>> bed
>>> last night, and it completed successfully.  So it does appear to be the
>>> optimizer.
>>>
>>> Steve
>>>
>>> On Wed, Oct 26, 2011 at 12:55 AM, Gabe Black <[email protected]>
>> wrote:
>>>> On 10/25/11 22:28, Ali Saidi wrote:
>>>>> On Tue, 25 Oct 2011 11:53:29 -0700, Gabe Black <[email protected]>
>>>>> wrote:
>>>>>> On 10/25/11 07:46, Steve Reinhardt wrote:
>>>>>>> On Tue, Oct 25, 2011 at 2:30 AM, Gabe Black <[email protected]>
>>>>>>> wrote:
>>>>>>>> I'm currently building binutils for SPARC, so hopefully I can
>>>>>>>> disassemble some things and get a better idea of what's going on.
>> It's
>>>>>>>> probably going to be really annoying to figure it out.
>>>>>>> If it's really just an FP rounding error, it might not be that
>>>>>>> hard... just
>>>>>>> look at the examples from the trace of where it's going wrong,
>>>>>>> figure out
>>>>>>> what the right answer is, and focus on those few instructions.  FP
>>>>>>> is pretty
>>>>>>> thoroughly specified by IEEE, so if it's not an outright compiler
>>>>>>> bug, maybe
>>>>>>> it's just some change in the default rounding settings or something.
>>>>>> Yeah, I think ISAs treat IEEE as a really good suggestion rather than
>> a
>>>>>> standard. ARM isn't strictly conformant, and neither is x86. The
>> default
>>>>>> rounding mode *is* standard, though, and I don't think is adjusted in
>>>>>> SPARC as a result of execution. If it changed somehow (unless I'm
>>>>>> forgetting where SPARC does that) it's a fairly significant problem.
>>>>>> Whether instructions generate +/- 0 in various situations may depend
>> on,
>>>>>> for instance, what order gcc decides to put the operands. I'm not sure
>>>>>> that it does, but there are all kinds of weird, subtle behaviors with
>>>>>> FP, and you can't just fix how add works if x86 picked the wrong
>> thing.
>>>>>> Then you have to replace add, or semi-replace it by faking it out with
>>>>>> other FP operations. If we're running real x87 instructions (we
>>>>>> shouldn't be in 64 bit mode, but we still could) then those use 80 bit
>>>>>> operands internally. Where and when rounding takes place depends on
>> when
>>>>>> those are moved in/out of the FPU, and will be different than true 64
>>>>>> bit operands. SSE based FP uses real 64 bit doubles, so that should
>>>>>> behave better. It should also be the default in 64 bit mode since the
>>>>>> compiler can assume some basic SSE support is present.
>>>>> The rounding mode in SPARC is controlled by bits 31:30 of the FSR. My
>>>>> guess is that this is actually the problem and gcc 4.5+ is doing some
>>>>> code motion that is moving the actual fp code around our setting of
>>>>> the rounding mode. Using one of the asm tricks to prevent code
>>>>> movement (supposedly an empty asm() is supposed to be  code barrier in
>>>>> gcc), might fix the problem. I don't have time to try it, but
>>>>> src/arch/sparc/isa/formats/basic.isa:145 looks like the right place.
>>>>> Also, trying to run the regression with m5.debug might see if the
>>>>> optimizer is at fault.
>>>>>
>>>>> Ali
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> gem5-dev mailing list
>>>>> [email protected]
>>>>> http://m5sim.org/mailman/listinfo/gem5-dev
>>>> Ah, ok, so we do set the mode apparently. I'll try gem5.debug and also
>>>> look at that template and see what I can see. Thanks Ali!
>>>>
>>>> Gabe
>>>> _______________________________________________
>>>> gem5-dev mailing list
>>>> [email protected]
>>>> http://m5sim.org/mailman/listinfo/gem5-dev
>>>>
>>> _______________________________________________
>>> gem5-dev mailing list
>>> [email protected]
>>> http://m5sim.org/mailman/listinfo/gem5-dev
>> _______________________________________________
>> gem5-dev mailing list
>> [email protected]
>> http://m5sim.org/mailman/listinfo/gem5-dev
>>
> _______________________________________________
> gem5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/gem5-dev

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to