Exactly this was happening on ARM, and that's why there are all the weird __asm__ statements in its instructions. There I had to also specify variables as inputs and outputs from the __asm__ statements so that other instructions would have to produce their value before or consume their value after that point. Here I can't do that since (quite reasonably) the code that sets the rounding mode is factored out into a common blob, and I don't know what the necessary variables are. There's supposed to be some way to prevent this sort of problem, but for gcc it's not implemented. I forget exactly how that's supposed to work.
Gabe On 10/27/11 08:32, Steve Reinhardt wrote: > Are you positive this is it? It does sound very likely that this is the > issue, but is there indisputable evidence, like you looked at the > disassembly and you can see that things are scheduled in the wrong order? > I'm asking because even though I agree that this seems likely to be the > issue, it seems equally unlikely that gcc would reorder operations around > function calls like m5_fesetround() (unless they're inlined), and the fact > that the asm statements didn't help seems like further evidence that maybe > we're not focusing on exactly the right place. > > Steve > > On Thu, Oct 27, 2011 at 12:35 AM, Gabe Black <[email protected]> wrote: > >> I'm convinced we've successfully identified the problem, but >> unfortunately I added barriers liberally and it still failed. >> >> Gabe >> >> int newrnd = M5_FE_TONEAREST; >> switch (Fsr<31:30>) { >> case 0: newrnd = M5_FE_TONEAREST; break; >> case 1: newrnd = M5_FE_TOWARDZERO; break; >> case 2: newrnd = M5_FE_UPWARD; break; >> case 3: newrnd = M5_FE_DOWNWARD; break; >> } >> __asm__ __volatile__ ("" ::: "memory"); >> int oldrnd = m5_fegetround(); >> __asm__ __volatile__ ("" ::: "memory"); >> m5_fesetround(newrnd); >> __asm__ __volatile__ ("" ::: "memory"); >> """ >> >> fp_code += code >> >> >> fp_code += """ >> __asm__ __volatile__ ("" ::: "memory"); >> m5_fesetround(oldrnd); >> __asm__ __volatile__ ("" ::: "memory"); >> """ >> fp_code = filterDoubles(fp_code) >> iop = InstObjParams(name, Name, 'SparcStaticInst', fp_code, flags) >> header_output = BasicDeclare.subst(iop) >> decoder_output = BasicConstructor.subst(iop) >> decode_block = BasicDecode.subst(iop) >> exec_output = BasicExecute.subst(iop) >> }}; >> >> >> On 10/26/11 07:10, Steve Reinhardt wrote: >>> I forgot to mention that I fired off a gem5.debug run before I went to >> bed >>> last night, and it completed successfully. So it does appear to be the >>> optimizer. >>> >>> Steve >>> >>> On Wed, Oct 26, 2011 at 12:55 AM, Gabe Black <[email protected]> >> wrote: >>>> On 10/25/11 22:28, Ali Saidi wrote: >>>>> On Tue, 25 Oct 2011 11:53:29 -0700, Gabe Black <[email protected]> >>>>> wrote: >>>>>> On 10/25/11 07:46, Steve Reinhardt wrote: >>>>>>> On Tue, Oct 25, 2011 at 2:30 AM, Gabe Black <[email protected]> >>>>>>> wrote: >>>>>>>> I'm currently building binutils for SPARC, so hopefully I can >>>>>>>> disassemble some things and get a better idea of what's going on. >> It's >>>>>>>> probably going to be really annoying to figure it out. >>>>>>> If it's really just an FP rounding error, it might not be that >>>>>>> hard... just >>>>>>> look at the examples from the trace of where it's going wrong, >>>>>>> figure out >>>>>>> what the right answer is, and focus on those few instructions. FP >>>>>>> is pretty >>>>>>> thoroughly specified by IEEE, so if it's not an outright compiler >>>>>>> bug, maybe >>>>>>> it's just some change in the default rounding settings or something. >>>>>> Yeah, I think ISAs treat IEEE as a really good suggestion rather than >> a >>>>>> standard. ARM isn't strictly conformant, and neither is x86. The >> default >>>>>> rounding mode *is* standard, though, and I don't think is adjusted in >>>>>> SPARC as a result of execution. If it changed somehow (unless I'm >>>>>> forgetting where SPARC does that) it's a fairly significant problem. >>>>>> Whether instructions generate +/- 0 in various situations may depend >> on, >>>>>> for instance, what order gcc decides to put the operands. I'm not sure >>>>>> that it does, but there are all kinds of weird, subtle behaviors with >>>>>> FP, and you can't just fix how add works if x86 picked the wrong >> thing. >>>>>> Then you have to replace add, or semi-replace it by faking it out with >>>>>> other FP operations. If we're running real x87 instructions (we >>>>>> shouldn't be in 64 bit mode, but we still could) then those use 80 bit >>>>>> operands internally. Where and when rounding takes place depends on >> when >>>>>> those are moved in/out of the FPU, and will be different than true 64 >>>>>> bit operands. SSE based FP uses real 64 bit doubles, so that should >>>>>> behave better. It should also be the default in 64 bit mode since the >>>>>> compiler can assume some basic SSE support is present. >>>>> The rounding mode in SPARC is controlled by bits 31:30 of the FSR. My >>>>> guess is that this is actually the problem and gcc 4.5+ is doing some >>>>> code motion that is moving the actual fp code around our setting of >>>>> the rounding mode. Using one of the asm tricks to prevent code >>>>> movement (supposedly an empty asm() is supposed to be code barrier in >>>>> gcc), might fix the problem. I don't have time to try it, but >>>>> src/arch/sparc/isa/formats/basic.isa:145 looks like the right place. >>>>> Also, trying to run the regression with m5.debug might see if the >>>>> optimizer is at fault. >>>>> >>>>> Ali >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> gem5-dev mailing list >>>>> [email protected] >>>>> http://m5sim.org/mailman/listinfo/gem5-dev >>>> Ah, ok, so we do set the mode apparently. I'll try gem5.debug and also >>>> look at that template and see what I can see. Thanks Ali! >>>> >>>> Gabe >>>> _______________________________________________ >>>> gem5-dev mailing list >>>> [email protected] >>>> http://m5sim.org/mailman/listinfo/gem5-dev >>>> >>> _______________________________________________ >>> gem5-dev mailing list >>> [email protected] >>> http://m5sim.org/mailman/listinfo/gem5-dev >> _______________________________________________ >> gem5-dev mailing list >> [email protected] >> http://m5sim.org/mailman/listinfo/gem5-dev >> > _______________________________________________ > gem5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/gem5-dev _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
