On Sat, Oct 29, 2011 at 7:09 PM, Gabriel Michael Black <
[email protected]> wrote:

> That sounds a bit like what I did with the asm blocks in ARM, except that
> it would modify that function and actually do that call. We'd also need to
> have single and double versions even though the parameter isn't used.


I'm not sure you would need two versions... it seems like if you defined
only a double version, but passed in a single arg and assigned the result
to a float var, the value would get silently converted to double and back
without any side effects (since logically it would just append a bunch of
zeros to the mantissa and then strip them back off, and depending on the
host ISA, it might not do anything at all).


> This is the sort of thing I'm talking about from ARM.
>
> m5_fesetround(newrnd)
> __asm__ __volatile__ ("" : "=m" (Frs1s) : "m" (Frs1s));
> __asm__ __volatile__ ("" : "=m" (Frs2s) : "m" (Frs2s));
>
> Frds = Frs1s + Frs2s;
> __asm__ __volatile__ ("" : "=m" (Frds) : "m" (Frds));
> m5_fesetround(oldrnd)
>
> The gcc is obligated to use the values of Frs1s and Frs2s "returned" by
> the first two asm blocks, and it's obligated to pass the result as a
> parameter to the third asm block. Those constraints pinch it in the middle
> and force the operation to fall inside the m5_fesetround calls.
>

It's a similar idea, but it seems to me that since it doesn't involve the
m5_fesetround call directly, and the compiler might be able to figure out
that there's no way m5_fesetround could legally get a pointer to the memory
locations holding the register values, it still wouldn't be obligated to
order the function calls with respect to the FP operation.  In contrast, as
long as the compiler doesn't have any visibility into the body of the
m5_fesetround calls to realize they're not doing anything to the FP arg, it
will have no choice but to strictly order those calls wrt the FP operation.


> One thing I just thought of is that I'm not completely sure that gcc will
> leave those variables in the same place when they're used for inputs and
> outputs. Maybe it expects the asm to move the input to the output even if
> it doesn't do anything else? Note that while they *look* like the same
> variable, there's nothing (that I know of) that requires gcc to make that
> name refer to the same storage all the time, just the same value. There's
> syntax to tell it to use particular output as an input too (or the other
> way around?), and that may make this sort of thing less of an issue. I have
> no good reason to think there's actually a problem here, but hypothetically
> it could be yet another problem with playing these sorts of games.
>

Yea, I can see where the asm statement might need to be a "mov" from the
source to the dest to cover the case where they're not the same location.

Of course, if this asm approach fundamentally doesn't solve the problem,
then the question is moot.

On Sat, Oct 29, 2011 at 7:18 PM, Gabriel Michael Black <
[email protected]> wrote:

> Even this isn't fool proof, though. Theoretically here's nothing to
> prevent gcc from doing the same calculation twice, once inside the asm
> blobs and once outside. It may assume the values are the same and use the
> wrong one as the result.


I don't understand what you're saying here, but if we're just agreeing that
the asm thing probably isn't sufficient, there's no need to explain further
:-).


> That -frounding-math would prevent that, I think, since it prevents gcc
> from assuming the rounding mode is always the same.


My interpretation is that -frounding-math would solve all our problems if
only it were implemented correctly.  Given that it's not implemented
correctly, can we count on it to do anything for us?  Or am I
misinterpreting that bugzilla link you sent?

Steve
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to