Quoting "Joseph S. Myers" <[email protected]>:
That diff does not appear to relate to undefined behavior. GCC considers
these out-of-range conversions to yield an unspecified value, possibly
raising an exception, as per Annex F, and does not take the liberty of
optimizing on the basis of them being undefined when not in an IEEE mode.
Well, still, the test is wrong in possibly raising an exception there,
with no provisions to ignore the exception or catch any signal raised.
For the ARCompact, in order to test the floating point emulation better,
I had (there are still there in #if 0 /*DEBUG */ blocks) small wrappers
for each function to evaluate it once with the hand-optimized version,
and once with fp-bit.c, and abort on getting different values.
Now, fp-bit generally tries to yield some value that the programmer thought
might mean something, whereas the hand-optimized version treats computations
of unspecified values as irrelevant.
Considering:
GLOBAL(fixunsdfsi):
mov.w LOCAL(x413),r1 ! bias + 20
mov DBL0H,r0
shll DBL0H
mov.l LOCAL(mask),r3
mov #-21,r2
shld r2,DBL0H ! SH4-200 will start this insn in a new cycle
bt/s LOCAL(ret0)
sub r1,DBL0H
cmp/pl DBL0H ! SH4-200 will start this insn in a new cycle
and r3,r0
bf/s LOCAL(ignore_low)
addc r3,r0 ! uses T == 1; sets implict 1
mov #11,r2
shld DBL0H,r0 ! SH4-200 will start this insn in a new cycle
cmp/gt r2,DBL0H
add #-32,DBL0H
bt LOCAL(retmax)
shld DBL0H,DBL0L
rts
or DBL0L,r0
and:
__fixunsdfsi:
bbit0 DBL0H,30,.Lret0or1
lsr r2,DBL0H,20
bmsk_s DBL0H,DBL0H,19
sub_s r2,r2,19; 0x3ff+20-0x400
neg_s r3,r2
btst_s r3,10
bset_s DBL0H,DBL0H,20
#ifdef __LITTLE_ENDIAN__
mov.ne DBL0L,DBL0H
asl DBL0H,DBL0H,r2
#else
asl.eq DBL0H,DBL0H,r2
lsr.ne DBL0H,DBL0H,r3
#endif
lsr DBL0L,DBL0L,r3
j_s.d [blink]
add.eq r0,r0,r1
.Lret0:
j_s.d [blink]
mov_l r0,0
.Lret0or1:
add_s DBL0H,DBL0H,0x100000
lsr_s DBL0H,DBL0H,30
j_s.d [blink]
bmsk_l r0,DBL0H,0
You can see that an SH4-300 can perform software floating point
fixunsdfsi in ten cycles, and the SH4-400 (SH4-200 sans FPU)
and ARC700 in twelve.
Adding any code in order to compute nice, fluffy values for
unspecified results would cause a significant performance degradation.