The bug is tagged future upstream . Afaik it was known at the time i
reported it, though it affected mostly windows users.

the meta bug is :
http://bugzilla.mozilla.org/show_bug.cgi?id=109286

special case where it happened:
https://bugzilla.mozilla.org/show_bug.cgi?id=224487

this is also related to the kernel, the big security annoucment of this
summer was about signaler handler interruptions preventing the kernel 
from restoring the correct fpu state before closing the process.
Another fix for threads this time is coming in 2.6.11 (it is in 2.6.10-ac
and 11-mm), but that does not fix my crash still.
https://bugzilla.mozilla.org/show_bug.cgi?id=136100
make think it could still be a kernel bug, i also found a patch for a
fpu problem in 2.4 which at first sight is not in 2.6 (it may be the one
that fixed above bug). I ll test it soon.


I don't think it is a gcc bug in any way. Quite the opposite, gcc have a
"future" bug too, for it to stop reinitializing the fpu control register
between each float operation (which is pretty slow).
And from the ieee standard the gcc debian mainainer pointed me too:
(url in last paragraph about the general problem) , even those fpu
reinitialisations be it from gcc asm or mozilla checking its state each
time it can are only hacks. This worst case i found is that even if
setting the variable as float, if the intermediate operation on the
float does not require it to pass by the stack , the computation will be
done with the fpu precision (double) and only the result is casted to a
float. The "fix" used for windows build force this by setting the
optimisation to 0 in jsdtsrc  Makefile. But it may not be in all the
libs using this algo Makefile (thus that may still explain a few
random segfault).
The gcc param to fix it is -ffloat-store, it force float to pass by the
stack between each operation, thus rounding the value each time.

There is a new ifdef in this algorithm :
http://lxr.mozilla.org/seamonkey/source/js/src/jsdtoa.c#2379
#ifdef Check_FLT_ROUNDS
2397             /* If FLT_ROUNDS == 2, L will usually be high by 1 */
this should avoid the rounding problem (if the precision is too high is
somewhat round it )

I would not have called it Check* as it round the value without
warnings, but it does the job i did with my patch using temp variable.
(which i now know fix the issue by forcing the operands to pass by the
stack between each computation, thus preventing registers only
computation with high precision cf bug ).

May this bug be tagged upstream ? i won't post update on debian as it is
an upstream bug 

There is a hack used for the windows users:
https://bugzilla.mozilla.org/show_bug.cgi?id=140852
On my linux firefox 1+ non installer build from w.m.o segfault too,
but mozilla trunk does not . biesi from #mozilla  irc told me it could
related to firefox being compiled with more optimisation,maybe the
debian packages are too.
This bug report also tell this -ffloat-store trick for linux
gcc that does the same is the -O0. (there does not seems to be such a
flag for windows so they used /O0 for the jsdtsrc lib).



To end up with that it does not only affect mozilla-browser but nspr
(the algorithm i extracted in previous post is there too). It affects
all gecko based browsers (galeon, firefox, ...) and evolution2 via the
libnspr4 library. That s all as far as i know.

The final patch seems far off. There are patches for resetting the fpu
before each float operation, other that hardcode 0 optimisations in
Makefiles, some assumptions are far fetched ("switching the FPU to a
53 bit mantesa aka IEEE compliant." which it is not, the ieee standard
define ervery possible mantissa, the default seems 64 from:
http://webster.cs.ucr.edu/AoA/Linux/HTML/RealArithmetic.html
http://www.srware.com/linux_numerics.txt
and the doc pointed by debian gcc maintainers:
http://bugs.debian.org/216616

and :
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14708
to end up the story, that maybe nobody has the full picture :/
luckily google and all those open bts are there. This may be a gcc bug,
a libc bug, a proc/mb bug, a kernel bug or mozilla ... maybe its just
the ieee 754 that is f*** up from day one :)


Afaik all those patches try to avoid fixing the float operation to manage
all precision and plays tricks to keep the precision at 53 bits
mantissa. Be it OS2, windows (or linux but i don't understand why it
does fails only on my proc/mb/kernel... , linux use 64bit mantissa by
default so it should always fails !). I guess some hacks somewhere avoid
the error in most cases, but may fails in corner cases.
There is a new c function in c99, lrint which avoid to round the float
efficiently, i hope it may end up to the mess.


Ciao
Alban


PS: java vm are concerned too :) maybe th
sablevm.org/bugs/1
gcc.gnu.org/ml/java-prs/2004-q2/msg00227.html 


PS2: sorry for any typo, i am still new to mozilla. It s not really a
piece of cake .)
That s a lot of thing to tests. It was mostly to keep you informed that
the issue is resolvable, quite unreproducible (maybe someone with an
athlon thunderbird 1.4 and a via motherboard, can check) except if
optimisation where introduced in mozilla a few days before my first
report another thing to check).
I will also try compiling with gcc 3.4 but i have only found float issues
 in previous gcc  for arm or alpha. Maybe it can fix some mozilla bugs
 on those platforms.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to