On Thu, Apr 02, 2009 at 03:24:32PM +0200, Nils Faerber wrote: > I am not that sure about the threads - if I remember correctly also the > Debian Etch rootfs which did not use the TLS extension also sigilled. It > may be that threads simply trigger the bug but are not the real cause. Hmmm, yes.
> > 2) We have variant that works (cache turned off). Great, we have > > to detect that, log it and reboot. I've been thinking about that, and I think we must rule out that we have a variant that certainly works. Let me explain: Say the problem is a certain codepath that's interruptible, but is unsafe to interrupt in some way. But with caches turned off, the system might be that slow that the codepath will not be interrupted because other software is not running yet. So I've got two kind of problems to think of: 1) a codepath that get's interrupted somehow. The duration of that codepath (race) makes it susceptible to something unclean which causes a SIGILL. The duration is influenced either by irq's or software scheduling. 2) a certain piece of code that somehow at some time does not flush caches when it should have. Somehow 1) seems to be the logical one. It doesn't happen very often, but it always happens to certain applications. On a sidenote: Maybe I should check initialization registers between the 2.4.20 kernel and the 2.6.24 . We could be tracing a kernel bug, while all the time the memory refresh timing is off, which will only introduce errors when memory is accessed at full speed, something like that. -- .signature not found _______________________________________________ Mipsbook-devel mailing list Mipsbook-devel@linuxtogo.org http://lists.linuxtogo.org/cgi-bin/mailman/listinfo/mipsbook-devel