Hi.

On 5 November 2013 02:32, David Kuehling <[email protected]> wrote:

> >>>>> "Graham" == Graham Whaley <[email protected]> writes:
>
> > On 4 November 2013 07:18, David Kuehling <[email protected]> wrote:
> >>>>>> "Andreas" == Andreas Barth <[email protected]> writes:
>
> >> * David Kuehling ([email protected]) [131103 16:00]:
> >>> Since then I've encountered system deadlocks every one-two days (the
> >>> system in question is running continuously 24/7).  Deadlock meaning,
> >>> that the system does seems completely dead, even num-lock LED cannot
> >>> be toggled any more (but fan is still spinning etc.).
> >>>
> >>> I never had stability problems on kernel 2.6.39.  I did have a
> >>> single deadlock when testing the debian-backports kernel package for
> >>> kernel 3.2.0 on debian squeeze (but I ran that kernel only for about
> >>> 2 days before upgrading to Wheezy).
>
> >> Can you try the old kernel if it happens with the old kernel and new
> >> userland?
> [..]
> > As to the instabilities - it occurs to me if this may be connected to
> > the Loongson 2f 'issues', as documented at [1] I believe (and btw,
> > would love if somebody could confirm and point me at any archive
> > links) that Debian-mips moved from MIPSI to MIPSII ISA when it when
> > from Squeeze to Wheezy. I'm wondering if maybe that change in code
> > layout may have bought one of these issues to the surface? Or maybe
> > that your kernel or RFS needs to be built with the options listed in
> > the link, and you've been "lucky" so far?  As far as I can find out,
> > there is no easy way (apart from maybe looking at the top of the chip
> > :-( ) to tell if you have a 2F01, 2F02 or 2F03 version of the 2F SoC,
> > and only the 2F03 is 'fixed' :-(  Anybody know for sure? I'm sure this
> > has probably been discussed before in the past.
>
> > Please feel free to educate me on if any of these 2F fixes are turned
> > on by default for upstream Debian. I doubt they are? And sorry if I've
> > missed some subtlety here?
>
> Hi Graham,
>
> I had the same thought - the lockups certainly look similar to what I
> experienced when running a Linux kernel compiled from source without the
> Loongson2f instruction fixes enabled in the kernel config (that would
> indicate I have one of the older SoCs).
>
> Looking at the output of objdump -D libc.so, it looks to me like the
> correct "fixed" NOP sequence is used (shown by the disassembler as "move
> at,at", which is synonomous for "or at,at,zero).  So Debian userspace
> looks like it's loongson compatible.
>
> Running objdump -D on the 3.2 kernel image (that's a gzip compressed
> image, so I guess the code I see is only a small bootstrap sequence for
> ungzipping the rest) I can see the right NOP sequence plus the extra
> code in front of indirect jumps (e.g. function return statements).  This
> looks like being compiled with -mfix-loongson2f-jump plus
> -mfix-loongson-nop.
>
> So far this looks good.  Hopefully we're not hitting new, undocumented
> CPU bugs here.
>
> After finally getting update-initrd to build a working image for my
> 2.6.39.4 kernel, I'm now back to running the same kernel I used a long
> time with squeeze.  If the lockups don't happen until the end of week
> we'll have another data point.
>
Any luck with this David - did it lock up, or still running ?

 Graham


>
> With 2.6.39.4 BTW I'm not using the loongson2f optimized libc from
> package libc6-loongson2f (only newer kernels seem to supply the right
> hwcap info for ld.so to choose the loongson2f optimized versions of
> libraries).  I'll have to run another test to see whether the loongson2f
> libc has anything to do WRT lockups.
>
> I noticed that going from linux 2.6.39 to 3.2, the process scheduling
> improved dramatically.  On 2.6.39 'nice' values seem to be ignored, and
> output of 'top' often looks wrong, like multiple processes using exactly
> the same CPU amount, without any variation.  Maybe newer loongson2f
> kernels changed to using a more accurate clock source for process CPU
> usage accounting.  These changes could also be a source for deadlocks.
> Hopefully I'll not have to bisect all linux versions before 3.2 to
> finally solve the issue.
>
> cheers,
>
> David
>
> --
> GnuPG public key: http://dvdkhlng.users.sourceforge.net/dk2.gpg
> Fingerprint: B63B 6AF2 4EEB F033 46F7  7F1D 935E 6F08 E457 205F
>

Reply via email to