from:"Thomas Sattler"

Re: 2.6.22-rc6 spurious hangs

2007-07-01 Thread Thomas Sattler

>>> Thomas, any chance you could try the patch below?
>> I'm still testing but I couldn't break it until now.
> Great, thanks a lot Thomas!
The box is still running without a problem,
it seems the bug is fixed.

Thanks a lot,
Thomas

-- 
keep mailinglists in english, feel free to send PM in german
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22-rc6 spurious hangs

2007-07-01 Thread Thomas Sattler

> Thomas, any chance you could try the patch below? It is very, very stupid,
> it was done without any understanding of this code, and of course it is
> completely untested. I doubt very much it is correct, and even if it is
> correct it is definitely not good. It would be great if Dmitry can take a 
> look.

I'm still testing but I couldn't break it until now.
And I didn't find any drawbacks yet.

Thomas

-- 
keep mailinglists in english, feel free to send PM in german
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22-rc6 spurious hangs

2007-07-01 Thread Thomas Sattler

 Thomas, any chance you could try the patch below? It is very, very stupid,
 it was done without any understanding of this code, and of course it is
 completely untested. I doubt very much it is correct, and even if it is
 correct it is definitely not good. It would be great if Dmitry can take a 
 look.

I'm still testing but I couldn't break it until now.
And I didn't find any drawbacks yet.

Thomas

-- 
keep mailinglists in english, feel free to send PM in german
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22-rc6 spurious hangs

2007-07-01 Thread Thomas Sattler

 Thomas, any chance you could try the patch below?
 I'm still testing but I couldn't break it until now.
 Great, thanks a lot Thomas!
The box is still running without a problem,
it seems the bug is fixed.

Thanks a lot,
Thomas

-- 
keep mailinglists in english, feel free to send PM in german
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Thomas Sattler

>> Jun 28 19:23:03 pearl cinergyt2_query_rc+0x0/0x2e9 [cinergyT2]
> 
> cinergyt2_query_rc() hangs. I'll try to look tomorrov, but I know nothing
> about drivers/media/dvb/.

Does this mean the problem is in the cinergyt2 driver? I'm having similar
problems with another box but with different hardware. While my laptop is
used as a test system the other one is used as a 'productive' TV-recorder.
I hoped we could trace the bug on the test system and fix the productive
one at the same time. :-/

The other box ("silver") is a desktop, which has two Hauppauge Nova-T DVB-T
PCI cards and one (analog) Hauppauge WinTV PVR-350. Silver only hangs if
the (digital) recording process has to much priority: (silver is running
2.6.21.5-cfs-v17 +squashfs +ivtv)

As I wanted to give as much priority to the recording process as possible
I firstly run dvbd as SCHED_RR. This hung the box quite often, sometimes
after an uptime of several minutes, sometimes after two weeks.

I switched to -ck and run dvbd as SCHED_ISO which worked without *any*
problem for about 18 months. As -ck is discontinued I switched to CFS and
the box hung again (twice until I understood why) when dvbd was running as
nice -15.

ATM dvbd runs with nice -12 but yesterday, during a rsync-transfer of
several >4G files, a recording was broken. 29 seconds of the recorded
stream are lost because the system load was at 5 for about three hours.

Perhaps the 29 missing seconds are caused not by to less CPU time but by
the havy IO of rsync. But on the other hand dvbd is also running at IO
realtime prio 4 (ionice) while rsync run as IO normal.

Any hints?
Thomas

-- 
keep mailinglists in english, feel free to send PM in german
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Thomas Sattler

 Jun 28 19:23:03 pearl cinergyt2_query_rc+0x0/0x2e9 [cinergyT2]
 
 cinergyt2_query_rc() hangs. I'll try to look tomorrov, but I know nothing
 about drivers/media/dvb/.

Does this mean the problem is in the cinergyt2 driver? I'm having similar
problems with another box but with different hardware. While my laptop is
used as a test system the other one is used as a 'productive' TV-recorder.
I hoped we could trace the bug on the test system and fix the productive
one at the same time. :-/

The other box (silver) is a desktop, which has two Hauppauge Nova-T DVB-T
PCI cards and one (analog) Hauppauge WinTV PVR-350. Silver only hangs if
the (digital) recording process has to much priority: (silver is running
2.6.21.5-cfs-v17 +squashfs +ivtv)

As I wanted to give as much priority to the recording process as possible
I firstly run dvbd as SCHED_RR. This hung the box quite often, sometimes
after an uptime of several minutes, sometimes after two weeks.

I switched to -ck and run dvbd as SCHED_ISO which worked without *any*
problem for about 18 months. As -ck is discontinued I switched to CFS and
the box hung again (twice until I understood why) when dvbd was running as
nice -15.

ATM dvbd runs with nice -12 but yesterday, during a rsync-transfer of
several 4G files, a recording was broken. 29 seconds of the recorded
stream are lost because the system load was at 5 for about three hours.

Perhaps the 29 missing seconds are caused not by to less CPU time but by
the havy IO of rsync. But on the other hand dvbd is also running at IO
realtime prio 4 (ionice) while rsync run as IO normal.

Any hints?
Thomas

-- 
keep mailinglists in english, feel free to send PM in german
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22-rc6 spurious hangs

2007-06-28 Thread Thomas Sattler

> Could you also show the result of sysrq-T ?
I was so happy that I could trigger it that fast ...
... that I forgot to press Alt-Sysrq-t before reboot.
:-(

But, I could trigger it again. :-)

This time I can offer:

 - Debug output from Oleg's patch (11x, every 30s)
 - Alt-Sysrq-t (3x, about 30s between them)

There is no lockdep stuff but lockdep must have
been running. It's enabled and did not fire
before the the bug was triggered.

The logfile is attached.
(yes it is, I checked twice)

Thomas



messages.gz
Description: application/gzip

Re: 2.6.22-rc6 spurious hangs

2007-06-28 Thread Thomas Sattler

Here is the logfile.

Thomas

-- 
keep mailinglists in english, feel free to send PM in german


messages.gz
Description: application/gzip

Re: 2.6.22-rc6 spurious hangs

2007-06-28 Thread Thomas Sattler

 As Ingo told me I run 'echo t > /proc/sysrq-trigger' this time. The
 corresponding part of my syslogs is attached, as well as my kernel config.
>>> Could you try the patch below? It dumps some info when flush_workqueue()
>>> hangs.
>> I'm compiling a patched kernel right now. As I wrote in my former mail the
>> whole thing not easy to trigger. So it can take some time to get the info.
> 
> Forgot to say, if you manage to trigger the hang, please wait a couple of
> minutes to collect more info from flush_wait().

Seems today is my lucky day: I triggered it in just a few minutes.

The logfile is attached.

Thomas

-- 
keep mailinglists in english, feel free to send PM in german
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.22-rc6 spurious hangs

2007-06-28 Thread Thomas Sattler

>> As Ingo told me I run 'echo t > /proc/sysrq-trigger' this time. The
>> corresponding part of my syslogs is attached, as well as my kernel config.
> 
> Could you try the patch below? It dumps some info when flush_workqueue()
> hangs.

I'm compiling a patched kernel right now. As I wrote in my former mail the
whole thing not easy to trigger. So it can take some time to get the info.

Thanks so far,
Thomas

-- 
keep mailinglists in english, feel free to send PM in german
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2.6.22-rc6 spurious hangs

2007-06-28 Thread Thomas Sattler

Hi there ...

I'm observing seldom hangs with linux 2.6. I can't tell when exactly it
happened the first time, I think somewhere around 2.6.16 or 2.6.17. I
see it about once or twice a month. With absolutely nothing in the logs.
So far I asked for help:

- in the -ck list

Mon Sep 4 10:22:06 EST 2006, [ck] ck-patches seem to break DVB-T drivers
(see http://bhhdoa.org.au/pipermail/ck/2006-September/thread.html#6385)

- in the linux-dvb list

Wed Sep 6 19:02:29 CEST 2006, [linux-dvb] driver problems when using
ck-patchset
(http://www.linuxtv.org/pipermail/linux-dvb/2006-September/thread.html#12649)

- in the DaLUG (german, currently no archive) 14.09.2006

But nobody could help me so far.

Here is what I do:

I was running different kernels with different patchsets. It happened in
the past on -ck kernels (staircase), vanilla scheduler and cfs. As far
as I can remember the following patches were allways applied: squashfs
and vesa-tng.

Currently I'm running 2.6.22rc6 with cfs-v18, vesa-tng and an
XFS-lockdep patch:

http://people.redhat.com/mingo/cfs-scheduler/sched-cfs-v2.6.22-rc6-v18.patch
http://dev.gentoo.org/~spock/projects/vesafb-tng/archive/vesafb-tng-1.0-rc2-2.6.20-rc2.patch
see http://marc.info/?l=linux-kernel=118286232709378=2

I also installed these kernel modules via gentoo portage:

ati-drivers-8.37.6-r1
fuse-2.6.4-r1
kqemu-1.3.0_pre11
truecrypt-4.3

kqemu and truecrypt weren't loaded, but ati-drivers and fuse were.

The box I talk about is an IBM T41p with 1.7GHz Pentium M and 512MB RAM.
The distribution in use is gentoo, quite up to date. Attached to the box
is an USB2.0 DVB-T receiver (Cinergy T², Terratec).

In rare cases the keyboard stops working when the T² stops streaming DVB
to the box. It happens when I record the stream to disk as well as when
I stream it to mplayer.

If end of streaming is caused by a keypress, 'q' or 'enter' on mplayer,
that key gets stuck. It's repeated until I reboot the box.

If the recording was scheduled and stops by itself no more keys are
recognized. The keyboard is dead. The laptop's own and the attached
USB-Keyboard. Magic-Sys-Keys are still working.

I can still use the mouse to move windows around, start new xterms via
icewm's panel or copy and past single characters from an xterm to other
xterms.

I can also close most of the open windows, for example firefox and most
xterms. I cannot close an xterm which is started as 'xterm -e top' by
icewm or a vncviewer. Both windows stay open but lose their content.

If a root shell is open I can enter 'reboot' or 'halt' but most of the
time this doesn't reboot or halt. I get the message for an upcoming
shotdown in all xterms but the box doesn't come down.

The systemload continously increases but there is nothing to see in top why.

Ingo Molnar told me to enable CONFIG_PROVE_LOCKING but xfs triggers it
long before the box hangs. I tested the patch mentioned above but it was
triggered by xfs again, see [1] and I didn't reboot between this and the
last hung. [1] http://marc.info/?l=linux-kernel=118295294529681=2

As Ingo told me I run 'echo t > /proc/sysrq-trigger' this time. The
corresponding part of my syslogs is attached, as well as my kernel config.

Another thing I observed with the T² is that it doesn't work if it's
already connected when the laptop boots up. I need to power off,
disconnect and boot. If I connect the T² after bootup it works. I can
also rmmod it's driver when it's not in use.

If I boot the box with the T² connected I cannot use it, the blue led in
the T² is always off and I cannot rmmod the driver. (I don't know
whether I ever tired to rmmod the driver before I tried to use the T².)

Please CC me as I'm not subscribed to the list.

Thomas

--
keep mailinglists in english, feel free to send PM in german

messages.gz
Description: application/gzip

config.gz
Description: application/gzip