Bug#690814: [3.1-3.2.y regression] disk activity provokes lockups on VIA EPIA CL-6000

2012-11-01 Thread Frank Lenaerts
--- On Tue, 10/30/12, Jonathan Nieder jrnie...@gmail.com wrote:
Thanks!  To recap:

 3.1.8-2 works fine
 3.2.1-1 hung at Loading, please wait... once, worked fine twice
 3.2.23-1 reliably hangs (though not always right away)

Could you try 3.2~rc4-1~experimental.1, to narrow down the range a
little?  (If it doesn't hang, that would be great, but I suspect it
will hang, too.)
3.2~rc4-1~experimental.1 seems to work fine i.e. I booted it several times and 
didn't encounter a problem. Also booted 3.2.1-1 again and it also booted fine 
(it has been running fine for a week or so). Also tried 3.2.23-1 again but it 
got stuck at Activating swap..., thus I think that reliably hangs states it 
correctly.

I also tried linux-image-3.2.0-2-486_3.2.19-1_i386.deb. It seems to boot fine 
(3 tests) and I left it running for now to see if it locks up while it is in 
use.


Bug#690814: [3.1-3.2.y regression] disk activity provokes lockups on VIA EPIA CL-6000

2012-10-30 Thread Frank Lenaerts


--- On Sun, 10/28/12, Jonathan Nieder jrnie...@gmail.com wrote:

From: Jonathan Nieder jrnie...@gmail.com
Subject: Re: [3.1-3.2.y regression] disk activity provokes lockups on VIA EPIA 
CL-6000
To: Frank Lenaerts frank.lenae...@yahoo.com
Cc: 690...@bugs.debian.org
Date: Sunday, October 28, 2012, 2:01 AM

Frank Lenaerts wrote:
 Jonathan Nieder wrote:

 Which versions were the 3.2.0-1, 3.2.0-2, and 3.2.0-3 kernels
 you mentioned testing above?
[...]
 I mean 3.1.0-1, 3.2.0-1, 3.2.0-3 here. Sorry for the lack of clarity.

 I just used uname -r and left out '-486' (because that's always the
 same on this box).

Yes, unfortunately that only gives the ABI version (package name)
rather than the package version which is more precise.  If the
packages that you used to test are still around, you can see the fullFrom the 
bash history:

    
http://snapshot.debian.org/archive/debian/20120119T160147Z/pool/main/l/linux-2.6/linux-image-3.2.0-1-486_3.2.1-1_i386.deb
    
http://snapshot.debian.org/archive/debian/20120110T093300Z/pool/main/l/linux-2.6/linux-image-3.1.0-1-486_3.1.8-2_i386.deb
    
http://snapshot.debian.org/archive/debian/20110724T212501Z/pool/main/l/linux-2.6/linux-image-3.0.0-1-486_3.0.0-1_i386.deb
    
http://snapshot.debian.org/archive/debian/20110705T091435Z/pool/main/l/linux-2.6/linux-image-2.6.39-2-486_2.6.39-3_i386.deb

version number in the .deb filename, or if they are installed you can
check with

    dpkg-query -W linux-image-{3.1.0-1,3.2.0-1,3.2.0-3}-486


Bug#690814: [3.1-3.2.y regression] disk activity provokes lockups on VIA EPIA CL-6000

2012-10-26 Thread Frank Lenaerts


--- On Thu, 10/25/12, Jonathan Nieder jrnie...@gmail.com wrote:

From: Jonathan Nieder jrnie...@gmail.com
Subject: Re: [3.1-3.2.y regression] disk activity provokes lockups on VIA EPIA 
CL-6000
To: Frank Lenaerts frank.lenae...@yahoo.com
Cc: 690...@bugs.debian.org
Date: Thursday, October 25, 2012, 10:41 PM

Jonathan Nieder wrote:

 Package names like linux-image-3.2.0-1-486 describe the kernel's
 ABI, not the package version.  The package version is something like
 3.2.1-1.  See http://kernel-handbook.alioth.debian.org/ch-versions.html
 for more details.

 You can get the version of the currently running kernel by running
 cat /proc/version (it will be in parentheses).  The currently
 installed kernel's version number can be retrieved with
 dpkg-query -W linux-image-$(uname -r).

 Which versions were the 3.2.0-1, 3.2.0-2, and 3.2.0-3 kernels
 you mentioned testing above?

I mean 3.1.0-1, 3.2.0-1, 3.2.0-3 here. Sorry for the lack of clarity.
I just used uname -r and left out '-486' (because that's always the same on 
this box).



Bug#690814: [squeeze-wheezy regression] disk activity provokes lockups on VIA EPIA CL-6000

2012-10-25 Thread Frank Lenaerts
--- On Wed, 10/17/12, Jonathan Nieder jrnie...@gmail.com wrote:

From: Jonathan Nieder jrnie...@gmail.com
Subject: Re: [squeeze-wheezy regression] disk activity provokes lockups on VIA 
EPIA CL-6000
To: Frank Lenaerts frank.lenae...@yahoo.com
Cc: 690...@bugs.debian.org
Date: Wednesday, October 17, 2012, 11:19 PM

# regression
severity 690814 important
quit

Hi Frank,

Frank Lenaerts wrote:

 I had to use the power button to restart the machine. The first time
 I rebooted, the system got stuck when trying to mount one of the
 filesystems. Rebooted again and got the login prompt. Did some more
 reboots and found out that the system hangs during more or less 50%
 of the reboots. Since the disk activity LED was always on, I tried
 to provoke some disk activity e.g. by installing some packages. This
 effectively locked up the system. I only once got the system in a
 locked up state without having the disk activity LED turned on.
[...]
 Since this machine had been running Lenny just fine, and Squeeze
 also worked fine, I decided to install a 2.6 kernel.
[...]
                          Note that it took several reboots to get
 the deb file on the system and to install it. With this kernel, the
 box runs just fine.

Thanks for reporting it.

A few suggestions for moving forward:
[...snipped...] * if you have time to run a bisection search through the 
pre-compiled
   kernels at http://snapshot.debian.org/package/linux-2.6/ to find
   the first broken one, that could help narrow down things quite a bit.
Tested some of the linux images:

- 2.6.39 seems ok

- 3.0.0-1 seems ok

- 3.1.0-1 seems ok

- 3.2.0-1 does not seem to be ok:
- first boot: system hung at Loading, please wait...; note that the LED 
indicating disk activity was not burning
- second boot: ok
- third boot: ok

Had it running and used it for a few days. Then decided to give 3.2.0-3 a try 
again.

- 3.2.0-3 is not ok:
- first boot: crash during the boot process i.e. detected the external USB disk 
(which I had connected again while working with 3.2.0-1), makefile style 
concurrent boot, hotplug, udev... stacktrace (not sure if it was after or 
before the hotplug stuff; unfortunately didn't have logging turned on and 
keyboard was stuck)... boot process continued a bit and then hung; note that 
the LED was not on; since I've now seen quite some lockups without this LED 
being on, I think it does not necessarily have anything to do with disk activity
- seond boot: stuck in Configuring network interfaces; note that I've seen 
this more than once already

So, it looks like we'll have to investigate the differences between -1 and -3.

Question: 3.2.0-3 was the one installed during the installation. 3.2.0-1 was 
installed by me for testing purposes and came from 3.2.1-1. I see that the 
snapshot directory contains other 3.2.0-1 e.g. under 3.2.2-1. Why is it like 
that? I see that the 3.2.19-1 directory contains 3.2.0-2 for instance and it 
seems that 3.2.0-3 is not in the snapshot directory...Hope that helps, and 
sorry I have no better ideas,
Jonathan

[1] http://www.kernel.org/doc/Documentation/networking/netconsole.txt


Bug#690814: [squeeze-wheezy regression] disk activity provokes lockups on VIA EPIA CL-6000

2012-10-18 Thread Frank Lenaerts


--- On Wed, 10/17/12, Jonathan Nieder jrnie...@gmail.com wrote:

From: Jonathan Nieder jrnie...@gmail.com
Subject: Re: [squeeze-wheezy regression] disk activity provokes lockups on VIA 
EPIA CL-6000
To: Frank Lenaerts frank.lenae...@yahoo.com
Cc: 690...@bugs.debian.org
Date: Wednesday, October 17, 2012, 11:19 PM

# regression
severity 690814 important
quit

Hi Frank,

Frank Lenaerts wrote:

 I had to use the power button to restart the machine. The first time
 I rebooted, the system got stuck when trying to mount one of the
 filesystems. Rebooted again and got the login prompt. Did some more
 reboots and found out that the system hangs during more or less 50%
 of the reboots. Since the disk activity LED was always on, I tried
 to provoke some disk activity e.g. by installing some packages. This
 effectively locked up the system. I only once got the system in a
 locked up state without having the disk activity LED turned on.
[...]
 Since this machine had been running Lenny just fine, and Squeeze
 also worked fine, I decided to install a 2.6 kernel.
[...]
                          Note that it took several reboots to get
 the deb file on the system and to install it. With this kernel, the
 box runs just fine.

Thanks for reporting it.

A few suggestions for moving forward:

 * please attach full dmesg output from a normal boot (with the
   2.6.32-based kernel)

The tarball I attached to this e-mail contains the dmesg output from 
2.6.32-5-486 and 3.2.0-3-486 (sometimes, I can login and even do some things;-))

 * could you also get a kernel log from booting the 3.2-based kernel?
   A full log including the lockup would be ideal --- netconsole[1]
   might help here.

The attached tarball contains some try to boot logs. Some more information:

- netconsole_1.out: netconsole only shows until 'loop: module loaded' while the 
console also showed 'Loading kernel module loop.' (after the above 'loop: 
module loaded'), 'Activating lvm and md swap... done' and 'Checking 
filesystems... fsck from util-linux 2.20.1'. I don't know why these 3 lines are 
not visible via netconsole. It looks like the box cannot write to the network 
anymore.

- netconsole_2.out: netconsole only shows until 'eth0: no IPv6 routers present' 
while the console also showed 'Activating swap...'

- netconsole_3_A-B-C.out: A was like netconsole_1.out, B was like 
netconsole_2.out, C was like netconsole_1.out

After this, I booted with 2.6 because I could not get 3.2 far enough to be able 
to login. When I booted with 2.6 it wanted to fsck an external USB disk. Since 
this would take too long, I unplugged it (so further logs won't show /dev/sdc 
anymore).

Instead of using netconsole, I used screen on ttyS0. Finally, I could get to 
the login prompt of 3.2. When I issued a find /usr -ls over an ssh connection, 
the system locked up (no keyboard interaction possible anymore). Note that 
screenlog_3.2.0-3-486.0 doesn't show anything usesfull (on the linux 
commandline, I specified console=ttyS0 and debug).

Since the successful 3.2 boot was without (a) the USB  disk and (b) 
netconsole, I decided to try to boot with netconsole (and without the USB 
disk). It booted a little bit further than before but I still couldn't get to 
the login prompt. Rebooted several times, see netconsole_4_a-b-c.out. Some 
extra notes:

- a: hung at 90-second grace period

- b: hung at loop: module loaded

- c: login was possible; netconsole also ended with 90-second grace period 
like in case a; to see if the network is the showstopper, ran find /usr -ls on 
the console; this worked, just like find /var -ls; when I ran an apt-get update 
however, the system locked up (all files were downloaded, but it failed at the 
end (percentage sign stuck)).

 * if you have time to run a bisection search through the pre-compiled
   kernels at http://snapshot.debian.org/package/linux-2.6/ to find
   the first broken one, that could help narrow down things quite a bit.

I put it on my TODO list.

Hope that helps, and sorry I have no better ideas,
Jonathan

[1] http://www.kernel.org/doc/Documentation/networking/netconsole.txt


dbts-690814.tar.gz
Description: GNU Zip compressed data


Bug#690814: linux-image-3.2.0-3-486: vmlinuz-3.2.0-3-486 locks up on VIA EPIA CL-6000

2012-10-17 Thread Frank Lenaerts
Package: src:linux
Version: 3.2.23-1
Severity: normal

Dear Maintainer,

Installation of Wheezy on this VIA EPIA CL-6000 box went fine but when the 
system booted after the installation, it locked up at different points in time. 
With locked 
up, I mean that nothing could be done anymore: my ssh session stalled, the 
machine did not answer echo requests, I could not do anything at the console. 
At this point, 
the LED indicating disk activity was turned on.

I had to use the power button to restart the machine. The first time I 
rebooted, the system got stuck when trying to mount one of the filesystems. 
Rebooted again and got 
the login prompt. Did some more reboots and found out that the system hangs 
during more or less 50% of the reboots. Since the disk activity LED was always 
on, I tried to 
provoke some disk activity e.g. by installing some packages. This effectively 
locked up the system. I only once got the system in a locked up state without 
having the disk 
activity LED turned on.

Ran memtest86+ (4 runs) without any problem. Tried another (identical) harddisk 
and tried to use the other IDE controller but always ran into the same problem.

Since this machine had been running Lenny just fine, and Squeeze also worked 
fine, I decided to install a 2.6 kernel. In Wheezy however, linux-image-2.6-486 
depends on 
linux-image-486, which in turn depends on linux-image-3.2.0-3-486. I therefore 
installed the 2.6 kernel from Squeeze. Note that it took several reboots to get 
the deb file 
on the system and to install it. With this kernel, the box runs just fine.

Note that I also have a VIA EPIA PD-6000, which is almost identical, but runs 
Wheezy with the 3.2 kernel just fine.

Note that, as this machine cannot run version 3.2.0-3-486 of the kernel, I used 
reportbug when the machine was running version 2.6.32-5-486 of the kernel. This 
kernel is 
the one from Squeeze.

-- Package-specific info:
** Kernel log: boot messages should be attached

** Model information
not available

** PCI devices:
00:00.0 Host bridge [0600]: VIA Technologies, Inc. VT8623 [Apollo CLE266] 
[1106:3123]
Subsystem: VIA Technologies, Inc. Device [1106:aa01]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium TAbort- 
TAbort- MAbort+ SERR- PERR- INTx-
Latency: 8
Region 0: Memory at e000 (32-bit, prefetchable) [size=64M]
Capabilities: access denied
Kernel driver in use: agpgart-via

00:01.0 PCI bridge [0604]: VIA Technologies, Inc. VT8633 [Apollo Pro266 AGP] 
[1106:b091] (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium TAbort- 
TAbort- MAbort+ SERR- PERR+ INTx-
Latency: 0
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
Memory behind bridge: e800-e9ff
Prefetchable memory behind bridge: e400-e7ff
Secondary status: 66MHz+ FastB2B- ParErr- DEVSEL=medium TAbort- 
TAbort- MAbort+ SERR- PERR+
BridgeCtl: Parity- SERR- NoISA+ VGA+ MAbort- Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: access denied

00:0f.0 Ethernet controller [0200]: VIA Technologies, Inc. VT6105/VT6106S 
[Rhine-III] [1106:3106] (rev 8b)
Subsystem: VIA Technologies, Inc. Device [1106:0106]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium TAbort- 
TAbort- MAbort- SERR- PERR- INTx-
Latency: 32 (750ns min, 2000ns max), Cache Line Size: 32 bytes
Interrupt: pin A routed to IRQ 12
Region 0: I/O ports at d000 [size=256]
Region 1: Memory at ea00 (32-bit, non-prefetchable) [size=256]
Capabilities: access denied
Kernel driver in use: via-rhine

00:10.0 USB controller [0c03]: VIA Technologies, Inc. VT82x UHCI USB 1.1 
Controller [1106:3038] (rev 80) (prog-if 00 [UHCI])
Subsystem: VIA Technologies, Inc. Device [1106:aa01]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium TAbort- 
TAbort- MAbort- SERR- PERR- INTx-
Latency: 32, Cache Line Size: 32 bytes
Interrupt: pin A routed to IRQ 11
Region 4: I/O ports at d400 [size=32]
Capabilities: access denied
Kernel driver in use: uhci_hcd

00:10.1 USB controller [0c03]: VIA Technologies, Inc. VT82x UHCI USB 1.1 
Controller [1106:3038] (rev 80) (prog-if 00 [UHCI])
Subsystem: VIA Technologies, Inc. Device [1106:aa01]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-