Bug#522110: BUG: soft lockup - CPU#3 stuck for 61s! (AMD Phenom 9600 Quad-Core)

2009-10-11 Thread Steve Kostecke
Ben Hutchings said:

Does this problem still occur in the current Debian kernel version
(2.6.30)?

$ uname -a
Linux stasis 2.6.24-etchnhalf.1-amd64 #1 SMP Tue Dec 2 17:21:26 UTC 2008 x86_64 
GNU/Linux

I downgraded to 2.6.24 quite a while ago. Some time after that the
lock-ups stopped (I don't recall the exact time-line).

I've not tried a more recent kernel since then.

-- 
Steve Kostecke st...@debian.org
Public Key at gopher://kostecke.net or `finger st...@kostecke.net`

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.




-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#522110: BUG: soft lockup - CPU#3 stuck for 61s! (AMD Phenom 9600 Quad-Core)

2009-03-31 Thread Steve Kostecke
Package: linux-image-2.6.28
Version: 2.6.28-10.00.Custom
Severity: important

I'm having sporadic problems with a runaway processor core on an AMD
Phenom 9600 Quad-Core system. The system, which runs Lenny, will
sometimes stay up for almost a week and other times has to be rebooted
serveral times in one day.

When the system locks up, the load on processor core 4 (cpu#3) slowly
climbs to 100% and everything running on that core freezes.

The kernel usually responds to the Magic SysRq keys.

This package was locally compiled on a Lenny system using the source
package from Sid.

Here's a typical syslog extract:

Mar 31 14:19:54 stasis kernel: [11166.917503] BUG: soft lockup - CPU#3
  stuck for 61s! [events/3:18]
Mar 31 14:19:54 stasis kernel: [11166.917507] Modules linked in:
  tcp_diag inet_diag ppdev parport_pc lp parport autofs4 ipv6 nfsd
  exportfs nfs lockd nfs_acl fuse dm_snapshot dm_mirror dm_region_hash
  dm_log dm_mod rpcsec_gss_krb5 auth_rpcgss sunrpc it87 hwmon_vid eeprom
  loop sg snd_usb_audio snd_usb_lib snd_hwdep snd_seq_dummy snd_hda_intel
  snd_seq_oss snd_pcm_oss snd_mixer_oss snd_seq_midi psmouse snd_rawmidi
  pcspkr serio_raw snd_seq_midi_event snd_pcm snd_seq i2c_piix4
  snd_page_alloc i2c_core snd_timer snd_seq_device snd pwc evdev
  compat_ioctl32 soundcore usblp videodev v4l1_compat wmi button ext3 jbd
  mbcache raid10 raid1 md_mod usb_storage usbhid hid atiixp sd_mod
  crc_t10dif ide_pci_generic ide_core floppy aic7xxx scsi_transport_spi
  ata_generic ahci ohci_hcd ehci_hcd libata scsi_mod atl1 mii thermal
  processor fan thermal_sys
Mar 31 14:19:54 stasis kernel: [11166.917511] CPU 3:
Mar 31 14:19:54 stasis kernel: [11166.917511] Modules linked in:
  tcp_diag inet_diag ppdev parport_pc lp parport autofs4 ipv6 nfsd
  exportfs nfs lockd nfs_acl fuse dm_snapshot dm_mirror dm_region_hash
  dm_log dm_mod rpcsec_gss_krb5 auth_rpcgss sunrpc it87 hwmon_vid eeprom
  loop sg snd_usb_audio snd_usb_lib snd_hwdep snd_seq_dummy snd_hda_intel
  snd_seq_oss snd_pcm_oss snd_mixer_oss snd_seq_midi psmouse snd_rawmidi
  pcspkr serio_raw snd_seq_midi_event snd_pcm snd_seq i2c_piix4
  snd_page_alloc i2c_core snd_timer snd_seq_device snd pwc evdev
  compat_ioctl32 soundcore usblp videodev v4l1_compat wmi button ext3 jbd
  mbcache raid10 raid1 md_mod usb_storage usbhid hid atiixp sd_mod
  crc_t10dif ide_pci_generic ide_core floppy aic7xxx scsi_transport_spi
  ata_generic ahci ohci_hcd ehci_hcd libata scsi_mod atl1 mii thermal
  processor fan thermal_sys
Mar 31 14:19:54 stasis kernel: [11166.917511] Pid: 18, comm: events/3
  Not tainted 2.6.28 #1
Mar 31 14:19:54 stasis kernel: [11166.917511] RIP:
  0010:[80262fb1]  [80262fb1]
smp_call_function_mask+0x19c/0x226
  Mar 31 14:19:54 stasis kernel: [11166.917511] RSP: 0018:88012ed9bc40
EFLAGS: 0202
Mar 31 14:19:54 stasis kernel: [11166.917511] RAX: 08fc RBX:
  0003 RCX: 
Mar 31 14:19:54 stasis kernel: [11166.917511] RDX: 08fc RSI:
  88012ed9bb90 RDI: 0246
Mar 31 14:19:54 stasis kernel: [11166.917511] RBP: 0003 R08:
  0008 R09: 0200
Mar 31 14:19:54 stasis kernel: [11166.917511] R10: 0008 R11:
  80221011 R12: 0014
Mar 31 14:19:54 stasis kernel: [11166.917511] R13: 80220678 R14:
  0003 R15: 88012c0eb5c0
Mar 31 14:19:54 stasis kernel: [11166.917511] FS:
  7faa85df96e0() GS:88012ed3bdc0() knlGS:f7910b90
Mar 31 14:19:54 stasis kernel: [11166.917511] CS:  0010 DS: 0018 ES:
  0018 CR0: 8005003b
Mar 31 14:19:54 stasis kernel: [11166.917511] CR2: 7f8beb322650 CR3:
  00201000 CR4: 06e0
Mar 31 14:19:54 stasis kernel: [11166.917511] DR0:  DR1:
   DR2: 
Mar 31 14:19:54 stasis kernel: [11166.917511] DR3:  DR6:
  0ff0 DR7: 0400
Mar 31 14:19:54 stasis kernel: [11166.917511] Call Trace:
Mar 31 14:19:54 stasis kernel: [11166.917511]  [80234c45] ?
  update_curr+0x4d/0x112
Mar 31 14:19:54 stasis kernel: [11166.917511]  [8023692a] ?
  dequeue_entity+0x18/0x11f
Mar 31 14:19:54 stasis kernel: [11166.917511]  [8021c9a1] ?
  mcheck_check_cpu+0x0/0x28
Mar 31 14:19:54 stasis kernel: [11166.917511]  [80263064] ?
  smp_call_function+0x29/0x2e
Mar 31 14:19:54 stasis kernel: [11166.917511]  [80247de1] ?
  on_each_cpu+0x10/0x30
Mar 31 14:19:54 stasis kernel: [11166.917511]  [8021c316] ?
  mcheck_timer+0x0/0x76
Mar 31 14:19:54 stasis kernel: [11166.917511]  [8021c32e] ?
  mcheck_timer+0x18/0x76
Mar 31 14:19:54 stasis kernel: [11166.917511]  [8039d200] ?
  rekey_seq_generator+0x0/0x4b
Mar 31 14:19:54 stasis kernel: [11166.917511]  [802523f3] ?
  run_workqueue+0x96/0x130
Mar 31 14:19:54 stasis kernel: [11166.917511]  [80476b29] ?
  _spin_lock_irqsave+0x24/0x2c
Mar 31 14:19:54 stasis kernel: 

Bug#513889: No longer using this kernel version

2009-03-17 Thread Steve Kostecke
I am no longer using the kernel version which this bug report was filed
against.

I have upgraded to a 2.6.28 kernel (built from the source packages in
sid) and have almost 4 days of uptime without a system lockup.

-- 
Steve Kostecke st...@debian.org
Public Key at gopher://kostecke.net or `finger st...@kostecke.net`

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.




-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#513889: linux-image-2.6.24-etchnhalf.1-amd64: AMD Phenom soft lockup - CPU#3 stuck for 11s [events/3:18]

2009-02-01 Thread Steve Kostecke
Package: linux-image-2.6.24-etchnhalf.1-amd64
Version: 2.6.24-6~etchnhalf.7
Severity: critical
Justification: breaks the whole system

My amd64 system is subject to random lockups. Sometimes I can perform an
emergency sync, umount, and reboot using MagicSysRq. Other times I can
not. Sometimes I get a couple of days of uptime. Other times I have to
reboot several times a day.

This particular case was one of the times that I had to use the reset
button to restart the system. The system had become unresponsive after
running for about 48 hours without X running or any console user logins
(total uptime was ~ 5 days).

The first lock-up was:

Jan 30 00:49:14 stasis kernel: BUG: soft lockup - CPU#3 stuck for 11s!
[events/3:18]
Jan 30 00:49:14 stasis kernel: CPU 3:
Jan 30 00:49:14 stasis kernel: Modules linked in: isofs zlib_inflate
ext2 tcp_diag inet_diag ppdev parport_pc lp parport autofs4 ipv6 tun
nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc fuse dm_snapshot
dm_mirror dm_mod it87 hwmon_vid eeprom loop snd_usb_audio snd_usb_lib
snd_hwdep snd_seq_dummy snd_seq_oss snd_hda_intel snd_seq_midi
snd_rawmidi snd_pcm_oss snd_mixer_oss snd_seq_midi_event rtc_cmos
snd_seq snd_pcm rtc_core psmouse snd_timer floppy rtc_lib pwc
snd_seq_device i2c_piix4 pcspkr snd_page_alloc serio_raw compat_ioctl32
videodev snd v4l2_common i2c_core v4l1_compat soundcore usblp atl1
button mii sg evdev ext3 jbd mbcache raid10 raid1 md_mod ide_generic
generic atiixp ide_core sd_mod usbhid hid aic7xxx scsi_transport_spi
ata_generic ahci ehci_hcd libata scsi_mod ohci_hcd thermal processor fan
Jan 30 00:49:14 stasis kernel: Pid: 18, comm: events/3 Not tainted 
2.6.24-etchnhalf.1-amd64 #1
Jan 30 00:49:14 stasis kernel: RIP: 0010:[8021bd36] 
[8021bd36] __smp_call_function_mask+0x9c/0xc0
Jan 30 00:49:14 stasis kernel: RSP: 0018:81012b763e00  EFLAGS: 0297
Jan 30 00:49:14 stasis kernel: RAX: 08fc RBX: 0003 RCX: 
0001
Jan 30 00:49:14 stasis kernel: RDX: 08fc RSI: 00fc RDI: 
0007
Jan 30 00:49:14 stasis kernel: RBP: 81000103c930 R08: 00f9 R09: 
81011401fac8
Jan 30 00:49:14 stasis kernel: R10: 81012b72abe8 R11:  R12: 

Jan 30 00:49:14 stasis kernel: R13: 00030282 R14: 81012b763ea0 R15: 
00020001
Jan 30 00:49:14 stasis kernel: FS:  2b04ef9bccc0() 
GS:81012b6b78c0() knlGS:f6dfdb90
Jan 30 00:49:14 stasis kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
Jan 30 00:49:14 stasis kernel: CR2: 2b8ba51cf000 CR3: 000114053000 CR4: 
06e0
Jan 30 00:49:14 stasis kernel: DR0:  DR1:  DR2: 

Jan 30 00:49:14 stasis kernel: DR3:  DR6: 0ff0 DR7: 
0400
Jan 30 00:49:14 stasis kernel: Jan 30 00:49:14 stasis kernel: Call Trace:
Jan 30 00:49:14 stasis kernel:  [80216193] mcheck_check_cpu+0x0/0x37
Jan 30 00:49:14 stasis kernel:  [8023e28b] lock_timer_base+0x26/0x4c
Jan 30 00:49:14 stasis kernel:  [80216193] mcheck_check_cpu+0x0/0x37
Jan 30 00:49:14 stasis kernel:  [8021bdb8] 
smp_call_function_mask+0x5e/0x70
Jan 30 00:49:14 stasis kernel:  [80215a31] mcheck_timer+0x0/0x7c
Jan 30 00:49:14 stasis kernel:  [80216193] mcheck_check_cpu+0x0/0x37
Jan 30 00:49:14 stasis kernel:  [8023a243] on_each_cpu+0x10/0x22
Jan 30 00:49:14 stasis kernel:  [80215a4e] mcheck_timer+0x1d/0x7c
Jan 30 00:49:14 stasis kernel:  [8027d1f0] vmstat_update+0x0/0x31
Jan 30 00:49:14 stasis kernel:  [802448dd] run_workqueue+0x7f/0x10b
Jan 30 00:49:14 stasis kernel:  [802451ef] worker_thread+0x0/0xe4
Jan 30 00:49:14 stasis kernel:  [802452c9] worker_thread+0xda/0xe4
Jan 30 00:49:14 stasis kernel:  [802481fe] 
autoremove_wake_function+0x0/0x2e
Jan 30 00:49:14 stasis kernel:  [802480de] kthread+0x47/0x75
Jan 30 00:49:14 stasis kernel:  [8020cc48] child_rip+0xa/0x12
Jan 30 00:49:14 stasis kernel:  [80248097] kthread+0x0/0x75
Jan 30 00:49:14 stasis kernel:  [8020cc3e] child_rip+0x0/0x12

This repeated 1703 times until the system became totally unresponsive
with:

Jan 30 06:25:05 stasis kernel: BUG: soft lockup - CPU#3 stuck for 11s! 
[events/3:18]
Jan 30 06:25:05 stasis kernel: CPU 3:
Jan 30 06:25:05 stasis kernel: Modules linked in: isofs zlib_inflate ext2 
tcp_diag inet_diag ppdev parport_pc lp parport autofs4 ipv6 tun nfsd auth_rpc
gss exportfs nfs lockd nfs_acl sunrpc fuse dm_snapshot dm_mirror dm_mod it87 
hwmon_vid eeprom loop snd_usb_audio snd_usb_lib snd_hwdep snd_seq_dummy sn
d_seq_oss snd_hda_intel snd_seq_midi snd_rawmidi snd_pcm_oss snd_mixer_oss 
snd_seq_midi_event rtc_cmos snd_seq snd_pcm rtc_core psmouse snd_timer flopp
y rtc_lib pwc snd_seq_device i2c_piix4 pcspkr snd_page_alloc serio_raw 
compat_ioctl32 videodev snd v4l2_common i2c_core v4l1_compat