Bug#522110: BUG: soft lockup - CPU#3 stuck for 61s! (AMD Phenom 9600 Quad-Core)
Ben Hutchings said: Does this problem still occur in the current Debian kernel version (2.6.30)? $ uname -a Linux stasis 2.6.24-etchnhalf.1-amd64 #1 SMP Tue Dec 2 17:21:26 UTC 2008 x86_64 GNU/Linux I downgraded to 2.6.24 quite a while ago. Some time after that the lock-ups stopped (I don't recall the exact time-line). I've not tried a more recent kernel since then. -- Steve Kostecke st...@debian.org Public Key at gopher://kostecke.net or `finger st...@kostecke.net` -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#522110: BUG: soft lockup - CPU#3 stuck for 61s! (AMD Phenom 9600 Quad-Core)
Package: linux-image-2.6.28 Version: 2.6.28-10.00.Custom Severity: important I'm having sporadic problems with a runaway processor core on an AMD Phenom 9600 Quad-Core system. The system, which runs Lenny, will sometimes stay up for almost a week and other times has to be rebooted serveral times in one day. When the system locks up, the load on processor core 4 (cpu#3) slowly climbs to 100% and everything running on that core freezes. The kernel usually responds to the Magic SysRq keys. This package was locally compiled on a Lenny system using the source package from Sid. Here's a typical syslog extract: Mar 31 14:19:54 stasis kernel: [11166.917503] BUG: soft lockup - CPU#3 stuck for 61s! [events/3:18] Mar 31 14:19:54 stasis kernel: [11166.917507] Modules linked in: tcp_diag inet_diag ppdev parport_pc lp parport autofs4 ipv6 nfsd exportfs nfs lockd nfs_acl fuse dm_snapshot dm_mirror dm_region_hash dm_log dm_mod rpcsec_gss_krb5 auth_rpcgss sunrpc it87 hwmon_vid eeprom loop sg snd_usb_audio snd_usb_lib snd_hwdep snd_seq_dummy snd_hda_intel snd_seq_oss snd_pcm_oss snd_mixer_oss snd_seq_midi psmouse snd_rawmidi pcspkr serio_raw snd_seq_midi_event snd_pcm snd_seq i2c_piix4 snd_page_alloc i2c_core snd_timer snd_seq_device snd pwc evdev compat_ioctl32 soundcore usblp videodev v4l1_compat wmi button ext3 jbd mbcache raid10 raid1 md_mod usb_storage usbhid hid atiixp sd_mod crc_t10dif ide_pci_generic ide_core floppy aic7xxx scsi_transport_spi ata_generic ahci ohci_hcd ehci_hcd libata scsi_mod atl1 mii thermal processor fan thermal_sys Mar 31 14:19:54 stasis kernel: [11166.917511] CPU 3: Mar 31 14:19:54 stasis kernel: [11166.917511] Modules linked in: tcp_diag inet_diag ppdev parport_pc lp parport autofs4 ipv6 nfsd exportfs nfs lockd nfs_acl fuse dm_snapshot dm_mirror dm_region_hash dm_log dm_mod rpcsec_gss_krb5 auth_rpcgss sunrpc it87 hwmon_vid eeprom loop sg snd_usb_audio snd_usb_lib snd_hwdep snd_seq_dummy snd_hda_intel snd_seq_oss snd_pcm_oss snd_mixer_oss snd_seq_midi psmouse snd_rawmidi pcspkr serio_raw snd_seq_midi_event snd_pcm snd_seq i2c_piix4 snd_page_alloc i2c_core snd_timer snd_seq_device snd pwc evdev compat_ioctl32 soundcore usblp videodev v4l1_compat wmi button ext3 jbd mbcache raid10 raid1 md_mod usb_storage usbhid hid atiixp sd_mod crc_t10dif ide_pci_generic ide_core floppy aic7xxx scsi_transport_spi ata_generic ahci ohci_hcd ehci_hcd libata scsi_mod atl1 mii thermal processor fan thermal_sys Mar 31 14:19:54 stasis kernel: [11166.917511] Pid: 18, comm: events/3 Not tainted 2.6.28 #1 Mar 31 14:19:54 stasis kernel: [11166.917511] RIP: 0010:[80262fb1] [80262fb1] smp_call_function_mask+0x19c/0x226 Mar 31 14:19:54 stasis kernel: [11166.917511] RSP: 0018:88012ed9bc40 EFLAGS: 0202 Mar 31 14:19:54 stasis kernel: [11166.917511] RAX: 08fc RBX: 0003 RCX: Mar 31 14:19:54 stasis kernel: [11166.917511] RDX: 08fc RSI: 88012ed9bb90 RDI: 0246 Mar 31 14:19:54 stasis kernel: [11166.917511] RBP: 0003 R08: 0008 R09: 0200 Mar 31 14:19:54 stasis kernel: [11166.917511] R10: 0008 R11: 80221011 R12: 0014 Mar 31 14:19:54 stasis kernel: [11166.917511] R13: 80220678 R14: 0003 R15: 88012c0eb5c0 Mar 31 14:19:54 stasis kernel: [11166.917511] FS: 7faa85df96e0() GS:88012ed3bdc0() knlGS:f7910b90 Mar 31 14:19:54 stasis kernel: [11166.917511] CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b Mar 31 14:19:54 stasis kernel: [11166.917511] CR2: 7f8beb322650 CR3: 00201000 CR4: 06e0 Mar 31 14:19:54 stasis kernel: [11166.917511] DR0: DR1: DR2: Mar 31 14:19:54 stasis kernel: [11166.917511] DR3: DR6: 0ff0 DR7: 0400 Mar 31 14:19:54 stasis kernel: [11166.917511] Call Trace: Mar 31 14:19:54 stasis kernel: [11166.917511] [80234c45] ? update_curr+0x4d/0x112 Mar 31 14:19:54 stasis kernel: [11166.917511] [8023692a] ? dequeue_entity+0x18/0x11f Mar 31 14:19:54 stasis kernel: [11166.917511] [8021c9a1] ? mcheck_check_cpu+0x0/0x28 Mar 31 14:19:54 stasis kernel: [11166.917511] [80263064] ? smp_call_function+0x29/0x2e Mar 31 14:19:54 stasis kernel: [11166.917511] [80247de1] ? on_each_cpu+0x10/0x30 Mar 31 14:19:54 stasis kernel: [11166.917511] [8021c316] ? mcheck_timer+0x0/0x76 Mar 31 14:19:54 stasis kernel: [11166.917511] [8021c32e] ? mcheck_timer+0x18/0x76 Mar 31 14:19:54 stasis kernel: [11166.917511] [8039d200] ? rekey_seq_generator+0x0/0x4b Mar 31 14:19:54 stasis kernel: [11166.917511] [802523f3] ? run_workqueue+0x96/0x130 Mar 31 14:19:54 stasis kernel: [11166.917511] [80476b29] ? _spin_lock_irqsave+0x24/0x2c Mar 31 14:19:54 stasis kernel:
Bug#513889: No longer using this kernel version
I am no longer using the kernel version which this bug report was filed against. I have upgraded to a 2.6.28 kernel (built from the source packages in sid) and have almost 4 days of uptime without a system lockup. -- Steve Kostecke st...@debian.org Public Key at gopher://kostecke.net or `finger st...@kostecke.net` -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#513889: linux-image-2.6.24-etchnhalf.1-amd64: AMD Phenom soft lockup - CPU#3 stuck for 11s [events/3:18]
Package: linux-image-2.6.24-etchnhalf.1-amd64 Version: 2.6.24-6~etchnhalf.7 Severity: critical Justification: breaks the whole system My amd64 system is subject to random lockups. Sometimes I can perform an emergency sync, umount, and reboot using MagicSysRq. Other times I can not. Sometimes I get a couple of days of uptime. Other times I have to reboot several times a day. This particular case was one of the times that I had to use the reset button to restart the system. The system had become unresponsive after running for about 48 hours without X running or any console user logins (total uptime was ~ 5 days). The first lock-up was: Jan 30 00:49:14 stasis kernel: BUG: soft lockup - CPU#3 stuck for 11s! [events/3:18] Jan 30 00:49:14 stasis kernel: CPU 3: Jan 30 00:49:14 stasis kernel: Modules linked in: isofs zlib_inflate ext2 tcp_diag inet_diag ppdev parport_pc lp parport autofs4 ipv6 tun nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc fuse dm_snapshot dm_mirror dm_mod it87 hwmon_vid eeprom loop snd_usb_audio snd_usb_lib snd_hwdep snd_seq_dummy snd_seq_oss snd_hda_intel snd_seq_midi snd_rawmidi snd_pcm_oss snd_mixer_oss snd_seq_midi_event rtc_cmos snd_seq snd_pcm rtc_core psmouse snd_timer floppy rtc_lib pwc snd_seq_device i2c_piix4 pcspkr snd_page_alloc serio_raw compat_ioctl32 videodev snd v4l2_common i2c_core v4l1_compat soundcore usblp atl1 button mii sg evdev ext3 jbd mbcache raid10 raid1 md_mod ide_generic generic atiixp ide_core sd_mod usbhid hid aic7xxx scsi_transport_spi ata_generic ahci ehci_hcd libata scsi_mod ohci_hcd thermal processor fan Jan 30 00:49:14 stasis kernel: Pid: 18, comm: events/3 Not tainted 2.6.24-etchnhalf.1-amd64 #1 Jan 30 00:49:14 stasis kernel: RIP: 0010:[8021bd36] [8021bd36] __smp_call_function_mask+0x9c/0xc0 Jan 30 00:49:14 stasis kernel: RSP: 0018:81012b763e00 EFLAGS: 0297 Jan 30 00:49:14 stasis kernel: RAX: 08fc RBX: 0003 RCX: 0001 Jan 30 00:49:14 stasis kernel: RDX: 08fc RSI: 00fc RDI: 0007 Jan 30 00:49:14 stasis kernel: RBP: 81000103c930 R08: 00f9 R09: 81011401fac8 Jan 30 00:49:14 stasis kernel: R10: 81012b72abe8 R11: R12: Jan 30 00:49:14 stasis kernel: R13: 00030282 R14: 81012b763ea0 R15: 00020001 Jan 30 00:49:14 stasis kernel: FS: 2b04ef9bccc0() GS:81012b6b78c0() knlGS:f6dfdb90 Jan 30 00:49:14 stasis kernel: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b Jan 30 00:49:14 stasis kernel: CR2: 2b8ba51cf000 CR3: 000114053000 CR4: 06e0 Jan 30 00:49:14 stasis kernel: DR0: DR1: DR2: Jan 30 00:49:14 stasis kernel: DR3: DR6: 0ff0 DR7: 0400 Jan 30 00:49:14 stasis kernel: Jan 30 00:49:14 stasis kernel: Call Trace: Jan 30 00:49:14 stasis kernel: [80216193] mcheck_check_cpu+0x0/0x37 Jan 30 00:49:14 stasis kernel: [8023e28b] lock_timer_base+0x26/0x4c Jan 30 00:49:14 stasis kernel: [80216193] mcheck_check_cpu+0x0/0x37 Jan 30 00:49:14 stasis kernel: [8021bdb8] smp_call_function_mask+0x5e/0x70 Jan 30 00:49:14 stasis kernel: [80215a31] mcheck_timer+0x0/0x7c Jan 30 00:49:14 stasis kernel: [80216193] mcheck_check_cpu+0x0/0x37 Jan 30 00:49:14 stasis kernel: [8023a243] on_each_cpu+0x10/0x22 Jan 30 00:49:14 stasis kernel: [80215a4e] mcheck_timer+0x1d/0x7c Jan 30 00:49:14 stasis kernel: [8027d1f0] vmstat_update+0x0/0x31 Jan 30 00:49:14 stasis kernel: [802448dd] run_workqueue+0x7f/0x10b Jan 30 00:49:14 stasis kernel: [802451ef] worker_thread+0x0/0xe4 Jan 30 00:49:14 stasis kernel: [802452c9] worker_thread+0xda/0xe4 Jan 30 00:49:14 stasis kernel: [802481fe] autoremove_wake_function+0x0/0x2e Jan 30 00:49:14 stasis kernel: [802480de] kthread+0x47/0x75 Jan 30 00:49:14 stasis kernel: [8020cc48] child_rip+0xa/0x12 Jan 30 00:49:14 stasis kernel: [80248097] kthread+0x0/0x75 Jan 30 00:49:14 stasis kernel: [8020cc3e] child_rip+0x0/0x12 This repeated 1703 times until the system became totally unresponsive with: Jan 30 06:25:05 stasis kernel: BUG: soft lockup - CPU#3 stuck for 11s! [events/3:18] Jan 30 06:25:05 stasis kernel: CPU 3: Jan 30 06:25:05 stasis kernel: Modules linked in: isofs zlib_inflate ext2 tcp_diag inet_diag ppdev parport_pc lp parport autofs4 ipv6 tun nfsd auth_rpc gss exportfs nfs lockd nfs_acl sunrpc fuse dm_snapshot dm_mirror dm_mod it87 hwmon_vid eeprom loop snd_usb_audio snd_usb_lib snd_hwdep snd_seq_dummy sn d_seq_oss snd_hda_intel snd_seq_midi snd_rawmidi snd_pcm_oss snd_mixer_oss snd_seq_midi_event rtc_cmos snd_seq snd_pcm rtc_core psmouse snd_timer flopp y rtc_lib pwc snd_seq_device i2c_piix4 pcspkr snd_page_alloc serio_raw compat_ioctl32 videodev snd v4l2_common i2c_core v4l1_compat