Bug#659169: Re: Bug#659169: [2.6.32] BUG: soft lockup - CPU#7 stuck for 17163091979s! [init:9709]
On -10/01/37 20:59, Ben Hutchings wrote: > Version: 2.6.32-40 > > On Wed, Feb 08, 2012 at 10:13:50PM +0100, Carlos Alberto Lopez Perez wrote: >> Source: linux-2.6 >> Version: 2.6.32-5 >> Severity: normal >> >> >> Hello, >> >> Today one of my servers stopped responding to some webs, ssh'ing into it was >> impossible. The ping worked and the ssh connection was starting but the >> shell didn't showed up after waiting a long time. Finally a hard-reset was >> needed in order to bring it back >> >> After the reboot I found this on kern.log >> >> # tail /var/log/kern.log >> Feb 1 09:41:00 server-i7_920 kernel: [17383037.287331] EXT4-fs (dm-29): >> mounted filesystem with ordered data mode >> Feb 5 09:38:15 server-i7_920 kernel: [17727613.769052] NOHZ: >> local_softirq_pending 100 >> Feb 7 05:56:07 server-i7_920 kernel: [17886689.887577] e1000e: eth0 NIC >> Link is Down >> Feb 7 05:59:05 server-i7_920 kernel: [17886867.166464] e1000e: eth0 NIC >> Link is Up 1000 Mbps Full Duplex, Flow Control: None >> Feb 7 05:59:41 server-i7_920 kernel: [17886903.722727] e1000e: eth0 NIC >> Link is Down >> Feb 7 06:00:00 server-i7_920 kernel: [17886922.309159] e1000e: eth0 NIC >> Link is Up 1000 Mbps Full Duplex, Flow Control: None >> Feb 8 17:28:57 server-i7_920 kernel: [18446744016.876326] BUG: soft lockup >> - CPU#7 stuck for 17163091979s! [init:9709] > [...] > > This appears to be the bug fixed by 'sched, x86: Avoid unnecessary > overflow in sched_clock', included in longterm update 2.6.32.50 and > Debian package version 2.6.32-40. That bug would be triggered once > the scheduler clock reached 18014398 seconds, which is a little after > the last reasonable time seen in this log. > > Ben. > Wow! Really amazing, thanks for the reply. I will be upgrading the kernel ASAP. Regards! -- ~~~ Carlos Alberto Lopez Perez http://neutrino.es Igalia - Free Software Engineeringhttp://www.igalia.com ~~~ signature.asc Description: OpenPGP digital signature
Bug#659169: [2.6.32] BUG: soft lockup - CPU#7 stuck for 17163091979s! [init:9709]
Source: linux-2.6 Version: 2.6.32-5 Severity: normal Hello, Today one of my servers stopped responding to some webs, ssh'ing into it was impossible. The ping worked and the ssh connection was starting but the shell didn't showed up after waiting a long time. Finally a hard-reset was needed in order to bring it back After the reboot I found this on kern.log # tail /var/log/kern.log Feb 1 09:41:00 server-i7_920 kernel: [17383037.287331] EXT4-fs (dm-29): mounted filesystem with ordered data mode Feb 5 09:38:15 server-i7_920 kernel: [17727613.769052] NOHZ: local_softirq_pending 100 Feb 7 05:56:07 server-i7_920 kernel: [17886689.887577] e1000e: eth0 NIC Link is Down Feb 7 05:59:05 server-i7_920 kernel: [17886867.166464] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Feb 7 05:59:41 server-i7_920 kernel: [17886903.722727] e1000e: eth0 NIC Link is Down Feb 7 06:00:00 server-i7_920 kernel: [17886922.309159] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Feb 8 17:28:57 server-i7_920 kernel: [18446744016.876326] BUG: soft lockup - CPU#7 stuck for 17163091979s! [init:9709] Feb 8 17:28:57 server-i7_920 kernel: [18446744016.876371] Modules linked in: btrfs zlib_deflate crc32c libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs exportfs reiserfs ext4 jbd2 crc16 ext2 ipt_LOG sg xt_limit xt_tcpudp xt_state iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables x_tables dummy loop snd_pcm snd_timer snd soundcore snd_page_alloc ioatdma i2c_i801 i2c_core pcspkr dca psmouse evdev button serio_raw processor ext3 jbd mbcache dm_mod sd_mod crc_t10dif ahci libata uhci_hcd ehci_hcd scsi_mod usbcore nls_base e1000e thermal thermal_sys [last unloaded: scsi_wait_scan] Feb 8 17:28:57 server-i7_920 kernel: [18446744016.876405] CPU 7: Feb 8 17:28:57 server-i7_920 kernel: [18446744016.876406] Modules linked in: btrfs zlib_deflate crc32c libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs exportfs reiserfs ext4 jbd2 crc16 ext2 ipt_LOG sg xt_limit xt_tcpudp xt_state iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables x_tables dummy loop snd_pcm snd_timer snd soundcore snd_page_alloc ioatdma i2c_i801 i2c_core pcspkr dca psmouse evdev button serio_raw processor ext3 jbd mbcache dm_mod sd_mod crc_t10dif ahci libata uhci_hcd ehci_hcd scsi_mod usbcore nls_base e1000e thermal thermal_sys [last unloaded: scsi_wait_scan] Feb 8 17:28:57 server-i7_920 kernel: [18446744016.876433] Pid: 9709, comm: init Not tainted 2.6.32-5-vserver-amd64 #1 X8STi Feb 8 17:28:57 server-i7_920 kernel: [18446744016.876435] RIP: 0023:[] [ ] 0xf76c0430 Feb 8 17:28:57 server-i7_920 kernel: [18446744016.876440] RSP: 002b:ffa6673c EFLAGS: 0296 Feb 8 17:28:57 server-i7_920 kernel: [18446744016.876442] RAX: fff6 RBX: RCX: ffa6673c Feb 8 17:28:57 server-i7_920 kernel: [18446744016.876443] RDX: f76c0430 RSI: 000f RDI: Feb 8 17:28:57 server-i7_920 kernel: [18446744016.876445] RBP: 8101166e R08: R09: Feb 8 17:28:57 server-i7_920 kernel: [18446744016.876447] R10: R11: R12: Feb 8 17:28:57 server-i7_920 kernel: [18446744016.876448] R13: R14: R15: Feb 8 17:28:57 server-i7_920 kernel: [18446744016.876451] FS: () GS:880016bc(0063) knlGS:f74a96c0 Feb 8 17:28:57 server-i7_920 kernel: [18446744016.876453] CS: 0010 DS: 002b ES: 002b CR0: 8005003b Feb 8 17:28:57 server-i7_920 kernel: [18446744016.876454] CR2: 7fa1566220a0 CR3: 000514e5d000 CR4: 06e0 Feb 8 17:28:57 server-i7_920 kernel: [18446744016.876456] DR0: DR1: DR2: Feb 8 17:28:57 server-i7_920 kernel: [18446744016.876458] DR3: DR6: 0ff0 DR7: 0400 Feb 8 17:28:57 server-i7_920 kernel: [18446744016.876460] Call Trace: Feb 8 18:20:04 server-i7_920 kernel: [ 3003.130843] INFO: task cron:8748 blocked for more than 120 seconds. Feb 8 18:20:04 server-i7_920 kernel: [ 3003.130874] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Feb 8 18:20:04 server-i7_920 kernel: [ 3003.130919] cron D 0 8748 25441 0x0002 Feb 8 18:20:04 server-i7_920 kernel: [ 3003.130923] 88063ccf3250 0086 Feb 8 18:20:04 server-i7_920 kernel: [ 3003.130927] 88055c60c7e0 00d0 f9e0 88045d2bdfd8 Feb 8 18:20:04 server-i7_920 kernel: [ 3003.130931] 00015780 00015780 88055c60c7e0 88055c60cad8 Feb 8 18:20:04 server-i7_920 kernel: [ 3003.130934] Call