[Kernel-packages] [Bug 1449910] Re: System hangs apparently randomly when disconnecting iScsi volumes
It's been a full year now, without any problem. Definitely, the 4.1 kernel works well (and also the 4.0.1, which I'm running on one of the machines). Tomorrow I will start the software upgrade of these machines to the latest LTS distribution (16.04.1), which runs the 4.4 kernel, which I trust should be running well. Are there any news about this bug? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1449910 Title: System hangs apparently randomly when disconnecting iScsi volumes Status in linux package in Ubuntu: Triaged Bug description: We have several servers here which mount Ubuntu LTS Trusty Tahr 14.04. We use LVM snapshots to do backups during the night. Randomly (witha rate of 1 event every 10-12 days) the LVM snapshot creation runs in some problem. The server hangs and it must be rebooted hardly. No shell, no even any screen output, just a black screen, no ping, nothing. And Magic SysRq key doesn't help. No logs are written either (I also tried redirect logs to another machine, just in case). This happens with different hardware, and different backup software. The only constant is LVM snapshots. Kernel is 3.13.0-49. We tried the 14.10 kernel (3.16.0-34), we had the same hangs, but this time we had something logged: Apr 21 23:02:56 server-name kernel: [654840.108023] INFO: task kswapd0:50 blocked for more than 120 seconds. Apr 21 23:02:56 server-name kernel: [654840.108145] Not tainted 3.16.0-34-generic #47~14.04.1-Ubuntu Apr 21 23:02:56 server-name kernel: [654840.108245] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 21 23:02:56 server-name kernel: [654840.108361] kswapd0 D 88007fc130c0 050 2 0x Apr 21 23:02:56 server-name kernel: [654840.108367] 880077bcf998 0046 880077bd 880077bcffd8 Apr 21 23:02:56 server-name kernel: [654840.108372] 000130c0 000130c0 88007bdef010 c90010d3c000 Apr 21 23:02:56 server-name kernel: [654840.108377] c90010d3c0d8 c90010d3c000 Apr 21 23:02:56 server-name kernel: [654840.108382] Call Trace: /0x70 Apr 21 23:02:56 server-name kernel: [654840.108419] [] reiserfs_wait_on_write_block+0x4d/0x80 [reiserfs] Apr 21 23:02:56 server-name kernel: [654840.108426] [] ? prepare_to_wait_event+0x100/0x100 Apr 21 23:02:56 server-name kernel: [654840.108437] [] do_journal_begin_r+0xe1/0x3e0 [reiserfs] Apr 21 23:02:56 server-name kernel: [654840.108443] [] ? __enqueue_entity+0x78/0x80 Apr 21 23:02:56 server-name kernel: [654840.108454] [] journal_begin+0x8a/0x160 [reiserfs] Apr 21 23:02:56 server-name kernel: [654840.108464] [] reiserfs_release_dquot+0x4c/0xd0 [reiserfs] Apr 21 23:02:56 server-name kernel: [654840.108470] [] ? __percpu_counter_add+0x51/0x70 Apr 21 23:02:56 server-name kernel: [654840.108476] [] dqput+0x9d/0x200 Apr 21 23:02:56 server-name kernel: [654840.108480] [] __dquot_drop+0x5d/0x70 Apr 21 23:02:56 server-name kernel: [654840.108485] [] dquot_drop+0x2d/0x40 Apr 21 23:02:56 server-name kernel: [654840.108494] [] reiserfs_evic t_inode+0xa0/0x180 [reiserfs] Apr 21 23:02:56 server-name kernel: [654840.108501] [] ? inode_wait_for_writeback+0x2e/0x40 Apr 21 23:02:56 server-name kernel: [654840.108507] [] evict+0xb4/0x180 Apr 21 23:02:56 server-name kernel: [654840.108511] [] dispose_list+0x39/0x50 Apr 21 23:02:56 server-name kernel: [654840.108515] [] prune_icache_sb+0x47/0x60 Apr 21 23:02:56 server-name kernel: [654840.108520] [] super_cache_scan+0x105/0x170 Apr 21 23:02:56 server-name kernel: [654840.108526] [] shrink_slab_node+0x138/0x290 Apr 21 23:02:56 server-name kernel: [654840.108532] [] ? css_next_descendant_pre+0x3b/0x40 Apr 21 23:02:56 server-name kernel: [654840.108536] [] shrink_slab+0x8b/0x160 Apr 21 23:02:56 server-name kernel: [654840.108540] [] balance_pgdat+0x3f2/0x620 Apr 21 23:02:56 server-name kernel: [654840.108544] [] kswapd+0x15b/0x3f0 Apr 21 23:02:56 server-name kernel: [654840.108549] [] ? prepare_to_wait_event+0x100/0x100 Apr 21 23:02:56 server-name kernel: [654840.108552] [] ? balance_pgdat+0x620/0x620 Apr 21 23:02:56 server-name kernel: [654840.108558] [] kthread+0xd2/0xf0 Apr 21 23:02:56 server-name kernel: [654840.108562] [] ? kthread_create_on_node+0x1c0/0x1c0 Apr 21 23:02:56 server-name kernel: [654840.108567] [] ret_from_fork+0x7c/0xb0 Apr 21 23:02:56 server-name kernel: [654840.108571] [] ? kthread_create_on_node+0x1c0/0x1c0 Apr 21 23:02:56 server-name kernel: [654840.108595] INFO: task kworker/1:1:25606 blocked for more than 120 seconds. Apr 21 23:02:56 server-name kernel: [654840.108709] Not tainted 3.16.0-34-generic #47~14.04.1-Ubuntu Apr 21 23:02:56 server-name kernel: [654840.108809] "echo 0 >
[Kernel-packages] [Bug 1449910] Re: System hangs apparently randomly when disconnecting iScsi volumes
After another month of testing, the mainline kernel never failed. So, is this bug fixed in some release version of the kernel? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1449910 Title: System hangs apparently randomly when disconnecting iScsi volumes Status in linux package in Ubuntu: Triaged Bug description: We have several servers here which mount Ubuntu LTS Trusty Tahr 14.04. We use LVM snapshots to do backups during the night. Randomly (witha rate of 1 event every 10-12 days) the LVM snapshot creation runs in some problem. The server hangs and it must be rebooted hardly. No shell, no even any screen output, just a black screen, no ping, nothing. And Magic SysRq key doesn't help. No logs are written either (I also tried redirect logs to another machine, just in case). This happens with different hardware, and different backup software. The only constant is LVM snapshots. Kernel is 3.13.0-49. We tried the 14.10 kernel (3.16.0-34), we had the same hangs, but this time we had something logged: Apr 21 23:02:56 server-name kernel: [654840.108023] INFO: task kswapd0:50 blocked for more than 120 seconds. Apr 21 23:02:56 server-name kernel: [654840.108145] Not tainted 3.16.0-34-generic #47~14.04.1-Ubuntu Apr 21 23:02:56 server-name kernel: [654840.108245] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 21 23:02:56 server-name kernel: [654840.108361] kswapd0 D 88007fc130c0 050 2 0x Apr 21 23:02:56 server-name kernel: [654840.108367] 880077bcf998 0046 880077bd 880077bcffd8 Apr 21 23:02:56 server-name kernel: [654840.108372] 000130c0 000130c0 88007bdef010 c90010d3c000 Apr 21 23:02:56 server-name kernel: [654840.108377] c90010d3c0d8 c90010d3c000 Apr 21 23:02:56 server-name kernel: [654840.108382] Call Trace: /0x70 Apr 21 23:02:56 server-name kernel: [654840.108419] [] reiserfs_wait_on_write_block+0x4d/0x80 [reiserfs] Apr 21 23:02:56 server-name kernel: [654840.108426] [] ? prepare_to_wait_event+0x100/0x100 Apr 21 23:02:56 server-name kernel: [654840.108437] [] do_journal_begin_r+0xe1/0x3e0 [reiserfs] Apr 21 23:02:56 server-name kernel: [654840.108443] [] ? __enqueue_entity+0x78/0x80 Apr 21 23:02:56 server-name kernel: [654840.108454] [] journal_begin+0x8a/0x160 [reiserfs] Apr 21 23:02:56 server-name kernel: [654840.108464] [] reiserfs_release_dquot+0x4c/0xd0 [reiserfs] Apr 21 23:02:56 server-name kernel: [654840.108470] [] ? __percpu_counter_add+0x51/0x70 Apr 21 23:02:56 server-name kernel: [654840.108476] [] dqput+0x9d/0x200 Apr 21 23:02:56 server-name kernel: [654840.108480] [] __dquot_drop+0x5d/0x70 Apr 21 23:02:56 server-name kernel: [654840.108485] [] dquot_drop+0x2d/0x40 Apr 21 23:02:56 server-name kernel: [654840.108494] [] reiserfs_evic t_inode+0xa0/0x180 [reiserfs] Apr 21 23:02:56 server-name kernel: [654840.108501] [] ? inode_wait_for_writeback+0x2e/0x40 Apr 21 23:02:56 server-name kernel: [654840.108507] [] evict+0xb4/0x180 Apr 21 23:02:56 server-name kernel: [654840.108511] [] dispose_list+0x39/0x50 Apr 21 23:02:56 server-name kernel: [654840.108515] [] prune_icache_sb+0x47/0x60 Apr 21 23:02:56 server-name kernel: [654840.108520] [] super_cache_scan+0x105/0x170 Apr 21 23:02:56 server-name kernel: [654840.108526] [] shrink_slab_node+0x138/0x290 Apr 21 23:02:56 server-name kernel: [654840.108532] [] ? css_next_descendant_pre+0x3b/0x40 Apr 21 23:02:56 server-name kernel: [654840.108536] [] shrink_slab+0x8b/0x160 Apr 21 23:02:56 server-name kernel: [654840.108540] [] balance_pgdat+0x3f2/0x620 Apr 21 23:02:56 server-name kernel: [654840.108544] [] kswapd+0x15b/0x3f0 Apr 21 23:02:56 server-name kernel: [654840.108549] [] ? prepare_to_wait_event+0x100/0x100 Apr 21 23:02:56 server-name kernel: [654840.108552] [] ? balance_pgdat+0x620/0x620 Apr 21 23:02:56 server-name kernel: [654840.108558] [] kthread+0xd2/0xf0 Apr 21 23:02:56 server-name kernel: [654840.108562] [] ? kthread_create_on_node+0x1c0/0x1c0 Apr 21 23:02:56 server-name kernel: [654840.108567] [] ret_from_fork+0x7c/0xb0 Apr 21 23:02:56 server-name kernel: [654840.108571] [] ? kthread_create_on_node+0x1c0/0x1c0 Apr 21 23:02:56 server-name kernel: [654840.108595] INFO: task kworker/1:1:25606 blocked for more than 120 seconds. Apr 21 23:02:56 server-name kernel: [654840.108709] Not tainted 3.16.0-34-generic #47~14.04.1-Ubuntu Apr 21 23:02:56 server-name kernel: [654840.108809] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 21 23:02:56 server-name kernel: [654840.108923] kworker/1:1 D 88007fc530c0 0 25606 2 0x Apr 21 23:02:56 server-name kernel:
[Kernel-packages] [Bug 1449910] Re: System hangs apparently randomly when disconnecting iScsi volumes
After the installation of the mainline kernel on the remaining machine, the problem hasn't appeared anymore for more than one month, then unfortunately I couldn't test this anymore, because of other unrelated hardware problems. I plan to start again in the next days, still I think the problem is solved with the mainline kernel. So, what can we do now? Will the new release kernel be free of this bug? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1449910 Title: System hangs apparently randomly when disconnecting iScsi volumes Status in linux package in Ubuntu: Triaged Bug description: We have several servers here which mount Ubuntu LTS Trusty Tahr 14.04. We use LVM snapshots to do backups during the night. Randomly (witha rate of 1 event every 10-12 days) the LVM snapshot creation runs in some problem. The server hangs and it must be rebooted hardly. No shell, no even any screen output, just a black screen, no ping, nothing. And Magic SysRq key doesn't help. No logs are written either (I also tried redirect logs to another machine, just in case). This happens with different hardware, and different backup software. The only constant is LVM snapshots. Kernel is 3.13.0-49. We tried the 14.10 kernel (3.16.0-34), we had the same hangs, but this time we had something logged: Apr 21 23:02:56 server-name kernel: [654840.108023] INFO: task kswapd0:50 blocked for more than 120 seconds. Apr 21 23:02:56 server-name kernel: [654840.108145] Not tainted 3.16.0-34-generic #47~14.04.1-Ubuntu Apr 21 23:02:56 server-name kernel: [654840.108245] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 21 23:02:56 server-name kernel: [654840.108361] kswapd0 D 88007fc130c0 050 2 0x Apr 21 23:02:56 server-name kernel: [654840.108367] 880077bcf998 0046 880077bd 880077bcffd8 Apr 21 23:02:56 server-name kernel: [654840.108372] 000130c0 000130c0 88007bdef010 c90010d3c000 Apr 21 23:02:56 server-name kernel: [654840.108377] c90010d3c0d8 c90010d3c000 Apr 21 23:02:56 server-name kernel: [654840.108382] Call Trace: /0x70 Apr 21 23:02:56 server-name kernel: [654840.108419] [] reiserfs_wait_on_write_block+0x4d/0x80 [reiserfs] Apr 21 23:02:56 server-name kernel: [654840.108426] [] ? prepare_to_wait_event+0x100/0x100 Apr 21 23:02:56 server-name kernel: [654840.108437] [] do_journal_begin_r+0xe1/0x3e0 [reiserfs] Apr 21 23:02:56 server-name kernel: [654840.108443] [] ? __enqueue_entity+0x78/0x80 Apr 21 23:02:56 server-name kernel: [654840.108454] [] journal_begin+0x8a/0x160 [reiserfs] Apr 21 23:02:56 server-name kernel: [654840.108464] [] reiserfs_release_dquot+0x4c/0xd0 [reiserfs] Apr 21 23:02:56 server-name kernel: [654840.108470] [] ? __percpu_counter_add+0x51/0x70 Apr 21 23:02:56 server-name kernel: [654840.108476] [] dqput+0x9d/0x200 Apr 21 23:02:56 server-name kernel: [654840.108480] [] __dquot_drop+0x5d/0x70 Apr 21 23:02:56 server-name kernel: [654840.108485] [] dquot_drop+0x2d/0x40 Apr 21 23:02:56 server-name kernel: [654840.108494] [] reiserfs_evic t_inode+0xa0/0x180 [reiserfs] Apr 21 23:02:56 server-name kernel: [654840.108501] [] ? inode_wait_for_writeback+0x2e/0x40 Apr 21 23:02:56 server-name kernel: [654840.108507] [] evict+0xb4/0x180 Apr 21 23:02:56 server-name kernel: [654840.108511] [] dispose_list+0x39/0x50 Apr 21 23:02:56 server-name kernel: [654840.108515] [] prune_icache_sb+0x47/0x60 Apr 21 23:02:56 server-name kernel: [654840.108520] [] super_cache_scan+0x105/0x170 Apr 21 23:02:56 server-name kernel: [654840.108526] [] shrink_slab_node+0x138/0x290 Apr 21 23:02:56 server-name kernel: [654840.108532] [] ? css_next_descendant_pre+0x3b/0x40 Apr 21 23:02:56 server-name kernel: [654840.108536] [] shrink_slab+0x8b/0x160 Apr 21 23:02:56 server-name kernel: [654840.108540] [] balance_pgdat+0x3f2/0x620 Apr 21 23:02:56 server-name kernel: [654840.108544] [] kswapd+0x15b/0x3f0 Apr 21 23:02:56 server-name kernel: [654840.108549] [] ? prepare_to_wait_event+0x100/0x100 Apr 21 23:02:56 server-name kernel: [654840.108552] [] ? balance_pgdat+0x620/0x620 Apr 21 23:02:56 server-name kernel: [654840.108558] [] kthread+0xd2/0xf0 Apr 21 23:02:56 server-name kernel: [654840.108562] [] ? kthread_create_on_node+0x1c0/0x1c0 Apr 21 23:02:56 server-name kernel: [654840.108567] [] ret_from_fork+0x7c/0xb0 Apr 21 23:02:56 server-name kernel: [654840.108571] [] ? kthread_create_on_node+0x1c0/0x1c0 Apr 21 23:02:56 server-name kernel: [654840.108595] INFO: task kworker/1:1:25606 blocked for more than 120 seconds. Apr 21 23:02:56 server-name kernel: [654840.108709] Not tainted 3.16.0-34-generic #47~14.04.1-Ubuntu Apr 21 23:02:56
[Kernel-packages] [Bug 1449910] Re: System hangs apparently randomly when disconnecting iScsi volumes
2 nights ago something happened. I have a 5th server which doesn't use LVM snapshots, but still it's backed up every night. I didn't install the mainline kernel on this machine, because we thought the problem was due to snapshots. And it crashed two nights ago, in the same way the other machines crashed, without using snapshots. It crashed after finishing the backup. I think the problem lies in the disconnection of the iscsi volume on which the machines do the backup. On the first reboot, I had the following in the kernel log: [ 209.312097] [ cut here ] [ 209.312106] WARNING: CPU: 0 PID: 1808 at /build/buildd/linux-3.13.0/drivers/pci/pci.c:1444 pci_disable_device+0x9c/0xb0() [ 209.312108] ipmi_si :01:04.6: disabling already-disabled device [ 209.312110] Modules linked in: ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sit tunnel4 ip_tunnel dm_crypt gpio_ich coretemp kvm joydev serio_raw hpilo lpc_ich ipmi_si(-) i3200_edac shpchp edac_core mac_hid lp parport reiserfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid0 multipath linear raid1 hid_generic radeon i2c_algo_bit ttm drm_kms_helper psmouse drm pata_acpi tg3 usbhid hid ptp pps_core [ 209.312151] CPU: 0 PID: 1808 Comm: modprobe Not tainted 3.13.0-53-generic #89-Ubuntu [ 209.312153] Hardware name: HP ProLiant DL320 G5p, BIOS W05 04/03/2008 [ 209.312154] 0009 88007abb3d40 81722e1e 88007abb3d88 [ 209.312158] 88007abb3d78 810677fd 88007c311000 88007c2c5580 [ 209.312161] 88007c311000 7f459c5473f0 7ffd2b319338 88007abb3dd8 [ 209.312164] Call Trace: [ 209.312170] [81722e1e] dump_stack+0x45/0x56 [ 209.312175] [810677fd] warn_slowpath_common+0x7d/0xa0 [ 209.312177] [8106786c] warn_slowpath_fmt+0x4c/0x50 [ 209.312182] [811a259d] ? kfree+0xfd/0x140 [ 209.312186] [813a9c7c] pci_disable_device+0x9c/0xb0 [ 209.312192] [a0398059] ipmi_pci_remove+0x29/0x30 [ipmi_si] [ 209.312195] [813ac68b] pci_device_remove+0x3b/0xb0 [ 209.312200] [81498c3f] __device_release_driver+0x7f/0xf0 [ 209.312203] [81499608] driver_detach+0xb8/0xc0 [ 209.312207] [81498875] bus_remove_driver+0x55/0xd0 [ 209.312210] [81499c7c] driver_unregister+0x2c/0x50 [ 209.312213] [813ab179] pci_unregister_driver+0x29/0x90 [ 209.312218] [a03984c4] cleanup_ipmi_si+0xd4/0xf0 [ipmi_si] [ 209.31] [810e05d2] SyS_delete_module+0x162/0x200 [ 209.312227] [81013ed7] ? do_notify_resume+0x97/0xb0 [ 209.312231] [8173391d] system_call_fastpath+0x1a/0x1f [ 209.312233] ---[ end trace f6143eeb3c0e8dba ]--- I don't know if this is related, anyway it happened only on this particular reboot. I now installed the mainline kernel also on this machine. I changed the title of this bug to reflect the additional information I got. ** Summary changed: - System hangs apparently randomly when creating LVM snapshots + System hangs apparently randomly when disconnecting iScsi volumes -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1449910 Title: System hangs apparently randomly when disconnecting iScsi volumes Status in linux package in Ubuntu: Triaged Bug description: We have several servers here which mount Ubuntu LTS Trusty Tahr 14.04. We use LVM snapshots to do backups during the night. Randomly (witha rate of 1 event every 10-12 days) the LVM snapshot creation runs in some problem. The server hangs and it must be rebooted hardly. No shell, no even any screen output, just a black screen, no ping, nothing. And Magic SysRq key doesn't help. No logs are written either (I also tried redirect logs to another machine, just in case). This happens with different hardware, and different backup software. The only constant is LVM snapshots. Kernel is 3.13.0-49. We tried the 14.10 kernel (3.16.0-34), we had the same hangs, but this time we had something logged: Apr 21 23:02:56 server-name kernel: [654840.108023] INFO: task kswapd0:50 blocked for more than 120 seconds. Apr 21 23:02:56 server-name kernel: [654840.108145] Not tainted 3.16.0-34-generic #47~14.04.1-Ubuntu Apr 21 23:02:56 server-name kernel: [654840.108245] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. Apr 21 23:02:56 server-name kernel: [654840.108361] kswapd0 D 88007fc130c0 050 2 0x Apr 21 23:02:56 server-name kernel: [654840.108367] 880077bcf998 0046 880077bd 880077bcffd8 Apr 21 23:02:56 server-name kernel: [654840.108372] 000130c0 000130c0 88007bdef010 c90010d3c000 Apr 21 23:02:56 server-name kernel: [654840.108377] c90010d3c0d8