[Bug 1767992] Re: Linux md raid-10 freezes during resync
Hi Damir (or anyone else affected by this that may be reading the bug), If you get a chance, could you please test the kernel that is in -proposed to verify that it resolves the issue for you? There are instructions in comment #12 for verifying this for Bionic or if you're using Xenial, the instructions are in comment #13. Thanks, Connor -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1767992 Title: Linux md raid-10 freezes during resync To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767992/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1767992] Re: Linux md raid-10 freezes during resync
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed- xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed- xenial'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-xenial -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1767992 Title: Linux md raid-10 freezes during resync To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767992/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1767992] Re: Linux md raid-10 freezes during resync
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed- bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed- bionic'. If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-bionic -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1767992 Title: Linux md raid-10 freezes during resync To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767992/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1767992] Re: Linux md raid-10 freezes during resync
** Tags added: cscc -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1767992 Title: Linux md raid-10 freezes during resync To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767992/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1767992] Re: Linux md raid-10 freezes during resync
** Changed in: linux (Ubuntu Bionic) Status: In Progress => Fix Committed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1767992 Title: Linux md raid-10 freezes during resync To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767992/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1767992] Re: Linux md raid-10 freezes during resync
Thank you for testing that and providing a test case, Damir! I've gone ahead and submitted that patch to the mailing list: https://lists.ubuntu.com/archives/kernel-team/2019-July/102199.html ** Description changed: + [Impact] + + * If regular and resync IO happen at the same time during a regular IO + split, the split bio will wait until resync IO finishes while at the + same time the resync IO is waiting for regular IO to finish. This + results in deadlock. + + * I believe this only impacts Bionic as Disco+ already contains this + commit. Xenial doesn't contain the commit that this one fixes. + + [Test Case] + + The test kernel containing this commit received positive feedback in the + launchpad bug. + + From the launchpad bug comment #10: + + "For reproduce on 4.15.0-50-generic: Make new raid-10, add some io fio/dd, + unpack anaconda archives, after minute or two deadlocked" + + [Regression Potential] + + * This fix has been in mainline since December 2018 and I don't see any + fixup commits upstream referencing this one. The small number of + changes in this commit seem reasonable for managing the `nr_pending` + adjustments which preclude either regular or resync IO. + + + Original bug description follows: + - + I'm trying to setup a few nodes with software raid-10. When array is created and resync is running i'm trying to install a few packages and frequently system stops responding, resync process stops, and I'm getting following errors in the kernel log. This looks like a deadlock for me. I had this problem in both 18.04 and 16.04. Reboot is the only way to fix the node. [ 2659.317256] INFO: task kworker/u24:13:343 blocked for more than 120 seconds. [ 2659.317313] Not tainted 4.15.0-20-generic #21-Ubuntu [ 2659.317350] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2659.317401] kworker/u24:13 D0 343 2 0x8000 [ 2659.317414] Workqueue: writeback wb_workfn (flush-9:1) [ 2659.317417] Call Trace: [ 2659.317430] __schedule+0x297/0x8b0 [ 2659.317435] schedule+0x2c/0x80 [ 2659.317443] wait_barrier+0x146/0x1a0 [raid10] [ 2659.317449] ? wait_woken+0x80/0x80 [ 2659.317454] raid10_write_request+0x77/0x950 [raid10] [ 2659.317459] ? r10bio_pool_alloc+0x24/0x30 [raid10] [ 2659.317465] ? mempool_alloc+0x71/0x190 [ 2659.317469] ? ___slab_alloc+0x20a/0x4b0 [ 2659.317475] ? md_write_start+0xc8/0x200 [ 2659.317480] ? mempool_alloc_slab+0x15/0x20 [ 2659.317484] raid10_make_request+0xcc/0x140 [raid10] [ 2659.317489] md_handle_request+0x126/0x1a0 [ 2659.317494] md_make_request+0x6b/0x150 [ 2659.317501] generic_make_request+0x124/0x300 [ 2659.317506] submit_bio+0x73/0x150 [ 2659.317510] ? submit_bio+0x73/0x150 [ 2659.317579] xfs_submit_ioend+0x87/0x1c0 [xfs] [ 2659.317626] xfs_do_writepage+0x377/0x6a0 [xfs] [ 2659.317632] write_cache_pages+0x20c/0x4e0 [ 2659.317674] ? xfs_vm_writepages+0xf0/0xf0 [xfs] [ 2659.317682] ? intel_pstate_update_pstate+0x40/0x40 [ 2659.317687] ? update_load_avg+0x5c5/0x6e0 [ 2659.317727] xfs_vm_writepages+0xbe/0xf0 [xfs] [ 2659.317732] do_writepages+0x4b/0xe0 [ 2659.317738] ? check_preempt_curr+0x83/0x90 [ 2659.317742] ? ttwu_do_wakeup+0x1e/0x150 [ 2659.317746] __writeback_single_inode+0x45/0x340 [ 2659.317749] ? __writeback_single_inode+0x45/0x340 [ 2659.317752] writeback_sb_inodes+0x1e1/0x510 [ 2659.317756] __writeback_inodes_wb+0x67/0xb0 [ 2659.317759] wb_writeback+0x271/0x300 [ 2659.317764] wb_workfn+0x180/0x410 [ 2659.317766] ? wb_workfn+0x180/0x410 [ 2659.317773] process_one_work+0x1de/0x410 [ 2659.317776] worker_thread+0x32/0x410 [ 2659.317781] kthread+0x121/0x140 [ 2659.317784] ? process_one_work+0x410/0x410 [ 2659.317788] ? kthread_create_worker_on_cpu+0x70/0x70 [ 2659.317793] ? do_syscall_64+0x73/0x130 [ 2659.317797] ? SyS_exit_group+0x14/0x20 [ 2659.317801] ret_from_fork+0x35/0x40 [ 2659.317806] INFO: task md1_resync:429 blocked for more than 120 seconds. [ 2659.317853] Not tainted 4.15.0-20-generic #21-Ubuntu [ 2659.317889] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2659.317940] md1_resync D0 429 2 0x8000 [ 2659.317943] Call Trace: [ 2659.317949] __schedule+0x297/0x8b0 [ 2659.317954] schedule+0x2c/0x80 [ 2659.317959] raise_barrier+0xa1/0x1a0 [raid10] [ 2659.317963] ? wait_woken+0x80/0x80 [ 2659.317968] raid10_sync_request+0x205/0x1f10 [raid10] [ 2659.317975] ? find_next_bit+0xb/0x10 [ 2659.317980] ? cpumask_next+0x1b/0x20 [ 2659.317985] ? is_mddev_idle+0x92/0xf4 [ 2659.317990] md_do_sync+0x8ca/0xf10 [ 2659.317994] ? wait_woken+0x80/0x80 [ 2659.318000] md_thread+0x129/0x170 [ 2659.318004] ? mddev_put+0x140/0x140 [ 2659.318007] ? md_thread+0x129/0x170 [ 2659.318012] kthread+0x121/0x140 [ 2659.318015] ?
[Bug 1767992] Re: Linux md raid-10 freezes during resync
Kernel from https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767992/comments/8 solved my issues. For reproduce on 4.15.0-50-generic Make new raid-10, add some io fio/dd, unpack anaconda archives(lol), after minute or two deadlocked [Mon Jul 15 19:31:11 2019] Workqueue: md md_submit_flush_data [Mon Jul 15 19:31:11 2019] Call Trace: [Mon Jul 15 19:31:11 2019] __schedule+0x291/0x8a0 [Mon Jul 15 19:31:11 2019] ? __switch_to_asm+0x34/0x70 [Mon Jul 15 19:31:11 2019] ? __switch_to_asm+0x40/0x70 [Mon Jul 15 19:31:11 2019] schedule+0x2c/0x80 [Mon Jul 15 19:31:11 2019] wait_barrier+0x146/0x1a0 [raid10] [Mon Jul 15 19:31:11 2019] ? wait_woken+0x80/0x80 [Mon Jul 15 19:31:11 2019] raid10_write_request+0x77/0x950 [raid10] [Mon Jul 15 19:31:11 2019] ? r10bio_pool_alloc+0x24/0x30 [raid10] [Mon Jul 15 19:31:11 2019] ? mempool_alloc+0x71/0x190 [Mon Jul 15 19:31:11 2019] ? md_write_start+0xf4/0x210 [Mon Jul 15 19:31:11 2019] ? default_wake_function+0x12/0x20 [Mon Jul 15 19:31:11 2019] ? autoremove_wake_function+0x12/0x40 [Mon Jul 15 19:31:11 2019] raid10_make_request+0xcc/0x140 [raid10] [Mon Jul 15 19:31:11 2019] md_handle_request+0x126/0x1a0 [Mon Jul 15 19:31:11 2019] md_submit_flush_data+0x54/0x70 [Mon Jul 15 19:31:11 2019] process_one_work+0x1de/0x410 [Mon Jul 15 19:31:11 2019] worker_thread+0x32/0x410 [Mon Jul 15 19:31:11 2019] kthread+0x121/0x140 [Mon Jul 15 19:31:11 2019] ? process_one_work+0x410/0x410 [Mon Jul 15 19:31:11 2019] ? kthread_create_worker_on_cpu+0x70/0x70 [Mon Jul 15 19:31:11 2019] ret_from_fork+0x35/0x40 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1767992 Title: Linux md raid-10 freezes during resync To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767992/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1767992] Re: Linux md raid-10 freezes during resync
Hi everyone, I was just curious if anyone's had a chance to test out Kai-Heng's test kernel from comment #8? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1767992 Title: Linux md raid-10 freezes during resync To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767992/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1767992] Re: Linux md raid-10 freezes during resync
** Also affects: linux (Ubuntu Bionic) Importance: Undecided Status: New ** Changed in: linux (Ubuntu Bionic) Status: New => In Progress ** Changed in: linux (Ubuntu Bionic) Importance: Undecided => Medium ** Changed in: linux (Ubuntu Bionic) Assignee: (unassigned) => Connor Kuehl (connork) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1767992 Title: Linux md raid-10 freezes during resync To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767992/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1767992] Re: Linux md raid-10 freezes during resync
Please test the kernel: https://people.canonical.com/~khfeng/lp1767992/ -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1767992 Title: Linux md raid-10 freezes during resync To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767992/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1767992] Re: Linux md raid-10 freezes during resync
** Package changed: ubuntu => linux (Ubuntu) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1767992 Title: Linux md raid-10 freezes during resync To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767992/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1767992] Re: Linux md raid-10 freezes during resync
Fixed in https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.19.21 patch https://lkml.org/lkml/2019/2/11/1281 m.b some backports ? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1767992 Title: Linux md raid-10 freezes during resync To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+bug/1767992/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1767992] Re: Linux md raid-10 freezes during resync
We had the same issue after updating from Ubuntu 14.04 to 16.04 and then 18.04 (the two consecutive updates were applied on the same day). The filesystem is also XFS. and the setup is with raid10. The bug does not completely froze the system. Many process still run and were able to write to disk. But new processes seems to have trouble with that. Also, some processes that did not write to disk also halted. Example: htop and ps aux Interesting thing is that "ps aux" actually started printing the output and around the middle of the execution, it halted, and Ctrl+C was unable to exit the program. htop, which hangs a few seconds after its execution, showed that the frozen processed were in D (uninterruptible sleep) state. The fix was to downgrade from kernel 4.15.0-34-generic to 4.2.0-34-generic. Although the fix was not perfect, since that fix it happened again once. (but once in 130 days is not as bad) Attached logs: kernel.log with some stracktrace, but they are more or less the same as previously posted.. (the logs were OCR'ed from some screenshots. so beware of some characters. like the 0 which sometimes turns into a 6) ** Attachment added: "kernel errors" https://bugs.launchpad.net/ubuntu/+bug/1767992/+attachment/5237094/+files/errors.txt -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1767992 Title: Linux md raid-10 freezes during resync To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+bug/1767992/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1767992] Re: Linux md raid-10 freezes during resync
I can confirm this affected me as well. My RAID-10 is on a separate filesystem from the Operating System's SSD -- I have the RAID-10 mounted at '/raid10' on my server. My application is installed on the non-mdadm SSD at '/opt' but it reads/writes/executes files from the '/raid10' filesystem (made up of 6x8TB drives with 1 spare). During a recovery from a failed drive, it would periodically run into the same issue mentioned here while performing write operations and was less likely to occur during read operations but it still did happen. I noticed I'd have to reboot about every 30-60min due to this hang-up that stopped the rebuild from continuing -- once the server was back online, it would continue. However, I got fed up with this because the rebuild/resync is supposed to take approximately 9ish hours and it was only 29% complete so on the last reboot I stopped my application and unmounted the RAID-10 (umount /raid10). Once I did that, this continue rebuilding through the night without issue and completed the remaining 71%. ... So it seems like you shouldn't interact with the RAID in the latest kernel/xfs/mdadm on Ubuntu 18.04. Here's the current versions I'm running: == root@server:~# uname -r 4.15.0-36-generic root@server:~# dpkg -l | awk '/mdadm/ || /xfsprog/ {print $2,$3}' | column -t mdadm 4.1~rc1-3~ubuntu18.04.1 xfsprogs 4.9.0+nmu1ubuntu2 == Previously with Ubuntu 16.04 with the 4.4 kernel and latest mdadm/xfsprogs for 16.04 I didn't have this issue. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1767992 Title: Linux md raid-10 freezes during resync To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+bug/1767992/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1767992] Re: Linux md raid-10 freezes during resync
This issue is gone after we migrated from XFS to EXT4, so it is definitely XFS-related. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1767992 Title: Linux md raid-10 freezes during resync To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+bug/1767992/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1767992] Re: Linux md raid-10 freezes during resync
Same problem here reproducible on Ubuntu 18.04 LTS. SATA RAID 10 with XFS filesystem. On each reboot the RAID does a resync. Any operation on the RAID before the resync is finished causes the process to hang. Might be also associated with another issue that causes the resync every time on boot. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1767992 Title: Linux md raid-10 freezes during resync To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+bug/1767992/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1767992] Re: Linux md raid-10 freezes during resync
Same here, md raid10 (not sure if its important but xfs, hung 3 times in last 2 days). It seems that a combination of md raid10 check + I/O (maybe XFS-specific, I dunno, but both original poster and us seem to use XFS) frequently hangs on kernels that are newer than Ubuntu 4.10.0-42.46~16.04.1-generic 4.10.17 (yes, I know it's a wide range, but everything started happening after we rebooted this machine, which upgraded us from Ubuntu 4.10.0-42.46~16.04.1-generic 4.10.17 to Ubuntu 4.13.0-43.48~16.04.1-generic 4.13.16, the check was scheduled some time later, so we didn't catch it immediately). The logs look very similar: Jun 13 19:15:42 pisces kernel: [27430.370899] INFO: task md7_resync:12982 blocked for more than 120 seconds. Jun 13 19:15:42 pisces kernel: [27430.370940] Not tainted 4.13.0-43-generic #48~16.04.1-Ubuntu Jun 13 19:15:42 pisces kernel: [27430.370966] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jun 13 19:15:42 pisces kernel: [27430.370997] md7_resync D0 12982 2 0x8000 Jun 13 19:15:42 pisces kernel: [27430.371000] Call Trace: Jun 13 19:15:42 pisces kernel: [27430.371012] __schedule+0x3d6/0x8b0 Jun 13 19:15:42 pisces kernel: [27430.371014] schedule+0x36/0x80 Jun 13 19:15:42 pisces kernel: [27430.371020] raise_barrier+0xd2/0x1a0 [raid10] Jun 13 19:15:42 pisces kernel: [27430.371024] ? wait_woken+0x80/0x80 Jun 13 19:15:42 pisces kernel: [27430.371027] raid10_sync_request+0x9bd/0x1b10 [raid10] Jun 13 19:15:42 pisces kernel: [27430.371031] ? pick_next_task_fair+0x449/0x570 Jun 13 19:15:42 pisces kernel: [27430.371035] ? __switch_to+0xb2/0x540 Jun 13 19:15:42 pisces kernel: [27430.371041] ? find_next_bit+0xb/0x10 Jun 13 19:15:42 pisces kernel: [27430.371046] ? is_mddev_idle+0xa1/0x101 Jun 13 19:15:42 pisces kernel: [27430.371048] md_do_sync+0xb81/0xfb0 Jun 13 19:15:42 pisces kernel: [27430.371050] ? wait_woken+0x80/0x80 Jun 13 19:15:42 pisces kernel: [27430.371054] md_thread+0x133/0x180 Jun 13 19:15:42 pisces kernel: [27430.371055] ? md_thread+0x133/0x180 Jun 13 19:15:42 pisces kernel: [27430.371060] kthread+0x10c/0x140 Jun 13 19:15:42 pisces kernel: [27430.371062] ? state_show+0x320/0x320 Jun 13 19:15:42 pisces kernel: [27430.371064] ? kthread_create_on_node+0x70/0x70 Jun 13 19:15:42 pisces kernel: [27430.371067] ret_from_fork+0x35/0x40 Jun 13 19:15:42 pisces kernel: [27430.371181] INFO: task kworker/20:1:27873 blocked for more than 120 seconds. Jun 13 19:15:42 pisces kernel: [27430.371210] Not tainted 4.13.0-43-generic #48~16.04.1-Ubuntu Jun 13 19:15:42 pisces kernel: [27430.371235] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jun 13 19:15:42 pisces kernel: [27430.371267] kworker/20:1D0 27873 2 0x8000 Jun 13 19:15:42 pisces kernel: [27430.371333] Workqueue: xfs-sync/md7 xfs_log_worker [xfs] Jun 13 19:15:42 pisces kernel: [27430.371334] Call Trace: Jun 13 19:15:42 pisces kernel: [27430.371338] __schedule+0x3d6/0x8b0 Jun 13 19:15:42 pisces kernel: [27430.371340] schedule+0x36/0x80 Jun 13 19:15:42 pisces kernel: [27430.371342] schedule_timeout+0x1f3/0x360 Jun 13 19:15:42 pisces kernel: [27430.371347] ? scsi_init_rq+0x84/0x100 Jun 13 19:15:42 pisces kernel: [27430.371349] wait_for_completion+0xb4/0x140 Jun 13 19:15:42 pisces kernel: [27430.371351] ? wait_for_completion+0xb4/0x140 Jun 13 19:15:42 pisces kernel: [27430.371356] ? wake_up_q+0x70/0x70 Jun 13 19:15:42 pisces kernel: [27430.371360] flush_work+0x129/0x1e0 Jun 13 19:15:42 pisces kernel: [27430.371363] ? worker_detach_from_pool+0xb0/0xb0 Jun 13 19:15:42 pisces kernel: [27430.371397] xlog_cil_force_lsn+0x8b/0x220 [xfs] Jun 13 19:15:42 pisces kernel: [27430.371400] ? update_curr+0x138/0x1d0 Jun 13 19:15:42 pisces kernel: [27430.371433] _xfs_log_force+0x85/0x290 [xfs] Jun 13 19:15:42 pisces kernel: [27430.371436] ? pick_next_task_fair+0x131/0x570 Jun 13 19:15:42 pisces kernel: [27430.371438] ? __switch_to+0xb2/0x540 Jun 13 19:15:42 pisces kernel: [27430.371471] ? xfs_log_worker+0x36/0x100 [xfs] Jun 13 19:15:42 pisces kernel: [27430.371505] xfs_log_force+0x2c/0x80 [xfs] Jun 13 19:15:42 pisces kernel: [27430.371538] xfs_log_worker+0x36/0x100 [xfs] Jun 13 19:15:42 pisces kernel: [27430.371541] process_one_work+0x15b/0x410 Jun 13 19:15:42 pisces kernel: [27430.371544] worker_thread+0x4b/0x460 Jun 13 19:15:42 pisces kernel: [27430.371546] kthread+0x10c/0x140 Jun 13 19:15:42 pisces kernel: [27430.371548] ? process_one_work+0x410/0x410 Jun 13 19:15:42 pisces kernel: [27430.371550] ? kthread_create_on_node+0x70/0x70 Jun 13 19:15:42 pisces kernel: [27430.371552] ret_from_fork+0x35/0x40 Jun 13 19:15:42 pisces kernel: [27430.371557] INFO: task kworker/20:0:4504 blocked for more than 120 seconds. Jun 13 19:15:42 pisces kernel: [27430.371587] Not tainted 4.13.0-43-generic #48~16.04.1-Ubuntu Jun 13 19:15:42 pisces kernel: [27430.371611] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
[Bug 1767992] Re: Linux md raid-10 freezes during resync
Status changed to 'Confirmed' because the bug affects multiple users. ** Changed in: ubuntu Status: New => Confirmed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1767992 Title: Linux md raid-10 freezes during resync To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+bug/1767992/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1767992] Re: Linux md raid-10 freezes during resync
** Description changed: - I'm tying to setup a few nodes with software raid-10. + I'm trying to setup a few nodes with software raid-10. When array is created and resync is running i'm trying to install a few packages and frequently system stops responding, resync process stops, and I'm getting following errors in the kernel log. This looks like a deadlock for me. I had this problem in both 18.04 and 16.04. Reboot is the only way to fix the node. [ 2659.317256] INFO: task kworker/u24:13:343 blocked for more than 120 seconds. [ 2659.317313] Not tainted 4.15.0-20-generic #21-Ubuntu [ 2659.317350] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2659.317401] kworker/u24:13 D0 343 2 0x8000 [ 2659.317414] Workqueue: writeback wb_workfn (flush-9:1) [ 2659.317417] Call Trace: [ 2659.317430] __schedule+0x297/0x8b0 [ 2659.317435] schedule+0x2c/0x80 [ 2659.317443] wait_barrier+0x146/0x1a0 [raid10] [ 2659.317449] ? wait_woken+0x80/0x80 [ 2659.317454] raid10_write_request+0x77/0x950 [raid10] [ 2659.317459] ? r10bio_pool_alloc+0x24/0x30 [raid10] [ 2659.317465] ? mempool_alloc+0x71/0x190 [ 2659.317469] ? ___slab_alloc+0x20a/0x4b0 [ 2659.317475] ? md_write_start+0xc8/0x200 [ 2659.317480] ? mempool_alloc_slab+0x15/0x20 [ 2659.317484] raid10_make_request+0xcc/0x140 [raid10] [ 2659.317489] md_handle_request+0x126/0x1a0 [ 2659.317494] md_make_request+0x6b/0x150 [ 2659.317501] generic_make_request+0x124/0x300 [ 2659.317506] submit_bio+0x73/0x150 [ 2659.317510] ? submit_bio+0x73/0x150 [ 2659.317579] xfs_submit_ioend+0x87/0x1c0 [xfs] [ 2659.317626] xfs_do_writepage+0x377/0x6a0 [xfs] [ 2659.317632] write_cache_pages+0x20c/0x4e0 [ 2659.317674] ? xfs_vm_writepages+0xf0/0xf0 [xfs] [ 2659.317682] ? intel_pstate_update_pstate+0x40/0x40 [ 2659.317687] ? update_load_avg+0x5c5/0x6e0 [ 2659.317727] xfs_vm_writepages+0xbe/0xf0 [xfs] [ 2659.317732] do_writepages+0x4b/0xe0 [ 2659.317738] ? check_preempt_curr+0x83/0x90 [ 2659.317742] ? ttwu_do_wakeup+0x1e/0x150 [ 2659.317746] __writeback_single_inode+0x45/0x340 [ 2659.317749] ? __writeback_single_inode+0x45/0x340 [ 2659.317752] writeback_sb_inodes+0x1e1/0x510 [ 2659.317756] __writeback_inodes_wb+0x67/0xb0 [ 2659.317759] wb_writeback+0x271/0x300 [ 2659.317764] wb_workfn+0x180/0x410 [ 2659.317766] ? wb_workfn+0x180/0x410 [ 2659.317773] process_one_work+0x1de/0x410 [ 2659.317776] worker_thread+0x32/0x410 [ 2659.317781] kthread+0x121/0x140 [ 2659.317784] ? process_one_work+0x410/0x410 [ 2659.317788] ? kthread_create_worker_on_cpu+0x70/0x70 [ 2659.317793] ? do_syscall_64+0x73/0x130 [ 2659.317797] ? SyS_exit_group+0x14/0x20 [ 2659.317801] ret_from_fork+0x35/0x40 [ 2659.317806] INFO: task md1_resync:429 blocked for more than 120 seconds. [ 2659.317853] Not tainted 4.15.0-20-generic #21-Ubuntu [ 2659.317889] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2659.317940] md1_resync D0 429 2 0x8000 [ 2659.317943] Call Trace: [ 2659.317949] __schedule+0x297/0x8b0 [ 2659.317954] schedule+0x2c/0x80 [ 2659.317959] raise_barrier+0xa1/0x1a0 [raid10] [ 2659.317963] ? wait_woken+0x80/0x80 [ 2659.317968] raid10_sync_request+0x205/0x1f10 [raid10] [ 2659.317975] ? find_next_bit+0xb/0x10 [ 2659.317980] ? cpumask_next+0x1b/0x20 [ 2659.317985] ? is_mddev_idle+0x92/0xf4 [ 2659.317990] md_do_sync+0x8ca/0xf10 [ 2659.317994] ? wait_woken+0x80/0x80 [ 2659.318000] md_thread+0x129/0x170 [ 2659.318004] ? mddev_put+0x140/0x140 [ 2659.318007] ? md_thread+0x129/0x170 [ 2659.318012] kthread+0x121/0x140 [ 2659.318015] ? find_pers+0x70/0x70 [ 2659.318019] ? kthread_create_worker_on_cpu+0x70/0x70 [ 2659.318023] ret_from_fork+0x35/0x40 [ 2659.318031] INFO: task xfsaild/md1p1:701 blocked for more than 120 seconds. [ 2659.318077] Not tainted 4.15.0-20-generic #21-Ubuntu [ 2659.318113] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2659.318164] xfsaild/md1p1 D0 701 2 0x8000 [ 2659.318167] Call Trace: [ 2659.318172] __schedule+0x297/0x8b0 [ 2659.318177] ? mempool_alloc_slab+0x15/0x20 [ 2659.318181] schedule+0x2c/0x80 [ 2659.318186] wait_barrier+0x146/0x1a0 [raid10] [ 2659.318189] ? wait_woken+0x80/0x80 [ 2659.318194] raid10_write_request+0x77/0x950 [raid10] [ 2659.318198] ? r10bio_pool_alloc+0x24/0x30 [raid10] [ 2659.318202] ? mempool_alloc+0x71/0x190 [ 2659.318206] ? md_write_start+0xc8/0x200 [ 2659.318211] raid10_make_request+0xcc/0x140 [raid10] [ 2659.318215] md_handle_request+0x126/0x1a0 [ 2659.318220] md_make_request+0x6b/0x150 [ 2659.318225] generic_make_request+0x124/0x300 [ 2659.318230] submit_bio+0x73/0x150 [ 2659.318234] ? submit_bio+0x73/0x150 [ 2659.318283] _xfs_buf_ioapply+0x31e/0x4e0 [xfs]