[Bug 1767992] Re: Linux md raid-10 freezes during resync

2019-08-09 Thread Connor Kuehl
Hi Damir (or anyone else affected by this that may be reading the bug),

If you get a chance, could you please test the kernel that is in
-proposed to verify that it resolves the issue for you? There are
instructions in comment #12 for verifying this for Bionic or if you're
using Xenial, the instructions are in comment #13.

Thanks,

Connor

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1767992

Title:
  Linux md raid-10 freezes during resync

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767992/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1767992] Re: Linux md raid-10 freezes during resync

2019-08-07 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the kernel in -proposed solves
the problem. Please test the kernel and update this bug with the
results. If the problem is solved, change the tag 'verification-needed-
xenial' to 'verification-done-xenial'. If the problem still exists,
change the tag 'verification-needed-xenial' to 'verification-failed-
xenial'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-xenial

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1767992

Title:
  Linux md raid-10 freezes during resync

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767992/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1767992] Re: Linux md raid-10 freezes during resync

2019-07-25 Thread Ubuntu Kernel Bot
This bug is awaiting verification that the kernel in -proposed solves
the problem. Please test the kernel and update this bug with the
results. If the problem is solved, change the tag 'verification-needed-
bionic' to 'verification-done-bionic'. If the problem still exists,
change the tag 'verification-needed-bionic' to 'verification-failed-
bionic'.

If verification is not done by 5 working days from today, this fix will
be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how
to enable and use -proposed. Thank you!


** Tags added: verification-needed-bionic

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1767992

Title:
  Linux md raid-10 freezes during resync

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767992/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1767992] Re: Linux md raid-10 freezes during resync

2019-07-24 Thread Brad Figg
** Tags added: cscc

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1767992

Title:
  Linux md raid-10 freezes during resync

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767992/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1767992] Re: Linux md raid-10 freezes during resync

2019-07-17 Thread Khaled El Mously
** Changed in: linux (Ubuntu Bionic)
   Status: In Progress => Fix Committed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1767992

Title:
  Linux md raid-10 freezes during resync

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767992/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1767992] Re: Linux md raid-10 freezes during resync

2019-07-15 Thread Connor Kuehl
Thank you for testing that and providing a test case, Damir!

I've gone ahead and submitted that patch to the mailing list:
https://lists.ubuntu.com/archives/kernel-team/2019-July/102199.html

** Description changed:

+ [Impact]
+ 
+ * If regular and resync IO happen at the same time during a regular IO
+   split, the split bio will wait until resync IO finishes while at the
+   same time the resync IO is waiting for regular IO to finish. This
+   results in deadlock.
+ 
+ * I believe this only impacts Bionic as Disco+ already contains this
+   commit. Xenial doesn't contain the commit that this one fixes.
+ 
+ [Test Case]
+ 
+ The test kernel containing this commit received positive feedback in the
+ launchpad bug.
+ 
+ From the launchpad bug comment #10:
+ 
+ "For reproduce on 4.15.0-50-generic: Make new raid-10, add some io fio/dd, 
+ unpack anaconda archives, after minute or two deadlocked"
+ 
+ [Regression Potential]
+ 
+ * This fix has been in mainline since December 2018 and I don't see any
+   fixup commits upstream referencing this one. The small number of
+   changes in this commit seem reasonable for managing the `nr_pending`
+   adjustments which preclude either regular or resync IO.
+ 
+ 
+ Original bug description follows:
+ -
+ 
  I'm trying to setup a few nodes with software raid-10.
  
  When array is created and resync is running i'm trying to install a few
  packages and frequently system stops responding, resync process stops,
  and I'm getting following errors in the kernel log.
  
  This looks like a deadlock for me.
  
  I had this problem in both 18.04 and 16.04. Reboot is the only way to
  fix the node.
  
  [ 2659.317256] INFO: task kworker/u24:13:343 blocked for more than 120 
seconds.
  [ 2659.317313]   Not tainted 4.15.0-20-generic #21-Ubuntu
  [ 2659.317350] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 2659.317401] kworker/u24:13  D0   343  2 0x8000
  [ 2659.317414] Workqueue: writeback wb_workfn (flush-9:1)
  [ 2659.317417] Call Trace:
  [ 2659.317430]  __schedule+0x297/0x8b0
  [ 2659.317435]  schedule+0x2c/0x80
  [ 2659.317443]  wait_barrier+0x146/0x1a0 [raid10]
  [ 2659.317449]  ? wait_woken+0x80/0x80
  [ 2659.317454]  raid10_write_request+0x77/0x950 [raid10]
  [ 2659.317459]  ? r10bio_pool_alloc+0x24/0x30 [raid10]
  [ 2659.317465]  ? mempool_alloc+0x71/0x190
  [ 2659.317469]  ? ___slab_alloc+0x20a/0x4b0
  [ 2659.317475]  ? md_write_start+0xc8/0x200
  [ 2659.317480]  ? mempool_alloc_slab+0x15/0x20
  [ 2659.317484]  raid10_make_request+0xcc/0x140 [raid10]
  [ 2659.317489]  md_handle_request+0x126/0x1a0
  [ 2659.317494]  md_make_request+0x6b/0x150
  [ 2659.317501]  generic_make_request+0x124/0x300
  [ 2659.317506]  submit_bio+0x73/0x150
  [ 2659.317510]  ? submit_bio+0x73/0x150
  [ 2659.317579]  xfs_submit_ioend+0x87/0x1c0 [xfs]
  [ 2659.317626]  xfs_do_writepage+0x377/0x6a0 [xfs]
  [ 2659.317632]  write_cache_pages+0x20c/0x4e0
  [ 2659.317674]  ? xfs_vm_writepages+0xf0/0xf0 [xfs]
  [ 2659.317682]  ? intel_pstate_update_pstate+0x40/0x40
  [ 2659.317687]  ? update_load_avg+0x5c5/0x6e0
  [ 2659.317727]  xfs_vm_writepages+0xbe/0xf0 [xfs]
  [ 2659.317732]  do_writepages+0x4b/0xe0
  [ 2659.317738]  ? check_preempt_curr+0x83/0x90
  [ 2659.317742]  ? ttwu_do_wakeup+0x1e/0x150
  [ 2659.317746]  __writeback_single_inode+0x45/0x340
  [ 2659.317749]  ? __writeback_single_inode+0x45/0x340
  [ 2659.317752]  writeback_sb_inodes+0x1e1/0x510
  [ 2659.317756]  __writeback_inodes_wb+0x67/0xb0
  [ 2659.317759]  wb_writeback+0x271/0x300
  [ 2659.317764]  wb_workfn+0x180/0x410
  [ 2659.317766]  ? wb_workfn+0x180/0x410
  [ 2659.317773]  process_one_work+0x1de/0x410
  [ 2659.317776]  worker_thread+0x32/0x410
  [ 2659.317781]  kthread+0x121/0x140
  [ 2659.317784]  ? process_one_work+0x410/0x410
  [ 2659.317788]  ? kthread_create_worker_on_cpu+0x70/0x70
  [ 2659.317793]  ? do_syscall_64+0x73/0x130
  [ 2659.317797]  ? SyS_exit_group+0x14/0x20
  [ 2659.317801]  ret_from_fork+0x35/0x40
  [ 2659.317806] INFO: task md1_resync:429 blocked for more than 120 seconds.
  [ 2659.317853]   Not tainted 4.15.0-20-generic #21-Ubuntu
  [ 2659.317889] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 2659.317940] md1_resync  D0   429  2 0x8000
  [ 2659.317943] Call Trace:
  [ 2659.317949]  __schedule+0x297/0x8b0
  [ 2659.317954]  schedule+0x2c/0x80
  [ 2659.317959]  raise_barrier+0xa1/0x1a0 [raid10]
  [ 2659.317963]  ? wait_woken+0x80/0x80
  [ 2659.317968]  raid10_sync_request+0x205/0x1f10 [raid10]
  [ 2659.317975]  ? find_next_bit+0xb/0x10
  [ 2659.317980]  ? cpumask_next+0x1b/0x20
  [ 2659.317985]  ? is_mddev_idle+0x92/0xf4
  [ 2659.317990]  md_do_sync+0x8ca/0xf10
  [ 2659.317994]  ? wait_woken+0x80/0x80
  [ 2659.318000]  md_thread+0x129/0x170
  [ 2659.318004]  ? mddev_put+0x140/0x140
  [ 2659.318007]  ? md_thread+0x129/0x170
  [ 2659.318012]  kthread+0x121/0x140
  [ 2659.318015]  ? 

[Bug 1767992] Re: Linux md raid-10 freezes during resync

2019-07-15 Thread Damir Chanyshev
Kernel from
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767992/comments/8
solved my issues.

For reproduce on 4.15.0-50-generic
Make new raid-10, add some io fio/dd, unpack anaconda archives(lol), after 
minute or two deadlocked 

[Mon Jul 15 19:31:11 2019] Workqueue: md md_submit_flush_data
[Mon Jul 15 19:31:11 2019] Call Trace:
[Mon Jul 15 19:31:11 2019]  __schedule+0x291/0x8a0
[Mon Jul 15 19:31:11 2019]  ? __switch_to_asm+0x34/0x70
[Mon Jul 15 19:31:11 2019]  ? __switch_to_asm+0x40/0x70
[Mon Jul 15 19:31:11 2019]  schedule+0x2c/0x80
[Mon Jul 15 19:31:11 2019]  wait_barrier+0x146/0x1a0 [raid10]
[Mon Jul 15 19:31:11 2019]  ? wait_woken+0x80/0x80
[Mon Jul 15 19:31:11 2019]  raid10_write_request+0x77/0x950 [raid10]
[Mon Jul 15 19:31:11 2019]  ? r10bio_pool_alloc+0x24/0x30 [raid10]
[Mon Jul 15 19:31:11 2019]  ? mempool_alloc+0x71/0x190
[Mon Jul 15 19:31:11 2019]  ? md_write_start+0xf4/0x210
[Mon Jul 15 19:31:11 2019]  ? default_wake_function+0x12/0x20
[Mon Jul 15 19:31:11 2019]  ? autoremove_wake_function+0x12/0x40
[Mon Jul 15 19:31:11 2019]  raid10_make_request+0xcc/0x140 [raid10]
[Mon Jul 15 19:31:11 2019]  md_handle_request+0x126/0x1a0
[Mon Jul 15 19:31:11 2019]  md_submit_flush_data+0x54/0x70
[Mon Jul 15 19:31:11 2019]  process_one_work+0x1de/0x410
[Mon Jul 15 19:31:11 2019]  worker_thread+0x32/0x410
[Mon Jul 15 19:31:11 2019]  kthread+0x121/0x140
[Mon Jul 15 19:31:11 2019]  ? process_one_work+0x410/0x410
[Mon Jul 15 19:31:11 2019]  ? kthread_create_worker_on_cpu+0x70/0x70
[Mon Jul 15 19:31:11 2019]  ret_from_fork+0x35/0x40

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1767992

Title:
  Linux md raid-10 freezes during resync

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767992/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1767992] Re: Linux md raid-10 freezes during resync

2019-07-10 Thread Connor Kuehl
Hi everyone,

I was just curious if anyone's had a chance to test out Kai-Heng's test
kernel from comment #8?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1767992

Title:
  Linux md raid-10 freezes during resync

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767992/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1767992] Re: Linux md raid-10 freezes during resync

2019-07-10 Thread Connor Kuehl
** Also affects: linux (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu Bionic)
   Status: New => In Progress

** Changed in: linux (Ubuntu Bionic)
   Importance: Undecided => Medium

** Changed in: linux (Ubuntu Bionic)
 Assignee: (unassigned) => Connor Kuehl (connork)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1767992

Title:
  Linux md raid-10 freezes during resync

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767992/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1767992] Re: Linux md raid-10 freezes during resync

2019-06-27 Thread Kai-Heng Feng
Please test the kernel:
https://people.canonical.com/~khfeng/lp1767992/

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1767992

Title:
  Linux md raid-10 freezes during resync

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767992/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1767992] Re: Linux md raid-10 freezes during resync

2019-06-25 Thread Damir Chanyshev
** Package changed: ubuntu => linux (Ubuntu)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1767992

Title:
  Linux md raid-10 freezes during resync

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1767992/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1767992] Re: Linux md raid-10 freezes during resync

2019-04-02 Thread Damir Chanyshev
Fixed in https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.19.21

patch https://lkml.org/lkml/2019/2/11/1281

m.b some backports ?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1767992

Title:
  Linux md raid-10 freezes during resync

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+bug/1767992/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1767992] Re: Linux md raid-10 freezes during resync

2019-02-08 Thread johnny
We had the same issue after updating from Ubuntu 14.04 to 16.04 and then 18.04 
(the two consecutive updates were applied on the same day).
The filesystem is also XFS. and the setup is with raid10.
The bug does not completely froze the system. Many process still run and were 
able to write to disk. But new processes seems to have trouble with that.
Also, some processes that did not write to disk also halted. Example: htop and 
ps aux
Interesting thing is that "ps aux" actually started printing the output and 
around the middle of the execution, it halted, and Ctrl+C was unable to exit 
the program.
htop, which hangs a few seconds after its execution, showed that the frozen 
processed were in D (uninterruptible sleep) state.

The fix was to downgrade from kernel 4.15.0-34-generic to 4.2.0-34-generic.
Although the fix was not perfect, since that fix it happened again once. (but 
once in 130 days is not as bad)

Attached logs: kernel.log with some stracktrace, but they are more or
less the same as previously posted.. (the logs were OCR'ed from some
screenshots. so beware of some characters. like the 0 which sometimes
turns into a 6)


** Attachment added: "kernel errors"
   
https://bugs.launchpad.net/ubuntu/+bug/1767992/+attachment/5237094/+files/errors.txt

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1767992

Title:
  Linux md raid-10 freezes during resync

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+bug/1767992/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1767992] Re: Linux md raid-10 freezes during resync

2018-10-26 Thread na
I can confirm this affected me as well. My RAID-10 is on a separate
filesystem from the Operating System's SSD -- I have the RAID-10 mounted
at '/raid10' on my server. My application is installed on the non-mdadm
SSD at '/opt' but it reads/writes/executes files from the '/raid10'
filesystem (made up of 6x8TB drives with 1 spare).

During a recovery from a failed drive, it would periodically run into
the same issue mentioned here while performing write operations and was
less likely to occur during read operations but it still did happen. I
noticed I'd have to reboot about every 30-60min due to this hang-up that
stopped the rebuild from continuing -- once the server was back online,
it would continue.

However, I got fed up with this because the rebuild/resync is supposed
to take approximately 9ish hours and it was only 29% complete so on the
last reboot I stopped my application and unmounted the RAID-10 (umount
/raid10). Once I did that, this continue rebuilding through the night
without issue and completed the remaining 71%.

... So it seems like you shouldn't interact with the RAID in the latest
kernel/xfs/mdadm on Ubuntu 18.04. Here's the current versions I'm
running:

==
root@server:~# uname -r
4.15.0-36-generic

root@server:~# dpkg -l | awk '/mdadm/ || /xfsprog/ {print $2,$3}' | column -t
mdadm 4.1~rc1-3~ubuntu18.04.1
xfsprogs  4.9.0+nmu1ubuntu2
==

Previously with Ubuntu 16.04 with the 4.4 kernel and latest
mdadm/xfsprogs for 16.04 I didn't have this issue.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1767992

Title:
  Linux md raid-10 freezes during resync

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+bug/1767992/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1767992] Re: Linux md raid-10 freezes during resync

2018-07-20 Thread Sergey Kirillov
This issue is gone after we migrated from XFS to EXT4, so it is
definitely XFS-related.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1767992

Title:
  Linux md raid-10 freezes during resync

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+bug/1767992/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1767992] Re: Linux md raid-10 freezes during resync

2018-07-20 Thread Thomas Winklehner
Same problem here reproducible on Ubuntu 18.04 LTS. SATA RAID 10 with
XFS filesystem. On each reboot the RAID does a resync. Any operation on
the RAID before the resync is finished causes the process to hang. Might
be also associated with another issue that causes the resync every time
on boot.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1767992

Title:
  Linux md raid-10 freezes during resync

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+bug/1767992/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1767992] Re: Linux md raid-10 freezes during resync

2018-06-14 Thread Aristarkh Zagorodnikov
Same here, md raid10 (not sure if its important but xfs, hung 3 times in last 2 
days).
It seems that a combination of md raid10 check + I/O (maybe XFS-specific, I 
dunno, but both original poster and us seem to use XFS) frequently hangs on 
kernels that are newer than Ubuntu 4.10.0-42.46~16.04.1-generic 4.10.17 (yes, I 
know it's a wide range, but everything started happening after we rebooted this 
machine, which upgraded us from Ubuntu 4.10.0-42.46~16.04.1-generic 4.10.17 to 
Ubuntu 4.13.0-43.48~16.04.1-generic 4.13.16, the check was scheduled some time 
later, so we didn't catch it immediately).

The logs look very similar:
Jun 13 19:15:42 pisces kernel: [27430.370899] INFO: task md7_resync:12982 
blocked for more than 120 seconds.
Jun 13 19:15:42 pisces kernel: [27430.370940]   Not tainted 
4.13.0-43-generic #48~16.04.1-Ubuntu
Jun 13 19:15:42 pisces kernel: [27430.370966] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 13 19:15:42 pisces kernel: [27430.370997] md7_resync  D0 12982  
2 0x8000
Jun 13 19:15:42 pisces kernel: [27430.371000] Call Trace:
Jun 13 19:15:42 pisces kernel: [27430.371012]  __schedule+0x3d6/0x8b0
Jun 13 19:15:42 pisces kernel: [27430.371014]  schedule+0x36/0x80
Jun 13 19:15:42 pisces kernel: [27430.371020]  raise_barrier+0xd2/0x1a0 [raid10]
Jun 13 19:15:42 pisces kernel: [27430.371024]  ? wait_woken+0x80/0x80
Jun 13 19:15:42 pisces kernel: [27430.371027]  raid10_sync_request+0x9bd/0x1b10 
[raid10]
Jun 13 19:15:42 pisces kernel: [27430.371031]  ? pick_next_task_fair+0x449/0x570
Jun 13 19:15:42 pisces kernel: [27430.371035]  ? __switch_to+0xb2/0x540
Jun 13 19:15:42 pisces kernel: [27430.371041]  ? find_next_bit+0xb/0x10
Jun 13 19:15:42 pisces kernel: [27430.371046]  ? is_mddev_idle+0xa1/0x101
Jun 13 19:15:42 pisces kernel: [27430.371048]  md_do_sync+0xb81/0xfb0
Jun 13 19:15:42 pisces kernel: [27430.371050]  ? wait_woken+0x80/0x80
Jun 13 19:15:42 pisces kernel: [27430.371054]  md_thread+0x133/0x180
Jun 13 19:15:42 pisces kernel: [27430.371055]  ? md_thread+0x133/0x180
Jun 13 19:15:42 pisces kernel: [27430.371060]  kthread+0x10c/0x140
Jun 13 19:15:42 pisces kernel: [27430.371062]  ? state_show+0x320/0x320
Jun 13 19:15:42 pisces kernel: [27430.371064]  ? 
kthread_create_on_node+0x70/0x70
Jun 13 19:15:42 pisces kernel: [27430.371067]  ret_from_fork+0x35/0x40
Jun 13 19:15:42 pisces kernel: [27430.371181] INFO: task kworker/20:1:27873 
blocked for more than 120 seconds.
Jun 13 19:15:42 pisces kernel: [27430.371210]   Not tainted 
4.13.0-43-generic #48~16.04.1-Ubuntu
Jun 13 19:15:42 pisces kernel: [27430.371235] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 13 19:15:42 pisces kernel: [27430.371267] kworker/20:1D0 27873  
2 0x8000
Jun 13 19:15:42 pisces kernel: [27430.371333] Workqueue: xfs-sync/md7 
xfs_log_worker [xfs]
Jun 13 19:15:42 pisces kernel: [27430.371334] Call Trace:
Jun 13 19:15:42 pisces kernel: [27430.371338]  __schedule+0x3d6/0x8b0
Jun 13 19:15:42 pisces kernel: [27430.371340]  schedule+0x36/0x80
Jun 13 19:15:42 pisces kernel: [27430.371342]  schedule_timeout+0x1f3/0x360
Jun 13 19:15:42 pisces kernel: [27430.371347]  ? scsi_init_rq+0x84/0x100
Jun 13 19:15:42 pisces kernel: [27430.371349]  wait_for_completion+0xb4/0x140
Jun 13 19:15:42 pisces kernel: [27430.371351]  ? wait_for_completion+0xb4/0x140
Jun 13 19:15:42 pisces kernel: [27430.371356]  ? wake_up_q+0x70/0x70
Jun 13 19:15:42 pisces kernel: [27430.371360]  flush_work+0x129/0x1e0
Jun 13 19:15:42 pisces kernel: [27430.371363]  ? 
worker_detach_from_pool+0xb0/0xb0
Jun 13 19:15:42 pisces kernel: [27430.371397]  xlog_cil_force_lsn+0x8b/0x220 
[xfs]
Jun 13 19:15:42 pisces kernel: [27430.371400]  ? update_curr+0x138/0x1d0
Jun 13 19:15:42 pisces kernel: [27430.371433]  _xfs_log_force+0x85/0x290 [xfs]
Jun 13 19:15:42 pisces kernel: [27430.371436]  ? pick_next_task_fair+0x131/0x570
Jun 13 19:15:42 pisces kernel: [27430.371438]  ? __switch_to+0xb2/0x540
Jun 13 19:15:42 pisces kernel: [27430.371471]  ? xfs_log_worker+0x36/0x100 [xfs]
Jun 13 19:15:42 pisces kernel: [27430.371505]  xfs_log_force+0x2c/0x80 [xfs]
Jun 13 19:15:42 pisces kernel: [27430.371538]  xfs_log_worker+0x36/0x100 [xfs]
Jun 13 19:15:42 pisces kernel: [27430.371541]  process_one_work+0x15b/0x410
Jun 13 19:15:42 pisces kernel: [27430.371544]  worker_thread+0x4b/0x460
Jun 13 19:15:42 pisces kernel: [27430.371546]  kthread+0x10c/0x140
Jun 13 19:15:42 pisces kernel: [27430.371548]  ? process_one_work+0x410/0x410
Jun 13 19:15:42 pisces kernel: [27430.371550]  ? 
kthread_create_on_node+0x70/0x70
Jun 13 19:15:42 pisces kernel: [27430.371552]  ret_from_fork+0x35/0x40
Jun 13 19:15:42 pisces kernel: [27430.371557] INFO: task kworker/20:0:4504 
blocked for more than 120 seconds.
Jun 13 19:15:42 pisces kernel: [27430.371587]   Not tainted 
4.13.0-43-generic #48~16.04.1-Ubuntu
Jun 13 19:15:42 pisces kernel: [27430.371611] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" 

[Bug 1767992] Re: Linux md raid-10 freezes during resync

2018-06-14 Thread Launchpad Bug Tracker
Status changed to 'Confirmed' because the bug affects multiple users.

** Changed in: ubuntu
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1767992

Title:
  Linux md raid-10 freezes during resync

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+bug/1767992/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1767992] Re: Linux md raid-10 freezes during resync

2018-04-30 Thread Sergey Kirillov
** Description changed:

- I'm tying to setup a few nodes with software raid-10.
+ I'm trying to setup a few nodes with software raid-10.
  
  When array is created and resync is running i'm trying to install a few
  packages and frequently system stops responding, resync process stops,
  and I'm getting following errors in the kernel log.
  
  This looks like a deadlock for me.
  
  I had this problem in both 18.04 and 16.04. Reboot is the only way to
  fix the node.
  
  [ 2659.317256] INFO: task kworker/u24:13:343 blocked for more than 120 
seconds.
  [ 2659.317313]   Not tainted 4.15.0-20-generic #21-Ubuntu
  [ 2659.317350] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 2659.317401] kworker/u24:13  D0   343  2 0x8000
  [ 2659.317414] Workqueue: writeback wb_workfn (flush-9:1)
  [ 2659.317417] Call Trace:
  [ 2659.317430]  __schedule+0x297/0x8b0
  [ 2659.317435]  schedule+0x2c/0x80
  [ 2659.317443]  wait_barrier+0x146/0x1a0 [raid10]
  [ 2659.317449]  ? wait_woken+0x80/0x80
  [ 2659.317454]  raid10_write_request+0x77/0x950 [raid10]
  [ 2659.317459]  ? r10bio_pool_alloc+0x24/0x30 [raid10]
  [ 2659.317465]  ? mempool_alloc+0x71/0x190
  [ 2659.317469]  ? ___slab_alloc+0x20a/0x4b0
  [ 2659.317475]  ? md_write_start+0xc8/0x200
  [ 2659.317480]  ? mempool_alloc_slab+0x15/0x20
  [ 2659.317484]  raid10_make_request+0xcc/0x140 [raid10]
  [ 2659.317489]  md_handle_request+0x126/0x1a0
  [ 2659.317494]  md_make_request+0x6b/0x150
  [ 2659.317501]  generic_make_request+0x124/0x300
  [ 2659.317506]  submit_bio+0x73/0x150
  [ 2659.317510]  ? submit_bio+0x73/0x150
  [ 2659.317579]  xfs_submit_ioend+0x87/0x1c0 [xfs]
  [ 2659.317626]  xfs_do_writepage+0x377/0x6a0 [xfs]
  [ 2659.317632]  write_cache_pages+0x20c/0x4e0
  [ 2659.317674]  ? xfs_vm_writepages+0xf0/0xf0 [xfs]
  [ 2659.317682]  ? intel_pstate_update_pstate+0x40/0x40
  [ 2659.317687]  ? update_load_avg+0x5c5/0x6e0
  [ 2659.317727]  xfs_vm_writepages+0xbe/0xf0 [xfs]
  [ 2659.317732]  do_writepages+0x4b/0xe0
  [ 2659.317738]  ? check_preempt_curr+0x83/0x90
  [ 2659.317742]  ? ttwu_do_wakeup+0x1e/0x150
  [ 2659.317746]  __writeback_single_inode+0x45/0x340
  [ 2659.317749]  ? __writeback_single_inode+0x45/0x340
  [ 2659.317752]  writeback_sb_inodes+0x1e1/0x510
  [ 2659.317756]  __writeback_inodes_wb+0x67/0xb0
  [ 2659.317759]  wb_writeback+0x271/0x300
  [ 2659.317764]  wb_workfn+0x180/0x410
  [ 2659.317766]  ? wb_workfn+0x180/0x410
  [ 2659.317773]  process_one_work+0x1de/0x410
  [ 2659.317776]  worker_thread+0x32/0x410
  [ 2659.317781]  kthread+0x121/0x140
  [ 2659.317784]  ? process_one_work+0x410/0x410
  [ 2659.317788]  ? kthread_create_worker_on_cpu+0x70/0x70
  [ 2659.317793]  ? do_syscall_64+0x73/0x130
  [ 2659.317797]  ? SyS_exit_group+0x14/0x20
  [ 2659.317801]  ret_from_fork+0x35/0x40
  [ 2659.317806] INFO: task md1_resync:429 blocked for more than 120 seconds.
  [ 2659.317853]   Not tainted 4.15.0-20-generic #21-Ubuntu
  [ 2659.317889] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 2659.317940] md1_resync  D0   429  2 0x8000
  [ 2659.317943] Call Trace:
  [ 2659.317949]  __schedule+0x297/0x8b0
  [ 2659.317954]  schedule+0x2c/0x80
  [ 2659.317959]  raise_barrier+0xa1/0x1a0 [raid10]
  [ 2659.317963]  ? wait_woken+0x80/0x80
  [ 2659.317968]  raid10_sync_request+0x205/0x1f10 [raid10]
  [ 2659.317975]  ? find_next_bit+0xb/0x10
  [ 2659.317980]  ? cpumask_next+0x1b/0x20
  [ 2659.317985]  ? is_mddev_idle+0x92/0xf4
  [ 2659.317990]  md_do_sync+0x8ca/0xf10
  [ 2659.317994]  ? wait_woken+0x80/0x80
  [ 2659.318000]  md_thread+0x129/0x170
  [ 2659.318004]  ? mddev_put+0x140/0x140
  [ 2659.318007]  ? md_thread+0x129/0x170
  [ 2659.318012]  kthread+0x121/0x140
  [ 2659.318015]  ? find_pers+0x70/0x70
  [ 2659.318019]  ? kthread_create_worker_on_cpu+0x70/0x70
  [ 2659.318023]  ret_from_fork+0x35/0x40
  [ 2659.318031] INFO: task xfsaild/md1p1:701 blocked for more than 120 seconds.
  [ 2659.318077]   Not tainted 4.15.0-20-generic #21-Ubuntu
  [ 2659.318113] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [ 2659.318164] xfsaild/md1p1   D0   701  2 0x8000
  [ 2659.318167] Call Trace:
  [ 2659.318172]  __schedule+0x297/0x8b0
  [ 2659.318177]  ? mempool_alloc_slab+0x15/0x20
  [ 2659.318181]  schedule+0x2c/0x80
  [ 2659.318186]  wait_barrier+0x146/0x1a0 [raid10]
  [ 2659.318189]  ? wait_woken+0x80/0x80
  [ 2659.318194]  raid10_write_request+0x77/0x950 [raid10]
  [ 2659.318198]  ? r10bio_pool_alloc+0x24/0x30 [raid10]
  [ 2659.318202]  ? mempool_alloc+0x71/0x190
  [ 2659.318206]  ? md_write_start+0xc8/0x200
  [ 2659.318211]  raid10_make_request+0xcc/0x140 [raid10]
  [ 2659.318215]  md_handle_request+0x126/0x1a0
  [ 2659.318220]  md_make_request+0x6b/0x150
  [ 2659.318225]  generic_make_request+0x124/0x300
  [ 2659.318230]  submit_bio+0x73/0x150
  [ 2659.318234]  ? submit_bio+0x73/0x150
  [ 2659.318283]  _xfs_buf_ioapply+0x31e/0x4e0 [xfs]