On 07/06/18 10:37, Juergen Sauer wrote:
..
Moving a virtual machine from ssd/raid1 subvolume (nocow) into the
rotational big store (noocow) fails.
After filling up the cachememory (ram) the data flow cuts down to zero
0 kb/sec.
In fatal result the copy of an huge file hangs does not proceed any
more, load raises infinite, iops falling to zero. In kernel log I find:
[ 491.151952] INFO: task kworker/u16:28:1027 blocked for more than 120
seconds.
[ 491.151953] Tainted: P O 4.17.3-1-ARCH #1
[ 491.151953] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 491.151953] kworker/u16:28 D 0 1027 2 0x80000000
[ 491.151965] Workqueue: btrfs-endio-raid56 btrfs_endio_raid56_helper
[btrfs]
[ 491.151965] Call Trace:
[ 491.151967] ? __schedule+0x282/0x890
[ 491.151969] schedule+0x32/0x90
[ 491.151970] io_schedule+0x12/0x40
[ 491.151971] blk_mq_get_tag+0x146/0x2a0
This has nothing to do with btrfs and is simply one of the remaining
(but already fixed upstream) bugs in the blk-mq stack, probably related
to sbitmap concurrency and or "tag starvation".
I could give you a list of patches from 4.18+ that help (reliably)
but I suppose you're not into kernel patching, so the easiest way for
you would be to to switch to the old block layer (e.g. by booting
with kernel flag scsi_mod.use_blk_mq=0) and use deadline/cfq as before.
This should all be fixed & work reliable with 4.18+; it looks that by
4.19 blk-mq will also be enabled by default.
cheers
Holger
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html