I/O Error Test 4
================
commit "bcache: add io_disable to struct cached_dev"
Problem: in case of the backing device hits I/o errors or is
disconected, the I/O can still be accepted to the bcache device.
Original kernel: dd writes in writeback mode to failed backing device complete.
Modified kernel: the bcache0 device is removed after some I/O errors in backing
device.
Original
--------
# uname -rv
4.15.0-55-generic #60-Ubuntu SMP Tue Jul 2 18:22:20 UTC 2019
# ./setup.sh >/dev/null 2>&1
[ 24.820401] bcache: register_bdev() registered backing device dm-0
[ 24.833268] bcache: run_cache_set() invalidating existing data
[ 24.848314] bcache: register_cache() registered cache device dm-1
[ 26.824645] bcache: bch_cached_dev_attach() Caching dm-0 as bcache0 on set
dd465fa7-4e85-484b-89dd-353c24c6b041
# echo writeback > /sys/block/bcache0/bcache/cache_mode
# cat /sys/block/bcache0/bcache/cache_mode
writethrough [writeback] writearound none
# ./dm_fake_dev.sh /dev/loop0 bad
[ 41.439684] Buffer I/O error on dev dm-0, logical block 262128, async page
read
[ 41.445284] Buffer I/O error on dev dm-0, logical block 262128, async page
read
[ 41.451846] bcache: register_bcache() error /dev/dm-0: device already
registered (emitting change event)
[ 41.454704] Buffer I/O error on dev bcache0, logical block 262112, async
page read
[ 41.457685] Buffer I/O error on dev bcache0, logical block 262112, async
page read
[ 41.457743] Buffer I/O error on dev bcache0, logical block 1, async page read
# dd if=/dev/zero of=/dev/bcache0 bs=4k
[ 49.048036] Buffer I/O error on dev bcache0, logical block 0, lost async
page write
[ 49.051702] Buffer I/O error on dev bcache0, logical block 1, lost async
page write
[ 49.054062] Buffer I/O error on dev bcache0, logical block 2, lost async
page write
[ 49.056466] Buffer I/O error on dev bcache0, logical block 3, lost async
page write
[ 49.058867] Buffer I/O error on dev bcache0, logical block 4, lost async
page write
[ 49.072020] Buffer I/O error on dev bcache0, logical block 5, lost async
page write
[ 49.074440] Buffer I/O error on dev bcache0, logical block 6, lost async
page write
[ 49.078658] Buffer I/O error on dev bcache0, logical block 7, lost async
page write
[ 49.079008] Buffer I/O error on dev bcache0, logical block 6834, lost async
page write
[ 49.079022] Buffer I/O error on dev bcache0, logical block 6835, lost async
page write
dd: error writing '/dev/bcache0': No space left on device
262142+0 records in
262141+0 records out
1073729536 bytes (1.1 GB, 1.0 GiB) copied, 2.58342 s, 416 MB/s
# dd if=/dev/zero of=/dev/bcache0 bs=4k
[ 62.696034] buffer_io_error: 260992 callbacks suppressed
[ 62.696037] Buffer I/O error on dev bcache0, logical block 0, lost async
page write
[ 62.701996] Buffer I/O error on dev bcache0, logical block 1, lost async
page write
[ 62.704394] Buffer I/O error on dev bcache0, logical block 2, lost async
page write
[ 62.706763] Buffer I/O error on dev bcache0, logical block 3, lost async
page write
[ 62.716025] Buffer I/O error on dev bcache0, logical block 4, lost async
page write
[ 62.718421] Buffer I/O error on dev bcache0, logical block 5, lost async
page write
[ 62.720821] Buffer I/O error on dev bcache0, logical block 6, lost async
page write
[ 62.723193] Buffer I/O error on dev bcache0, logical block 7, lost async
page write
[ 62.725584] Buffer I/O error on dev bcache0, logical block 8, lost async
page write
[ 62.725763] Buffer I/O error on dev bcache0, logical block 5405, lost async
page write
dd: error writing '/dev/bcache0': No space left on device
262142+0 records in
262141+0 records out
1073729536 bytes (1.1 GB, 1.0 GiB) copied, 2.88915 s, 372 MB/s
# dd if=/dev/zero of=/dev/bcache0 bs=4k
[ 67.700114] buffer_io_error: 290043 callbacks suppressed
[ 67.700117] Buffer I/O error on dev bcache0, logical block 40750, lost async
page write
[ 67.706230] Buffer I/O error on dev bcache0, logical block 40751, lost async
page write
[ 67.709846] Buffer I/O error on dev bcache0, logical block 40752, lost async
page write
[ 67.713503] Buffer I/O error on dev bcache0, logical block 40753, lost async
page write
[ 67.717241] Buffer I/O error on dev bcache0, logical block 40754, lost async
page write
[ 67.720938] Buffer I/O error on dev bcache0, logical block 40755, lost async
page write
[ 67.741395] Buffer I/O error on dev bcache0, logical block 40756, lost async
page write
dd: error writing '/dev/bcache0': No space left on device
[ 67.748145] Buffer I/O error on dev bcache0, logical block 40757, lost async
page write
[ 67.752352] Buffer I/O error on dev bcache0, logical block 41038, lost async
page write
[ 67.756642] Buffer I/O error on dev bcache0, logical block 41313, lost async
page write
262142+0 records in
262141+0 records out
1073729536 bytes (1.1 GB, 1.0 GiB) copied, 2.99741 s, 358 MB/s
# lsblk -e 252
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 1G 0 loop
loop1 7:1 0 1G 0 loop
└─fake-loop1 253:1 0 1024M 0 dm
└─bcache0 251:0 0 1024M 0 disk
fake-loop0 253:0 0 1G 0 dm
└─bcache0 251:0 0 1024M 0 disk
Modified
--------
# uname -rv
4.15.0-55-generic #60+test20190703build1bcache1-Ubuntu SMP Wed Jul 3 21:41:37
UTC
# ./setup.sh >/dev/null 2>&1
[ 22.202972] bcache: run_cache_set() invalidating existing data
[ 22.213346] bcache: register_cache() registered cache device dm-1
[ 22.226165] bcache: register_bdev() registered backing device dm-0
[ 24.198940] bcache: bch_cached_dev_attach() Caching dm-0 as bcache0 on set
cabfdb60-4301-46e0-940a-eb96e801c816
# echo writeback > /sys/block/bcache0/bcache/cache_mode
# cat /sys/block/bcache0/bcache/cache_mode
writethrough [writeback] writearound none
# ./dm_fake_dev.sh /dev/loop0 bad
[ 40.025536] Buffer I/O error on dev dm-0, logical block 262128, async page
read
[ 40.030156] Buffer I/O error on dev dm-0, logical block 262128, async page
read
[ 40.035808] bcache: register_bcache() error /dev/dm-0: device already
registered (emitting change event)
[ 40.038534] bcache: bch_count_backing_io_errors() dm-0: IO error on backing
device, unrecoverable
[ 40.038567] bcache: bch_count_backing_io_errors() dm-0: IO error on backing
device, unrecoverable
[ 40.038574] Buffer I/O error on dev bcache0, logical block 262112, async
page read
[ 40.041268] bcache: bch_count_backing_io_errors() dm-0: IO error on backing
device, unrecoverable
[ 40.041284] Buffer I/O error on dev bcache0, logical block 262112, async
page read
[ 40.041319] bcache: bch_count_backing_io_errors() dm-0: IO error on backing
device, unrecoverable
[ 40.041341] bcache: bch_count_backing_io_errors() dm-0: IO error on backing
device, unrecoverable
[ 40.041346] Buffer I/O error on dev bcache0, logical block 1, async page
read
# dd if=/dev/zero of=/dev/bcache0 bs=4k
[ 48.178854] bcache: bch_count_backing_io_errors() dm-0: IO error on backing
device, unrecoverable
[ 48.181495] bcache: bch_count_backing_io_errors() dm-0: IO error on backing
device, unrecoverable
[ 48.183988] bcache: bch_count_backing_io_errors() dm-0: IO error on backing
device, unrecoverable
[ 48.186469] bcache: bch_count_backing_io_errors() dm-0: IO error on backing
device, unrecoverable
[ 48.188962] Buffer I/O error on dev bcache0, logical block 0, lost async
page write
[ 48.191117] Buffer I/O error on dev bcache0, logical block 1, lost async
page write
[ 48.193295] Buffer I/O error on dev bcache0, logical block 2, lost async
page write
[ 48.195457] Buffer I/O error on dev bcache0, logical block 3, lost async
page write
[ 48.197607] bcache: bch_count_backing_io_errors() dm-0: IO error on backing
device, unrecoverable
[ 48.200116] bcache: bch_count_backing_io_errors() dm-0: IO error on backing
device, unrecoverable
[ 48.202597] bcache: bch_count_backing_io_errors() dm-0: IO error on backing
device, unrecoverable
[ 48.205087] Buffer I/O error on dev bcache0, logical block 4, lost async
page write
[ 48.207229] Buffer I/O error on dev bcache0, logical block 5, lost async
page write
[ 48.209377] Buffer I/O error on dev bcache0, logical block 6, lost async
page write
...
[ 48.362824] bcache: bch_count_backing_io_errors() dm-0: IO error on backing
device, unrecoverable
[ 48.365085] bcache: bch_count_backing_io_errors() dm-0: IO error on backing
device, unrecoverable
[ 48.367294] bcache: bch_count_backing_io_errors() dm-0: IO error on backing
device, unrecoverable
[ 48.369457] bcache: bch_cached_dev_error() stop bcache0: too many IO errors
on backing device dm-0
[ 48.369457]
dd: error writing '/dev/bcache0': No space left on device
262142+0 records in[ 48.866726] bcache: bcache_device_free() bcache0 stopped
262141+0 records out
1073729536 bytes (1.1 GB, 1.0 GiB) copied, 2.27785 s, 471 MB/s
# lsblk -e 252
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 1G 0 loop
loop1 7:1 0 1G 0 loop
└─fake-loop1 253:1 0 1024M 0 dm
fake-loop0 253:0 0 1G 0 dm
# ls /dev/bcache0
ls: cannot access '/dev/bcache0': No such file or directory
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1829563
Title:
bcache: risk of data loss on I/O errors in backing or caching devices
Status in linux package in Ubuntu:
Invalid
Status in linux source package in Bionic:
In Progress
Status in linux source package in Cosmic:
In Progress
Bug description:
[Impact]
* The bcache code in Bionic lacks several fixes to handle
I/O errors in both backing devices and caching devices.
* Partial or permanent errors in backing or caching devices,
specially in writeback mode, can lead to data loss and/or
the application is not notified about failed I/O requests.
* The bcache device might remain available for I/O requests
even if backing device is offline, so writes are undefined.
[Test Case]
* Detailed test cases/steps for the behavior of almost every
patch with code logic changes are provided in bug comments.
* The patchset has been tested for regressions on each cache
mode (writethrough, writeback, writearound, none) with the
xfstests test suite (on ext4), fio (random read-write) and
iozone (several read/write tests).
[Regression Potential]
* The patchset is relatively large and touches several areas
in bcache code, however, synthetic testing of the patches
has been performed, and extensive regression/stress tests
were run (as mentioned in Test Case section).
* Many patches in the patchset are 'Fixes' patches to other
patches, and no further 'Fixes' currently exist upstream.
[Other Info]
* Canonical Field Eng. deploys bcache+writeback extensively
(e.g., BootStack, UA cloud, except rare all-flash cases).
[Original Bug Description]
This is a request for a backport of the following upstream patch from
4.18:
"bcache: stop bcache device when backing device is offline"
https://github.com/torvalds/linux/commit/0f0709e6bfc3ce4e8e1c0e8573490c45f76cfeee
Field engineering uses bcache quite extensively and it would be good
to have this in the GA/bionic kernel.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1829563/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp