Dear btrfs team / community,
Sometimes it happens that kernel resets USB subsystem (looks like hardware
problem). Nevertheless all USB devices are unattached and attached back. After
few hours of struggle btrfs finally comes to the situation when read-only
filesystem mount is necessary. During this time when I try to access this
mounted filesystem (/mnt/backups) it reports success for some directories, or
error for others:
root@debian:~# ll /mnt/backups/
total 14334
drwxr-xr-x 1 adm users 116 Sep 12 00:35 .
drwxrwxr-x 1 adm users 164 Sep 19 22:44 ..
-rw-r--r-- 1 adm users 79927 Feb 7 2018 contacts.zip
drwxr-xr-x 1 adm users 254 Feb 4 2018 attic
drwxr-xr-x 1 adm users 16 Feb 23 2018 recent
...
root@debian:~# ll /mnt/backups/attic/
ls: reading directory '/mnt/backups/attic/': Input/output error
total 0
drwxr-xr-x 1 adm users 254 Feb 4 2018 .
drwxr-xr-x 1 adm users 116 Sep 12 00:35 ..
It looks like this depends on whether the content is in disk cache...
What is surprising: when I try to create a file, I succeed:
root@debian:~# touch /mnt/backups/.mounted
root@debian:~# ll /mnt/backups/.mounted
-rw-r--r-- 1 root root 0 Sep 20 16:52 /mnt/backups/.mounted
root@debian:~# rm /mnt/backups/.mounted
My btrfs volume consists of two identical drives combined into RAID1 volume:
# btrfs filesystem df /mnt/backups
Data, RAID1: total=880.00GiB, used=878.96GiB
System, RAID1: total=8.00MiB, used=144.00KiB
Metadata, RAID1: total=2.00GiB, used=1.13GiB
GlobalReserve, single: total=512.00MiB, used=0.00B
# btrfs filesystem show /mnt/backups
Label: none uuid: a657364b-36d2-4c1f-8e5d-dc3d28166190
Total devices 2 FS bytes used 880.09GiB
devid 1 size 3.64TiB used 882.01GiB path /dev/sdf
devid 2 size 3.64TiB used 882.01GiB path /dev/sde
As a workaround I can monitor dmesg output but:
1. It would be nice if I could tell btrfs that I would like to mount read-only
after a certain error rate per minute is reached.
2. It would be nice if btrfs could detect that both drives are not available and
unmount (as mount read-only won't help much) the filesystem.
Kernel log for Linux v4.14.2 is attached.
--
With best regards,
Dmitry
Jun 29 18:54:56 debian kernel: [1197865.440396] usb 4-2: USB disconnect, device
number 3
Jun 29 18:54:56 debian kernel: [1197865.440403] usb 4-2.2: USB disconnect,
device number 5
Jun 29 18:54:56 debian kernel: [1197865.476118] usb 4-2.3: USB disconnect,
device number 8
Jun 29 18:54:56 debian kernel: [1197865.549379] usb 4-2.4: USB disconnect,
device number 7
...
Jun 29 18:54:58 debian kernel: [1197867.517728] usb-storage 4-2.3:1.0: USB Mass
Storage device detected
Jun 29 18:54:58 debian kernel: [1197867.524021] usb-storage 4-2.3:1.0: Quirks
match for vid 152d pid 0567: 5000000
Jun 29 18:54:58 debian kernel: [1197867.603859] usb 4-2.4: new full-speed USB
device number 13 using ehci-pci
Jun 29 18:54:58 debian kernel: [1197867.725595] usb-storage 4-2.4:1.2: USB Mass
Storage device detected
Jun 29 18:54:58 debian kernel: [1197867.728602] scsi host9: usb-storage
4-2.4:1.2
Jun 29 18:54:59 debian kernel: [1197868.528737] scsi 7:0:0:0: Direct-Access
ST4000DM 004-2CV104 0125 PQ: 0 ANSI: 6
Jun 29 18:54:59 debian kernel: [1197868.529310] scsi 7:0:0:1: Direct-Access
ST4000DM 004-2CV104 0125 PQ: 0 ANSI: 6
Jun 29 18:54:59 debian kernel: [1197868.530093] sd 7:0:0:0: Attached scsi
generic sg5 type 0
Jun 29 18:54:59 debian kernel: [1197868.530588] sd 7:0:0:1: Attached scsi
generic sg6 type 0
Jun 29 18:54:59 debian kernel: [1197868.533064] sd 7:0:0:1: [sdh] Very big
device. Trying to use READ CAPACITY(16).
Jun 29 18:54:59 debian kernel: [1197868.533619] sd 7:0:0:1: [sdh] 7814037168
512-byte logical blocks: (4.00 TB/3.64 TiB)
Jun 29 18:54:59 debian kernel: [1197868.533626] sd 7:0:0:1: [sdh] 4096-byte
physical blocks
Jun 29 18:54:59 debian kernel: [1197868.534063] sd 7:0:0:1: [sdh] Write Protect
is off
Jun 29 18:54:59 debian kernel: [1197868.534069] sd 7:0:0:1: [sdh] Mode Sense:
67 00 10 08
Jun 29 18:54:59 debian kernel: [1197868.534422] sd 7:0:0:1: [sdh] No Caching
mode page found
Jun 29 18:54:59 debian kernel: [1197868.534542] sd 7:0:0:1: [sdh] Assuming
drive cache: write through
Jun 29 18:54:59 debian kernel: [1197868.535563] sd 7:0:0:1: [sdh] Very big
device. Trying to use READ CAPACITY(16).
Jun 29 18:54:59 debian kernel: [1197868.536702] sd 7:0:0:0: [sdg] Very big
device. Trying to use READ CAPACITY(16).
Jun 29 18:54:59 debian kernel: [1197868.537454] sd 7:0:0:0: [sdg] 7814037168
512-byte logical blocks: (4.00 TB/3.64 TiB)
Jun 29 18:54:59 debian kernel: [1197868.537459] sd 7:0:0:0: [sdg] 4096-byte
physical blocks
Jun 29 18:54:59 debian kernel: [1197868.538327] sd 7:0:0:0: [sdg] Write Protect
is off
Jun 29 18:54:59 debian kernel: [1197868.538331] sd 7:0:0:0: [sdg] Mode Sense:
67 00 10 08
...
Jun 29 20:22:35 debian kernel: [1203125.061068] BTRFS error (device sdf): bdev
/dev/sdh errs: wr 0, rd 1, flush 0, corrupt 0, gen 0
Jun 29 20:22:35 debian kernel: [1203125.061244] BTRFS error (device sdf): bdev
/dev/sdg errs: wr 0, rd 1, flush 0, corrupt 0, gen 0
Jun 29 20:22:35 debian kernel: [1203125.061412] BTRFS error (device sdf): bdev
/dev/sdh errs: wr 0, rd 2, flush 0, corrupt 0, gen 0
Jun 29 20:22:35 debian kernel: [1203125.061530] BTRFS error (device sdf): bdev
/dev/sdg errs: wr 0, rd 2, flush 0, corrupt 0, gen 0
Jun 29 20:22:35 debian kernel: [1203125.061770] BTRFS error (device sdf): bdev
/dev/sdh errs: wr 0, rd 3, flush 0, corrupt 0, gen 0
Jun 29 20:22:35 debian kernel: [1203125.061894] BTRFS error (device sdf): bdev
/dev/sdg errs: wr 0, rd 3, flush 0, corrupt 0, gen 0
Jun 29 20:22:40 debian kernel: [1203130.411911] btrfs_dev_stat_print_on_error:
42 callbacks suppressed
...
Jun 29 23:51:36 debian kernel: [1215666.863475] BTRFS error (device sdf): bdev
/dev/sdh errs: wr 0, rd 1867, flush 0, corrupt 0, gen 0
Jun 29 23:51:36 debian kernel: [1215666.864464] BTRFS error (device sdf): bdev
/dev/sdg errs: wr 0, rd 1867, flush 0, corrupt 0, gen 0
Jun 29 23:51:36 debian kernel: [1215666.865392] BTRFS: error (device sdf) in
btrfs_run_delayed_refs:3089: errno=-5 IO failure
Jun 29 23:51:36 debian kernel: [1215666.866354] BTRFS info (device sdf): forced
readonly
Jun 29 23:51:36 debian kernel: [1215666.866357] BTRFS warning (device sdf):
Skipping commit of aborted transaction.
Jun 29 23:51:36 debian kernel: [1215666.866360] BTRFS: error (device sdf) in
cleanup_transaction:1873: errno=-5 IO failure
Jun 29 23:51:36 debian kernel: [1215666.868305] BTRFS error (device sdf):
commit super ret -5
Jun 29 23:51:36 debian kernel: [1215666.869849] BTRFS error (device sdf):
cleaner transaction attach returned -30