Re: Disk failed while doing scrub
Dāvis Mosāns posted on Tue, 14 Jul 2015 04:54:27 +0300 as excerpted: 2015-07-13 11:12 GMT+03:00 Duncan 1i5t5.dun...@cox.net: You say five disk, but nowhere in your post do you mention what raid mode you were using, neither do you post btrfs filesystem show and btrfs filesystem df, as suggested on the wiki and which list that information. Sorry, I forgot. I'm running Arch Linux 4.0.7, with btrfs-progs v4.1 Using RAID1 for metadata and single for data, with features big_metadata, extended_iref, mixed_backref, no_holes, skinny_metadata and mounted with noatime,compress=zlib,space_cache,autodefrag Thanks. FWIW, pretty similar here, but running gentoo, now with btrfs- progs v4.1.1 and the mainline 4.2-rc1+ kernel. BTW, note that space_cache has been the default for quite some time, now. I've never actually manually mounted with space_cache on any of my filesystems over several years, now, yet they all report it when I check /proc/mounts, etc. So if you're adding that manually, you can kill that option and save the commandline/fstab space. =:^) Label: 'Data' uuid: 1ec5b839-acc6-4f70-be9d-6f9e6118c71c Total devices 5 FS bytes used 7.16TiB devid1 size 2.73TiB used 2.35TiB path /dev/sdc devid2 size 1.82TiB used 1.44TiB path /dev/sdd devid3 size 1.82TiB used 1.44TiB path /dev/sde devid4 size 1.82TiB used 1.44TiB path /dev/sdg devid5 size 931.51GiB used 539.01GiB path /dev/sdh Data, single: total=7.15TiB, used=7.15TiB System, RAID1: total=8.00MiB, used=784.00KiB System, single: total=4.00MiB, used=0.00B Metadata, RAID1: total=16.00GiB, used=14.37GiB Metadata, single: total=8.00MiB, used=0.00B GlobalReserve, single: total=512.00MiB, used=0.00B And note that you can easily and quickly remove those empty single-mode system and metadata chunks, which are an artifact of the way mkfs.btrfs works, using balance filters. btrfs balance start -mprofile=single ... should do it. They're actually working on mkfs.btrfs patches to fix it not to do that, right now. There's active patch and testing threads discussing it. Hopefully for btrfs-progs v4.2. (4.1.1 has the patches for single-device and prep work for multi-device, according to the changelog.) Because filesystem still mounts, I assume I should do btrfs device delete /dev/sdd /mntpoint and then restore damaged files from backup. You can try a replace, but with a failing drive still connected, people report mixed results. It's likely to fail as it can't read certain blocks to transfer them to the new device. As I understand, device delete will copy data from that disk and distribute across rest of disks, while btrfs replace will copy to new disk which must be atleast size of disk I'm replacing. Sorry. You wrote delete, I read replace. How'd I do that? =:^( You are absolutely correct. Delete would be better here. I guess I had just been reading a thread discussing the problems I mentioned with replace, and saw what I expected to see, not what you actually wrote. There's no such partial-file with null-fill tools shipped just yet. From journal I have only 14 files mentioned where errors occurred. Now 13 files from them don't throw any errors and their SHA's match to my backups so they're fine. Good. I was going on the assumption that the questionable device was in much worse shape than that. And actually btrfs does allow to copy/read that one damaged file, only I get I/O error when trying to read data from those broken sectors Good, and good to know. Thanks. =:^) best and correct way to recover a file is using ddrescue I was just going to mention ddrescue. =:^) $ du -m /tmp/damaged_file 6251/tmp/damaged_file so basically only like 8K bytes are unrecoverable from this file. Probably there could be created some tool which could get even more data knowing about btrfs. There /is/, however, a command that can be used to either regenerate or zero-out the checksum tree. See btrfs check --init-csum-tree. Seems, you can't specify a path/file for it and it's quite destructive action if you want to get data only about some one specific file. Yes. It's whole-filesystem-all-or-nothing, unfortunately. =:^( I did scrub second time and this time there aren't that many uncorrectable errors and also there's no csum_errors so --init-csum-tree is useless here I think. Agreed. Most likely previously scrub got that many errors because it still continued for a bit even if disk didn't respond. Yes. scrub status [...] read_errors: 2 csum_errors: 0 verify_errors: 0 no_csum: 89600 csum_discards: 656214 super_errors: 0 malloc_errors: 0 uncorrectable_errors: 2 unverified_errors: 0 corrected_errors: 0 last_physical: 2590041112576 OK, that matches up with 8 KiB bad, since blocks are 4 KiB and there's two uncorrectable errors. With the scrub
Re: Disk failed while doing scrub
2015-07-13 11:12 GMT+03:00 Duncan 1i5t5.dun...@cox.net: You say five disk, but nowhere in your post do you mention what raid mode you were using, neither do you post btrfs filesystem show and btrfs filesystem df, as suggested on the wiki and which list that information. Sorry, I forgot. I'm running Arch Linux 4.0.7, with btrfs-progs v4.1 Using RAID1 for metadata and single for data, with features big_metadata, extended_iref, mixed_backref, no_holes, skinny_metadata and mounted with noatime,compress=zlib,space_cache,autodefrag Label: 'Data' uuid: 1ec5b839-acc6-4f70-be9d-6f9e6118c71c Total devices 5 FS bytes used 7.16TiB devid1 size 2.73TiB used 2.35TiB path /dev/sdc devid2 size 1.82TiB used 1.44TiB path /dev/sdd devid3 size 1.82TiB used 1.44TiB path /dev/sde devid4 size 1.82TiB used 1.44TiB path /dev/sdg devid5 size 931.51GiB used 539.01GiB path /dev/sdh Data, single: total=7.15TiB, used=7.15TiB System, RAID1: total=8.00MiB, used=784.00KiB System, single: total=4.00MiB, used=0.00B Metadata, RAID1: total=16.00GiB, used=14.37GiB Metadata, single: total=8.00MiB, used=0.00B GlobalReserve, single: total=512.00MiB, used=0.00B Because filesystem still mounts, I assume I should do btrfs device delete /dev/sdd /mntpoint and then restore damaged files from backup. You can try a replace, but with a failing drive still connected, people report mixed results. It's likely to fail as it can't read certain blocks to transfer them to the new device. As I understand, device delete will copy data from that disk and distribute across rest of disks, while btrfs replace will copy to new disk which must be atleast size of disk I'm replacing. Assuming other existing disks are good, if so, why replace would be preferable over delete? because delete could fail, but replace not? There's no such partial-file with null-fill tools shipped just yet. Those files normally simply trigger errors trying to read them, because btrfs won't let you at them if the checksum doesn't verify. From journal I have only 14 files mentioned where errors occurred. Now 13 files from them don't throw any errors and their SHA's match to my backups so they're fine. And actually btrfs does allow to copy/read that one damaged file, only I get I/O error when trying to read data from those broken sectors kernel: drivers/scsi/mvsas/mv_sas.c 1863:Release slot [0] tag[0], task [88011c8c9900]: kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 0001, slot [0]. kernel: sas: sas_ata_task_done: SAS error 8a kernel: sas: Enter sas_scsi_recover_host busy: 1 failed: 1 kernel: sas: ata9: end_device-7:2: cmd error handler kernel: sas: ata7: end_device-7:0: dev error handler kernel: sas: ata14: end_device-7:7: dev error handler kernel: ata9.00: exception Emask 0x0 SAct 0x4000 SErr 0x0 action 0x0 kernel: ata9.00: failed command: READ FPDMA QUEUED kernel: ata9.00: cmd 60/00:00:00:33:a1/0f:00:ab:00:00/40 tag 14 ncq 1966080 in res 41/40:00:48:40:a1/00:0f:ab:00:00/00 Emask 0x409 (media error) F kernel: ata9.00: status: { DRDY ERR } kernel: ata9.00: error: { UNC } kernel: ata9.00: configured for UDMA/133 kernel: sd 7:0:2:0: [sdd] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 kernel: sd 7:0:2:0: [sdd] tag#0 Sense Key : 0x3 [current] [descriptor] kernel: sd 7:0:2:0: [sdd] tag#0 ASC=0x11 ASCQ=0x4 kernel: sd 7:0:2:0: [sdd] tag#0 CDB: opcode=0x28 28 00 ab a1 33 00 00 0f 00 00 kernel: blk_update_request: I/O error, dev sdd, sector 2879471688 kernel: ata9: EH complete kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1 but all other sectors can be copied fine $ du -m ./damaged_file 6250 ./damaged_file $ cp ./damaged_file /tmp/ cp: error reading ‘damaged_file’: Input/output error $ du -m /tmp/damaged_file 4335/tmp/damaged_file cp copies first file part correctly, and I verified that both start of file (first 4336M) and end of file (last 1890M) SHA's match backup $ head -c 4336M ./damaged_file | sha256sum e81b20bfa7358c9f5a0ed165bffe43185abc59e35246e52a7be1d43e6b7e040d - $ head -c 4337M ./damaged_file | sha256sum head: error reading ‘./damaged_file’: Input/output error $ tail -c 1890M ./damaged_file | sha256sum 941568f4b614077858cb8c8dd262bb431bf4c45eca936af728ecffc95619cb60 - $ tail -c 1891M ./damaged_file | sha256sum tail: error reading ‘./damaged_file’: Input/output error with dd can also copy almost all file, only using noerror option it excludes those regions from target file rather than filling with nulls so this isn't good for recovery $ dd conv=noerror if=damaged_file of=/tmp/damaged_file dd: error reading ‘damaged_file’: Input/output error 8880328+0 records in 8880328+0 records out 4546727936 bytes (4,5 GB) copied, 69,7282 s, 65,2 MB/s dd: error reading ‘damaged_file’: Input/output error 8930824+0 records in 8930824+0 records out 4572581888 bytes (4,6 GB) copied, 113,648 s, 40,2 MB/s 12801720+0 records in
Disk failed while doing scrub
Hello, Short version: while doing scrub on 5 disk btrfs filesystem, /dev/sdd failed and also had some error on other disk (/dev/sdh) Because filesystem still mounts, I assume I should do btrfs device delete /dev/sdd /mntpoint and then restore damaged files from backup. Are all affected files listed in journal? there's messages about x callbacks suppressed so I'm not sure and if there aren't how to get full list of damaged files? Also I wonder if there are any tools to recover partial file fragments and reconstruct file? (where missing fragments filled with nulls) I assume that there's no point in running btrfs check --check-data-csum because scrub already does check that? from journal: kernel: drivers/scsi/mvsas/mv_sas.c 1863:Release slot [1] tag[1], task [88007efb8800]: kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 0002, slot [1]. kernel: sas: sas_ata_task_done: SAS error 8a kernel: sas: Enter sas_scsi_recover_host busy: 1 failed: 1 kernel: sas: ata9: end_device-7:2: cmd error handler kernel: sas: ata7: end_device-7:0: dev error handler kernel: sas: ata14: end_device-7:7: dev error handler kernel: ata9.00: exception Emask 0x0 SAct 0x800 SErr 0x0 action 0x0 kernel: ata9.00: failed command: READ FPDMA QUEUED kernel: ata9.00: cmd 60/00:00:00:3d:a1/04:00:ab:00:00/40 tag 11 ncq 524288 in res 41/40:00:48:40:a1/00:04:ab:00:00/00 Emask 0x409 (media error) F kernel: ata9.00: status: { DRDY ERR } kernel: ata9.00: error: { UNC } kernel: ata9.00: configured for UDMA/133 kernel: sd 7:0:2:0: [sdd] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 kernel: sd 7:0:2:0: [sdd] tag#0 Sense Key : 0x3 [current] [descriptor] kernel: sd 7:0:2:0: [sdd] tag#0 ASC=0x11 ASCQ=0x4 kernel: sd 7:0:2:0: [sdd] tag#0 CDB: opcode=0x28 28 00 ab a1 3d 00 00 04 00 00 kernel: blk_update_request: I/O error, dev sdd, sector 2879471688 kernel: ata9: EH complete kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1 kernel: drivers/scsi/mvsas/mv_sas.c 1863:Release slot [1] tag[1], task [88007efb9a00]: kernel: drivers/scsi/mvsas/mv_94xx.c 625:command active 0003, slot [1]. kernel: sas: sas_ata_task_done: SAS error 8a kernel: sas: Enter sas_scsi_recover_host busy: 2 failed: 2 kernel: sas: trying to find task 0x8801e0cadb00 kernel: sas: sas_scsi_find_task: aborting task 0x8801e0cadb00 kernel: sas: sas_scsi_find_task: task 0x8801e0cadb00 is aborted kernel: sas: sas_eh_handle_sas_errors: task 0x8801e0cadb00 is aborted kernel: sas: ata9: end_device-7:2: cmd error handler kernel: sas: ata8: end_device-7:1: cmd error handler kernel: sas: ata7: end_device-7:0: dev error handler kernel: sas: ata8: end_device-7:1: dev error handler kernel: ata8.00: exception Emask 0x0 SAct 0x4 SErr 0x0 action 0x6 frozen kernel: ata8.00: failed command: READ FPDMA QUEUED kernel: ata8.00: cmd 60/00:00:00:1b:36/04:00:bf:00:00/40 tag 18 ncq 524288 in res 40/00:08:00:58:11/00:00:a6:00:00/40 Emask 0x4 (timeout) kernel: ata8.00: status: { DRDY } kernel: ata8: hard resetting link kernel: sas: ata9: end_device-7:2: dev error handler kernel: sas: ata14: end_device-7:7: dev error handler kernel: ata9: log page 10h reported inactive tag 26 kernel: ata9.00: exception Emask 0x1 SAct 0x40 SErr 0x0 action 0x6 kernel: ata9.00: failed command: READ FPDMA QUEUED kernel: ata9.00: cmd 60/08:00:48:40:a1/00:00:ab:00:00/40 tag 22 ncq 4096 in res 01/04:a8:40:40:a1/00:00:ab:00:00/40 Emask 0x3 (HSM violation) kernel: ata9.00: status: { ERR } kernel: ata9.00: error: { ABRT } kernel: ata9: hard resetting link kernel: sas: sas_form_port: phy1 belongs to port1 already(1)! kernel: ata9.00: both IDENTIFYs aborted, assuming NODEV kernel: ata9.00: revalidation failed (errno=-2) kernel: drivers/scsi/mvsas/mv_sas.c 1428:mvs_I_T_nexus_reset for device[1]:rc= 0 kernel: ata8.00: configured for UDMA/133 kernel: ata8.00: device reported invalid CHS sector 0 kernel: ata8: EH complete kernel: ata9: hard resetting link kernel: ata9.00: both IDENTIFYs aborted, assuming NODEV kernel: ata9.00: revalidation failed (errno=-2) kernel: ata9: hard resetting link kernel: ata9.00: both IDENTIFYs aborted, assuming NODEV kernel: ata9.00: revalidation failed (errno=-2) kernel: ata9.00: disabled kernel: ata9: EH complete kernel: sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1 kernel: sd 7:0:2:0: [sdd] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 kernel: sd 7:0:2:0: [sdd] tag#0 CDB: opcode=0x28 28 00 ab a1 40 48 00 00 08 00 kernel: blk_update_request: I/O error, dev sdd, sector 2879471688 kernel: sd 7:0:2:0: [sdd] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=0x00 kernel: sd 7:0:2:0: [sdd] tag#0 CDB: opcode=0x28 28 00 ab a1 45 00 00 06 00 00 kernel: BTRFS: unable to fixup (regular) error at logical 7390602616832 on dev /dev/sdd kernel: BTRFS: unable to fixup (regular) error at
Re: Disk failed while doing scrub
Dāvis Mosāns posted on Mon, 13 Jul 2015 09:26:05 +0300 as excerpted: Short version: while doing scrub on 5 disk btrfs filesystem, /dev/sdd failed and also had some error on other disk (/dev/sdh) You say five disk, but nowhere in your post do you mention what raid mode you were using, neither do you post btrfs filesystem show and btrfs filesystem df, as suggested on the wiki and which list that information. FWIW, btrfs defaults for a multi-device filesystem are raid1 metadata, raid0 data. If you didn't specify raid level at mkfs time, it's very likely that's what you're using. The scrub results seem to support this as if the data had been raid1 or raid10, nearly all the errors should have been correctable by pulling from the second copy. And raid5/6 should have been able to recover from parity, tho this mode is new enough it's still not recommended as the chances of bugs and thus failure to work properly are much higher. So you really should have been using raid1/10 if you wanted device failure tolerance, but you didn't say, and if you're using defaults as seems reasonably likely, your data was raid0, and thus it's likely many/ most files are either gone or damaged beyond repair. (As it happens I have a number of btrfs raid1 data/metadata on a pair of partitioned ssds, with each btrfs on a corresponding partition on both of them, with one of the ssds developing bad sectors and basically slowly failing. But the other member of the raid1 pair is solid and I have backups, as well as a spare I can replace the failing one with when I decide it's time, so I've been letting the bad one stick around due as much as anything to morbid curiosity, watching it slowly fail. So I know exactly how scrub on btrfs raid1 behaves in a bad-sector case, pulling the copy from the good device to overwrite the bad copy with, triggering the device's sector remapping in the process. Despite all the read errors, they've all been correctable, because I'm using raid1 for both data and metadata.) Because filesystem still mounts, I assume I should do btrfs device delete /dev/sdd /mntpoint and then restore damaged files from backup. You can try a replace, but with a failing drive still connected, people report mixed results. It's likely to fail as it can't read certain blocks to transfer them to the new device. With raid1 or better, physically disconnecting the failing device, and doing a device delete missing (or replace missing, but AFAIK this doesn't work with released versions and I'm not sure if it's even in integration yet, but there are patches on-list that should make it work) can work, but with raid0/single, you can mount with a missing device if you use degraded,ro, but obviously that'll only let you try to copy files off, and you'll likely not have a lot of luck with raid0, with files missing but a bit more luck with single. In the likely raid0/single case, you're best bet is probably to try copying off what you can, and/or restoring from backups. See the discussion below. Are all affected files listed in journal? there's messages about x callbacks suppressed so I'm not sure and if there aren't how to get full list of damaged files? Also I wonder if there are any tools to recover partial file fragments and reconstruct file? (where missing fragments filled with nulls) I assume that there's no point in running btrfs check --check-data-csum because scrub already does check that? There's no such partial-file with null-fill tools shipped just yet. Those files normally simply trigger errors trying to read them, because btrfs won't let you at them if the checksum doesn't verify. There /is/, however, a command that can be used to either regenerate or zero-out the checksum tree. See btrfs check --init-csum-tree. Current versions recalculate the csums, older versions (btrfsck as that was before btrfs check) simply zeroed it out. Then you can read the file despite bad checksums, tho you'll still get errors if the block physically cannot be read. There's also btrfs restore, which works on the unmounted filesystem without actually writing to it, copying the files it can read to a new location, which of course has to be a filesystem with enough room to restore the files to, altho it's possible to tell restore to do only specific subdirs, for instance. What I'd recommend depends on how complete and how recent your backup is. If it's complete and recent enough, probably the easiest thing is to simply blow away the bad filesystem and start over, recovering from the backup to a new filesystem. If there's files you'd like to get back that weren't backed up or where the backup is old, since the filesystem is mountable, I'd probably copy everything off it I could. Then, I'd try restore, letting it restore to the same location I had copied to, but NOT using the --overwrite option, so it only wrote any files it could restore that the copy wasn't able to get