Re: How to recover from btrfs scrub errors? (uncorrectable errors, checksum error at logical)
On 2018/10/22 下午2:29, Otto Kekäläinen wrote: > I never got a reply to this thread, I replied to you but got no rely: https://lore.kernel.org/linux-btrfs/eba5de6f-535a-0f5d-e415-9cd622d71...@gmx.com/ And your steps are just what I suggested. Thanks, Qu > but I am not replying to myself in > case somebody has the same issue and is reading the archive: > > The problem went away after: > - deleted all snapshots as they seemed to slow down btrfs I/O so much > that simple commands like rm and rsync were unusable > - replaced the disk that had the corrupted file (just in case - > smartctl did not indicate any disk failures) with btrfs replace > - rsynced files from another location to this filesystem so that the > corrupted files got overwritten > > Now btrfs scrub does not find any corruption anymore and the > filesystem I/O speed is usable, though still slower than what it used > to be in the past. > > ma 15. lokak. 2018 klo 10.50 Otto Kekäläinen (o...@seravo.fi) kirjoitti: >> >> Hello! >> >> I am trying to figure out how to recover from errors detected by btrfs scrub. >> >> Scrub status reports: >> >> scrub status for 4f4479d5-648a-45b9-bcbf-978c766aeb41 >> scrub started at Mon Oct 15 10:02:28 2018, running for 00:35:39 >> total bytes scrubbed: 791.15GiB with 18 errors >> error details: csum=18 >> corrected errors: 0, uncorrectable errors: 18, unverified errors: 0 >> >> Kernel log contains lines like >> >> BTRFS warning (device dm-8): checksum error at logical 7351706472448 on dev >> /dev/mapper/disk6tb, sector 61412648, root 12725, inode 152358265, >> offset 483328: >> path resolving failed with ret=-2 >> >> I've tried so far: >> - deleting the files (when path is visible) >> - overwriting the files with new data >> - changed disk (with btrfs replace) >> >> The checksum errors however persist. >> How do I get rid of them? >> >> >> The files are logs and other non-vital information. I am fine by >> deleting the corrupted files. It is OK to recover so that I loose a >> few gigabytes of data, but not the entire filesystem. >> >> Setup is a multi-disk btrfs filesystem, data single, metadata RAID-1 >> Mounted with: >> >> /dev/mapper/wdc3td on /data type btrfs >> (rw,noatime,compress=lzo,space_cache,subvolid=5,subvol=/) >> >> I've read lots of online sources on the topic but none of these help >> me on how to recover from the current state: >> >> https://btrfs.wiki.kernel.org/index.php/Btrfsck >> http://marc.merlins.org/perso/btrfs/post_2014-03-19_Btrfs-Tips_-Btrfs-Scrub-and-Btrfs-Filesystem-Repair.html >> https://wiki.archlinux.org/index.php/Identify_damaged_files#Find_damaged_files > > > signature.asc Description: OpenPGP digital signature
Re: How to recover from btrfs scrub errors? (uncorrectable errors, checksum error at logical)
I never got a reply to this thread, but I am not replying to myself in case somebody has the same issue and is reading the archive: The problem went away after: - deleted all snapshots as they seemed to slow down btrfs I/O so much that simple commands like rm and rsync were unusable - replaced the disk that had the corrupted file (just in case - smartctl did not indicate any disk failures) with btrfs replace - rsynced files from another location to this filesystem so that the corrupted files got overwritten Now btrfs scrub does not find any corruption anymore and the filesystem I/O speed is usable, though still slower than what it used to be in the past. ma 15. lokak. 2018 klo 10.50 Otto Kekäläinen (o...@seravo.fi) kirjoitti: > > Hello! > > I am trying to figure out how to recover from errors detected by btrfs scrub. > > Scrub status reports: > > scrub status for 4f4479d5-648a-45b9-bcbf-978c766aeb41 > scrub started at Mon Oct 15 10:02:28 2018, running for 00:35:39 > total bytes scrubbed: 791.15GiB with 18 errors > error details: csum=18 > corrected errors: 0, uncorrectable errors: 18, unverified errors: 0 > > Kernel log contains lines like > > BTRFS warning (device dm-8): checksum error at logical 7351706472448 on dev > /dev/mapper/disk6tb, sector 61412648, root 12725, inode 152358265, > offset 483328: > path resolving failed with ret=-2 > > I've tried so far: > - deleting the files (when path is visible) > - overwriting the files with new data > - changed disk (with btrfs replace) > > The checksum errors however persist. > How do I get rid of them? > > > The files are logs and other non-vital information. I am fine by > deleting the corrupted files. It is OK to recover so that I loose a > few gigabytes of data, but not the entire filesystem. > > Setup is a multi-disk btrfs filesystem, data single, metadata RAID-1 > Mounted with: > > /dev/mapper/wdc3td on /data type btrfs > (rw,noatime,compress=lzo,space_cache,subvolid=5,subvol=/) > > I've read lots of online sources on the topic but none of these help > me on how to recover from the current state: > > https://btrfs.wiki.kernel.org/index.php/Btrfsck > http://marc.merlins.org/perso/btrfs/post_2014-03-19_Btrfs-Tips_-Btrfs-Scrub-and-Btrfs-Filesystem-Repair.html > https://wiki.archlinux.org/index.php/Identify_damaged_files#Find_damaged_files -- Otto Kekäläinen CEO Seravo +358 44 566 2204 Follow me at @ottokekalainen
Re: How to recover from btrfs scrub errors? (uncorrectable errors, checksum error at logical)
On 2018/10/15 下午3:50, Otto Kekäläinen wrote: > Hello! > > I am trying to figure out how to recover from errors detected by btrfs scrub. > > Scrub status reports: > > scrub status for 4f4479d5-648a-45b9-bcbf-978c766aeb41 > scrub started at Mon Oct 15 10:02:28 2018, running for 00:35:39 > total bytes scrubbed: 791.15GiB with 18 errors > error details: csum=18 > corrected errors: 0, uncorrectable errors: 18, unverified errors: 0 > > Kernel log contains lines like > > BTRFS warning (device dm-8): checksum error at logical 7351706472448 on dev > /dev/mapper/disk6tb, sector 61412648, root 12725, inode 152358265, > offset 483328: > path resolving failed with ret=-2 > > I've tried so far: > - deleting the files (when path is visible) Please ensure there are no other subvolumes/snapshots containing the same file or reflink to it. If path is not visible, please use the root and inode number to locate the culprit file. "find" command support to search using inode number. And "btrfs subvolume list" command will show the subvolume number. Also it's recommended to sync the fs before scrub, in case culprit inode only get orphaned but not deleted from disk. > - overwriting the files with new data If you're only overwriting the culprit sector, it could get CoWed and the original data extent is still there. You need to ensure the old data is not referred by any other root/inode. Please ensure there is no reflink/snapshot first. Then delete the file or overwrite the whole culprit file. Thanks, Qu > - changed disk (with btrfs replace) > > The checksum errors however persist. > How do I get rid of them? > > > The files are logs and other non-vital information. I am fine by > deleting the corrupted files. It is OK to recover so that I loose a > few gigabytes of data, but not the entire filesystem. > > Setup is a multi-disk btrfs filesystem, data single, metadata RAID-1 > Mounted with: > > /dev/mapper/wdc3td on /data type btrfs > (rw,noatime,compress=lzo,space_cache,subvolid=5,subvol=/) > > I've read lots of online sources on the topic but none of these help > me on how to recover from the current state: > > https://btrfs.wiki.kernel.org/index.php/Btrfsck > http://marc.merlins.org/perso/btrfs/post_2014-03-19_Btrfs-Tips_-Btrfs-Scrub-and-Btrfs-Filesystem-Repair.html > https://wiki.archlinux.org/index.php/Identify_damaged_files#Find_damaged_files > signature.asc Description: OpenPGP digital signature
How to recover from btrfs scrub errors? (uncorrectable errors, checksum error at logical)
Hello! I am trying to figure out how to recover from errors detected by btrfs scrub. Scrub status reports: scrub status for 4f4479d5-648a-45b9-bcbf-978c766aeb41 scrub started at Mon Oct 15 10:02:28 2018, running for 00:35:39 total bytes scrubbed: 791.15GiB with 18 errors error details: csum=18 corrected errors: 0, uncorrectable errors: 18, unverified errors: 0 Kernel log contains lines like BTRFS warning (device dm-8): checksum error at logical 7351706472448 on dev /dev/mapper/disk6tb, sector 61412648, root 12725, inode 152358265, offset 483328: path resolving failed with ret=-2 I've tried so far: - deleting the files (when path is visible) - overwriting the files with new data - changed disk (with btrfs replace) The checksum errors however persist. How do I get rid of them? The files are logs and other non-vital information. I am fine by deleting the corrupted files. It is OK to recover so that I loose a few gigabytes of data, but not the entire filesystem. Setup is a multi-disk btrfs filesystem, data single, metadata RAID-1 Mounted with: /dev/mapper/wdc3td on /data type btrfs (rw,noatime,compress=lzo,space_cache,subvolid=5,subvol=/) I've read lots of online sources on the topic but none of these help me on how to recover from the current state: https://btrfs.wiki.kernel.org/index.php/Btrfsck http://marc.merlins.org/perso/btrfs/post_2014-03-19_Btrfs-Tips_-Btrfs-Scrub-and-Btrfs-Filesystem-Repair.html https://wiki.archlinux.org/index.php/Identify_damaged_files#Find_damaged_files