On 7/3/20 1:41 PM, Chris Murphy wrote:
> SSDs can fail in weird ways. Some spew garbage as they're failing,
> some go read-only. I've seen both. I don't have stats on how common it
> is for an SSD to go read-only as it fails, but once it happens you
> cannot fsck it. It won't accept writes. If it won't mount, your only
> chance to recover data is some kind of offline scrape tool. And Btrfs
> does have a very very good scrape tool, in terms of its success rate -
> UX is scary. But that can and will improve.

Ok, you and Josef have both recommended the btrfs restore ("scrape")
tool as a next recovery step after fsck fails, and I figured we should
check that out, to see if that alleviates the concerns about
recoverability of user data in the face of corruption.

I also realized that mkfs of an image isn't representative of an SSD
system typical of Fedora laptops, so I added "-m single" to mkfs,
because this will be the mkfs.btrfs default on SSDs (right?).  Based
on Josef's description of fsck's algorithm of throwing away any
block with a bad CRC this seemed worth testing.

I also turned fuzzing /down/ to hitting 2048 bytes out of the 1G
image, or a bit less than 1% of the filesystem blocks, at random.
This is 1/4 the fuzzing rate from the original test.

So: -m single, fuzz 2048 bytes of 1G image, run btrfsck --repair,
mount, mount w/ recovery, and then restore ("scrape") if all that
fails, see what we get.

I ran 50 loops, and got:

46 btrfsck failures
20 mount failures

So it ran btrfs restore 20 times; of those, 11 runs lost all or
substantially all of the files; 17 runs lost at least 1/3 of the 

devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 

Reply via email to