On 2019/1/15 下午8:28, Leonard Lausen wrote:
> 
> Thanks Qu and David for your prompt attention!
> 
> Qu Wenruo <[email protected]> writes:
>>> following tree-dumps:
>>>
>>>   sudo btrfs inspect dump-tree -t root /dev/mapper/vg1-root > 
>>> /tmp/btrfsdumproot
>>>   sudo btrfs inspect dump-tree -b 1350630375424 /dev/mapper/vg1-root > 
>>> /tmp/btrfsdump1350630375424
>>>
>>> The root dump is at https://termbin.com/lz0l and the block dump at
>>> https://termbin.com/oev5 . The number 1350630375424 does not occur in
>>> the root dump. The root dump has 16715 lines, the block dump only 645.
>>
>> Super nice move, it shows the corruption and the cause.
>>
>>      item 66 key (1714119835648 METADATA_ITEM 0) itemoff 13325 itemsize 33
>>      item 67 key (10510212874240 METADATA_ITEM 0) itemoff 13283 itemsize 42
>>      item 68 key (1714119868416 METADATA_ITEM 0) itemoff 13250 itemsize 33
>>
>> See the key objectid of key 67 is way larger than item 66/68.
>>
>> And furthermore, it indeed looks like a bit rot:
>> 0x18f19810000 (1714119835648)
>> 0x98f19814000 (10510212874240)
>> 0x18f19818000 (1714119868416)
>>
>> See one bit got flipped.
> 
> Thanks for the explanation!
> 
>> I don't know it's corrupted in memory or on the SSD, although I tend to
>> believe it's caused by memory bit flip.
>> But anyway, it can be fixed by patching the corrupted leaf manually.
>>
>> I'm working on the fix.
>> Please make sure there is no write into the fs (just in case, since the
>> fs should be RO).
>>
>> And prepare a LiveUSB on which you could compile btrfs-progs (needs some
>> dependency).
>>
>> It shouldn't take me too long time crafting the fix.
> 
> Thanks Qu! I see that ArchLinux LiveUSB is based on linux 4.20.0, but
> 4.20.1 contains some btrfs fixes. Should I make sure to be at least on
> 4.20.1 for this?

You won't even need to try mount the fs, so kernel version doesn't
matter here.

BTW, archlinux ISO is really a nice tool as liveUSB, your needed
dependency could be found by checking the PKGBUILD of btrfs-progs.

Thanks,
Qu
> 
> David Sterba <[email protected]> writes:
>> On Tue, Jan 15, 2019 at 07:48:47PM +0800, Qu Wenruo wrote:
>>> See the key objectid of key 67 is way larger than item 66/68.
>>>
>>> And furthermore, it indeed looks like a bit rot:
>>> 0x18f19810000 (1714119835648)
>>> 0x98f19814000 (10510212874240)
>>> 0x18f19818000 (1714119868416)
>>>
>>> See one bit got flipped.
> 
>>> I don't know it's corrupted in memory or on the SSD, although I tend to
>>> believe it's caused by memory bit flip.
>>
>> Single bit flips are almost always caused by RAM, not storage (that
>> fails in larger blocks or does not even return any data)
>>> But anyway, it can be fixed by patching the corrupted leaf manually.
>>
>> That will fix one instance of the corrupted key, without an analysis how
>> far the wrong key got spred it's still risky.
> 
> How could I analyse this?
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to