2019-04-02 02:24, Qu Wenruo:

On 2019/4/1 上午2:44, bt...@avgustinov.eu wrote:
Dear all,


I am a big fan of btrfs, and I am using it since 2013 - in the meantime
on at least four different computers. During this time, I suffered at
least four bad btrfs-failures leading to unmountable, unreadable and
unrecoverable file system. Since in three of the cases I did not manage
to recover even a single file, I am beginning to lose my confidence in
btrfs: for 35-years working with different computers no other file
system was so bad at recovering files!

Considering the importance of btrfs and keeping in mind the number of
similar failures, described in countless forums on the net, I have got
an idea: to donate my last two damaged filesystems for investigation
purposes and thus hopefully contribute to the improvement of btrfs. One
condition: any recovered personal data (mostly pictures and audio files)
should remain undisclosed and be deleted.

Should anybody be interested in this - feel free to contact me
personally (I am not reading the list regularly!), otherwise I am going
to reformat and reuse both systems in two weeks from today.

Some more info:

   - The smaller system is 83.6GB, I could either send you an image of
this system on an unneeded hard drive or put it into a dedicated
computer and give you root rights and ssh-access to it (the network link
is 100Mb down, 50Mb up, so it should be acceptable).

I'm a little more interested in this case, as it's easier to debug.

However there is one requirement before debugging.

*NO* btrfs check --repair/--init-* run at all.
btrfs check --repair is known to cause transid error.

unfortunately, this file system was used as testbed and even
"btrfs check --repair --check-data-csum --init-csum-tree --init-extent tree ..." was attempted on it.
So I assume you are not interested.

On the larger file system only "btrfs check --repair --readonly ..." was attempted (without success; most command executions were documented, so the results can be made available), no writing commands were issued.

And, I'm afraid even with some debugging, the result would be pretty
predictable.

I do not need anything from the smaller file system and have (hopefully fresh enough) backups from the bigger one. I would be good enough if it helps to find any bugs, which are still in the code.

It will be 90% transid error.
And if it's tree block from future, then it's something barrier related.
If it's tree block from the past, then it's some tree block doesn't
reach disk.

We have being chasing the spectre for a long time, had several
assumption but never pinned it down.

IMHO spectre would lead to much bigger loses - at least in my case it could have happened all four times, but it did not.

But anyway, more info is always better.

I'd like to get the ssh access for this smaller image.

If you are still interested, please advise how to create the image of the file system. I can imagine that it is preferable to use the original, but in my case it is a (not mounted) partition of a bigger hard drive, and the other partitions are in use. The "btrfs-image" seems inappropriate to me, "dd" will probably screw things up?

Kind regards,

Nik.
--
Thanks,
Qu


   - The used space on the other file system is about 3 TB (4 TB
capacity) and it is distributed among 5 drives, so I can only offer
remote access to this, but I will need time to organize it.

If you need additional information - please ask, but keep in mind that I
have almost no "free time" and the answer could need a day or two.

Kind regards,

Nik.

--

Reply via email to