My Filesystem is Broken

2018-08-22 Thread Zirconium Hacker
Hi.
My primary (boot) filesystem is broken, due to an interrupted resize operation.
I'm hoping that I can get help either fixing the filesystem or
recovering some of my data, but I'd also like to know why btrfs and
its tools acted the way they did.
I think I've also found a bug in GParted.

Output of uname -a on the recovery medium: Linux ArchUSB
4.18.3-arch1-1-ARCH #1 SMP PREEMPT Sat Aug 18 09:22:54 UTC 2018 x86_64
GNU/Linux
Kernel on the affected system: Liquorix Linux 4.17.14
Output of btrfs --version: btrfs-progs v4.17.1
Relevant dmesg log is attached.

I currently use a single btrfs filesystem (fs A) with subvolumes for
root, home, and var.  It's ~140 GiB in size, with ~130GiB used.
I also have a second btrfs filesystem (fs B) from a previous
installation of Arch Linux.  It's ~40 GiB in size, with less than 30
GiB used.

First of all, how I got into this situation:

Yesterday, I wanted to reclaim some space, so I decided to shrink fs B.
I opened up GParted, and resized it.  I got an error from btrfs, like
"no space left on device".
I was a little confused.  This was seemingly solved by unmounting fs B.
I re-ran the reisze.  Some process was using 100% CPU on one core, and
I didn't see much (if any) I/O activity.
After a few minutes I noticed that GParted was attempting to resize
the mount point of fs B ('/mnt/oldsys' on fs A), even though it wasn't
mounted!
I'm not sure what this does, but I figure that it's not good (even
though fs A shouldn't fit in 32 GiB).
So I press cancel... nothing.  I try force cancel, still no effect.  I
try to kill the resize process, first with SIGTERM, then with
SIGKILL... nothing.
I figure that I have to reboot at this point.  During reboot, systemd
waits really long for a stop job on some user session thing.
Then, strangely, I see output _during shutdown_ about btrfs
_beginning_ to resize fs A (referring to it by its block device,
/dev/sda2)...
I choose to reset my computer.  When I try to boot again (bad idea in
retrospect) systemd takes a long time to "re-mount the root
filesystem".
Start jobs begin to timeout and fail, so I reset my computer again to
boot into a recovery medium.

Part 2, the attempted recovery:

I run btrfsck on sda2, and seeing lots of errors (see
btrfsck.readonly.log) I choose to make an image of the entire block
device.
After that, I attempt a btrfsck --repair (btrfsck.repair.log).
It seems alright until it reaches "Deleting bad dir index" and then
hangs.  IIRC at some point it segfaulted.
Desperately, I try to run another repair... and I encounter a BUG_ON
(btrfsck.repair.2.log).  Ouch.


Well, at this point I'm stuck.  I have no backups.
I've already restored the image of fs A from before any repair attempts.

Thanks in advance,
Jared


Some big files that I couldn't attach:
btrfsck.readonly.log:
https://drive.google.com/open?id=1CZP67uCs7zCyi1CfPv6tt9DxnIfU967u
btrfsck.repair.log:
https://drive.google.com/open?id=1l2Nj8n9CzmxRZznbbIYEMrQc6c5plMDm


P.S. Sorry if this gets sent twice -- Gmail failed to deliver it the first time.
[1/7] checking root items
Fixed 0 roots.
[2/7] checking extents
[3/7] checking free space cache
[4/7] checking fs roots
Deleting bad dir index [258,96,2] root 18446744073709551608
extent-tree.c:1423: btrfs_inc_extent_ref: BUG_ON `err` triggered, value -5
btrfs check(+0x1e612)[0x5593cc313612]
btrfs check(btrfs_inc_extent_ref+0x166)[0x5593cc3167b6]
btrfs check(+0x1f064)[0x5593cc314064]
btrfs check(__btrfs_cow_block+0x5e5)[0x5593cc309af5]
btrfs check(btrfs_cow_block+0xf9)[0x5593cc309ea9]
btrfs check(btrfs_search_slot+0x36d)[0x5593cc30cd1d]
btrfs check(btrfs_lookup_dir_index+0x58)[0x5593cc31ea58]
btrfs check(+0x5d6ae)[0x5593cc3526ae]
btrfs check(cmd_check+0x3580)[0x5593cc35d460]
btrfs check(main+0x88)[0x5593cc3080b8]
/usr/lib/libc.so.6(__libc_start_main+0xf3)[0x7f95fa1b2223]
btrfs check(_start+0x2e)[0x5593cc3081de]
[ 2137.123004] BTRFS info (device sda2): disk space caching is enabled
[ 2137.123005] BTRFS info (device sda2): has skinny extents
[ 2137.189323] BTRFS info (device sda2): enabling ssd optimizations
[ 3409.093491] traps: btrfsck[962] general protection ip:5572b1e0c9c2 sp:7fffbc926c80 error:0 in btrfs[5572b1dfe000+7a000]
[ 3409.093520] audit: type=1701 audit(1534813029.734:4): auid=1000 uid=0 gid=0 ses=1 pid=962 comm="btrfsck" exe="/usr/bin/btrfs" sig=11 res=1
[ 3563.069359] INFO: task btrfsck:962 blocked for more than 120 seconds.
[ 3563.069814]   Not tainted 4.18.3-arch1-1-ARCH #1
[ 3563.070239] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3563.070666] btrfsck D0   962960 0x8080
[ 3563.070671] Call Trace:
[ 3563.070682]  ? __schedule+0x29b/0x8b0
[ 3563.070686]  schedule+0x32/0x90
[ 3563.070690]  schedule_preempt_disabled+0x14/0x20
[ 3563.070695]  __mutex_lock.isra.0+0x220/0x530
[ 3563.070700]  ? file_update_time+0x5e/0x130
[ 3563.070705]  pipe_write+0x36/0x3f0
[ 3563.070710]  ? _raw_spin_unlock+0x16/0x30
[ 3563.070714]  ? follow_page_pte+0x3a7/0x5f0
[ 

Re: BTRFS error (device sda4): failed to read chunk tree: -5

2017-08-18 Thread Zirconium Hacker
THANK YOU ALL!  I just had to truncate the first 1.5 KiB of the image
file to get the offsets right (God knows why), but I could then get
the btrfs driver to recognize it, and I could MOUNT THE FILESYSTEM!
I'm going to run btrfs check on it, free some space on it, PROPERLY
remove the extra device, and then boot the system and set up both more
cloud backups and probably throw in a 500GB disk I have laying around
so I can set up snapshots.
Everyone has been very helpful, and sorry for the real issue being my
own stupidity.  :)

On Fri, Aug 18, 2017 at 10:28 PM, Zirconium Hacker <jared.e...@gmail.com> wrote:
> The image doesn't have a valid superblock.  I'm really confused as to
> how that could've happened.
>
> On Fri, Aug 18, 2017 at 7:21 PM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
>>
>>
>> On 2017年08月19日 05:52, Zirconium Hacker wrote:
>>>
>>> Ok, so since it's clear now that I need that 5 GB device to be
>>> present... I found the image file.  But how do I get BTRFS to
>>> recognize the image as a device?
>>
>>
>> # losetup -f
>> Remember the loop*, here use /dev/loop1 as example.
>>
>> # losetup /dev/loop1 
>> # partprobe /dev/loop1
>> Then you should have /dev/loop1p1
>>
>> # btrfs dev rescan
>> If nothing wrong happened, you should be good to go.
>>
>> Thanks,
>> Qu
>>
>>
>>>  I have zero experience with
>>> multi-device systems.  Setting it up as a loop device doesn't fix
>>> mounting, and wipefs doesn't detect the BTRFS magic number, but
>>> printing some of it to console shows it does have real data.  Writing
>>> the magic number onto it (it's a copy of the original to be safe)
>>> shows in dump-super, but all other values are zero.
>>>
>>> I tried sending the above on my phone earlier but it was detected as a
>>> "virus" because it contained HTML.  Whoops.
>>>
>>> On Fri, Aug 18, 2017 at 11:00 AM, Chris Murphy <li...@colorremedies.com>
>>> wrote:
>>>>
>>>> On Fri, Aug 18, 2017 at 2:47 AM, Zirconium Hacker <jared.e...@gmail.com>
>>>> wrote:
>>>>
>>>>> I vaguely remember following this guide at some point:
>>>>>
>>>>> http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-Full-Problems.html
>>>>> -- specifically the "Balance cannot run because the filesystem is
>>>>> full" part.  This may have broken some things?
>>>>>
>>>>
>>>> If you don't do 'btrfs device delete /dev/loop0' or if that command
>>>> does not complete, then it's possible to get into the situation you're
>>>> in.
>>>>
>>>> Have you ever mounted this file system with -o degraded?
>>>>
>>>> I'm going to guess the history is something like:
>>>> 1. enospc
>>>> 2. btrfs dev add
>>>> 3. some kind of filtered balance, which only causes data block groups
>>>> to be moved to the 2nd device
>>>> 4. 2nd device is physically removed without first 'btrfs dev del'
>>>>
>>>> Zirco's superblock very clearly says num_devices  2, so I'd expect
>>>> normal mount to always fail unless both devices are present. Is there
>>>> some weird edge case where Btrfs might permit non-degraded mount when
>>>> only data bg's are on a 2nd device? And then trouble only happens
>>>> later when a balance is done and it goes looking for these bg's? And
>>>> then, boom!
>>>>
>>>> --
>>>> Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS error (device sda4): failed to read chunk tree: -5

2017-08-18 Thread Zirconium Hacker
The image doesn't have a valid superblock.  I'm really confused as to
how that could've happened.

On Fri, Aug 18, 2017 at 7:21 PM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
>
>
> On 2017年08月19日 05:52, Zirconium Hacker wrote:
>>
>> Ok, so since it's clear now that I need that 5 GB device to be
>> present... I found the image file.  But how do I get BTRFS to
>> recognize the image as a device?
>
>
> # losetup -f
> Remember the loop*, here use /dev/loop1 as example.
>
> # losetup /dev/loop1 
> # partprobe /dev/loop1
> Then you should have /dev/loop1p1
>
> # btrfs dev rescan
> If nothing wrong happened, you should be good to go.
>
> Thanks,
> Qu
>
>
>>  I have zero experience with
>> multi-device systems.  Setting it up as a loop device doesn't fix
>> mounting, and wipefs doesn't detect the BTRFS magic number, but
>> printing some of it to console shows it does have real data.  Writing
>> the magic number onto it (it's a copy of the original to be safe)
>> shows in dump-super, but all other values are zero.
>>
>> I tried sending the above on my phone earlier but it was detected as a
>> "virus" because it contained HTML.  Whoops.
>>
>> On Fri, Aug 18, 2017 at 11:00 AM, Chris Murphy <li...@colorremedies.com>
>> wrote:
>>>
>>> On Fri, Aug 18, 2017 at 2:47 AM, Zirconium Hacker <jared.e...@gmail.com>
>>> wrote:
>>>
>>>> I vaguely remember following this guide at some point:
>>>>
>>>> http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-Full-Problems.html
>>>> -- specifically the "Balance cannot run because the filesystem is
>>>> full" part.  This may have broken some things?
>>>>
>>>
>>> If you don't do 'btrfs device delete /dev/loop0' or if that command
>>> does not complete, then it's possible to get into the situation you're
>>> in.
>>>
>>> Have you ever mounted this file system with -o degraded?
>>>
>>> I'm going to guess the history is something like:
>>> 1. enospc
>>> 2. btrfs dev add
>>> 3. some kind of filtered balance, which only causes data block groups
>>> to be moved to the 2nd device
>>> 4. 2nd device is physically removed without first 'btrfs dev del'
>>>
>>> Zirco's superblock very clearly says num_devices  2, so I'd expect
>>> normal mount to always fail unless both devices are present. Is there
>>> some weird edge case where Btrfs might permit non-degraded mount when
>>> only data bg's are on a 2nd device? And then trouble only happens
>>> later when a balance is done and it goes looking for these bg's? And
>>> then, boom!
>>>
>>> --
>>> Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS error (device sda4): failed to read chunk tree: -5

2017-08-18 Thread Zirconium Hacker
Ok, so since it's clear now that I need that 5 GB device to be
present... I found the image file.  But how do I get BTRFS to
recognize the image as a device?  I have zero experience with
multi-device systems.  Setting it up as a loop device doesn't fix
mounting, and wipefs doesn't detect the BTRFS magic number, but
printing some of it to console shows it does have real data.  Writing
the magic number onto it (it's a copy of the original to be safe)
shows in dump-super, but all other values are zero.

I tried sending the above on my phone earlier but it was detected as a
"virus" because it contained HTML.  Whoops.

On Fri, Aug 18, 2017 at 11:00 AM, Chris Murphy <li...@colorremedies.com> wrote:
> On Fri, Aug 18, 2017 at 2:47 AM, Zirconium Hacker <jared.e...@gmail.com> 
> wrote:
>
>> I vaguely remember following this guide at some point:
>> http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-Full-Problems.html
>> -- specifically the "Balance cannot run because the filesystem is
>> full" part.  This may have broken some things?
>>
>
> If you don't do 'btrfs device delete /dev/loop0' or if that command
> does not complete, then it's possible to get into the situation you're
> in.
>
> Have you ever mounted this file system with -o degraded?
>
> I'm going to guess the history is something like:
> 1. enospc
> 2. btrfs dev add
> 3. some kind of filtered balance, which only causes data block groups
> to be moved to the 2nd device
> 4. 2nd device is physically removed without first 'btrfs dev del'
>
> Zirco's superblock very clearly says num_devices  2, so I'd expect
> normal mount to always fail unless both devices are present. Is there
> some weird edge case where Btrfs might permit non-degraded mount when
> only data bg's are on a 2nd device? And then trouble only happens
> later when a balance is done and it goes looking for these bg's? And
> then, boom!
>
> --
> Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS error (device sda4): failed to read chunk tree: -5

2017-08-18 Thread Zirconium Hacker
# ./btrfs-debug-tree -b 131072 /dev/sda4
https://pastebin.com/TDa0GuqB
# ./btrfs-debug-tree -b 61809344512 /dev/sda4
btrfs-progs v4.12-dirty
bytenr mismatch, want=61809344512, have=0
Couldn't read tree root
bytenr mismatch, want=61809344512, have=0
ERROR: failed to read 61809344512
# ./btrfs-debug-tree -b 61807755264 /dev/sda4
btrfs-progs v4.12-dirty
bytenr mismatch, want=61809344512, have=0
Couldn't read tree root
bytenr mismatch, want=61807755264, have=0
ERROR: failed to read 61807755264

And that last one you wanted me to run debug-tree on was a duplicate.

Bonus:
# ./btrfs-debug-tree -b 108544 /dev/sda4
btrfs-progs v4.12-dirty
bytenr mismatch, want=61809344512, have=0
Couldn't read tree root
node 108544 level 1 items 2 free 491 generation 325709 owner 1
fs uuid 29889b3a-1c10-48e4-ad6d-21d03d06e90b
chunk uuid 33f664ec-d0bc-42f9-87f1-d2c05046
key (EXTENT_TREE ROOT_ITEM 0) block 1085456384 (66251) gen 325709
key (286 INODE_ITEM 0) block 1085505536 (66254) gen 325709

BTW, thank you for your quick responses and help so far.

On Fri, Aug 18, 2017 at 5:46 AM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
> Would you please try this patch?
> https://patchwork.kernel.org/patch/9908173/
>
> This should allow btrfs-debug-tree to output tree block even tree root is
> corrupted.
> You could apply it on lasted master branch (tagged as v4.12).
>
> Then re-execute the following command (with patched btrfs-progs):
> # btrfs-debug-tree -b 131072 /dev/sda4
>
> And some new output:
> # btrfs-debug-tree -b 61809344512 /dev/sda4
> # btrfs-debug-tree -b 61807755264 /dev/sda4
> # btrfs-debug-tree -b 61809344512 /dev/sda4
>
> Thanks,
> Qu
>
>
> On 2017年08月18日 17:29, Zirconium Hacker wrote:
>>
>> $ sudo btrfs check -r 108544 /dev/sda4
>> parent transid verify failed on 108544 wanted 325966 found 325709
>> parent transid verify failed on 108544 wanted 325966 found 325709
>> Ignoring transid failure
>> bytenr mismatch, want=61352312832, have=0
>> Couldn't setup device tree
>> ERROR: cannot open file system
>>
>> On Fri, Aug 18, 2017 at 5:19 AM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
>>>
>>>
>>>
>>> On 2017年08月18日 17:08, Zirconium Hacker wrote:
>>>>
>>>>
>>>> I already ran that earlier, here's the pastebin:
>>>> https://pastebin.com/KGB8nVRA
>>>>
>>>> Running debug-tree on all 1084 of them (I guess that was unnecessary)
>>>> gave the same errors every time:
>>>> bytenr mismatch, want=61809344512, have=0
>>>> Couldn't read tree root
>>>> ERROR: unable to open /dev/sda4
>>>>
>>>
>>> Then try using btrfs check with new root:
>>>
>>> # btrfs check -r 108544 /dev/sda4
>>>
>>> Please note that, the generation in superblock differs quite a lot with
>>> find-root result.
>>> So I'm afraid it will cause quite a lot of problems.
>>>
>>> But least, it should help btrfs check to get over "Couldn't read tree
>>> root"
>>> error message.
>>>
>>> And for btrfs-debug-tree error, I'll submit a patch soon to allow it to
>>> be
>>> run on such heavily damaged fs.
>>>
>>>
>>> Thanks,
>>> Qu
>>>
>>>> On Fri, Aug 18, 2017 at 5:03 AM, Qu Wenruo <quwenruo.bt...@gmx.com>
>>>> wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 2017年08月18日 16:47, Zirconium Hacker wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> $ sudo btrfs-debug-tree -b 131072 /dev/sda4
>>>>>> btrfs-progs v4.12
>>>>>> bytenr mismatch, want=61809344512, have=0
>>>>>> Couldn't read tree root
>>>>>> ERROR: unable to open /dev/sda4
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I think this can be improved for case like this.
>>>>> I'll try to submit a patch to enhance btrfs-debug-tree.
>>>>>
>>>>> Would you please try "btrfs-find-root /dev/sda4"?
>>>>> This will try to locate on-disk old tree root, and if we're lucky, old
>>>>> tree
>>>>> root can allow us to mount the fs.
>>>>>
>>>>>>
>>>>>> Mounting with degraded,ro does not fix the multi-device issue.  The
>>>>>> system was never really intended to have a second device, though:
>>>>>
>>>>>
>>>>>
>>>>>
>>>

Re: BTRFS error (device sda4): failed to read chunk tree: -5

2017-08-18 Thread Zirconium Hacker
$ sudo btrfs check -r 108544 /dev/sda4
parent transid verify failed on 108544 wanted 325966 found 325709
parent transid verify failed on 108544 wanted 325966 found 325709
Ignoring transid failure
bytenr mismatch, want=61352312832, have=0
Couldn't setup device tree
ERROR: cannot open file system

On Fri, Aug 18, 2017 at 5:19 AM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
>
>
> On 2017年08月18日 17:08, Zirconium Hacker wrote:
>>
>> I already ran that earlier, here's the pastebin:
>> https://pastebin.com/KGB8nVRA
>>
>> Running debug-tree on all 1084 of them (I guess that was unnecessary)
>> gave the same errors every time:
>> bytenr mismatch, want=61809344512, have=0
>> Couldn't read tree root
>> ERROR: unable to open /dev/sda4
>>
>
> Then try using btrfs check with new root:
>
> # btrfs check -r 108544 /dev/sda4
>
> Please note that, the generation in superblock differs quite a lot with
> find-root result.
> So I'm afraid it will cause quite a lot of problems.
>
> But least, it should help btrfs check to get over "Couldn't read tree root"
> error message.
>
> And for btrfs-debug-tree error, I'll submit a patch soon to allow it to be
> run on such heavily damaged fs.
>
>
> Thanks,
> Qu
>
>> On Fri, Aug 18, 2017 at 5:03 AM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
>>>
>>>
>>>
>>> On 2017年08月18日 16:47, Zirconium Hacker wrote:
>>>>
>>>>
>>>> $ sudo btrfs-debug-tree -b 131072 /dev/sda4
>>>> btrfs-progs v4.12
>>>> bytenr mismatch, want=61809344512, have=0
>>>> Couldn't read tree root
>>>> ERROR: unable to open /dev/sda4
>>>
>>>
>>>
>>> I think this can be improved for case like this.
>>> I'll try to submit a patch to enhance btrfs-debug-tree.
>>>
>>> Would you please try "btrfs-find-root /dev/sda4"?
>>> This will try to locate on-disk old tree root, and if we're lucky, old
>>> tree
>>> root can allow us to mount the fs.
>>>
>>>>
>>>> Mounting with degraded,ro does not fix the multi-device issue.  The
>>>> system was never really intended to have a second device, though:
>>>
>>>
>>>
>>> Wait for a minute, did you mean this btrfs doesn't ever have a second
>>> device?
>>> This seems quite weird now.
>>>
>>>>
>>>> $ sudo btrfs fi show /dev/sda4
>>>> bytenr mismatch, want=61809344512, have=0
>>>> Couldn't read tree root
>>>> Label: none  uuid: 29889b3a-1c10-48e4-ad6d-21d03d06e90b
>>>> Total devices 2 FS bytes used 49.52GiB
>>>> devid1 size 54.07GiB used 54.07GiB path /dev/sda4
>>>> *** Some devices missing
>>>>
>>>> I vaguely remember following this guide at some point:
>>>>
>>>>
>>>> http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-Full-Problems.html
>>>> -- specifically the "Balance cannot run because the filesystem is
>>>> full" part.  This may have broken some things?
>>>
>>>
>>>
>>> Not sure, at least from your superblock, too many things are in doubt.
>>>  From the number of devices, to strange system chunk.
>>>
>>>
>>> Thanks,
>>> Qu
>>>>
>>>>
>>>>
>>>> On Fri, Aug 18, 2017 at 4:15 AM, Qu Wenruo <quwenruo.bt...@gmx.com>
>>>> wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 2017年08月18日 15:17, Zirconium Hacker wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> I checked my fstab, and my mount options for that partition are:
>>>>>> nodev,nosuid (so no discard).
>>>>>> As far as I remember, I had some issues converting from ext4 with
>>>>>> existing tools (I think that was on Debian so the tools were likely
>>>>>> older) so I did a manual conversion backup, wipe, copy files back).
>>>>>>
>>>>>> $ sudo btrfs-find-root -o 3 /dev/sda4
>>>>>> Couldn't read tree root
>>>>>> Superblock thinks the generation is 311252
>>>>>> Superblock thinks the level is 0
>>>>>> ERROR: tree block bytenr 0 is not aligned to sectorsize 4096
>>>>>> Found tree root at 131072 gen 311252 level 0
>>>>>
>>>>>
>>>>>
>>>>>

Re: BTRFS error (device sda4): failed to read chunk tree: -5

2017-08-18 Thread Zirconium Hacker
I already ran that earlier, here's the pastebin: https://pastebin.com/KGB8nVRA

Running debug-tree on all 1084 of them (I guess that was unnecessary)
gave the same errors every time:
bytenr mismatch, want=61809344512, have=0
Couldn't read tree root
ERROR: unable to open /dev/sda4

On Fri, Aug 18, 2017 at 5:03 AM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
>
>
> On 2017年08月18日 16:47, Zirconium Hacker wrote:
>>
>> $ sudo btrfs-debug-tree -b 131072 /dev/sda4
>> btrfs-progs v4.12
>> bytenr mismatch, want=61809344512, have=0
>> Couldn't read tree root
>> ERROR: unable to open /dev/sda4
>
>
> I think this can be improved for case like this.
> I'll try to submit a patch to enhance btrfs-debug-tree.
>
> Would you please try "btrfs-find-root /dev/sda4"?
> This will try to locate on-disk old tree root, and if we're lucky, old tree
> root can allow us to mount the fs.
>
>>
>> Mounting with degraded,ro does not fix the multi-device issue.  The
>> system was never really intended to have a second device, though:
>
>
> Wait for a minute, did you mean this btrfs doesn't ever have a second
> device?
> This seems quite weird now.
>
>>
>> $ sudo btrfs fi show /dev/sda4
>> bytenr mismatch, want=61809344512, have=0
>> Couldn't read tree root
>> Label: none  uuid: 29889b3a-1c10-48e4-ad6d-21d03d06e90b
>> Total devices 2 FS bytes used 49.52GiB
>> devid1 size 54.07GiB used 54.07GiB path /dev/sda4
>> *** Some devices missing
>>
>> I vaguely remember following this guide at some point:
>>
>> http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-Full-Problems.html
>> -- specifically the "Balance cannot run because the filesystem is
>> full" part.  This may have broken some things?
>
>
> Not sure, at least from your superblock, too many things are in doubt.
> From the number of devices, to strange system chunk.
>
>
> Thanks,
> Qu
>>
>>
>> On Fri, Aug 18, 2017 at 4:15 AM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
>>>
>>>
>>>
>>> On 2017年08月18日 15:17, Zirconium Hacker wrote:
>>>>
>>>>
>>>> I checked my fstab, and my mount options for that partition are:
>>>> nodev,nosuid (so no discard).
>>>> As far as I remember, I had some issues converting from ext4 with
>>>> existing tools (I think that was on Debian so the tools were likely
>>>> older) so I did a manual conversion backup, wipe, copy files back).
>>>>
>>>> $ sudo btrfs-find-root -o 3 /dev/sda4
>>>> Couldn't read tree root
>>>> Superblock thinks the generation is 311252
>>>> Superblock thinks the level is 0
>>>> ERROR: tree block bytenr 0 is not aligned to sectorsize 4096
>>>> Found tree root at 131072 gen 311252 level 0
>>>
>>>
>>>
>>> So chunk root (and since it's level 0, the whole chunk tree) seems good.
>>>
>>> Could you please try the following command?
>>> # btrfs-debug-tree -b 131072 /dev/sda4
>>>
>>> I assume it may fail due to the fact that root tree is corrupted.
>>> But maybe we are lucky?
>>>
>>>
>>> And further investigating your super dump and the code, it's shows some
>>> clue, mostly related to your multi-device setup.
>>>
>>> Your find-root output shows that, the only chunk leaf in /dev/sda4 seems
>>> good.
>>> And in btrfs_read_chunk_tree(), which returned -EIO and caused the error
>>> message, will first search chunk root.
>>>
>>> Since your chunk leaf is good, such search itself should not cause too
>>> much
>>> problem.
>>>
>>> Then btrfs_read_chunk_tree() will try to read out each device, by calling
>>> read_one_dev().
>>> Which can return -EIO if any device is missing and you're not using
>>> degraded
>>> mount option.
>>>
>>> Is your 2nd device missing? If so, would you please try to mount with
>>> "degraded,ro" mount option?
>>>
>>> BTW, if you didn't manually convert chunk profiles, did you first create
>>> btrfs on single device, and then added a new device to the btrfs?
>>>
>>> Thanks,
>>> Qu
>>>
>>>>
>>>> On Fri, Aug 18, 2017 at 12:10 AM, Chris Murphy <li...@colorremedies.com>
>>>> wrote:
>>>>>
>>>>>
>>>>> On Thu, Aug 17, 2017 at 4:42 PM, Qu Wenruo <quwenruo.bt...@gmx.com>
>>>>> wrote:
>>>>>
>>>>>> BTW are you using discard mount option? Sometimes it can cause
>>>>>> problem.
>>>>>
>>>>>
>>>>>
>>>>> OP did not say if it was using discard mount option; but did say some
>>>>> time before this (I'm not sure how recent) he had used fstrim. The
>>>>> firmware for this SSD model is current.
>>>>>
>>>>>
>>>>> --
>>>>> Chris Murphy
>>>>
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
>>>> in
>>>> the body of a message to majord...@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS error (device sda4): failed to read chunk tree: -5

2017-08-18 Thread Zirconium Hacker
$ sudo btrfs-debug-tree -b 131072 /dev/sda4
btrfs-progs v4.12
bytenr mismatch, want=61809344512, have=0
Couldn't read tree root
ERROR: unable to open /dev/sda4

Mounting with degraded,ro does not fix the multi-device issue.  The
system was never really intended to have a second device, though:

$ sudo btrfs fi show /dev/sda4
bytenr mismatch, want=61809344512, have=0
Couldn't read tree root
Label: none  uuid: 29889b3a-1c10-48e4-ad6d-21d03d06e90b
Total devices 2 FS bytes used 49.52GiB
devid1 size 54.07GiB used 54.07GiB path /dev/sda4
*** Some devices missing

I vaguely remember following this guide at some point:
http://marc.merlins.org/perso/btrfs/post_2014-05-04_Fixing-Btrfs-Filesystem-Full-Problems.html
-- specifically the "Balance cannot run because the filesystem is
full" part.  This may have broken some things?

On Fri, Aug 18, 2017 at 4:15 AM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
>
>
> On 2017年08月18日 15:17, Zirconium Hacker wrote:
>>
>> I checked my fstab, and my mount options for that partition are:
>> nodev,nosuid (so no discard).
>> As far as I remember, I had some issues converting from ext4 with
>> existing tools (I think that was on Debian so the tools were likely
>> older) so I did a manual conversion backup, wipe, copy files back).
>>
>> $ sudo btrfs-find-root -o 3 /dev/sda4
>> Couldn't read tree root
>> Superblock thinks the generation is 311252
>> Superblock thinks the level is 0
>> ERROR: tree block bytenr 0 is not aligned to sectorsize 4096
>> Found tree root at 131072 gen 311252 level 0
>
>
> So chunk root (and since it's level 0, the whole chunk tree) seems good.
>
> Could you please try the following command?
> # btrfs-debug-tree -b 131072 /dev/sda4
>
> I assume it may fail due to the fact that root tree is corrupted.
> But maybe we are lucky?
>
>
> And further investigating your super dump and the code, it's shows some
> clue, mostly related to your multi-device setup.
>
> Your find-root output shows that, the only chunk leaf in /dev/sda4 seems
> good.
> And in btrfs_read_chunk_tree(), which returned -EIO and caused the error
> message, will first search chunk root.
>
> Since your chunk leaf is good, such search itself should not cause too much
> problem.
>
> Then btrfs_read_chunk_tree() will try to read out each device, by calling
> read_one_dev().
> Which can return -EIO if any device is missing and you're not using degraded
> mount option.
>
> Is your 2nd device missing? If so, would you please try to mount with
> "degraded,ro" mount option?
>
> BTW, if you didn't manually convert chunk profiles, did you first create
> btrfs on single device, and then added a new device to the btrfs?
>
> Thanks,
> Qu
>
>>
>> On Fri, Aug 18, 2017 at 12:10 AM, Chris Murphy <li...@colorremedies.com>
>> wrote:
>>>
>>> On Thu, Aug 17, 2017 at 4:42 PM, Qu Wenruo <quwenruo.bt...@gmx.com>
>>> wrote:
>>>
>>>> BTW are you using discard mount option? Sometimes it can cause problem.
>>>
>>>
>>> OP did not say if it was using discard mount option; but did say some
>>> time before this (I'm not sure how recent) he had used fstrim. The
>>> firmware for this SSD model is current.
>>>
>>>
>>> --
>>> Chris Murphy
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS error (device sda4): failed to read chunk tree: -5

2017-08-18 Thread Zirconium Hacker
I checked my fstab, and my mount options for that partition are:
nodev,nosuid (so no discard).
As far as I remember, I had some issues converting from ext4 with
existing tools (I think that was on Debian so the tools were likely
older) so I did a manual conversion backup, wipe, copy files back).

$ sudo btrfs-find-root -o 3 /dev/sda4
Couldn't read tree root
Superblock thinks the generation is 311252
Superblock thinks the level is 0
ERROR: tree block bytenr 0 is not aligned to sectorsize 4096
Found tree root at 131072 gen 311252 level 0

On Fri, Aug 18, 2017 at 12:10 AM, Chris Murphy  wrote:
> On Thu, Aug 17, 2017 at 4:42 PM, Qu Wenruo  wrote:
>
>> BTW are you using discard mount option? Sometimes it can cause problem.
>
> OP did not say if it was using discard mount option; but did say some
> time before this (I'm not sure how recent) he had used fstrim. The
> firmware for this SSD model is current.
>
>
> --
> Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS error (device sda4): failed to read chunk tree: -5

2017-08-17 Thread Zirconium Hacker
I hope "Reply All" is the right option here.  Again, first time
interacting with a mailing list.  Google said that was what to do.

I have found no I/O errors in dmesg -- at least, none mentioning
'I/O', 'IO', or anything triggered by mount besides BTRFS's
complaints.

$ sudo btrfs rescue chunk -v /dev/sda4
(See https://pastebin.com/YaRHuKeT -- the output hasn't visibly
changed since I tried this around a week ago, but this output is
recent)
$ man btrfs | grep show-super -A1
   btrfs-show-super
   moved to btrfs inspect-internal dump-super
$ sudo btrfs inspect-internal dump-super -fa /dev/sda4
(See https://pastebin.com/DbABqXGQ)
$ sudo btrfs-find-root -o 5 /dev/sda4
(See 
https://zerobin.net/?496ed00aed01ab0c#Kvp+FqrF6mfqQLZvUYJ1ODWYIzGayJbdyuMXc9RTauA=
   -- Pastebin wouldn't let me paste that much)

I hope the way I'm organizing the output is OK.

On Thu, Aug 17, 2017 at 6:42 PM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
>
>
> On 2017年08月18日 00:53, Chris Murphy wrote:
>>
>> Readding Btrfslist, and adding Qu:
>>
>>
>> On Thu, Aug 17, 2017 at 12:48 AM, Zirconium Hacker <jared.e...@gmail.com>
>> wrote:
>>>
>>> Oh, sorry, I guess the output of the command I ran wasn't clear -- it
>>> was collecting the output of running the debug command on all 1,084
>>> and showing that it was the same.  Here's specifically what you asked
>>> for:
>>>
>>> $ sudo btrfs-debug-tree -b 61809344512 /dev/sda4
>>> btrfs-progs v4.12
>>> bytenr mismatch, want=61809344512, have=0
>>> Couldn't read tree root
>>> ERROR: unable to open /dev/sda4
>>> $ sudo btrfs-debug-tree -b 108544 /dev/sda4
>>> btrfs-progs v4.12
>>> bytenr mismatch, want=61809344512, have=0
>
>
> This means either chunk root is corrupted, or system chunk array in
> superblock is corrupted.
> Bytenr mismatch is normally impossible for normal operation.
>
> BTW are you using discard mount option? Sometimes it can cause problem.
>
> And please also paste the following output:
>
> # btrfs-show-super -fa /dev/sda4
> # btrfs-find-root -o 5 /dev/sda4
>
> The first is to output the full backup roots and current chunk root for us
> to debug.
> The second one will try to iterate your whole disk to find a valid but old
> chunk root.
> If we could find one (even a little old), it may make it possible to mount
> the fs.
>
> Thanks,
> Qu
>
>>> Couldn't read tree root
>>> ERROR: unable to open /dev/sda4
>>>
>>> I'm using GMail, and it's confusing me by trimming off quotes and
>>> stuff, so sorry if I miss something.
>>>
>>
>> OK well now we're in the bad part of Btrfs repair where the error
>> messages don't help. > It's one thing for it to complain about
>> 108544 being invalid, because by now it might have been
>> overwritten, but to say it wants some other root that we already know
>> it can't read, and then fail reading that root is not helpful
>> information.
>> Maybe Qu has an idea. But it does sound like something really
>> catastrophic happened to blow away all of the backup root trees.
>>
>> Going back to your first email, -o ro,usebackuproot failed with a
>> chunk tree error. I wonder if 'rescue chunk' might help.
>>
>> Try 'btrfs rescue chunk -v' and see what you get.
>>
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


BTRFS error (device sda4): failed to read chunk tree: -5

2017-08-16 Thread Zirconium Hacker
Hi,
This is my first time using a mailing list, and I hope I'm doing this right.

$ uname -a
Linux thinkpad 4.12.6-1-ARCH #1 SMP PREEMPT Sat Aug 12 09:16:22 CEST
2017 x86_64 GNU/Linux
$ btrfs --version
btrfs-progs v4.12
$ sudo mount -o ro,recovery /dev/sda4 /mnt
mount: /mnt: wrong fs type, bad option, bad superblock on /dev/sda4,
missing codepage or helper program, or other error.
$ dmesg | tail

[ 1289.087439] BTRFS warning (device sda4): 'recovery' is deprecated,
use 'usebackuproot' instead
[ 1289.087440] BTRFS info (device sda4): trying to use backup root at mount time
[ 1289.087442] BTRFS info (device sda4): disk space caching is enabled
[ 1289.097757] BTRFS error (device sda4): failed to read chunk tree: -5
[ 1289.135222] BTRFS error (device sda4): open_ctree failed

$ sudo btrfs check /dev/sda4
bytenr mismatch, want=61809344512, have=0
Couldn't read tree root
ERROR: cannot open file system
$ sudo btrfs restore - -D /dev/sda4 .
bytenr mismatch, want=61809344512, have=0
Couldn't read tree root
Could not open root, trying backup super
bytenr mismatch, want=61809344512, have=0
Couldn't read tree root
Could not open root, trying backup super
ERROR: superblock bytenr 274877906944 is larger than device size 58056507392
Could not open root, trying backup super

A script called btrfs-undelete
(https://gist.github.com/Changaco/45f8d171027ea2655d74) also fails
with similar errors.

I'd like to recover at least one folder, my desktop -- everything else
was backed up.
I'm using PhotoRec to try and recover some files, but I'd like a
better solution that keeps filenames and at least some folder
structure.

Thanks in advance!
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html