Re: System unable to mount partition after a power loss
I ran that command and I cannot get the email to send properly to the mailing list as the attachment of the output is over 4.6M. On 12/7/2018 11:49 AM, Doni Crosby wrote: The output of the command is attached. This is what errors showed up on the system: parent transid verify failed on 3563224842240 wanted 5184691 found 5184689 parent transid verify failed on 3563224842240 wanted 5184691 found 5184689 parent transid verify failed on 3563222974464 wanted 5184691 found 5184688 parent transid verify failed on 3563222974464 wanted 5184691 found 5184688 parent transid verify failed on 3563223121920 wanted 5184691 found 5184688 parent transid verify failed on 3563223121920 wanted 5184691 found 5184688 parent transid verify failed on 3563229970432 wanted 5184691 found 5184689 parent transid verify failed on 3563229970432 wanted 5184691 found 5184689 parent transid verify failed on 3563229970432 wanted 5184691 found 5184689 parent transid verify failed on 3563229970432 wanted 5184691 found 5184689 Ignoring transid failure parent transid verify failed on 3563231428608 wanted 5184691 found 5183327 parent transid verify failed on 3563231428608 wanted 5184691 found 5183327 parent transid verify failed on 3563231428608 wanted 5184691 found 5183327 parent transid verify failed on 3563231428608 wanted 5184691 found 5183327 Ignoring transid failure parent transid verify failed on 3563231444992 wanted 5184691 found 5183325 parent transid verify failed on 3563231444992 wanted 5184691 found 5183325 parent transid verify failed on 3563231444992 wanted 5184691 found 5183325 parent transid verify failed on 3563231444992 wanted 5184691 found 5183325 Ignoring transid failure parent transid verify failed on 3563231412224 wanted 5184691 found 5183325 parent transid verify failed on 3563231412224 wanted 5184691 found 5183325 parent transid verify failed on 3563231412224 wanted 5184691 found 5183325 parent transid verify failed on 3563231412224 wanted 5184691 found 5183325 Ignoring transid failure parent transid verify failed on 3563231461376 wanted 5184691 found 5183325 parent transid verify failed on 3563231461376 wanted 5184691 found 5183325 parent transid verify failed on 3563231461376 wanted 5184691 found 5183325 parent transid verify failed on 3563231461376 wanted 5184691 found 5183325 Ignoring transid failure WARNING: eb corrupted: parent bytenr 31801344 slot 132 level 1 child bytenr 3563231461376 level has 1 expect 0, skipping the slot parent transid verify failed on 3563231494144 wanted 5184691 found 5183325 parent transid verify failed on 3563231494144 wanted 5184691 found 5183325 parent transid verify failed on 3563231494144 wanted 5184691 found 5183325 parent transid verify failed on 3563231494144 wanted 5184691 found 5183325 Ignoring transid failure parent transid verify failed on 3563231526912 wanted 5184691 found 5183325 parent transid verify failed on 3563231526912 wanted 5184691 found 5183325 parent transid verify failed on 3563231526912 wanted 5184691 found 5183325 parent transid verify failed on 3563231526912 wanted 5184691 found 5183325 Ignoring transid failure parent transid verify failed on 3563229626368 wanted 5184691 found 5184689 parent transid verify failed on 3563229626368 wanted 5184691 found 5184689 parent transid verify failed on 3563229937664 wanted 5184691 found 5184689 parent transid verify failed on 3563229937664 wanted 5184691 found 5184689 parent transid verify failed on 3563226857472 wanted 5184691 found 5184689 parent transid verify failed on 3563226857472 wanted 5184691 found 5184689 parent transid verify failed on 3563230674944 wanted 5184691 found 5183325 parent transid verify failed on 3563230674944 wanted 5184691 found 5183325 parent transid verify failed on 3563230674944 wanted 5184691 found 5183325 parent transid verify failed on 3563230674944 wanted 5184691 found 5183325 Ignoring transid failure On Fri, Dec 7, 2018 at 2:22 AM Qu Wenruo wrote: On 2018/12/7 下午1:24, Doni Crosby wrote: All, I'm coming to you to see if there is a way to fix or at least recover most of the data I have from a btrfs filesystem. The system went down after both a breaker and the battery backup failed. I cannot currently mount the system, with the following error from dmesg: Note: The vda1 is just the entire disk being passed from the VM host to the VM it's not an actual true virtual block device [ 499.704398] BTRFS info (device vda1): disk space caching is enabled [ 499.704401] BTRFS info (device vda1): has skinny extents [ 499.739522] BTRFS error (device vda1): parent transid verify failed on 3563231428608 wanted 5184691 found 5183327 Transid mismatch normally means the fs is screwed up more or less. And according to your mount failure, it looks the fs get screwed up badly. What's the kernel version used in the VM? I don't really think the VM is always using the latest kernel. [ 499.740257] BTRFS error (device vda1): parent transid verify failed on 3563231428608 wanted 5184691 found 5183327
Re: System unable to mount partition after a power loss
I just looked at the VM it does not have a cache. That's the default in proxmox to improve performance. On Fri, Dec 7, 2018 at 7:25 AM Austin S. Hemmelgarn wrote: > > On 2018-12-07 01:43, Doni Crosby wrote: > >> This is qemu-kvm? What's the cache mode being used? It's possible the > >> usual write guarantees are thwarted by VM caching. > > Yes it is a proxmox host running the system so it is a qemu vm, I'm > > unsure on the caching situation. > On the note of QEMU and the cache mode, the only cache mode I've seen to > actually cause issues for BTRFS volumes _inside_ a VM is 'cache=unsafe', > but that causes problems for most filesystems, so it's probably not the > issue here. > > OTOH, I've seen issues with most of the cache modes other than > 'cache=writeback' and 'cache=writethrough' when dealing with BTRFS as > the back-end storage on the host system, and most of the time such > issues will manifest as both problems with the volume inside the VM > _and_ the volume the disk images are being stored on.
Re: System unable to mount partition after a power loss
On 2018-12-07 01:43, Doni Crosby wrote: This is qemu-kvm? What's the cache mode being used? It's possible the usual write guarantees are thwarted by VM caching. Yes it is a proxmox host running the system so it is a qemu vm, I'm unsure on the caching situation. On the note of QEMU and the cache mode, the only cache mode I've seen to actually cause issues for BTRFS volumes _inside_ a VM is 'cache=unsafe', but that causes problems for most filesystems, so it's probably not the issue here. OTOH, I've seen issues with most of the cache modes other than 'cache=writeback' and 'cache=writethrough' when dealing with BTRFS as the back-end storage on the host system, and most of the time such issues will manifest as both problems with the volume inside the VM _and_ the volume the disk images are being stored on.
Re: System unable to mount partition after a power loss
On 2018/12/7 下午1:24, Doni Crosby wrote: > All, > > I'm coming to you to see if there is a way to fix or at least recover > most of the data I have from a btrfs filesystem. The system went down > after both a breaker and the battery backup failed. I cannot currently > mount the system, with the following error from dmesg: > > Note: The vda1 is just the entire disk being passed from the VM host > to the VM it's not an actual true virtual block device > > [ 499.704398] BTRFS info (device vda1): disk space caching is enabled > [ 499.704401] BTRFS info (device vda1): has skinny extents > [ 499.739522] BTRFS error (device vda1): parent transid verify failed > on 3563231428608 wanted 5184691 found 5183327 Transid mismatch normally means the fs is screwed up more or less. And according to your mount failure, it looks the fs get screwed up badly. What's the kernel version used in the VM? I don't really think the VM is always using the latest kernel. > [ 499.740257] BTRFS error (device vda1): parent transid verify failed > on 3563231428608 wanted 5184691 found 5183327 > [ 499.770847] BTRFS error (device vda1): open_ctree failed > > I have tried running btrfsck: > parent transid verify failed on 3563224121344 wanted 5184691 found 5184688 > parent transid verify failed on 3563224121344 wanted 5184691 found 5184688 > parent transid verify failed on 3563224121344 wanted 5184691 found 5184688 > parent transid verify failed on 3563224121344 wanted 5184691 found 5184688 > parent transid verify failed on 3563224121344 wanted 5184691 found 5184688 > parent transid verify failed on 3563224121344 wanted 5184691 found 5184688 > parent transid verify failed on 3563221630976 wanted 5184691 found 5184688 > parent transid verify failed on 3563221630976 wanted 5184691 found 5184688 > parent transid verify failed on 3563223138304 wanted 5184691 found 5184688 > parent transid verify failed on 3563223138304 wanted 5184691 found 5184688 > parent transid verify failed on 3563223138304 wanted 5184691 found 5184688 > parent transid verify failed on 3563223138304 wanted 5184691 found 5184688 > parent transid verify failed on 3563224072192 wanted 5184691 found 5184688 > parent transid verify failed on 3563224072192 wanted 5184691 found 5184688 > parent transid verify failed on 3563225268224 wanted 5184691 found 5184689 > parent transid verify failed on 3563225268224 wanted 5184691 found 5184689 > parent transid verify failed on 3563227398144 wanted 5184691 found 5184689 > parent transid verify failed on 3563227398144 wanted 5184691 found 5184689 > parent transid verify failed on 3563229593600 wanted 5184691 found 5184689 > parent transid verify failed on 3563229593600 wanted 5184691 found 5184689 > parent transid verify failed on 3563229593600 wanted 5184691 found 5184689 > parent transid verify failed on 3563229593600 wanted 5184691 found 5184689 According to your later dump-super output, it looks pretty possible that the corrupted extents are all belonging to extent tree. So it's still possible that your fs tree and other essential trees are OK. Please dump the following output (with its stderr) to further confirm the damage. # btrfs ins dump-tree -b 31801344 --follow /dev/vda1 If your objective is only to recover data, then you could start to try btrfs-restore. It's pretty hard to fix the heavily damaged extent tree. Thanks, Qu > Ignoring transid failure > Checking filesystem on /dev/vda1 > UUID: 7c76bb05-b3dc-4804-bf56-88d010a214c6 > checking extents > parent transid verify failed on 3563224842240 wanted 5184691 found 5184689 > parent transid verify failed on 3563224842240 wanted 5184691 found 5184689 > parent transid verify failed on 3563222974464 wanted 5184691 found 5184688 > parent transid verify failed on 3563222974464 wanted 5184691 found 5184688 > parent transid verify failed on 3563223121920 wanted 5184691 found 5184688 > parent transid verify failed on 3563223121920 wanted 5184691 found 5184688 > parent transid verify failed on 3563229970432 wanted 5184691 found 5184689 > parent transid verify failed on 3563229970432 wanted 5184691 found 5184689 > parent transid verify failed on 3563229970432 wanted 5184691 found 5184689 > parent transid verify failed on 3563229970432 wanted 5184691 found 5184689 > Ignoring transid failure > parent transid verify failed on 3563231428608 wanted 5184691 found 5183327 > parent transid verify failed on 3563231428608 wanted 5184691 found 5183327 > parent transid verify failed on 3563231428608 wanted 5184691 found 5183327 > parent transid verify failed on 3563231428608 wanted 5184691 found 5183327 > Ignoring transid failure > parent transid verify failed on 3563231444992 wanted 5184691 found 5183325 > parent transid verify failed on 3563231444992 wanted 5184691 found 5183325 > parent transid verify failed on 3563231444992 wanted 5184691 found 5183325 > parent transid verify failed on 3563231444992 wanted 5184691 found 5183325 > Ignoring transid failure > parent transid verify
Re: System unable to mount partition after a power loss
> This is qemu-kvm? What's the cache mode being used? It's possible the > usual write guarantees are thwarted by VM caching. Yes it is a proxmox host running the system so it is a qemu vm, I'm unsure on the caching situation. > Old version of progs, I suggest upgrading to 4.17.1 and run I updated the progs to 4.17 and ran the following btrfs insp dump-s -f /device/: See attachment btrfs rescue super -v /device/ (insp rescue super wasn't valid) All Devices: Device: id = 1, name = /dev/vda1 Before Recovering: [All good supers]: device name = /dev/vda1 superblock bytenr = 65536 device name = /dev/vda1 superblock bytenr = 67108864 device name = /dev/vda1 superblock bytenr = 274877906944 [All bad supers]: All supers are valid, no need to recover btrfs check --mode=lowmem /dev/vda1: parent transid verify failed on 3563224121344 wanted 5184691 found 5184688 parent transid verify failed on 3563224121344 wanted 5184691 found 5184688 parent transid verify failed on 3563221630976 wanted 5184691 found 5184688 parent transid verify failed on 3563221630976 wanted 5184691 found 5184688 parent transid verify failed on 3563223138304 wanted 5184691 found 5184688 parent transid verify failed on 3563223138304 wanted 5184691 found 5184688 parent transid verify failed on 3563224072192 wanted 5184691 found 5184688 parent transid verify failed on 3563224072192 wanted 5184691 found 5184688 parent transid verify failed on 3563225268224 wanted 5184691 found 5184689 parent transid verify failed on 3563225268224 wanted 5184691 found 5184689 parent transid verify failed on 3563227398144 wanted 5184691 found 5184689 parent transid verify failed on 3563227398144 wanted 5184691 found 5184689 parent transid verify failed on 3563229593600 wanted 5184691 found 5184689 parent transid verify failed on 3563229593600 wanted 5184691 found 5184689 parent transid verify failed on 3563229593600 wanted 5184691 found 5184689 parent transid verify failed on 3563229593600 wanted 5184691 found 5184689 Ignoring transid failure ERROR: child eb corrupted: parent bytenr=3563210342400 item=120 parent level=1 child level=1 ERROR: cannot open file system mount -o ro,norecovery,usebackuproot /dev/vda1 /mnt: Same dmesg output as before. On Fri, Dec 7, 2018 at 12:56 AM Chris Murphy wrote: > > On Thu, Dec 6, 2018 at 10:24 PM Doni Crosby wrote: > > > > All, > > > > I'm coming to you to see if there is a way to fix or at least recover > > most of the data I have from a btrfs filesystem. The system went down > > after both a breaker and the battery backup failed. I cannot currently > > mount the system, with the following error from dmesg: > > > > Note: The vda1 is just the entire disk being passed from the VM host > > to the VM it's not an actual true virtual block device > > This is qemu-kvm? What's the cache mode being used? It's possible the > usual write guarantees are thwarted by VM caching. > > > > > btrfs check --recover also ends in a segmentation fault > > I'm not familiar with --recover option, the --repair option is flagged > with a warning in the man page. >Warning >Do not use --repair unless you are advised to do so by a > developer or an experienced user, > > > > btrfs --version: > > btrfs-progs v4.7.3 > > Old version of progs, I suggest upgrading to 4.17.1 and run > > btrfs insp dump-s -f /device/ > btrfs insp rescue super -v /device/ > btrfs check --mode=lowmem /device/ > > These are all read only commands. Please post output to the list, > hopefully a developer will get around to looking at it. > > It is safe to try: > > mount -o ro,norecovery,usebackuproot /device/ /mnt/ > > If that works, I suggest updating your backup while it's still > possible in the meantime. > > > -- > Chris Murphy superblock: bytenr=65536, device=/dev/vda1 - csum_type 0 (crc32c) csum_size 4 csum0xbfa6fd72 [match] bytenr 65536 flags 0x1 ( WRITTEN ) magic _BHRfS_M [match] fsid7c76bb05-b3dc-4804-bf56-88d010a214c6 label Array generation 5184693 root31801344 sys_array_size 226 chunk_root_generation 5183734 root_level 1 chunk_root 20971520 chunk_root_level1 log_root0 log_root_transid0 log_root_level 0 total_bytes 32003947737088 bytes_used 6652776640512 sectorsize 4096 nodesize16384 leafsize (deprecated) 16384 stripesize 4096 root_dir6 num_devices 1 compat_flags0x0 compat_ro_flags 0x0 incompat_flags 0x161 ( MIXED_BACKREF |
Re: System unable to mount partition after a power loss
On Thu, Dec 6, 2018 at 10:24 PM Doni Crosby wrote: > > All, > > I'm coming to you to see if there is a way to fix or at least recover > most of the data I have from a btrfs filesystem. The system went down > after both a breaker and the battery backup failed. I cannot currently > mount the system, with the following error from dmesg: > > Note: The vda1 is just the entire disk being passed from the VM host > to the VM it's not an actual true virtual block device This is qemu-kvm? What's the cache mode being used? It's possible the usual write guarantees are thwarted by VM caching. > btrfs check --recover also ends in a segmentation fault I'm not familiar with --recover option, the --repair option is flagged with a warning in the man page. Warning Do not use --repair unless you are advised to do so by a developer or an experienced user, > btrfs --version: > btrfs-progs v4.7.3 Old version of progs, I suggest upgrading to 4.17.1 and run btrfs insp dump-s -f /device/ btrfs insp rescue super -v /device/ btrfs check --mode=lowmem /device/ These are all read only commands. Please post output to the list, hopefully a developer will get around to looking at it. It is safe to try: mount -o ro,norecovery,usebackuproot /device/ /mnt/ If that works, I suggest updating your backup while it's still possible in the meantime. -- Chris Murphy
System unable to mount partition after a power loss
All, I'm coming to you to see if there is a way to fix or at least recover most of the data I have from a btrfs filesystem. The system went down after both a breaker and the battery backup failed. I cannot currently mount the system, with the following error from dmesg: Note: The vda1 is just the entire disk being passed from the VM host to the VM it's not an actual true virtual block device [ 499.704398] BTRFS info (device vda1): disk space caching is enabled [ 499.704401] BTRFS info (device vda1): has skinny extents [ 499.739522] BTRFS error (device vda1): parent transid verify failed on 3563231428608 wanted 5184691 found 5183327 [ 499.740257] BTRFS error (device vda1): parent transid verify failed on 3563231428608 wanted 5184691 found 5183327 [ 499.770847] BTRFS error (device vda1): open_ctree failed I have tried running btrfsck: parent transid verify failed on 3563224121344 wanted 5184691 found 5184688 parent transid verify failed on 3563224121344 wanted 5184691 found 5184688 parent transid verify failed on 3563224121344 wanted 5184691 found 5184688 parent transid verify failed on 3563224121344 wanted 5184691 found 5184688 parent transid verify failed on 3563224121344 wanted 5184691 found 5184688 parent transid verify failed on 3563224121344 wanted 5184691 found 5184688 parent transid verify failed on 3563221630976 wanted 5184691 found 5184688 parent transid verify failed on 3563221630976 wanted 5184691 found 5184688 parent transid verify failed on 3563223138304 wanted 5184691 found 5184688 parent transid verify failed on 3563223138304 wanted 5184691 found 5184688 parent transid verify failed on 3563223138304 wanted 5184691 found 5184688 parent transid verify failed on 3563223138304 wanted 5184691 found 5184688 parent transid verify failed on 3563224072192 wanted 5184691 found 5184688 parent transid verify failed on 3563224072192 wanted 5184691 found 5184688 parent transid verify failed on 3563225268224 wanted 5184691 found 5184689 parent transid verify failed on 3563225268224 wanted 5184691 found 5184689 parent transid verify failed on 3563227398144 wanted 5184691 found 5184689 parent transid verify failed on 3563227398144 wanted 5184691 found 5184689 parent transid verify failed on 3563229593600 wanted 5184691 found 5184689 parent transid verify failed on 3563229593600 wanted 5184691 found 5184689 parent transid verify failed on 3563229593600 wanted 5184691 found 5184689 parent transid verify failed on 3563229593600 wanted 5184691 found 5184689 Ignoring transid failure Checking filesystem on /dev/vda1 UUID: 7c76bb05-b3dc-4804-bf56-88d010a214c6 checking extents parent transid verify failed on 3563224842240 wanted 5184691 found 5184689 parent transid verify failed on 3563224842240 wanted 5184691 found 5184689 parent transid verify failed on 3563222974464 wanted 5184691 found 5184688 parent transid verify failed on 3563222974464 wanted 5184691 found 5184688 parent transid verify failed on 3563223121920 wanted 5184691 found 5184688 parent transid verify failed on 3563223121920 wanted 5184691 found 5184688 parent transid verify failed on 3563229970432 wanted 5184691 found 5184689 parent transid verify failed on 3563229970432 wanted 5184691 found 5184689 parent transid verify failed on 3563229970432 wanted 5184691 found 5184689 parent transid verify failed on 3563229970432 wanted 5184691 found 5184689 Ignoring transid failure parent transid verify failed on 3563231428608 wanted 5184691 found 5183327 parent transid verify failed on 3563231428608 wanted 5184691 found 5183327 parent transid verify failed on 3563231428608 wanted 5184691 found 5183327 parent transid verify failed on 3563231428608 wanted 5184691 found 5183327 Ignoring transid failure parent transid verify failed on 3563231444992 wanted 5184691 found 5183325 parent transid verify failed on 3563231444992 wanted 5184691 found 5183325 parent transid verify failed on 3563231444992 wanted 5184691 found 5183325 parent transid verify failed on 3563231444992 wanted 5184691 found 5183325 Ignoring transid failure parent transid verify failed on 3563231412224 wanted 5184691 found 5183325 parent transid verify failed on 3563231412224 wanted 5184691 found 5183325 parent transid verify failed on 3563231412224 wanted 5184691 found 5183325 parent transid verify failed on 3563231412224 wanted 5184691 found 5183325 Ignoring transid failure parent transid verify failed on 3563231461376 wanted 5184691 found 5183325 parent transid verify failed on 3563231461376 wanted 5184691 found 5183325 parent transid verify failed on 3563231461376 wanted 5184691 found 5183325 parent transid verify failed on 3563231461376 wanted 5184691 found 5183325 Ignoring transid failure Segmentation fault btrfs check --recover also ends in a segmentation fault I am aware of chunk-recover and have tried to run it but got weary when I saw dev0 not vda1. Any help would be appreciated, Doni uname -a: Linux Homophone 4.18.0-0.bpo.1-amd64 #1 SMP Debian 4.18.6-1~bpo9+1 (2018-09-13)