On Feb 11, 2017, at 12:34 PM, Kenneth Bogert <[email protected]> wrote:
> 
> Hello all,
> 
> I have been running a Rockstor 3.8.16-8 on an older Dell Optiplex for about a 
> month.  The system has four drives separated into two Raid1 filesystems 
> (“pools” in Rockstor terminology).  A few days ago I restarted it and noticed 
> that the services (NFS, Samba, etc) weren’t working.  Looking at dmesg, I saw:
> 
> kernel: BTRFS error (device sdb): parent transid verify failed on 
> 1721409388544 wanted 19188 found 83121
> 
> and sure enough, one of the subvolumes on my main filesystem is corrupted.  
> By corrupted I mean it can’t be accessed, deleted, or even looked at:
> 
> ls -l
> kernel: BTRFS error (device sdb): parent transid verify failed on 
> 1721409388544 wanted 19188 found 83121
> kernel: BTRFS error (device sdb): parent transid verify failed on 
> 1721409388544 wanted 19188 found 83121
> ls: cannot access /mnt2/Primary/Movies: Input/output error
> 
> total 16
> drwxr-xr-x 1 root      root         100 Dec 29 02:00 .
> drwxr-xr-x 1 root      root         208 Jan  3 12:05 ..
> drwxr-x--- 1 kbogert   root         698 Feb  6 08:49 Documents
> drwxr-xrwx 1 root      root         916 Jan  3 12:54 Games
> drwxr-xrwx 1 xenserver xenserver   2904 Jan  3 12:54 ISO
> d????????? ? ?         ?              ?            ? Movies
> drwxr-xrwx 1 root      root      139430 Jan  3 12:53 Music
> drwxr-xrwx 1 root      root       82470 Jan  3 12:53 RawPhotos
> drwxr-xr-x 1 root      root          80 Jan  1 04:00 .snapshots
> drwxr-xrwx 1 root      root          72 Jan  3 13:07 VMs
> 
> The input/output error is given for any operation on Movies.
> 
> Luckily there has been no data loss that I am aware of.  As it turns out I 
> have a snapshot of the Movies subvolume taken a few days before the incident. 
>  I was able to simply cp -a all files off of the entire filesystem, with no 
> reported errors, and verified a handful of them.  Note that the transid error 
> in dmesg alternates between sdb and sda5 after each startup.
> 
> 
> SETUP DETAILS
> 
> uname -a
> Linux ironmountain 4.8.7-1.el7.elrepo.x86_64 #1 SMP Thu Nov 10 20:47:24 EST 
> 2016 x86_64 x86_64 x86_64 GNU/Linux
> 
> btrfs —version
> btrfs-progs v4.8.3
> 
> btrfs dev scan
> kernel: BTRFS: device label Primary devid 1 transid 83461 /dev/sdb
> kernel: BTRFS: device label Primary devid 2 transid 83461 /dev/sda5
> 
> btrfs fi show /mnt2/Primary
> Label: 'Primary'  uuid: 21e09dd8-a54d-49ec-95cb-93fdd94f0c17
>       Total devices 2 FS bytes used 943.67GiB
>       devid    1 size 2.73TiB used 947.06GiB path /dev/sdb
>       devid    2 size 2.70TiB used 947.06GiB path /dev/sda5
> 
> btrfs dev usage /mnt2/Primary
> /dev/sda5, ID: 2
>   Device size:             2.70TiB
>   Device slack:              0.00B
>   Data,RAID1:            944.00GiB
>   Metadata,RAID1:          3.00GiB
>   System,RAID1:           64.00MiB
>   Unallocated:             1.77TiB
> 
> /dev/sdb, ID: 1
>   Device size:             2.73TiB
>   Device slack:              0.00B
>   Data,RAID1:            944.00GiB
>   Metadata,RAID1:          3.00GiB
>   System,RAID1:           64.00MiB
>   Unallocated:             1.80TiB
> 
> 
> btrfs fi df /mnt2/Primary
> Data, RAID1: total=944.00GiB, used=942.60GiB
> System, RAID1: total=64.00MiB, used=176.00KiB
> Metadata, RAID1: total=3.00GiB, used=1.07GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
> 
> 
> This server is very light use, however, I do have a number of VMs in the VMs 
> filesystem, exported over NFS, that are used by a Xenserver.  These are not 
> marked nocow, though I probably should have.  At the time of restart no VMs 
> were running.
> 
> I have deviated from Rockstor’s default setup a bit.  They take an 
> “appliance” view and try to enforce btrfs partitions that cover entire disks. 
>  I installed Rockstor onto /dev/sda4, created the Primary partition on 
> /dev/sdb using Rockstor’s gui, then on the command line added /dev/sda5 to it 
> and converted to raid1.  As far as I can tell Rockstor is just CentOS 7 with 
> a few updated utilities and a bunch of python scripts for providing a web 
> interface to btrfs-progs.  I have it setup to take monthly snapshots and do 
> monthly scrubs, with the exception of the Documents subvolume which takes 
> daily snapshots.  These are all readonly and go in the .snapshots directory.  
> Rockstor automatically deletes old snapshots once a limit is reached (7 daily 
> snapshots, for instance).
> 
> Side note, btrfs-progs 4.8.3 apparently has problems with CentOS 7’s glibc: 
> https://github.com/rockstor/rockstor-core/issues/1608 .  I have confirmed 
> that bug in my own compiled version of 4.8.3, and that 4.9.1 does not have it.
> 
> 
> WHAT I’VE TRIED AND RESULTS
> 
> First off, I have created an image with btrfs-image that I can make available 
> (though large, I believe it was a few Gbs and the filesystem is 3 TB)
> 
> * btrfs-zero-log 
>       had no discernible effect.
> 
> 
> * At this point, I compiled btrfs-progs 4.9.1.  The following commands were 
> run with this version:
> 
> 
> * btrfs check
>       This exits in an assert fairly quickly:
> checking extents
> cmds-check.c:5406: check_owner_ref: BUG_ON `rec->is_root` triggered, value 1
> /mnt/usb/btrfs-progs-bin/bin/btrfs[0x42139b]
> /mnt/usb/btrfs-progs-bin/bin/btrfs[0x421483]
> /mnt/usb/btrfs-progs-bin/bin/btrfs[0x430529]
> /mnt/usb/btrfs-progs-bin/bin/btrfs[0x43160c]
> /mnt/usb/btrfs-progs-bin/bin/btrfs[0x435d6f]
> /mnt/usb/btrfs-progs-bin/bin/btrfs[0x43ab71]
> /mnt/usb/btrfs-progs-bin/bin/btrfs[0x43b065]
> /mnt/usb/btrfs-progs-bin/bin/btrfs(cmd_check+0xbbc)[0x441b82]
> /mnt/usb/btrfs-progs-bin/bin/btrfs(main+0x12b)[0x40a734]
> /lib64/libc.so.6(__libc_start_main+0xf5)[0x7ffff6fa7b35]
> /mnt/usb/btrfs-progs-bin/bin/btrfs[0x40a179]
> 
> Full backtrace is attached as btrfsck_debug.log 
> 
> * btrfs check -mode lowmem
>       This outputs a large number of errors before finally segfault’ing.  
> Full backtrace attached as btrfsck_lowmem_debug.log
> 
> * btrfs scrub
>       This completes with no errors.
> 
> 
> * Memtest86 completed more than 6 passes with no errors (left it running for 
> a day)
> 
> * No SMART errors, btrfs device stats shows no errors.  The drives the 
> filesystem is on are brand new.
> 
> * I have tried to recreate the problem by installing Rockstor into a number 
> of VMs and redoing my steps, no such luck.
> 
> 
> The main Rockstor partition (btrfs), as well as the other Raid1 partition on 
> completely separate drives were not affected.  I can provide any other logs 
> requested.
> 
> Help would be greatly appreciated!
> 
> 
> Kenneth Bogert
> 
> <btrfsck_lowmem_debug.log><btrfsck_debug.log>

As a small update to this problem, here is the output of btrfs subvolume list 
(with 4.9.1):

The snapshot for the Movies subvolume is at gen 73808 but Movies is 19188?


ID 259 gen 83464 cgen 39 parent 5 top level 5 parent_uuid - path Music
ID 260 gen 19188 cgen 40 parent 5 top level 5 parent_uuid - path Movies
ID 261 gen 73808 cgen 41 parent 5 top level 5 parent_uuid - path ISO
ID 262 gen 73864 cgen 42 parent 5 top level 5 parent_uuid - path RawPhotos
ID 263 gen 83456 cgen 44 parent 5 top level 5 parent_uuid - path VMs
ID 601 gen 73810 cgen 356 parent 5 top level 5 parent_uuid - path Games
ID 882 gen 83462 cgen 526 parent 5 top level 5 parent_uuid - path Documents
ID 2104 gen 44513 cgen 44513 parent 5 top level 5 parent_uuid 
212f71b3-21a2-274c-b080-86f262f50ccb path .snapshots/Documents/documents_daily_1
ID 2111 gen 55190 cgen 55190 parent 5 top level 5 parent_uuid 
212f71b3-21a2-274c-b080-86f262f50ccb path 
.snapshots/Documents/documents_weekly_201701220542
ID 2121 gen 68569 cgen 68569 parent 5 top level 5 parent_uuid 
212f71b3-21a2-274c-b080-86f262f50ccb path 
.snapshots/Documents/documents_weekly_201701290542
ID 2122 gen 68593 cgen 68593 parent 5 top level 5 parent_uuid 
4e131f43-6ccb-7449-89ed-0d00b761cb08 path .snapshots/VMs/VMs_201701290600
ID 2124 gen 71873 cgen 71873 parent 5 top level 5 parent_uuid 
212f71b3-21a2-274c-b080-86f262f50ccb path 
.snapshots/Documents/documents_daily_201701310400
ID 2125 gen 73705 cgen 73705 parent 5 top level 5 parent_uuid 
212f71b3-21a2-274c-b080-86f262f50ccb path 
.snapshots/Documents/documents_daily_201702010400
ID 2126 gen 73808 cgen 73808 parent 5 top level 5 parent_uuid 
1d82b662-f291-b340-9424-804fa431a03b path .snapshots/ISO/ISO_201702010500
ID 2127 gen 73808 cgen 73808 parent 5 top level 5 parent_uuid 
915e8022-4cf3-084b-8ac6-504822a168c4 path .snapshots/Movies/movies_201702010500
ID 2128 gen 73810 cgen 73810 parent 5 top level 5 parent_uuid 
adcb63c8-ee55-8b49-8f7a-aed491aab7e6 path .snapshots/Games/games_201702010500
ID 2129 gen 73811 cgen 73811 parent 5 top level 5 parent_uuid 
e23f7432-fc89-c849-a2f2-4280cefabcf7 path .snapshots/Music/music_201702010500
ID 2130 gen 73864 cgen 73864 parent 5 top level 5 parent_uuid 
67dc081c-cf8e-a444-8c8f-7899865e2f08 path 
.snapshots/RawPhotos/rawphotos_201702010530
ID 2131 gen 73865 cgen 73865 parent 5 top level 5 parent_uuid 
212f71b3-21a2-274c-b080-86f262f50ccb path 
.snapshots/Documents/documents_monthly_201702010530
ID 2132 gen 73920 cgen 73920 parent 5 top level 5 parent_uuid 
4e131f43-6ccb-7449-89ed-0d00b761cb08 path .snapshots/VMs/VMs_201702010600
ID 2133 gen 75516 cgen 75516 parent 5 top level 5 parent_uuid 
212f71b3-21a2-274c-b080-86f262f50ccb path 
.snapshots/Documents/documents_daily_201702020400
ID 2134 gen 77397 cgen 77397 parent 5 top level 5 parent_uuid 
212f71b3-21a2-274c-b080-86f262f50ccb path 
.snapshots/Documents/documents_daily_201702030400
ID 2135 gen 79229 cgen 79229 parent 5 top level 5 parent_uuid 
212f71b3-21a2-274c-b080-86f262f50ccb path 
.snapshots/Documents/documents_daily_201702040400
ID 2136 gen 81109 cgen 81109 parent 5 top level 5 parent_uuid 
212f71b3-21a2-274c-b080-86f262f50ccb path 
.snapshots/Documents/documents_daily_201702050400
ID 2137 gen 81246 cgen 81246 parent 5 top level 5 parent_uuid 
212f71b3-21a2-274c-b080-86f262f50ccb path 
.snapshots/Documents/documents_weekly_201702050542
ID 2138 gen 81273 cgen 81273 parent 5 top level 5 parent_uuid 
4e131f43-6ccb-7449-89ed-0d00b761cb08 path .snapshots/VMs/VMs_201702050600
ID 2139 gen 82966 cgen 82966 parent 5 top level 5 parent_uuid 
212f71b3-21a2-274c-b080-86f262f50ccb path 
.snapshots/Documents/documents_daily_201702060400


Kenneth Bogert

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to