> On Feb 17, 2017, at 1:39 PM, Kenneth Bogert <[email protected]> wrote: > > On Feb 11, 2017, at 12:34 PM, Kenneth Bogert <[email protected]> wrote: >> >> Hello all, >> >> I have been running a Rockstor 3.8.16-8 on an older Dell Optiplex for about >> a month. The system has four drives separated into two Raid1 filesystems >> (“pools” in Rockstor terminology). A few days ago I restarted it and >> noticed that the services (NFS, Samba, etc) weren’t working. Looking at >> dmesg, I saw: >> >> kernel: BTRFS error (device sdb): parent transid verify failed on >> 1721409388544 wanted 19188 found 83121 >> >> and sure enough, one of the subvolumes on my main filesystem is corrupted. >> By corrupted I mean it can’t be accessed, deleted, or even looked at: >> >> ls -l >> kernel: BTRFS error (device sdb): parent transid verify failed on >> 1721409388544 wanted 19188 found 83121 >> kernel: BTRFS error (device sdb): parent transid verify failed on >> 1721409388544 wanted 19188 found 83121 >> ls: cannot access /mnt2/Primary/Movies: Input/output error >> >> total 16 >> drwxr-xr-x 1 root root 100 Dec 29 02:00 . >> drwxr-xr-x 1 root root 208 Jan 3 12:05 .. >> drwxr-x--- 1 kbogert root 698 Feb 6 08:49 Documents >> drwxr-xrwx 1 root root 916 Jan 3 12:54 Games >> drwxr-xrwx 1 xenserver xenserver 2904 Jan 3 12:54 ISO >> d????????? ? ? ? ? ? Movies >> drwxr-xrwx 1 root root 139430 Jan 3 12:53 Music >> drwxr-xrwx 1 root root 82470 Jan 3 12:53 RawPhotos >> drwxr-xr-x 1 root root 80 Jan 1 04:00 .snapshots >> drwxr-xrwx 1 root root 72 Jan 3 13:07 VMs >> >> The input/output error is given for any operation on Movies. >> >> Luckily there has been no data loss that I am aware of. As it turns out I >> have a snapshot of the Movies subvolume taken a few days before the >> incident. I was able to simply cp -a all files off of the entire >> filesystem, with no reported errors, and verified a handful of them. Note >> that the transid error in dmesg alternates between sdb and sda5 after each >> startup. >> >> >> SETUP DETAILS >> >> uname -a >> Linux ironmountain 4.8.7-1.el7.elrepo.x86_64 #1 SMP Thu Nov 10 20:47:24 EST >> 2016 x86_64 x86_64 x86_64 GNU/Linux >> >> btrfs —version >> btrfs-progs v4.8.3 >> >> btrfs dev scan >> kernel: BTRFS: device label Primary devid 1 transid 83461 /dev/sdb >> kernel: BTRFS: device label Primary devid 2 transid 83461 /dev/sda5 >> >> btrfs fi show /mnt2/Primary >> Label: 'Primary' uuid: 21e09dd8-a54d-49ec-95cb-93fdd94f0c17 >> Total devices 2 FS bytes used 943.67GiB >> devid 1 size 2.73TiB used 947.06GiB path /dev/sdb >> devid 2 size 2.70TiB used 947.06GiB path /dev/sda5 >> >> btrfs dev usage /mnt2/Primary >> /dev/sda5, ID: 2 >> Device size: 2.70TiB >> Device slack: 0.00B >> Data,RAID1: 944.00GiB >> Metadata,RAID1: 3.00GiB >> System,RAID1: 64.00MiB >> Unallocated: 1.77TiB >> >> /dev/sdb, ID: 1 >> Device size: 2.73TiB >> Device slack: 0.00B >> Data,RAID1: 944.00GiB >> Metadata,RAID1: 3.00GiB >> System,RAID1: 64.00MiB >> Unallocated: 1.80TiB >> >> >> btrfs fi df /mnt2/Primary >> Data, RAID1: total=944.00GiB, used=942.60GiB >> System, RAID1: total=64.00MiB, used=176.00KiB >> Metadata, RAID1: total=3.00GiB, used=1.07GiB >> GlobalReserve, single: total=512.00MiB, used=0.00B >> >> >> This server is very light use, however, I do have a number of VMs in the VMs >> filesystem, exported over NFS, that are used by a Xenserver. These are not >> marked nocow, though I probably should have. At the time of restart no VMs >> were running. >> >> I have deviated from Rockstor’s default setup a bit. They take an >> “appliance” view and try to enforce btrfs partitions that cover entire >> disks. I installed Rockstor onto /dev/sda4, created the Primary partition >> on /dev/sdb using Rockstor’s gui, then on the command line added /dev/sda5 >> to it and converted to raid1. As far as I can tell Rockstor is just CentOS >> 7 with a few updated utilities and a bunch of python scripts for providing a >> web interface to btrfs-progs. I have it setup to take monthly snapshots and >> do monthly scrubs, with the exception of the Documents subvolume which takes >> daily snapshots. These are all readonly and go in the .snapshots directory. >> Rockstor automatically deletes old snapshots once a limit is reached (7 >> daily snapshots, for instance). >> >> Side note, btrfs-progs 4.8.3 apparently has problems with CentOS 7’s glibc: >> https://github.com/rockstor/rockstor-core/issues/1608 . I have confirmed >> that bug in my own compiled version of 4.8.3, and that 4.9.1 does not have >> it. >> >> >> WHAT I’VE TRIED AND RESULTS >> >> First off, I have created an image with btrfs-image that I can make >> available (though large, I believe it was a few Gbs and the filesystem is 3 >> TB) >> >> * btrfs-zero-log >> had no discernible effect. >> >> >> * At this point, I compiled btrfs-progs 4.9.1. The following commands were >> run with this version: >> >> >> * btrfs check >> This exits in an assert fairly quickly: >> checking extents >> cmds-check.c:5406: check_owner_ref: BUG_ON `rec->is_root` triggered, value 1 >> /mnt/usb/btrfs-progs-bin/bin/btrfs[0x42139b] >> /mnt/usb/btrfs-progs-bin/bin/btrfs[0x421483] >> /mnt/usb/btrfs-progs-bin/bin/btrfs[0x430529] >> /mnt/usb/btrfs-progs-bin/bin/btrfs[0x43160c] >> /mnt/usb/btrfs-progs-bin/bin/btrfs[0x435d6f] >> /mnt/usb/btrfs-progs-bin/bin/btrfs[0x43ab71] >> /mnt/usb/btrfs-progs-bin/bin/btrfs[0x43b065] >> /mnt/usb/btrfs-progs-bin/bin/btrfs(cmd_check+0xbbc)[0x441b82] >> /mnt/usb/btrfs-progs-bin/bin/btrfs(main+0x12b)[0x40a734] >> /lib64/libc.so.6(__libc_start_main+0xf5)[0x7ffff6fa7b35] >> /mnt/usb/btrfs-progs-bin/bin/btrfs[0x40a179] >> >> Full backtrace is attached as btrfsck_debug.log >> >> * btrfs check -mode lowmem >> This outputs a large number of errors before finally segfault’ing. >> Full backtrace attached as btrfsck_lowmem_debug.log >> >> * btrfs scrub >> This completes with no errors. >> >> >> * Memtest86 completed more than 6 passes with no errors (left it running for >> a day) >> >> * No SMART errors, btrfs device stats shows no errors. The drives the >> filesystem is on are brand new. >> >> * I have tried to recreate the problem by installing Rockstor into a number >> of VMs and redoing my steps, no such luck. >> >> >> The main Rockstor partition (btrfs), as well as the other Raid1 partition on >> completely separate drives were not affected. I can provide any other logs >> requested. >> >> Help would be greatly appreciated! >> >> >> Kenneth Bogert >> >> <btrfsck_lowmem_debug.log><btrfsck_debug.log> > > As a small update to this problem, here is the output of btrfs subvolume list > (with 4.9.1): > > The snapshot for the Movies subvolume is at gen 73808 but Movies is 19188? > > > ID 259 gen 83464 cgen 39 parent 5 top level 5 parent_uuid - path Music > ID 260 gen 19188 cgen 40 parent 5 top level 5 parent_uuid - path Movies > ID 261 gen 73808 cgen 41 parent 5 top level 5 parent_uuid - path ISO > ID 262 gen 73864 cgen 42 parent 5 top level 5 parent_uuid - path RawPhotos > ID 263 gen 83456 cgen 44 parent 5 top level 5 parent_uuid - path VMs > ID 601 gen 73810 cgen 356 parent 5 top level 5 parent_uuid - path Games > ID 882 gen 83462 cgen 526 parent 5 top level 5 parent_uuid - path Documents > ID 2104 gen 44513 cgen 44513 parent 5 top level 5 parent_uuid > 212f71b3-21a2-274c-b080-86f262f50ccb path > .snapshots/Documents/documents_daily_1 > ID 2111 gen 55190 cgen 55190 parent 5 top level 5 parent_uuid > 212f71b3-21a2-274c-b080-86f262f50ccb path > .snapshots/Documents/documents_weekly_201701220542 > ID 2121 gen 68569 cgen 68569 parent 5 top level 5 parent_uuid > 212f71b3-21a2-274c-b080-86f262f50ccb path > .snapshots/Documents/documents_weekly_201701290542 > ID 2122 gen 68593 cgen 68593 parent 5 top level 5 parent_uuid > 4e131f43-6ccb-7449-89ed-0d00b761cb08 path .snapshots/VMs/VMs_201701290600 > ID 2124 gen 71873 cgen 71873 parent 5 top level 5 parent_uuid > 212f71b3-21a2-274c-b080-86f262f50ccb path > .snapshots/Documents/documents_daily_201701310400 > ID 2125 gen 73705 cgen 73705 parent 5 top level 5 parent_uuid > 212f71b3-21a2-274c-b080-86f262f50ccb path > .snapshots/Documents/documents_daily_201702010400 > ID 2126 gen 73808 cgen 73808 parent 5 top level 5 parent_uuid > 1d82b662-f291-b340-9424-804fa431a03b path .snapshots/ISO/ISO_201702010500 > ID 2127 gen 73808 cgen 73808 parent 5 top level 5 parent_uuid > 915e8022-4cf3-084b-8ac6-504822a168c4 path > .snapshots/Movies/movies_201702010500 > ID 2128 gen 73810 cgen 73810 parent 5 top level 5 parent_uuid > adcb63c8-ee55-8b49-8f7a-aed491aab7e6 path .snapshots/Games/games_201702010500 > ID 2129 gen 73811 cgen 73811 parent 5 top level 5 parent_uuid > e23f7432-fc89-c849-a2f2-4280cefabcf7 path .snapshots/Music/music_201702010500 > ID 2130 gen 73864 cgen 73864 parent 5 top level 5 parent_uuid > 67dc081c-cf8e-a444-8c8f-7899865e2f08 path > .snapshots/RawPhotos/rawphotos_201702010530 > ID 2131 gen 73865 cgen 73865 parent 5 top level 5 parent_uuid > 212f71b3-21a2-274c-b080-86f262f50ccb path > .snapshots/Documents/documents_monthly_201702010530 > ID 2132 gen 73920 cgen 73920 parent 5 top level 5 parent_uuid > 4e131f43-6ccb-7449-89ed-0d00b761cb08 path .snapshots/VMs/VMs_201702010600 > ID 2133 gen 75516 cgen 75516 parent 5 top level 5 parent_uuid > 212f71b3-21a2-274c-b080-86f262f50ccb path > .snapshots/Documents/documents_daily_201702020400 > ID 2134 gen 77397 cgen 77397 parent 5 top level 5 parent_uuid > 212f71b3-21a2-274c-b080-86f262f50ccb path > .snapshots/Documents/documents_daily_201702030400 > ID 2135 gen 79229 cgen 79229 parent 5 top level 5 parent_uuid > 212f71b3-21a2-274c-b080-86f262f50ccb path > .snapshots/Documents/documents_daily_201702040400 > ID 2136 gen 81109 cgen 81109 parent 5 top level 5 parent_uuid > 212f71b3-21a2-274c-b080-86f262f50ccb path > .snapshots/Documents/documents_daily_201702050400 > ID 2137 gen 81246 cgen 81246 parent 5 top level 5 parent_uuid > 212f71b3-21a2-274c-b080-86f262f50ccb path > .snapshots/Documents/documents_weekly_201702050542 > ID 2138 gen 81273 cgen 81273 parent 5 top level 5 parent_uuid > 4e131f43-6ccb-7449-89ed-0d00b761cb08 path .snapshots/VMs/VMs_201702050600 > ID 2139 gen 82966 cgen 82966 parent 5 top level 5 parent_uuid > 212f71b3-21a2-274c-b080-86f262f50ccb path > .snapshots/Documents/documents_daily_201702060400 > > > Kenneth Bogert >
Is anyone interested in this problem? If not, I’m planning on rebuilding this filesystem this weekend. Kenneth Bogert -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
