Am Sat, 26 Mar 2016 20:30:35 +0100 schrieb Kai Krakow <hurikha...@gmail.com>:
> Am Wed, 23 Mar 2016 12:16:24 +0800 > schrieb Qu Wenruo <quwen...@cn.fujitsu.com>: > > > Kai Krakow wrote on 2016/03/22 19:48 +0100: > > > Am Tue, 22 Mar 2016 16:47:10 +0800 > > > schrieb Qu Wenruo <quwen...@cn.fujitsu.com>: > > > > [...] > > [...] > [...] > > > > > > Apparently, that system does not boot now due to errors in bcache > > > b-tree. That being that, it may well be some bcache error and not > > > btrfs' fault. Apparently I couldn't catch the output, I've been > > > in a hurry. It said "write error" and had some backtrace. I will > > > come to this back later. > > > > > > Let's go to the system I currently care about (that one with the > > > always breaking VDI file): > > > > > [...] > [...] > > > > > > After the error occured? > > > > > > Yes, some text about the extent being compressed and btrfs repair > > > doesn't currently handle that case (I tried --repair as I'm > > > having a backup). I simply decided not to investigate that > > > further at that point but delete and restore the affected file > > > from backup. However, this is the message from dmesg (tho, I > > > didn't catch the backtrace): > > > > > > btrfs_run_delayed_refs:2927: errno=-17 Object already exists > > > > That's nice, at least we have some clue. > > > > It's almost sure, it's a bug either in btrfs kernel which doesn't > > handle delayed refs well(low possibility), or, corrupted fs which > > create something kernel can't handle(I bet that's the case). > > [kernel 4.5.0 gentoo, btrfs-progs 4.4.1] > > Well, this time it hit me on the USB backup drive which uses no bcache > and no other fancy options except compress-force=zlib. Apparently, > I've only got a (real) screenshot which I'm going to link here: > > https://www.dropbox.com/s/9qbc7np23y8lrii/IMG_20160326_200033.jpg?dl=0 > > The same drive has no problems except "bad metadata crossing stripe > boundary" - but a lot of them. This drive was never converted, it was > freshly generated several months ago. > [...] I finally got copy&paste data: # before mounting let's check the FS: $ sudo btrfsck /dev/disk/by-label/usb-backup Checking filesystem on /dev/disk/by-label/usb-backup UUID: 1318ec21-c421-4e36-a44a-7be3d41f9c3f checking extents bad metadata [156041216, 156057600) crossing stripe boundary bad metadata [181403648, 181420032) crossing stripe boundary bad metadata [392167424, 392183808) crossing stripe boundary bad metadata [783482880, 783499264) crossing stripe boundary bad metadata [784924672, 784941056) crossing stripe boundary bad metadata [130151612416, 130151628800) crossing stripe boundary bad metadata [162826813440, 162826829824) crossing stripe boundary bad metadata [162927083520, 162927099904) crossing stripe boundary bad metadata [619740659712, 619740676096) crossing stripe boundary bad metadata [619781947392, 619781963776) crossing stripe boundary bad metadata [619795644416, 619795660800) crossing stripe boundary bad metadata [619816091648, 619816108032) crossing stripe boundary bad metadata [620011388928, 620011405312) crossing stripe boundary bad metadata [890992459776, 890992476160) crossing stripe boundary bad metadata [891022737408, 891022753792) crossing stripe boundary bad metadata [891101773824, 891101790208) crossing stripe boundary bad metadata [891301199872, 891301216256) crossing stripe boundary bad metadata [1012219314176, 1012219330560) crossing stripe boundary bad metadata [1017202409472, 1017202425856) crossing stripe boundary bad metadata [1017365397504, 1017365413888) crossing stripe boundary bad metadata [1020764422144, 1020764438528) crossing stripe boundary bad metadata [1251103342592, 1251103358976) crossing stripe boundary bad metadata [1251144695808, 1251144712192) crossing stripe boundary bad metadata [1251147055104, 1251147071488) crossing stripe boundary bad metadata [1259271225344, 1259271241728) crossing stripe boundary bad metadata [1266223611904, 1266223628288) crossing stripe boundary bad metadata [1304750063616, 1304750080000) crossing stripe boundary bad metadata [1304790106112, 1304790122496) crossing stripe boundary bad metadata [1304850792448, 1304850808832) crossing stripe boundary bad metadata [1304869928960, 1304869945344) crossing stripe boundary bad metadata [1305089540096, 1305089556480) crossing stripe boundary bad metadata [1309561651200, 1309561667584) crossing stripe boundary bad metadata [1309581443072, 1309581459456) crossing stripe boundary bad metadata [1309583671296, 1309583687680) crossing stripe boundary bad metadata [1309942808576, 1309942824960) crossing stripe boundary bad metadata [1310050549760, 1310050566144) crossing stripe boundary bad metadata [1313031585792, 1313031602176) crossing stripe boundary bad metadata [1313232912384, 1313232928768) crossing stripe boundary bad metadata [1555210764288, 1555210780672) crossing stripe boundary bad metadata [1555395182592, 1555395198976) crossing stripe boundary bad metadata [2050576744448, 2050576760832) crossing stripe boundary bad metadata [2050803957760, 2050803974144) crossing stripe boundary bad metadata [2050969108480, 2050969124864) crossing stripe boundary checking free space tree cache and super generation don't match, space cache will be invalidated checking fs roots checking csums checking root refs found 1860217443214 bytes used err is 0 total csum bytes: 1805105116 total tree bytes: 11793776640 total fs tree bytes: 8220835840 total extent tree bytes: 1443315712 btree space waste bytes: 2307850845 file data blocks allocated: 2137151094784 referenced 2706830905344 # now let's wait for the backup to mount the FS and look at dmesg: [21375.606479] BTRFS info (device sde1): force zlib compression [21375.606483] BTRFS info (device sde1): using free space tree [21375.606485] BTRFS: has skinny extents [21383.710725] BTRFS: checking UUID tree [21388.531267] ------------[ cut here ]------------ [21388.531279] WARNING: CPU: 0 PID: 27085 at fs/btrfs/extent-tree.c:2946 btrfs_run_delayed_refs+0x279/0x2b0() [21388.531281] BTRFS: Transaction aborted (error -17) [21388.531282] Modules linked in: nvidia_drm(PO) uas usb_storage vboxnetadp(O) vboxnetflt(O) vboxdrv(O) nvidia_modeset(PO) nvidia(PO) [21388.531293] CPU: 0 PID: 27085 Comm: kworker/u8:3 Tainted: P O 4.5.0-gentoo #1 [21388.531295] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z68 Pro3, BIOS L2.16A 02/22/2013 [21388.531300] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [21388.531302] 0000000000000000 ffffffff8159eaa9 ffff88029d723d68 ffffffff81ea1cf6 [21388.531306] ffffffff810c6e37 ffff880407be5960 ffff88029d723db8 0000000000000020 [21388.531308] ffff8802bd886510 ffff8803cc846d00 ffffffff810c6eb7 ffffffff81e8a088 [21388.531311] Call Trace: [21388.531317] [<ffffffff8159eaa9>] ? dump_stack+0x46/0x5d [21388.531322] [<ffffffff810c6e37>] ? warn_slowpath_common+0x77/0xb0 [21388.531326] [<ffffffff810c6eb7>] ? warn_slowpath_fmt+0x47/0x50 [21388.531329] [<ffffffff814923d9>] ? btrfs_run_delayed_refs+0x279/0x2b0 [21388.531332] [<ffffffff8149243d>] ? delayed_ref_async_start+0x2d/0x70 [21388.531337] [<ffffffff810da8c3>] ? process_one_work+0x133/0x350 [21388.531340] [<ffffffff810dae25>] ? worker_thread+0x45/0x450 [21388.531343] [<ffffffff810dade0>] ? rescuer_thread+0x300/0x300 [21388.531347] [<ffffffff810df3f8>] ? kthread+0xb8/0xd0 [21388.531350] [<ffffffff810e301f>] ? finish_task_switch+0x6f/0x1f0 [21388.531352] [<ffffffff810df340>] ? kthread_park+0x50/0x50 [21388.531356] [<ffffffff81bdb31f>] ? ret_from_fork+0x3f/0x70 [21388.531359] [<ffffffff810df340>] ? kthread_park+0x50/0x50 [21388.531361] ---[ end trace 69a4c78997ef63b2 ]--- [21388.531364] BTRFS: error (device sde1) in btrfs_run_delayed_refs:2946: errno=-17 Object already exists [21388.531367] BTRFS info (device sde1): forced readonly This FS has been okay just a few reboots earlier (except the stripe boundary thing). History of kernel versions used on the system (back to the date when I think the FS was still clean/not corrupted): Wed Aug 5 08:12:01 2015 >>> sys-kernel/gentoo-sources-4.1.4 Sat Aug 15 19:04:34 2015 >>> sys-kernel/gentoo-sources-4.1.5 Sat Aug 22 11:10:53 2015 >>> sys-kernel/gentoo-sources-4.1.6 Tue Sep 1 00:03:38 2015 >>> sys-kernel/gentoo-sources-4.2.0 Sat Sep 5 17:45:11 2015 >>> sys-kernel/gentoo-sources-4.2.0-r1 Wed Sep 30 04:55:55 2015 >>> sys-kernel/gentoo-sources-4.2.2 Tue Oct 6 22:55:19 2015 >>> sys-kernel/gentoo-sources-4.2.3 Sun Oct 25 03:45:03 2015 >>> sys-kernel/gentoo-sources-4.2.4 Wed Nov 4 04:21:38 2015 >>> sys-kernel/gentoo-sources-4.2.5 Mon Nov 9 21:36:31 2015 >>> sys-kernel/gentoo-sources-4.3.0 Sat Nov 14 10:48:03 2015 >>> sys-kernel/gentoo-sources-4.2.6 Sun Dec 13 13:03:12 2015 >>> sys-kernel/gentoo-sources-4.2.7 Tue Dec 15 22:37:53 2015 >>> sys-kernel/gentoo-sources-4.2.8 Tue Dec 22 14:37:01 2015 >>> sys-kernel/gentoo-sources-4.3.3 Fri Jan 22 20:38:32 2016 >>> sys-kernel/gentoo-sources-4.3.3-r1 Wed Jan 27 09:27:57 2016 >>> sys-kernel/gentoo-sources-4.3.4 Sun Jan 31 10:25:00 2016 >>> sys-kernel/gentoo-sources-4.4.0-r1 Mon Feb 1 20:40:17 2016 >>> sys-kernel/gentoo-sources-4.4.1 Fri Feb 19 22:11:45 2016 >>> sys-kernel/gentoo-sources-4.4.2 Sat Feb 27 15:10:45 2016 >>> sys-kernel/gentoo-sources-4.4.3 Sun Mar 6 14:58:05 2016 >>> sys-kernel/gentoo-sources-4.4.4 Sat Mar 12 19:21:15 2016 >>> sys-kernel/gentoo-sources-4.4.5 Sat Mar 19 06:13:09 2016 >>> sys-kernel/gentoo-sources-4.4.6 Sun Mar 20 20:45:21 2016 >>> sys-kernel/gentoo-sources-4.5.0 I only saw unreliable behavior with 4.4.5, 4.4.6, and 4.5.0 tho the problem may exist longer in my FS. $ sudo btrfs-show-super /dev/sde1 superblock: bytenr=65536, device=/dev/sde1 --------------------------------------------------------- csum 0xcc976d97 [match] bytenr 65536 flags 0x1 ( WRITTEN ) magic _BHRfS_M [match] fsid 1318ec21-c421-4e36-a44a-7be3d41f9c3f label usb-backup generation 50814 root 1251250159616 sys_array_size 129 chunk_root_generation 50784 root_level 1 chunk_root 2516518567936 chunk_root_level 1 log_root 0 log_root_transid 0 log_root_level 0 total_bytes 2000397864960 bytes_used 1860398493696 sectorsize 4096 nodesize 16384 leafsize 16384 stripesize 4096 root_dir 6 num_devices 1 compat_flags 0x0 compat_ro_flags 0x1 incompat_flags 0x169 ( MIXED_BACKREF | COMPRESS_LZO | BIG_METADATA | EXTENDED_IREF | SKINNY_METADATA ) csum_type 0 csum_size 4 cache_generation 50208 uuid_tree_generation 50742 dev_item.uuid 9008d5a0-ac7b-4505-8193-27428429f953 dev_item.fsid 1318ec21-c421-4e36-a44a-7be3d41f9c3f [match] dev_item.type 0 dev_item.total_bytes 2000397864960 dev_item.bytes_used 1912308039680 dev_item.io_align 4096 dev_item.io_width 4096 dev_item.sector_size 4096 dev_item.devid 1 dev_item.dev_group 0 dev_item.seek_speed 0 dev_item.bandwidth 0 dev_item.generation 0 BTW: btrfsck thinks that the space tree is invalid every time it is run, no matter if cleanly unmounted, uncleanly unmounted, or "btrfsck --repair" and then ran a second time. -- Regards, Kai Replies to list-only preferred. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html