BTRFS kernel errors in 3.15.5
Hi, This happens systematically these days, while trying to backup using rsync two of my laptops on an external HD - BTRFS formatted. First at mount : juil. 19 09:08:32 zafu kernel: BTRFS info (device dm-3): disk space caching is enabled juil. 19 09:08:38 zafu kernel: pool[7215]: segfault at 58252fd0 ip 7fd06e466ac7 sp 7fd062ffbf90 error 4 in libc-2.19.so[7fd06e41e000+1a4000] Happening on 2 different machines (both Arch Linux with latest kernel), I suspected the external HD, but it spent the night passing all long SMART tests plus a btrfs scrub, all with flying colors... After the following error, all accesses to the external HD lock until a forced power off of the whole system... juil. 19 09:12:57 zafu kernel: INFO: task kworker/u4:7:6677 blocked for more than 120 seconds. juil. 19 09:12:57 zafu kernel: Tainted: G O 3.15.5-2-ARCH #1 juil. 19 09:12:57 zafu kernel: echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. juil. 19 09:12:57 zafu kernel: kworker/u4:7D 0 6677 2 0x juil. 19 09:12:57 zafu kernel: Workqueue: btrfs-flush_delalloc normal_work_helper [btrfs] juil. 19 09:12:57 zafu kernel: 88001e9dfb48 0046 8800879f1460 00014700 juil. 19 09:12:57 zafu kernel: 88001e9dffd8 00014700 8800879f1460 a0262692 juil. 19 09:12:57 zafu kernel: 0050 88001e9dfbc4 88001e9dfac8 juil. 19 09:12:57 zafu kernel: Call Trace: juil. 19 09:12:57 zafu kernel: [a0262692] ? run_delalloc_range+0x192/0x350 [btrfs] juil. 19 09:12:57 zafu kernel: [8113f880] ? filemap_fdatawait+0x30/0x30 juil. 19 09:12:57 zafu kernel: [81509fa9] schedule+0x29/0x70 juil. 19 09:12:57 zafu kernel: [8150a294] io_schedule+0x94/0xf0 juil. 19 09:12:57 zafu kernel: [8113f88e] sleep_on_page+0xe/0x20 juil. 19 09:12:57 zafu kernel: [8150a788] __wait_on_bit_lock+0x48/0xb0 juil. 19 09:12:57 zafu kernel: [8113f9e8] __lock_page+0x78/0x90 juil. 19 09:12:57 zafu kernel: [810b1c80] ? autoremove_wake_function+0x40/0x40 juil. 19 09:12:57 zafu kernel: [a027ac98] extent_write_cache_pages.isra.28.constprop.45+0x258/0x3a0 [btrfs] juil. 19 09:12:57 zafu kernel: [810a714c] ? update_curr+0xec/0x1b0 juil. 19 09:12:57 zafu kernel: [810a3068] ? __enqueue_entity+0x78/0x80 juil. 19 09:12:57 zafu kernel: [810a902e] ? enqueue_entity+0x24e/0xaa0 juil. 19 09:12:57 zafu kernel: [a027c20c] extent_writepages+0x5c/0x90 [btrfs] juil. 19 09:12:57 zafu kernel: [a025e9d0] ? __btrfs_submit_bio_start_direct_io+0x40/0x40 [btrfs] juil. 19 09:12:57 zafu kernel: [a025d6e8] btrfs_writepages+0x28/0x30 [btrfs] juil. 19 09:12:57 zafu kernel: [8114d1de] do_writepages+0x1e/0x30 juil. 19 09:12:57 zafu kernel: [8114139d] __filemap_fdatawrite_range+0x5d/0x80 juil. 19 09:12:57 zafu kernel: [8114140c] filemap_flush+0x1c/0x20 juil. 19 09:12:57 zafu kernel: [a026046a] btrfs_run_delalloc_work+0x5a/0xa0 [btrfs] juil. 19 09:12:57 zafu kernel: [a028aa77] normal_work_helper+0x77/0x350 [btrfs] juil. 19 09:12:57 zafu kernel: [810861d8] process_one_work+0x168/0x450 juil. 19 09:12:57 zafu kernel: [81086c32] worker_thread+0x132/0x3e0 juil. 19 09:12:57 zafu kernel: [81086b00] ? manage_workers.isra.23+0x2d0/0x2d0 juil. 19 09:12:57 zafu kernel: [8108d43a] kthread+0xea/0x100 juil. 19 09:12:57 zafu kernel: [8108d350] ? kthread_create_on_node+0x1b0/0x1b0 juil. 19 09:12:57 zafu kernel: [81515efc] ret_from_fork+0x7c/0xb0 juil. 19 09:12:57 zafu kernel: [8108d350] ? kthread_create_on_node+0x1b0/0x1b0 -- Swâmi Petaramesh sw...@petaramesh.org http://petaramesh.org PGP 9076E32E -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS kernel errors in 3.15.5
Swâmi Petaramesh posted on Sat, 19 Jul 2014 09:19:13 +0200 as excerpted: This happens systematically these days, while trying to backup using rsync two of my laptops on an external HD - BTRFS formatted. While I've not seen it, based on reports here on the list, there's apparently a known issue with rsync on btrfs for 3.15 so far. It seems to trigger most often on btrfs' with compression enabled, tho I'm not sure if compression is required to trigger it or if it simply happens more often with it. I'm not sure of the status, but Chris Mason mentioned that he was investigating and trying to reproduce, so they're definitely actively working on it. People seem to have better luck either reverting to the latest 3.14 series kernel for the moment, or updating to the latest 3.16-rc or live- git build, with 3.16 of course getting close to release now. I'd move forward as I know they're always fixing stuff and I routinely test kernel rcs anyway, but Marc Merlin (one of the people reporting this bug), at least, has moved back to 3.14 instead, as he says that works well for him. -- Duncan - List replies preferred. No HTML msgs. Every nonfree program has a lord, a master -- and if you use the program, he is your master. Richard Stallman -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ENOSPC errors during balance
The 2nd dmesg (didn't look at the 1st), has many instances like this; [96241.882138] ata2.00: exception Emask 0x1 SAct 0x7ffe0fff SErr 0x0 action 0x6 frozen [96241.882139] ata2.00: Ata error. fis:0x21 [96241.882142] ata2.00: failed command: READ FPDMA QUEUED [96241.882148] ata2.00: cmd 60/08:00:68:0a:2d/00:00:18:00:00/40 tag 0 ncq 4096 in res 41/00:58:40:5c:2c/00:00:18:00:00/40 Emask 0x1 (device error) I'm not sure what this error is, it acts like an unrecoverable read error but I'm not seeing UNC reported. It looks like ata 2.00 is sdb, which is a member of a btrfs raid10 volume. So this isn't related to your sdg2 and enospc error, it's a different problem. I'm not sure of the reason for the BTRFS info (device sdg2): 2 enospc errors during balance but it seems informational rather than either a warning or problem. I'd treat ext4-btrfs converted file systems to be something of an odd duck, in that it's uncommon, therefore isn't getting as much testing and extra caution is a good idea. Make frequent backups. Chris Murphy-- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Blocked tasks on 3.15.1
On Thu, Jul 17, 2014 at 8:18 AM, Chris Mason c...@fb.com wrote: [ deadlocks during rsync in 3.15 with compression enabled ] Hi everyone, I still haven't been able to reproduce this one here, but I'm going through a series of tests with lzo compression foraced and every operation forced to ordered. Hopefully it'll kick it out soon. While I'm hammering away, could you please try this patch. If this is the buy you're hitting, the deadlock will go away and you'll see this printk in the log. thanks! -chris diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 3668048..8ab56df 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8157,6 +8157,13 @@ void btrfs_destroy_inode(struct inode *inode) spin_unlock(root-fs_info-ordered_root_lock); } + spin_lock(root-fs_info-ordered_root_lock); + if (!list_empty(BTRFS_I(inode)-ordered_operations)) { + list_del_init(BTRFS_I(inode)-ordered_operations); +printk(KERN_CRIT racing inode deletion with ordered operations!!!\n); + } + spin_unlock(root-fs_info-ordered_root_lock); + if (test_bit(BTRFS_INODE_HAS_ORPHAN_ITEM, BTRFS_I(inode)-runtime_flags)) { btrfs_info(root-fs_info, inode %llu still on the orphan list, Thanks Chris. Running 3.15.6 with this patch applied on top: - still causes a hang with `rsync -hPaHAXx --del /mnt/home/nyx/ /home/nyx/` - no extra error messages printed (`dmesg | grep racing`) compared to without the patch To recap some details (so I can have it all in one place): - /home/ is btrfs with compress=lzo - /mnt/home is btrfs with no compression enabled. - I have _not_ created any nodatacow files. - Both filesystems are on different physical disks. - Full stack is: sata - dmcrypt - lvm - btrfs (I noticed others mentioning the use of dmcrypt) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS hang with 3.16-rc5
Am Freitag, 18. Juli 2014, 09:36:06 schrieb Chris Mason: On 07/18/2014 03:51 AM, Martin Steigerwald wrote: Am Dienstag, 15. Juli 2014, 09:21:40 schrieb Chris Mason: On 07/14/2014 05:58 PM, Martin Steigerwald wrote: Am Montag, 14. Juli 2014, 16:12:22 schrieb Chris Mason: On 07/14/2014 11:10 AM, Martin Steigerwald wrote: Am Montag, 14. Juli 2014, 17:04:22 schrieben Sie: Hi! While with 3.16-rc3 and rc4 I didn´t have a BTRFS hang in several days of usage, with 3-16-rc5 I had a hang again. Less than a hour since booting it. Since the hang bug I and others had with 3.15 and upto 3.16-rc2 usually didn´t happen that quickly after boot and since backtrace looks a bit different from what I have in memory, I post this in a new thread. See thread Blocked tasks on 3.15.1 for a discussion of previous hang issues. Probably good to add some basic information on the filesystem: Do you have compression enabled? I wasn't able to nail down the 3.15.1 hang before vacation attacked me, but I'm hoping to track it down today. Yes. I have. It just hung again while I was playing PlaneShift. Back to 3.16-rc4 as rc5 seems to be broke here. The btrfs hang you're hitting goes back to 3.15. So 3.16-rc4 vs rc5 shouldn't be a factor. Are you hitting other problems with 3.16? On this system it is a matter. 3.16-rc5: Two hangs in one day 3.16-rc4: No hang so far with three days uptime (well with hibernation cycles in between) So easy observation for me: 3.16-rc4 fine, 3.16-rc5 broke. Can you please try this patch on rc5 and look for the printk: diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 3668048..8ab56df 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8157,6 +8157,13 @@ void btrfs_destroy_inode(struct inode *inode) spin_unlock(root-fs_info-ordered_root_lock); } + spin_lock(root-fs_info-ordered_root_lock); + if (!list_empty(BTRFS_I(inode)-ordered_operations)) { + list_del_init(BTRFS_I(inode)-ordered_operations); +printk(KERN_CRIT racing inode deletion with ordered operations!!!\n); + } + spin_unlock(root-fs_info-ordered_root_lock); + if (test_bit(BTRFS_INODE_HAS_ORPHAN_ITEM, BTRFS_I(inode)-runtime_flags)) { btrfs_info(root-fs_info, inode %llu still on the orphan list, Did so and again got a hang. No racing inodes tough: merkaba:/boot zgrep -i racing inode /var/log/syslog* merkaba:/boot#1 Built kernel seems right: martin@merkaba:[…] LANG=C grep -ir racing inode fs/btrfs fs/btrfs/inode.c:printk(KERN_CRIT racing inode deletion with ordered operations!!!\n); Binary file fs/btrfs/inode.o matches Binary file fs/btrfs/btrfs.o matches Binary file fs/btrfs/btrfs.ko matches Backtrace doesn´t seem to contain any function related to inodes. Back to rc4 again for now. These hangs seemed to occur first at writing several hundred MiB onto a high speed SDHC card… yet, they persisted long after the write was finished, upto to a point where I had to reboot cause machine hung on trying to switch between tty7 (X11) and tty1 (for diagnosis). Jul 19 19:29:11 merkaba kernel: [12346.692457] mmc0: new high speed SDHC card at address 0001 Jul 19 19:29:11 merkaba kernel: [12346.744276] mmcblk0: mmc0:0001 0 29.2 GiB Jul 19 19:29:11 merkaba kernel: [12346.769850] mmcblk0: p1 Jul 19 19:29:20 merkaba kernel: [12355.796267] FAT-fs (mmcblk0p1): utf8 is not a recommended IO charset for FAT filesystems, filesystem will be case se nsitive! Jul 19 19:29:20 merkaba kernel: [12355.805515] FAT-fs (mmcblk0p1): Volume was not properly unmounted. Some data may be corrupt. Please run fsck. Jul 19 19:33:27 merkaba kernel: [12602.162818] INFO: task btrfs-transacti:715 blocked for more than 120 seconds. Jul 19 19:33:27 merkaba kernel: [12602.162826] Tainted: G O 3.16.0-rc5-tp520-btrfs-delrace+ #5 Jul 19 19:33:27 merkaba kernel: [12602.162827] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. Jul 19 19:33:27 merkaba kernel: [12602.162828] btrfs-transacti D 8800cf90e780 0 715 2 0x Jul 19 19:33:27 merkaba kernel: [12602.162834] 880401ddbc80 0002 880407fc 880401ddbfd8 Jul 19 19:33:27 merkaba kernel: [12602.162836] 8800cf90e240 00013080 8800cf90e240 7fff Jul 19 19:33:27 merkaba kernel: [12602.162839] 8804018efd28 0002 81470270 8804018efd20 Jul 19 19:33:27 merkaba kernel: [12602.162842] Call Trace: Jul 19 19:33:27 merkaba kernel: [12602.162856] [81470270] ? michael_mic.part.6+0x1e/0x1e Jul 19 19:33:27 merkaba kernel: [12602.162860] [81470c70] schedule+0x64/0x66 Jul 19 19:33:27 merkaba kernel: [12602.162862] [8147029f] schedule_timeout+0x2f/0x114 Jul 19 19:33:27 merkaba kernel: [12602.162867] [81062c4d] ? get_parent_ip+0xd/0x3c Jul 19
Re: Blocked tasks on 3.15.1
Am Samstag, 19. Juli 2014, 12:38:53 schrieb Cody P Schafer: On Thu, Jul 17, 2014 at 8:18 AM, Chris Mason c...@fb.com wrote: [ deadlocks during rsync in 3.15 with compression enabled ] Hi everyone, I still haven't been able to reproduce this one here, but I'm going through a series of tests with lzo compression foraced and every operation forced to ordered. Hopefully it'll kick it out soon. While I'm hammering away, could you please try this patch. If this is the buy you're hitting, the deadlock will go away and you'll see this printk in the log. thanks! -chris diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 3668048..8ab56df 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8157,6 +8157,13 @@ void btrfs_destroy_inode(struct inode *inode) spin_unlock(root-fs_info-ordered_root_lock); } + spin_lock(root-fs_info-ordered_root_lock); + if (!list_empty(BTRFS_I(inode)-ordered_operations)) { + list_del_init(BTRFS_I(inode)-ordered_operations); +printk(KERN_CRIT racing inode deletion with ordered operations!!!\n); + } + spin_unlock(root-fs_info-ordered_root_lock); + if (test_bit(BTRFS_INODE_HAS_ORPHAN_ITEM, BTRFS_I(inode)-runtime_flags)) { btrfs_info(root-fs_info, inode %llu still on the orphan list, Thanks Chris. Running 3.15.6 with this patch applied on top: - still causes a hang with `rsync -hPaHAXx --del /mnt/home/nyx/ /home/nyx/` - no extra error messages printed (`dmesg | grep racing`) compared to without the patch I got same results with 3.16-rc5 + this patch (see thread BTRFS hang with 3.16-rc5). 3.16-rc4 still is fine with me. No hang whatsoever so far. To recap some details (so I can have it all in one place): - /home/ is btrfs with compress=lzo BTRFS RAID 1 with lzo. - I have _not_ created any nodatacow files. Me neither. - Full stack is: sata - dmcrypt - lvm - btrfs (I noticed others mentioning the use of dmcrypt) Same, except no dmcrypt. -- Martin 'Helios' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS hang with 3.16-rc5
On 07/19/2014 01:59 PM, Martin Steigerwald wrote: Am Freitag, 18. Juli 2014, 09:36:06 schrieb Chris Mason: On 07/18/2014 03:51 AM, Martin Steigerwald wrote: Am Dienstag, 15. Juli 2014, 09:21:40 schrieb Chris Mason: On 07/14/2014 05:58 PM, Martin Steigerwald wrote: Am Montag, 14. Juli 2014, 16:12:22 schrieb Chris Mason: On 07/14/2014 11:10 AM, Martin Steigerwald wrote: Am Montag, 14. Juli 2014, 17:04:22 schrieben Sie: Hi! While with 3.16-rc3 and rc4 I didn´t have a BTRFS hang in several days of usage, with 3-16-rc5 I had a hang again. Less than a hour since booting it. Since the hang bug I and others had with 3.15 and upto 3.16-rc2 usually didn´t happen that quickly after boot and since backtrace looks a bit different from what I have in memory, I post this in a new thread. See thread Blocked tasks on 3.15.1 for a discussion of previous hang issues. Probably good to add some basic information on the filesystem: Do you have compression enabled? I wasn't able to nail down the 3.15.1 hang before vacation attacked me, but I'm hoping to track it down today. Yes. I have. It just hung again while I was playing PlaneShift. Back to 3.16-rc4 as rc5 seems to be broke here. The btrfs hang you're hitting goes back to 3.15. So 3.16-rc4 vs rc5 shouldn't be a factor. Are you hitting other problems with 3.16? On this system it is a matter. 3.16-rc5: Two hangs in one day 3.16-rc4: No hang so far with three days uptime (well with hibernation cycles in between) So easy observation for me: 3.16-rc4 fine, 3.16-rc5 broke. Can you please try this patch on rc5 and look for the printk: diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 3668048..8ab56df 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8157,6 +8157,13 @@ void btrfs_destroy_inode(struct inode *inode) spin_unlock(root-fs_info-ordered_root_lock); } +spin_lock(root-fs_info-ordered_root_lock); +if (!list_empty(BTRFS_I(inode)-ordered_operations)) { +list_del_init(BTRFS_I(inode)-ordered_operations); +printk(KERN_CRIT racing inode deletion with ordered operations!!!\n); + } +spin_unlock(root-fs_info-ordered_root_lock); + if (test_bit(BTRFS_INODE_HAS_ORPHAN_ITEM, BTRFS_I(inode)-runtime_flags)) { btrfs_info(root-fs_info, inode %llu still on the orphan list, Did so and again got a hang. No racing inodes tough: merkaba:/boot zgrep -i racing inode /var/log/syslog* merkaba:/boot#1 Built kernel seems right: martin@merkaba:[…] LANG=C grep -ir racing inode fs/btrfs fs/btrfs/inode.c:printk(KERN_CRIT racing inode deletion with ordered operations!!!\n); Binary file fs/btrfs/inode.o matches Binary file fs/btrfs/btrfs.o matches Binary file fs/btrfs/btrfs.ko matches Backtrace doesn´t seem to contain any function related to inodes. Back to rc4 again for now. These hangs seemed to occur first at writing several hundred MiB onto a high speed SDHC card… yet, they persisted long after the write was finished, upto to a point where I had to reboot cause machine hung on trying to switch between tty7 (X11) and tty1 (for diagnosis). Ok, this is definitely the same hang reported on 3.15.1. Thanks for giving the patch a try, I've got another long running test going this weekend in hopes of triggering it here. -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS hang with 3.16-rc5
Am Samstag, 19. Juli 2014, 14:39:51 schrieb Chris Mason: On 07/19/2014 01:59 PM, Martin Steigerwald wrote: Am Freitag, 18. Juli 2014, 09:36:06 schrieb Chris Mason: On 07/18/2014 03:51 AM, Martin Steigerwald wrote: Am Dienstag, 15. Juli 2014, 09:21:40 schrieb Chris Mason: On 07/14/2014 05:58 PM, Martin Steigerwald wrote: Am Montag, 14. Juli 2014, 16:12:22 schrieb Chris Mason: On 07/14/2014 11:10 AM, Martin Steigerwald wrote: Am Montag, 14. Juli 2014, 17:04:22 schrieben Sie: Hi! While with 3.16-rc3 and rc4 I didn´t have a BTRFS hang in several days of usage, with 3-16-rc5 I had a hang again. Less than a hour since booting it. Since the hang bug I and others had with 3.15 and upto 3.16-rc2 usually didn´t happen that quickly after boot and since backtrace looks a bit different from what I have in memory, I post this in a new thread. See thread Blocked tasks on 3.15.1 for a discussion of previous hang issues. Probably good to add some basic information on the filesystem: Do you have compression enabled? I wasn't able to nail down the 3.15.1 hang before vacation attacked me, but I'm hoping to track it down today. Yes. I have. It just hung again while I was playing PlaneShift. Back to 3.16-rc4 as rc5 seems to be broke here. The btrfs hang you're hitting goes back to 3.15. So 3.16-rc4 vs rc5 shouldn't be a factor. Are you hitting other problems with 3.16? On this system it is a matter. 3.16-rc5: Two hangs in one day 3.16-rc4: No hang so far with three days uptime (well with hibernation cycles in between) So easy observation for me: 3.16-rc4 fine, 3.16-rc5 broke. Can you please try this patch on rc5 and look for the printk: diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 3668048..8ab56df 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8157,6 +8157,13 @@ void btrfs_destroy_inode(struct inode *inode) spin_unlock(root-fs_info-ordered_root_lock); } + spin_lock(root-fs_info-ordered_root_lock); + if (!list_empty(BTRFS_I(inode)-ordered_operations)) { + list_del_init(BTRFS_I(inode)-ordered_operations); +printk(KERN_CRIT racing inode deletion with ordered operations!!!\n); + } + spin_unlock(root-fs_info-ordered_root_lock); + if (test_bit(BTRFS_INODE_HAS_ORPHAN_ITEM, BTRFS_I(inode)-runtime_flags)) { btrfs_info(root-fs_info, inode %llu still on the orphan list, Did so and again got a hang. No racing inodes tough: merkaba:/boot zgrep -i racing inode /var/log/syslog* merkaba:/boot#1 Built kernel seems right: martin@merkaba:[…] LANG=C grep -ir racing inode fs/btrfs fs/btrfs/inode.c:printk(KERN_CRIT racing inode deletion with ordered operations!!!\n); Binary file fs/btrfs/inode.o matches Binary file fs/btrfs/btrfs.o matches Binary file fs/btrfs/btrfs.ko matches Backtrace doesn´t seem to contain any function related to inodes. Back to rc4 again for now. These hangs seemed to occur first at writing several hundred MiB onto a high speed SDHC card… yet, they persisted long after the write was finished, upto to a point where I had to reboot cause machine hung on trying to switch between tty7 (X11) and tty1 (for diagnosis). Ok, this is definitely the same hang reported on 3.15.1. Thanks for giving the patch a try, I've got another long running test going this weekend in hopes of triggering it here. I found make-kpkg (from Debian kernel-package) trigger BTRFS hang quite reliably with 3.14 and 3.15 at least after some update. Often during running objcopy commands. Example call: make-kpkg -j4 --rootcmd fakeroot --initrd --append-to-version -tp520-btrfs- delrace --revision 1 linux_image -- Martin 'Helios' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: update commit root on snapshot creation after orphan cleanup
Hi Filipe, It's quite possible I don't fully understand the issue. It seems that we are creating a read-only snapshot, commit a transaction, and then go and modify the snapshot once again, by deleting all the ORPHAN_ITEMs we have in its file tree (btrfs_orphan_cleanup). Shouldn't all this be part of snapshot creation, so that after we commit, we have a clean file tree with no orphans there? (not sure if this makes sense though). With your patch we do this additional commit after the cleanup. But nothing prevents send from starting before this additional commit, correct? And it would still see the orphans through the commit root. You say that it is not a problem, but I am not sure why (probably I am missing something here). So for me it looks like your patch closes a race window significantly (at the cost of an additional commit), but does not close it fully. But most important: perhaps send should look for ORPHAN_ITEMs and treat those inodes as deleted? Thanks, Alex. On Tue, Jun 3, 2014 at 2:41 PM, Filipe David Borba Manana fdman...@gmail.com wrote: On snapshot creation (either writable or read-only), we do orphan cleanup against the root of the snapshot. If the cleanup did remove any orphans, then the current root node will be different from the commit root node until the next transaction commit happens. A send operation always uses the commit root of a snapshot - this means it will see the orphans if it starts computing the send stream before the next transaction commit happens (triggered by a timer or sync() for .e.g), which is when the commit root gets assigned a reference to current root, where the orphans are not visible anymore. The consequence of send seeing the orphans is explained below. For example: mkfs.btrfs -f /dev/sdd mount -o commit=999 /dev/sdd /mnt # open a file with O_TMPFILE and leave it open # write some data to the file btrfs subvolume snapshot -r /mnt /mnt/snap1 btrfs send /mnt/snap1 -f /tmp/send.data The send operation will fail with the following error: ERROR: send ioctl failed with -116: Stale file handle What happens here is that our snapshot has an orphan inode still visible through the commit root, that corresponds to the tmpfile. However send will attempt to call inode.c:btrfs_iget(), with the goal of reading the file's data, which will return -ESTALE because it will use the current root (and not the commit root) of the snapshot. Of course, there are other cases where we can get orphans, but this example using a tmpfile makes it much easier to reproduce the issue. Therefore on snapshot creation, after calling btrfs_orphan_cleanup, if the commit root is different from the current root, just commit the transaction associated with the snapshot's root (if it exists), so that a send will not see any orphans that don't exist anymore. This also guarantees a send will always see the same content regardless of whether a transaction commit happened already before the send was requested and after the orphan cleanup (meaning the commit root and current roots are the same) or it hasn't happened yet (commit and current roots are different). Signed-off-by: Filipe David Borba Manana fdman...@gmail.com --- fs/btrfs/ioctl.c | 29 + 1 file changed, 29 insertions(+) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 95194a9..6680ad9 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -712,6 +712,35 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir, if (ret) goto fail; + /* +* If orphan cleanup did remove any orphans, it means the tree was +* modified and therefore the commit root is not the same as the +* current root anymore. This is a problem, because send uses the +* commit root and therefore can see inode items that don't exist +* in the current root anymore, and for example make calls to +* btrfs_iget, which will do tree lookups based on the current root +* and not on the commit root. Those lookups will fail, returning a +* -ESTALE error, and making send fail with that error. So make sure +* a send does not see any orphans we have just removed, and that it +* will see the same inodes regardless of whether a transaction +* commit happened before it started (meaning that the commit root +* will be the same as the current root) or not. +*/ + if (readonly pending_snapshot-snap-node != + pending_snapshot-snap-commit_root) { + trans = btrfs_join_transaction(pending_snapshot-snap); + if (IS_ERR(trans) PTR_ERR(trans) != -ENOENT) { + ret = PTR_ERR(trans); + goto fail; + } + if (!IS_ERR(trans)) { + ret = btrfs_commit_transaction(trans, +
Fw: ENOSPC errors during balance
Start weitergeleitete Nachricht: Huh, turns out the Reply-To was to Chris Murphy, so here it is again for the whole list. Datum: Sat, 19 Jul 2014 20:34:34 +0200 Von: Marc Joliet mar...@gmx.de An: Chris Murphy li...@colorremedies.com Betreff: Re: ENOSPC errors during balance Am Sat, 19 Jul 2014 11:38:08 -0600 schrieb Chris Murphy li...@colorremedies.com: The 2nd dmesg (didn't look at the 1st), has many instances like this; [96241.882138] ata2.00: exception Emask 0x1 SAct 0x7ffe0fff SErr 0x0 action 0x6 frozen [96241.882139] ata2.00: Ata error. fis:0x21 [96241.882142] ata2.00: failed command: READ FPDMA QUEUED [96241.882148] ata2.00: cmd 60/08:00:68:0a:2d/00:00:18:00:00/40 tag 0 ncq 4096 in res 41/00:58:40:5c:2c/00:00:18:00:00/40 Emask 0x1 (device error) I'm not sure what this error is, it acts like an unrecoverable read error but I'm not seeing UNC reported. It looks like ata 2.00 is sdb, which is a member of a btrfs raid10 volume. So this isn't related to your sdg2 and enospc error, it's a different problem. Yeah, from what I remember reading it's related to nforce2 chipsets, but I never pursued it, since I never really noticed any consequences (this is an old computer that I originally build in 2006). IIRC one workaround is to switch to 1.5gpbs instead of 3gbps (but then, it already is at 1.5 Gbps, but none of the other ports are? Might be the hard drive, I *think* it's older than the others.), another is related to irqbalance (which I forgot about, I've just switched it off and will see if the messages stop, but then again, my first dmesg didn't have any of those messages). Anyway, yes, it's unrelated to my problem :-) . I'm not sure of the reason for the BTRFS info (device sdg2): 2 enospc errors during balance but it seems informational rather than either a warning or problem. I'd treat ext4-btrfs converted file systems to be something of an odd duck, in that it's uncommon, therefore isn't getting as much testing and extra caution is a good idea. Make frequent backups. Well, I *could* just recreate the file system. Since these are my only backups (no offsite backup as of yet), I wanted to keep the existing ones. So btrfs-convert was a convenient way to upgrade. But since I ended up deleting those backups anyway, I would only be losing my hourly and a few daily backups. But it's not as if the file system is otherwise misbehaving. Another random idea: the number of errors decreased the second time I ran balance (from 4 to 2), I could run another full balance and see if it keeps decreasing. -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup signature.asc Description: PGP signature
Re: ENOSPC errors during balance
Am Sat, 19 Jul 2014 22:10:51 +0200 schrieb Marc Joliet mar...@gmx.de: [...] Another random idea: the number of errors decreased the second time I ran balance (from 4 to 2), I could run another full balance and see if it keeps decreasing. Well, this time there were still 2 ENOSPC errors. But I can show the df output after such an ENOSPC error, to illustrate what I meant with the sudden surge in total usage: # btrfs filesystem df /run/media/marcec/MARCEC_BACKUP Data, single: total=236.00GiB, used=229.04GiB System, DUP: total=32.00MiB, used=36.00KiB Metadata, DUP: total=4.00GiB, used=3.20GiB unknown, single: total=512.00MiB, used=0.00 And then after running a balance and (almost) immediately cancelling: # btrfs filesystem df /run/media/marcec/MARCEC_BACKUP Data, single: total=230.00GiB, used=229.04GiB System, DUP: total=32.00MiB, used=36.00KiB Metadata, DUP: total=4.00GiB, used=3.20GiB unknown, single: total=512.00MiB, used=0.00 -- Marc Joliet -- People who think they know everything really annoy those of us who know we don't - Bjarne Stroustrup signature.asc Description: PGP signature
Re: ENOSPC errors during balance
On Sat, Jul 19, 2014 at 11:38:08AM -0600, Chris Murphy wrote: [96241.882138] ata2.00: exception Emask 0x1 SAct 0x7ffe0fff SErr 0x0 action 0x6 frozen [96241.882139] ata2.00: Ata error. fis:0x21 [96241.882142] ata2.00: failed command: READ FPDMA QUEUED [96241.882148] ata2.00: cmd 60/08:00:68:0a:2d/00:00:18:00:00/40 tag 0 ncq 4096 in res 41/00:58:40:5c:2c/00:00:18:00:00/40 Emask 0x1 (device error) Afair those are somehow related to NCQ. Piotr Szymaniak. -- Komnata audiencyjna zlotego Bruna przerastala wszystko, co dotad widzialem. Musial zatrudnic dziesiatki programistow i kreatorow, by stworzyc tak wyuzdane i wysmakowane wnetrze. Dzwieki, barwy, ksztalty i zapachy wywolywaly erekcje. -- Marcin Przybylek, Gamedec: Syndrom Adelheima signature.asc Description: Digital signature
Re: ENOSPC errors during balance
On Jul 19, 2014, at 2:58 PM, Marc Joliet mar...@gmx.de wrote: Am Sat, 19 Jul 2014 22:10:51 +0200 schrieb Marc Joliet mar...@gmx.de: [...] Another random idea: the number of errors decreased the second time I ran balance (from 4 to 2), I could run another full balance and see if it keeps decreasing. Well, this time there were still 2 ENOSPC errors. But I can show the df output after such an ENOSPC error, to illustrate what I meant with the sudden surge in total usage: # btrfs filesystem df /run/media/marcec/MARCEC_BACKUP Data, single: total=236.00GiB, used=229.04GiB System, DUP: total=32.00MiB, used=36.00KiB Metadata, DUP: total=4.00GiB, used=3.20GiB unknown, single: total=512.00MiB, used=0.00 And then after running a balance and (almost) immediately cancelling: # btrfs filesystem df /run/media/marcec/MARCEC_BACKUP Data, single: total=230.00GiB, used=229.04GiB System, DUP: total=32.00MiB, used=36.00KiB Metadata, DUP: total=4.00GiB, used=3.20GiB unknown, single: total=512.00MiB, used=0.00 I think it's a bit weird. Two options: a. Keep using the file system, with judicious backups, if a dev wants more info they'll reply to the thread; b. Migrate the data to a new file system, first capture the file system with btrfs-image in case a dev wants more info and you've since blown away the filesystem, and then move it to a new btrfs fs. I'd use send/receive for this to preserve subvolumes and snapshots. Chris Murphy-- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ENOSPC errors during balance
Chris Murphy posted on Sat, 19 Jul 2014 11:38:08 -0600 as excerpted: I'm not sure of the reason for the BTRFS info (device sdg2): 2 enospc errors during balance but it seems informational rather than either a warning or problem. I'd treat ext4-btrfs converted file systems to be something of an odd duck, in that it's uncommon, therefore isn't getting as much testing and extra caution is a good idea. Make frequent backups. Expanding on that a bit... Balance simply rewrites chunks, combining where possible and possibly converting to a different layout (single/dup/raid0/1/10/5/6[1]) in the process. The most common reason for enospc during balance is of course all space allocated to chunks, with various workarounds for that if it happens, but that doesn't seem to be what was happening to you (Mark J./OP). Based on very similar issues reported by another ext4 - btrfs converter and the discussion on that thread, here's what I think happened: First a critical question for you as it's a critical piece of this scenario that you didn't mention in your summary. The wiki page on ext4 - btrfs conversion suggests deleting the ext2_saved subvolume and then doing a full defrag and rebalance. You're attempting a full rebalance, but have you yet deleted ext2_saved and did you do the defrag before attempting the rebalance? I'm guessing not, as was the case with the other user that reported this issue. Here's what apparently happened in his case and how we fixed it: The problem is that btrfs data chunks are 1 GiB each. Thus, the maximum size of a btrfs extent is 1 GiB. But ext4 doesn't have an arbitrary limitation on extent size, and for files over a GiB in size, ext4 extents can /also/ be over a GiB in size. That results in two potential issues at balance time. First, btrfs treats the ext2_saved subvolume as a read-only snapshot and won't touch it, thus keeping the ext* data intact in case the user wishes to rollback to ext*. I don't think btrfs touches that data during a balance either, as it really couldn't do so /safely/ without incorporating all of the ext* code into btrfs. I'm not sure how it expresses that situation, but it's quite possible that btrfs treats it as enospc. Second, for files that had ext4 extents greater than a GiB, balance will naturally enospc, because even the biggest possible btrfs extent, a full 1 GiB data chunk, is too small to hold the existing file extent. Of course this only happens on filesystems converted from ext*, because natively btrfs has no way to make an extent larger than a GiB, so it won't run into the problem if it was created natively instead of converted from ext*. Once the ext2_saved subvolume/snapshot is deleted, defragging should cure the problem as it rewrites those files to btrfs-native chunks, normally defragging but in this case fragging to the 1 GiB btrfs-native data-chunk- size extent size. Alternatively, and this is what the other guy did, one can find all the files from the original ext*fs over a GiB in size, and move them off- filesystem and back AFAIK he had several gigs of spare RAM and no files larger than that, so he used tmpfs as the temporary storage location, which is memory so the only I/O is that on the btrfs in question. By doing that he deleted the existing files on btrfs and recreated them, naturally splitting the extents on data-chunk-boundaries as btrfs normally does, in the recreation. If you had deleted the ext2_saved subvolume/snapshot and done the defrag already, that explanation doesn't work as-is, but I'd still consider it an artifact from the conversion, and try the alternative move-off- filesystem-temporarily method. If you don't have any files over a GiB in size, then I don't know... perhaps it's some other bug. --- [1] Raid5/6 support not yet complete. Operational code is there but recovery code is still incomplete. -- Duncan - List replies preferred. No HTML msgs. Every nonfree program has a lord, a master -- and if you use the program, he is your master. Richard Stallman -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html