[PATCH] fs: btrfs: Replace -ENOENT by -ERANGE in btrfs_get_acl()
size contains the value returned by posix_acl_from_xattr(), which returns -ERANGE, -ENODATA, zero, or an integer greater than zero. So replace -ENOENT by -ERANGE. Signed-off-by: Salah Triki--- fs/btrfs/acl.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/btrfs/acl.c b/fs/btrfs/acl.c index 67a6077..53bb7af 100644 --- a/fs/btrfs/acl.c +++ b/fs/btrfs/acl.c @@ -55,8 +55,7 @@ struct posix_acl *btrfs_get_acl(struct inode *inode, int type) } if (size > 0) { acl = posix_acl_from_xattr(_user_ns, value, size); - } else if (size == -ENOENT || size == -ENODATA || size == 0) { - /* FIXME, who returns -ENOENT? I think nobody */ + } else if (size == -ERANGE || size == -ENODATA || size == 0) { acl = NULL; } else { acl = ERR_PTR(-EIO); -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
btrfs RAID 10 truncates files over 2G to 4096 bytes.
Hi, My setup is that I use one file system for / and /home (on SSD) and a larger raid 10 for /mnt/share (6 x 2TB). Today I've discovered that 14 of files that are supposed to be over 2GB are in fact just 4096 bytes. I've checked the content of those 4KB and it seems that it does contain information that were at the beginnings of the files. I've experienced this problem in the past (3 - 4 years ago ?) but attributed it to different problem that I've spoke with you guys here about (corruption due to non ECC ram). At that time I did deleted files affected (56) and similar problem was discovered a year but not more than 2 years ago and I believe I've deleted the files. I periodically (once a month) run a scrub on my system to eliminate any errors sneaking in. I believe I did a balance a half a year ago ? to reclaim space after I deleted a large database. root@noname_server:/mnt/share# btrfs fi show Label: none uuid: 060c2345-5d2f-4965-b0a2-47ed2d1a5ba2 Total devices 1 FS bytes used 177.19GiB devid3 size 899.22GiB used 360.06GiB path /dev/sde2 Label: none uuid: d4cd1d5f-92c4-4b0f-8d45-1b378eff92a1 Total devices 6 FS bytes used 4.02TiB devid1 size 1.82TiB used 1.34TiB path /dev/sdg1 devid2 size 1.82TiB used 1.34TiB path /dev/sdh1 devid3 size 1.82TiB used 1.34TiB path /dev/sdi1 devid4 size 1.82TiB used 1.34TiB path /dev/sdb1 devid5 size 1.82TiB used 1.34TiB path /dev/sda1 devid6 size 1.82TiB used 1.34TiB path /dev/sdf1 root@noname_server:/mnt/share# uname -a Linux noname_server 4.4.0-28-generic #47-Ubuntu SMP Fri Jun 24 10:09:13 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux root@noname_server:/mnt/share# btrfs --version btrfs-progs v4.4 root@noname_server:/mnt/share# Problem is that stuff on this filesystem moves so slowly that it's hard to remember historical events ... it's like AWS glacier. What I can state with 100% certainty is that: - files that are affected are 2GB and over (safe to assume 4GB and over) - files affected were just read (and some not even read) never written after putting into storage - In the past I've assumed that files affected are due to size, but I have quite few ISO files some backups of virtual machines ... no problems there - seems like problem originates in one folder & size > 2GB & extension .mkv -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Filesystem locks up, also with older kernel on any action after booting into 4.7-rc4 once
On 07/02/2016 09:40 PM, Hans van Kranenburg wrote: On 07/02/2016 09:18 PM, Chris Murphy wrote: On Sat, Jul 2, 2016 at 11:34 AM, Hans van Kranenburgwrote: On 07/02/2016 07:14 PM, Hans van Kranenburg wrote: I just rebooted a VM into a 4.7 kernel. The joy didn't last long. After 177 seconds the btrfs data partition (root is on ext4) locked up. Worse, it keeps locking up on any action performed even when rebooting it with older kernels again. D: The filesystem initially mounts fine, but then locks up again immediately. Linux stacheldraht 4.7.0-rc4-amd64 #1 SMP Debian 4.7~rc4-1~exp1 (2016-06-20) x86_64 GNU/Linux ps output shows [btrfs-transaction] in D state: root 1108 0.0 0.0 0 0 ?D17:42 0:00 \_ [btrfs-transacti] From dmesg: [blah blah blah] So, something happened inside the fs that makes it lock up every time I try to do anything with it... I force-rebooted the poor thing again, and mounted the filesystem ro. It mounts without any complaint. I can see all files now, I can do sub list etc... So I think I'm going to copy some data to a new filesystem on a new block device just in case. The thing has to move to new storage anyway it's about 100 subvolumes with about 150GB of data, so that's a nice excercise with send/receive. Two things might be interesting: 1. btrfs check (without repair) to add to the above and see whether it finds any problems. 2. For send, to try -e option, if you have related subvolume snapshots. See if this bug is really a bug or user error or maybe it's fixed. https://bugzilla.kernel.org/show_bug.cgi?id=111221 The directory structure is dirvish with my btrfs patches. These are the subvols: 2016050802/tree 2016051502/tree So they're all named tree. I cannot just send them all to some location. And I cannot rename them, because the fs is mounted ro... Ok, I just moved the latest daily snapshots of all data to a new fs, so backups can run on top of it again tonight. The borken fs is still mounted ro, and I can try to fix it. Trying to send extra snapshots with send -c fails consistently with "parent determination failed for ..." and I'm not going to find out why today I guess. The backup system on this host works by snapshotting (rw) the tree of yesterday and then rsyncing the remote over it, so snapshots are probably losing btrfs-level parent relationship. Still, it would be nice to be able to use -c to move multiple ones with shared data to another fs. To be able to reconstruct the backup snapshot history, I would have to revert to send/receive + (snapshot +rsync) * N-1 now, which is not really btrfsish. Ah, the send/receive finished, let's try some fun things... -# btrfs check /dev/xvdc Checking filesystem on /dev/xvdc UUID: 49ca0cda-3233-4dac-936b-16265c0937a6 checking extents checking free space tree cache and super generation don't match, space cache will be invalidated checking fs roots checking csums checking root refs found 157548691476 bytes used err is 0 total csum bytes: 153411888 total tree bytes: 454918144 total fs tree bytes: 264257536 total extent tree bytes: 15941632 btree space waste bytes: 71694806 file data blocks allocated: 190005772288 referenced 190005731328 Not many exciting explosions happening here. The space cache error is maybe a result from switching to space_cache=v2 while the old space cache is still present? -- Hans van Kranenburg - System / Network Engineer T +31 (0)10 2760434 | hans.van.kranenb...@mendix.com | www.mendix.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Filesystem locks up, also with older kernel on any action after booting into 4.7-rc4 once
On 07/02/2016 09:18 PM, Chris Murphy wrote: On Sat, Jul 2, 2016 at 11:34 AM, Hans van Kranenburgwrote: On 07/02/2016 07:14 PM, Hans van Kranenburg wrote: I just rebooted a VM into a 4.7 kernel. The joy didn't last long. After 177 seconds the btrfs data partition (root is on ext4) locked up. Worse, it keeps locking up on any action performed even when rebooting it with older kernels again. D: The filesystem initially mounts fine, but then locks up again immediately. Linux stacheldraht 4.7.0-rc4-amd64 #1 SMP Debian 4.7~rc4-1~exp1 (2016-06-20) x86_64 GNU/Linux ps output shows [btrfs-transaction] in D state: root 1108 0.0 0.0 0 0 ?D17:42 0:00 \_ [btrfs-transacti] From dmesg: [blah blah blah] So, something happened inside the fs that makes it lock up every time I try to do anything with it... I force-rebooted the poor thing again, and mounted the filesystem ro. It mounts without any complaint. I can see all files now, I can do sub list etc... So I think I'm going to copy some data to a new filesystem on a new block device just in case. The thing has to move to new storage anyway it's about 100 subvolumes with about 150GB of data, so that's a nice excercise with send/receive. Two things might be interesting: 1. btrfs check (without repair) to add to the above and see whether it finds any problems. 2. For send, to try -e option, if you have related subvolume snapshots. See if this bug is really a bug or user error or maybe it's fixed. https://bugzilla.kernel.org/show_bug.cgi?id=111221 The directory structure is dirvish with my btrfs patches. These are the subvols: 2016050802/tree 2016051502/tree So they're all named tree. I cannot just send them all to some location. And I cannot rename them, because the fs is mounted ro... -- Hans van Kranenburg - System / Network Engineer T +31 (0)10 2760434 | hans.van.kranenb...@mendix.com | www.mendix.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Filesystem locks up, also with older kernel on any action after booting into 4.7-rc4 once
On Sat, Jul 2, 2016 at 11:34 AM, Hans van Kranenburgwrote: > On 07/02/2016 07:14 PM, Hans van Kranenburg wrote: >> >> I just rebooted a VM into a 4.7 kernel. The joy didn't last long. After >> 177 seconds the btrfs data partition (root is on ext4) locked up. Worse, >> it keeps locking up on any action performed even when rebooting it with >> older kernels again. D: The filesystem initially mounts fine, but then >> locks up again immediately. >> >> Linux stacheldraht 4.7.0-rc4-amd64 #1 SMP Debian 4.7~rc4-1~exp1 >> (2016-06-20) x86_64 GNU/Linux >> >> ps output shows [btrfs-transaction] in D state: >> >> root 1108 0.0 0.0 0 0 ?D17:42 0:00 \_ >> [btrfs-transacti] >> >> From dmesg: >> >> [blah blah blah] >> >> So, something happened inside the fs that makes it lock up every time I >> try to do anything with it... > > > I force-rebooted the poor thing again, and mounted the filesystem ro. It > mounts without any complaint. I can see all files now, I can do sub list > etc... > > So I think I'm going to copy some data to a new filesystem on a new block > device just in case. The thing has to move to new storage anyway it's about > 100 subvolumes with about 150GB of data, so that's a nice excercise with > send/receive. Two things might be interesting: 1. btrfs check (without repair) to add to the above and see whether it finds any problems. 2. For send, to try -e option, if you have related subvolume snapshots. See if this bug is really a bug or user error or maybe it's fixed. https://bugzilla.kernel.org/show_bug.cgi?id=111221 -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Cannot balance FS (No space left on device)
On Sat, Jul 2, 2016 at 9:07 AM, Hans van Kranenburgwrote: > > Also, the behaviour of *always* creating a new empty block group before > starting to work (which makes it impossible to free up space on a fully > allocated filesystem with balance) got reverted in: > > https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=cf25ce518e8ef9d59b292e51193bed2b023a32da > > This patch is in 4.5 and 4.7-rc, but *not* in 4.6. Upstream it first appears in 4.5.7. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs lockup
On Fri, Jul 1, 2016 at 3:50 PM, Grey Christoforowrote: > On Fri, Jul 1, 2016 at 9:51 PM, Chris Murphy wrote: >> This is interesting: >> >> Jul 01 11:56:40 kernel: BTRFS info (device nvme0n1p5): relocating >> block group 232511242240 flags 1 >> >> a. It's an NVMe drive. >> b. Btrfs at this time is involved in a balance operation of some sort. > The balance operation is one I started manually. It's a coincidence > that it's running at this point and I don't believe it's related to > the lockups because > 1) I saw the lockups on a previous baobab scan of my > /var/lib/docker/btrfs/subvolumes when no balance was taking place > and > 2) after removing all of docker's subvolumes, the problem has gone away >> >> >> And then what you previously reported, the parts of which I don't follow: >> >> Jul 01 12:00:31 kernel: NMI watchdog: BUG: soft lockup - CPU#0 stuck >> for 22s! [scanner:7121] > I'd guess this is the kernel's protection mechanism to try to recover > when come critical (blocking) thread does not return, looks like it's > essentially killing the process when a watchdog timer expires. I suggest filing a bug and put the URL here. If you can reproduce without balance that's probably cleaner for a developer to follow. And then also when the blocking happens if you can do sysrq+t, then something like 'journalctl -k -o short-monotonic > dmesg_sysrqt.log' since often it'll overfill the kernel message buffer where journald will get it all and just the kernel messages. Attach that to the bug. Open question if there are kernel debug options to try, I can't really tell if this is directly Btrfs related or incidental. The soft lockup message refers to scanner as the running task, and it comes up more than once in the message log. No idea what that is. I guess it could be some unexpected interaction between Btrfs, VFS and this scanner task. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Filesystem locks up, also with older kernel on any action after booting into 4.7-rc4 once
On 07/02/2016 07:14 PM, Hans van Kranenburg wrote: I just rebooted a VM into a 4.7 kernel. The joy didn't last long. After 177 seconds the btrfs data partition (root is on ext4) locked up. Worse, it keeps locking up on any action performed even when rebooting it with older kernels again. D: The filesystem initially mounts fine, but then locks up again immediately. Linux stacheldraht 4.7.0-rc4-amd64 #1 SMP Debian 4.7~rc4-1~exp1 (2016-06-20) x86_64 GNU/Linux ps output shows [btrfs-transaction] in D state: root 1108 0.0 0.0 0 0 ?D17:42 0:00 \_ [btrfs-transacti] From dmesg: [blah blah blah] So, something happened inside the fs that makes it lock up every time I try to do anything with it... I force-rebooted the poor thing again, and mounted the filesystem ro. It mounts without any complaint. I can see all files now, I can do sub list etc... So I think I'm going to copy some data to a new filesystem on a new block device just in case. The thing has to move to new storage anyway it's about 100 subvolumes with about 150GB of data, so that's a nice excercise with send/receive. -- Hans van Kranenburg - System / Network Engineer T +31 (0)10 2760434 | hans.van.kranenb...@mendix.com | www.mendix.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Filesystem locks up, also with older kernel on any action after booting into 4.7-rc4 once
I just rebooted a VM into a 4.7 kernel. The joy didn't last long. After 177 seconds the btrfs data partition (root is on ext4) locked up. Worse, it keeps locking up on any action performed even when rebooting it with older kernels again. D: The filesystem initially mounts fine, but then locks up again immediately. Linux stacheldraht 4.7.0-rc4-amd64 #1 SMP Debian 4.7~rc4-1~exp1 (2016-06-20) x86_64 GNU/Linux ps output shows [btrfs-transaction] in D state: root 1108 0.0 0.0 0 0 ?D17:42 0:00 \_ [btrfs-transacti] From dmesg: [ 177.715994] [ cut here ] [ 177.716032] WARNING: CPU: 0 PID: 1108 at /build/linux-vIn3gu/linux-4.7~rc4/fs/btrfs/locking.c:251 btrfs_tree_lock+0x1eb/0x210 [btrfs] [ 177.716037] Modules linked in: binfmt_misc nf_log_ipv6 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6table_mangle ip6table_raw ip6_tables nf_log_ipv4 nf_log_common xt_LOG xt_limit ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_owner xt_multiport xt_conntrack iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_raw ip_tables x_tables intel_powerclamp evdev coretemp pcspkr crct10dif_pclmul crc32_pclmul ghash_clmulni_intel quota_v2 quota_tree loop autofs4 ext4 ecb crc16 jbd2 mbcache btrfs crc32c_generic xor raid6_pq crc32c_intel xen_netfront aesni_intel xen_blkfront aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd [ 177.716090] CPU: 0 PID: 1108 Comm: btrfs-transacti Tainted: G W 4.7.0-rc4-amd64 #1 Debian 4.7~rc4-1~exp1 [ 177.716095] 0200 a4392a01 81312db5 [ 177.716104] 8107896e 880079adb9d8 88007b940800 [ 177.716113] 4000 880079adb9d8 880079887928 [ 177.716121] Call Trace: [ 177.716129] [] ? dump_stack+0x5c/0x77 [ 177.716138] [] ? __warn+0xbe/0xe0 [ 177.716154] [] ? btrfs_tree_lock+0x1eb/0x210 [btrfs] [ 177.716168] [] ? btrfs_reserve_extent+0x1b5/0x200 [btrfs] [ 177.716182] [] ? btrfs_alloc_tree_block+0x167/0x4e0 [btrfs] [ 177.716197] [] ? __btrfs_cow_block+0x14c/0x5a0 [btrfs] [ 177.716210] [] ? btrfs_cow_block+0x10b/0x1d0 [btrfs] [ 177.716224] [] ? commit_cowonly_roots+0x5b/0x2f0 [btrfs] [ 177.716238] [] ? btrfs_run_delayed_refs+0x203/0x2b0 [btrfs] [ 177.716256] [] ? btrfs_qgroup_account_extents+0x84/0x180 [btrfs] [ 177.716273] [] ? btrfs_commit_transaction+0x568/0xa40 [btrfs] [ 177.716290] [] ? start_transaction+0x95/0x4a0 [btrfs] [ 177.716304] [] ? transaction_kthread+0x1e9/0x200 [btrfs] [ 177.716319] [] ? btrfs_cleanup_transaction+0x590/0x590 [btrfs] [ 177.716328] [] ? kthread+0xcd/0xf0 [ 177.716336] [] ? ret_from_fork+0x1f/0x40 [ 177.716341] [] ? kthread_create_on_node+0x190/0x190 [ 177.716360] ---[ end trace 558c4b7ce67e3503 ]--- And, then repeated every 120 seconds: [ 360.096092] INFO: task btrfs-transacti:1108 blocked for more than 120 seconds. [ 360.096105] Tainted: GW 4.7.0-rc4-amd64 #1 [ 360.096110] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 360.096120] btrfs-transacti D 88007d016d80 0 1108 2 0x [ 360.096128] 88000292ee80 880078f3bbf0 880078f3c000 [ 360.096136] 880079adba40 880079adba58 880078f3bc08 880079adba38 [ 360.096143] 815cdc11 880079adb9d8 c01424aa [ 360.096151] Call Trace: [ 360.096162] [] ? schedule+0x31/0x80 [ 360.096193] [] ? btrfs_tree_lock+0xba/0x210 [btrfs] [ 360.096201] [] ? wake_atomic_t_function+0x60/0x60 [ 360.096215] [] ? btrfs_alloc_tree_block+0x167/0x4e0 [btrfs] [ 360.096229] [] ? __btrfs_cow_block+0x14c/0x5a0 [btrfs] [ 360.096241] [] ? btrfs_cow_block+0x10b/0x1d0 [btrfs] [ 360.096256] [] ? commit_cowonly_roots+0x5b/0x2f0 [btrfs] [ 360.096269] [] ? btrfs_run_delayed_refs+0x203/0x2b0 [btrfs] [ 360.096287] [] ? btrfs_qgroup_account_extents+0x84/0x180 [btrfs] [ 360.096303] [] ? btrfs_commit_transaction+0x568/0xa40 [btrfs] [ 360.096320] [] ? start_transaction+0x95/0x4a0 [btrfs] [ 360.096334] [] ? transaction_kthread+0x1e9/0x200 [btrfs] [ 360.096348] [] ? btrfs_cleanup_transaction+0x590/0x590 [btrfs] [ 360.096356] [] ? kthread+0xcd/0xf0 [ 360.096362] [] ? ret_from_fork+0x1f/0x40 [ 360.096367] [] ? kthread_create_on_node+0x190/0x190 I'm surprised to see qgroup mentioned, because I'm quite sure I don't use that. I just force-rebooted the thing. Starting went well, mounting the partition went without any error. But, any operation on the thing locks it up again. -# btrfs sub list . [ 41.046160] [ cut here ] [ 41.046196] WARNING: CPU: 2 PID: 573 at /build/linux-vIn3gu/linux-4.7~rc4/fs/btrfs/locking.c:251 btrfs_tree_lock+0x1eb/0x210 [btrfs] [ 41.046201] Modules linked in: nf_log_ipv6 ip6t_REJECT
Re: Cannot balance FS (No space left on device)
On 06/13/2016 02:33 PM, Austin S. Hemmelgarn wrote: On 2016-06-10 18:39, Hans van Kranenburg wrote: On 06/11/2016 12:10 AM, ojab // wrote: On Fri, Jun 10, 2016 at 9:56 PM, Hans van Kranenburgwrote: You can work around it by either adding two disks (like Henk said), or by temporarily converting some chunks to single. Just enough to get some free space on the first two disks to get a balance going that can fill the third one. You don't have to convert all of your data or metadata to single! Something like: btrfs balance start -v -dconvert=single,limit=10 /mnt/xxx/ Unfortunately it fails even if I set limit=1: $ sudo btrfs balance start -v -dconvert=single,limit=1 /mnt/xxx/ Dumping filters: flags 0x1, state 0x0, force is off DATA (flags 0x120): converting, target=281474976710656, soft is off, limit=1 ERROR: error during balancing '/mnt/xxx/': No space left on device There may be more info in syslog - try dmesg | tail Ah, apparently the balance operation *always* wants to allocate some new empty space before starting to look more close at the task you give it... No, that's not exactly true. It seems to be a rather common fallacy right now that balance repacks data into existing chunks, which is absolutely false. What a balance does is to send everything selected by the filters through the allocator again, and specifically prevent any existing chunks from being used to satisfy the allocation. When you have 5 data chunks that are 20% used and run 'balance -dlimit=20', it doesn't pack that all into the first chunk, it allocates a new chunk, and then packs it all into that, then frees all the other chunks. This behavior is actually a pretty important property when adding or removing devices or converting between profiles, because it's what forces things into the new configuration of the filesystem. In an ideal situation, the limit filters should make it repack into existing chunks when specified alone, but currently that's not how it works, and I kind of doubt that that will ever be how it works. I have to disagree with you here, based on what I see happening. Two examples will follow, providing some pudding for the proof. Also, the behaviour of *always* creating a new empty block group before starting to work (which makes it impossible to free up space on a fully allocated filesystem with balance) got reverted in: https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=cf25ce518e8ef9d59b292e51193bed2b023a32da This patch is in 4.5 and 4.7-rc, but *not* in 4.6. Script used to provide block group output, using pyton-btrfs: -# cat show_block_groups.py #!/usr/bin/python from __future__ import print_function import btrfs import sys fs = btrfs.FileSystem(sys.argv[1]) for chunk in fs.chunks(): print(fs.block_group(chunk.vaddr, chunk.length)) Example 1: -# uname -a Linux ichiban 4.5.0-0.bpo.2-amd64 #1 SMP Debian 4.5.4-1~bpo8+1 (2016-05-13) x86_64 GNU/Linux -# ./show_block_groups.py / block group vaddr 86211821568 length 1073741824 flags DATA used 83712 used_pct 78 block group vaddr 87285563392 length 33554432 flags SYSTEM used 16384 used_pct 0 block group vaddr 87319117824 length 1073741824 flags DATA used 1070030848 used_pct 100 block group vaddr 88392859648 length 1073741824 flags DATA used 1057267712 used_pct 98 block group vaddr 89466601472 length 1073741824 flags DATA used 1066360832 used_pct 99 block group vaddr 90540343296 length 268435456 flags METADATA used 238256128 used_pct 89 block group vaddr 90808778752 length 268435456 flags METADATA used 226082816 used_pct 84 block group vaddr 91077214208 length 268435456 flags METADATA used 242548736 used_pct 90 block group vaddr 91345649664 length 268435456 flags METADATA used 218415104 used_pct 81 block group vaddr 91614085120 length 268435456 flags METADATA used 223723520 used_pct 83 block group vaddr 91882520576 length 268435456 flags METADATA used 68272128 used_pct 25 block group vaddr 92150956032 length 1073741824 flags DATA used 1048154112 used_pct 98 block group vaddr 93224697856 length 1073741824 flags DATA used 800985088 used_pct 75 block group vaddr 94298439680 length 1073741824 flags DATA used 62197760 used_pct 6 block group vaddr 95372181504 length 1073741824 flags DATA used 49541120 used_pct 5 block group vaddr 96445923328 length 1073741824 flags DATA used 142856192 used_pct 13 block group vaddr 97519665152 length 1073741824 flags DATA used 102051840 used_pct 10 Now do a balance, to remove the least used block group: 1st terminal: -# watch -d './show_block_groups.py /' 2nd terminal: -# btrfs balance start -v -dusage=5 / Dumping filters: flags 0x1, state 0x0, force is off DATA (flags 0x2): balancing, usage=5 Done, had to relocate 1 out of 17 chunks After: -# ./show_block_groups.py / block group vaddr 86211821568 length 1073741824 flags DATA used 83712 used_pct 78 block group vaddr 87285563392 length 33554432 flags SYSTEM used
Re: btrfs ops hang indefinitely (process in D state)
Hi Duncan, I pretty much understand the risks and do not need them to be explained to me. When I installed the remote system, the versions where pretty close to the cutting edge. And the problem looks as if it *could* to be the same in 3.13 and in 4.4 kernels. I wrote here to ask advice about "live" recovery *if* you have any, and to offer debug information *if* you interested. If you do not have advice for me, and are not interested in the sort of debug data that I *can* provide, so be it... Regards, Eugene On 07/02/2016 01:54 PM, Duncan wrote: > Eugene Crosser posted on Sat, 02 Jul 2016 12:49:53 +0300 as excerpted: > >> Enter the second system. It is a rented physical server in a datacenter >> with two hard disks, joined into a single root btrfs (/dev/sd[ab]1 are >> swap partitions): >> >> root@dehost:~# uname -a >> Linux dehost 3.13.0-91-generic [...] >> root@dehost:~# btrfs --version >> Btrfs v3.12 >> root@dehost:~# > > v3.12 userspace and v3.13 kernel are both ancient history in btrfs terms, > far too old to provide anything useful in terms of debugging info. > > In general, btrfs is not yet fully stable, and usage on the production > systems where that ancient a kernel and userspace might be considered for > stability reasons is considered highly incompatible with that sort of an > interest in stability at the cost of new features, because btrfs itself > isn't anything close to that level of stable. So the general > recommendation is choose one, either the still stabilizing btrfs on a > more current system if you want btrfs, or something truly stable, if you > really need that sort of years outdated stability. > > That said, while this list does tend to focus on mainline and the last > two mainline releases series of the current and LTS kernels, so ATM 4.6 > and 4.5 for current and 4.4 and 4.1 for LTS, not really much earlier, we > recognize that various distros do backporting and support much further > back. But this list tracks mainline, not those distro kernels, and > specifically, we don't track what they've backported vs. what they > haven't. So if you wish to use your distro's old kernels, that's fine, > but you're going to be better off going to them for support then, because > they'll know what they've backported and what they haven't and are thus > in a better position to provide that support. > > Meanwhile, I do recognize that you had something similar happen on a much > newer kernel as well, but that was on a different system, and you don't > have the details or logs left for that one, so that's not of much help > either. > > Unless of course you can duplicate the behavior once again with a > reasonably current kernel within the two-release series either LTS or > current range, as specified above, and can provide the logs, etc, from > it... > signature.asc Description: OpenPGP digital signature
Re: btrfs ops hang indefinitely (process in D state)
Eugene Crosser posted on Sat, 02 Jul 2016 12:49:53 +0300 as excerpted: > Enter the second system. It is a rented physical server in a datacenter > with two hard disks, joined into a single root btrfs (/dev/sd[ab]1 are > swap partitions): > > root@dehost:~# uname -a > Linux dehost 3.13.0-91-generic [...] > root@dehost:~# btrfs --version > Btrfs v3.12 > root@dehost:~# v3.12 userspace and v3.13 kernel are both ancient history in btrfs terms, far too old to provide anything useful in terms of debugging info. In general, btrfs is not yet fully stable, and usage on the production systems where that ancient a kernel and userspace might be considered for stability reasons is considered highly incompatible with that sort of an interest in stability at the cost of new features, because btrfs itself isn't anything close to that level of stable. So the general recommendation is choose one, either the still stabilizing btrfs on a more current system if you want btrfs, or something truly stable, if you really need that sort of years outdated stability. That said, while this list does tend to focus on mainline and the last two mainline releases series of the current and LTS kernels, so ATM 4.6 and 4.5 for current and 4.4 and 4.1 for LTS, not really much earlier, we recognize that various distros do backporting and support much further back. But this list tracks mainline, not those distro kernels, and specifically, we don't track what they've backported vs. what they haven't. So if you wish to use your distro's old kernels, that's fine, but you're going to be better off going to them for support then, because they'll know what they've backported and what they haven't and are thus in a better position to provide that support. Meanwhile, I do recognize that you had something similar happen on a much newer kernel as well, but that was on a different system, and you don't have the details or logs left for that one, so that's not of much help either. Unless of course you can duplicate the behavior once again with a reasonably current kernel within the two-release series either LTS or current range, as specified above, and can provide the logs, etc, from it... -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
btrfs ops hang indefinitely (process in D state)
Hello, This may be the same problem as "btrfs lockup". I have two systems using btrfs for several years. One is my home desktop, it has root+home ext4 fs on a PCI SSD, and "big stuff" on a btrfs using two hard disks in RAID1 configuration: root@pccross:/export# uname -a Linux pccross 4.7.0-rc2-custom #2 SMP Sat Jun 11 01:13:59 MSK 2016 x86_64 x86_64 x86_64 GNU/Linux # -- Was earlier 4.x version when the problem happened root@pccross:/export# btrfs --version btrfs-progs v4.4 root@pccross:/export# btrfs fi show Label: 'export' uuid: c94c3ef6-394e-4441-8992-d702bdff Total devices 2 FS bytes used 1.26TiB devid1 size 3.64TiB used 1.26TiB path /dev/sda devid2 size 3.64TiB used 1.26TiB path /dev/sdb root@pccross:/export# btrfs fi df /export Data, RAID1: total=1.26TiB, used=1.25TiB System, RAID1: total=32.00MiB, used=208.00KiB Metadata, RAID1: total=5.00GiB, used=3.82GiB GlobalReserve, single: total=512.00MiB, used=0.00B A month ago, I moved a directory containing a few Gb from home (ext4) to btrfs with `mv` command. The command took some minutes and eventually finished without error. After some hours, a cron job that uses files on btrfs did not run. I logged in to investigate and realized that its process was in 'D' state, and any command that I tried that would use btrfs (ls, ...) would enter 'D' state and stay there indefinitely. There was nothing interesting (that I remember) in dmesg. Reboot did not help and indeed could not complete because some of startup jobs use files on btfs, and they hang. I rebooted without mounting btrfs and ran `btrfsck`. It found and fixed some inconsistencies (no log, sorry), and I could mount, and since then everything works, except the directory that I moved disappeared altogether (I had a backup so could restore it). No debugging material left so this is just for background. = Enter the second system. It is a rented physical server in a datacenter with two hard disks, joined into a single root btrfs (/dev/sd[ab]1 are swap partitions): root@dehost:~# uname -a Linux dehost 3.13.0-91-generic #138-Ubuntu SMP Fri Jun 24 17:00:34 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux root@dehost:~# btrfs --version Btrfs v3.12 root@dehost:~# btrfs fi show Label: none uuid: 67a2708c-f039-4783-a699-6f6be0dac318 Total devices 2 FS bytes used 442.58GiB devid1 size 2.72TiB used 444.04GiB path /dev/sda2 devid2 size 2.72TiB used 444.03GiB path /dev/sdb2 Btrfs v3.12 root@dehost:~# btrfs fi df / Data, RAID1: total=440.00GiB, used=439.51GiB System, RAID1: total=32.00MiB, used=72.00KiB System, single: total=4.00MiB, used=0.00 Metadata, RAID1: total=4.00GiB, used=3.07GiB A week ago, the system started to become unresponsive every day. Kernel works (responds to ping) but no processes can start. Looking at the logs after reboot I noticed that activity stops some time after the start of backup cron job that covers a set of directories (/etc, /home, /var/mail and some more.). I disabled the backup job and since then, several days, it did not hang. = My question to the developers: what can I do to (1) recover the filesystem while it is mounted (I can use recovery netboot system and run `btrfs check` as the last resort), and (2) provide any useful debugging information to the developers? Thank you, Eugene signature.asc Description: OpenPGP digital signature
stat() on btrfs reports the st_blocks with delay (data loss in archivers)
There are optimizations in archivers (tar, rsync, ...) that rely on up2date st_blocks info. For example, in GNU tar there is optimization check [1] whether the 'st_size' reports more data than the 'st_blocks' can hold --> then tar considers that file is sparse (and does additional steps). It looks like btrfs doesn't show correct value in 'st_blocks' until the data are synced. ATM, there happens that: a) some "tool" creates sparse file b) that tool does not sync explicitly and exits .. c) tar is called immediately after that to archive the sparse file d) tar considers [2] the file is completely sparse (because st_blocks is zero) and archives no data. Here comes data loss. Because we fixed 'btrfs' to report non-zero 'st_blocks' when the file data is small and is in-lined (no real data blocks) -- I consider this is too bug in btrfs worth fixing. [1] http://git.savannah.gnu.org/cgit/paxutils.git/tree/lib/system.h?id=ec72abd9dd63bbff4534ec77e97b1a6cadfc3cf8#n392 [2] http://git.savannah.gnu.org/cgit/tar.git/tree/src/sparse.c?id=ac065c57fdc1788a2769fb119ed0c8146e1b9dd6#n273 Tested on kernel: kernel-4.5.7-300.fc24.x86_64 Originally reported here, reproducer available there: https://bugzilla.redhat.com/show_bug.cgi?id=1352061 Pavel -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs/113: Assertion failure
On Friday, July 01, 2016 04:25:52 PM Josef Bacik wrote: > On 07/01/2016 12:11 PM, Chandan Rajendra wrote: > > Sorry, Forgot to add the mailing list to CC. Doing it now ... > > > >> While running btrfs/113, I see the following call trace, > >> > >> [ 182.272009] BTRFS: assertion failed: !current->journal_info || flush != > >> BTRFS_RESERVE_FLUSH_ALL, file: > >> /home/chandan/repos/linux/fs/btrfs/extent-tree.c, line: 5131 > >> [ 182.274010] [ cut here ] > >> [ 182.274685] kernel BUG at > >> /home/chandan/repos/linux/fs/btrfs/ctree.h:3347! > >> [ 182.274982] invalid opcode: [#1] SMP DEBUG_PAGEALLOC > >> [ 182.274982] Modules linked in: > >> [ 182.274982] CPU: 2 PID: 2911 Comm: xfs_io Not tainted > >> 4.6.0-g5027553-dirty #29 > >> [ 182.274982] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > >> Bochs 01/01/2011 > >> [ 182.274982] task: 8818a4c3a400 ti: 8818aec4c000 task.ti: > >> 8818aec4c000 > >> [ 182.274982] RIP: 0010:[] [] > >> assfail+0x1a/0x1c > >> [ 182.274982] RSP: 0018:8818aec4f5d8 EFLAGS: 00010282 > >> [ 182.274982] RAX: 0097 RBX: 0003 RCX: > >> 8203dc18 > >> [ 182.274982] RDX: 0001 RSI: 0286 RDI: > >> 822c8ccc > >> [ 182.274982] RBP: 8818aec4f5d8 R08: fffe R09: > >> > >> [ 182.274982] R10: 0005 R11: 028d R12: > >> 8818b44ff000 > >> [ 182.274982] R13: 0003 R14: 8818a54a01f0 R15: > >> 8818a4f36150 > >> [ 182.274982] FS: 7f54fbc6b700() GS:88193350() > >> knlGS: > >> [ 182.274982] CS: 0010 DS: ES: CR0: 80050033 > >> [ 182.274982] CR2: 0061eba0 CR3: 0018aec54000 CR4: > >> 06e0 > >> [ 182.274982] Stack: > >> [ 182.274982] 8818aec4fa70 8134a891 0002 > >> 0003 > >> [ 182.274982] 8818b2502600 8818b44ff000 > >> > >> [ 182.274982] > >> > >> [ 182.274982] Call Trace: > >> [ 182.274982] [] __reserve_metadata_bytes+0xb1/0x1fe0 > >> [ 182.274982] [] ? lookup_address+0x23/0x30 > >> [ 182.274982] [] ? _lookup_address_cpa.isra.9+0x2d/0x30 > >> [ 182.274982] [] ? > >> __change_page_attr_set_clr+0xeb/0xc80 > >> [ 182.274982] [] ? lookup_address+0x23/0x30 > >> [ 182.274982] [] ? get_alloc_profile+0x8a/0x1a0 > >> [ 182.274982] [] ? btrfs_get_alloc_profile+0x2b/0x30 > >> [ 182.274982] [] ? can_overcommit+0x9e/0x100 > >> [ 182.274982] [] ? > >> __reserve_metadata_bytes+0xc88/0x1fe0 > >> [ 182.274982] [] ? __alloc_pages_nodemask+0x10d/0xc80 > >> [ 182.274982] [] ? _lookup_address_cpa.isra.9+0x2d/0x30 > >> [ 182.274982] [] ? > >> __change_page_attr_set_clr+0xeb/0xc80 > >> [ 182.274982] [] ? lookup_address+0x23/0x30 > >> [ 182.274982] [] ? __slab_free+0x96/0x2b0 > >> [ 182.274982] [] ? __probe_kernel_read+0x39/0x90 > >> [ 182.274982] [] ? insert_state+0xc9/0x150 > >> [ 182.274982] [] ? > >> add_delayed_ref_tail_merge+0x2e/0x350 > >> [ 182.274982] [] reserve_metadata_bytes+0x1f/0xe0 > >> [ 182.274982] [] btrfs_block_rsv_add+0x26/0x50 > >> [ 182.274982] [] ? free_extent_state+0x15/0x20 > >> [ 182.274982] [] > >> btrfs_delalloc_reserve_metadata+0x13e/0x490 > >> [ 182.274982] [] btrfs_delalloc_reserve_space+0x2a/0x50 > >> [ 182.274982] [] btrfs_truncate_block+0x8a/0x430 > >> [ 182.274982] [] ? > >> generic_bin_search.constprop.35+0x86/0x1e0 > >> [ 182.274982] [] truncate_inline_extent+0x157/0x260 > >> [ 182.274982] [] ? btrfs_search_slot+0x86c/0x990 > >> [ 182.274982] [] ? free_extent_map+0x4c/0xa0 > >> [ 182.274982] [] btrfs_truncate_inode_items+0xba7/0xdc0 > >> [ 182.274982] [] btrfs_truncate+0x168/0x280 > >> [ 182.274982] [] btrfs_setattr+0x214/0x320 > >> [ 182.274982] [] notify_change+0x1dc/0x380 > >> [ 182.274982] [] do_truncate+0x61/0xa0 > >> [ 182.274982] [] > >> do_sys_ftruncate.constprop.17+0xf9/0x160 > >> [ 182.274982] [] SyS_ftruncate+0x9/0x10 > >> [ 182.274982] [] entry_SYSCALL_64_fastpath+0x13/0x8f > >> [ 182.274982] Code: 48 c7 c7 48 14 df 81 48 89 e5 e8 ac ac d3 ff 0f 0b 55 > >> 89 d1 31 c0 48 89 f2 48 89 fe 48 c7 c7 48 14 df 81 48 89 e5 e8 90 ac d3 ff > >> <0f> 0b 55 b9 11 00 00 00 48 89 e5 41 55 49 89 f5 41 54 49 89 d4 > >> [ 182.274982] RIP [] assfail+0x1a/0x1c > >> [ 182.274982] RSP > >> [ 182.327207] ---[ end trace 44721e14eef0a6b2 ]--- > >> > >> > >> Basically btrfs_truncate() starts a transaction, > >> btrfs_truncate_inode_items() > >> encounters an inline extent and invokes > >> btrfs_truncate_block(). btrfs_truncate_block() tries to reserve delalloc > >> (both > >> data & metadata) space. While doing so it passes BTRFS_RESERVE_FLUSH_ALL > >> as an > >> argument. Since we already have a transaction running, we fail the > >> following > >> ASSERT()