Re: [PATCH RFC v3 1/5] Revert "btrfs: add support for processing pending changes" related commits
Original Message Subject: Re: [PATCH RFC v3 1/5] Revert "btrfs: add support for processing pending changes" related commits From: Qu Wenruo To: dste...@suse.cz, linux-btrfs@vger.kernel.org, miao...@huawei.com Date: 2015年01月26日 08:37 Original Message Subject: Re: [PATCH RFC v3 1/5] Revert "btrfs: add support for processing pending changes" related commits From: David Sterba To: Qu Wenruo Date: 2015年01月23日 22:57 On Fri, Jan 23, 2015 at 05:31:41PM +0800, Qu Wenruo wrote: For mount option change, later patches will introduce copy-n-update method and rwsem protects to keep mount options consistent during transaction. That's a better approach, for the mount options. I'm glad that you like this method. Although the description in this patch is outdated, it is now per-transaction mount option. Sorry for the confusion. For sysfs interface to change label/features, it will keep the same behavior as 'btrfs pro set', so pending changes are also not needed. This still leaves the transaction commit inside the syfs handler, that was one of the points not to do that. The callstack looks safe from, eg. the label handler: [169148.523158] WARNING: CPU: 1 PID: 2044 at fs/btrfs/sysfs.c:394 btrfs_label_store+0x135/0x190 [btrfs]() [169148.533925] Modules linked in: btrfs dm_flakey rpcsec_gss_krb5 loop [last unloaded: btrfs] [169148.536950] CPU: 1 PID: 2044 Comm: bash Tainted: G W 3.19.0-rc5-default+ #211 [169148.536952] Hardware name: Intel Corporation Santa Rosa platform/Matanzas, BIOS TSRSCRB1.86C.0047.B00.0610170821 10/17/06 [169148.536954] 018a 88007a753dc8 81a9898b 018a [169148.536963] 88007a753e08 81077f65 880077fb0100 [169148.536972] 880075dc 880077fbff00 0009 880075dc06d0 [169148.536980] Call Trace: [169148.536983] [] dump_stack+0x4f/0x6c [169148.536991] [] warn_slowpath_common+0x95/0xe0 [169148.537000] [] warn_slowpath_null+0x1a/0x20 [169148.537005] [] btrfs_label_store+0x135/0x190 [btrfs] [169148.537030] [] kobj_attr_store+0x17/0x20 [169148.537037] [] sysfs_kf_write+0x4f/0x70 [169148.537044] [] kernfs_fop_write+0x128/0x180 [169148.537051] [] vfs_write+0xd4/0x1d0 [169148.537059] [] SyS_write+0x59/0xd0 [169148.537070] [] system_call_fastpath+0x12/0x17 Lockep shows these locks held: [169148.537296] 4 locks held by bash/2044: [169148.537309] #0: (sb_writers#5){.+.+.+}, at: [] vfs_write+0x1b0/0x1d0 [169148.537319] #1: (&of->mutex){+.+.+.}, at: [] kernfs_fop_write+0x8e/0x180 [169148.537330] #2: (s_active#214){.+.+.+}, at: [] kernfs_fop_write+0x96/0x180 [169148.537342] #3: (tasklist_lock){.+.+..}, at: [] debug_show_all_locks+0x44/0x1e0 #3 is from lockdep #2 is not really a lock, annotated vfs atomic counter #0 is annotated atomic, the freezing barrier #1 is a kernfs mutex that, afaics it's per file, but I don't like to see the lock dependency here. That's a lock we can see now, but it's outside of btrfs or the vfs. It's a matter of precaution. Thanks for pointing out the problem. It makes sense to delay it. But we have btrfs-workqueue, why not put it to "worker" workqueue? If using this method, we can just wrap btrfs_ioctl_set_fslabel() and queue it to fs_info->workers. This can avoid the the lockdep problem, but the behavior is still inconsistent with the synchronized ioctl method. Although not perfect, it should be good enough and still clean enough. Wait a second, #1 is a mutex, so I didn't quite understand the problem. Just because it is not btrfs/vfs mutex so we want to avoid it? It seems not convincing enough for me... For readonly/freeze check, I prefer extra vfsmount from sb->s_mounts and use mnt_want_write() (handle ro) and transaction (handle freeze). So IMHO it just needs some small tweaks on the original implementation. Thanks, Qu What do you think about such method? Thanks, Qu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs: Make csum tree rebuild works with extent tree rebuild.
Before this patch, csum tree rebuild will not work with extent tree rebuild, since extent tree rebuild will only build up basic block groups, but csum tree rebuild needs data extents to rebuild. So if one use btrfsck with --init-csum-tree and --init-extent-tree, csum tree will be empty and tons of "missing csum" error will be outputted. This patch allows csum tree rebuild get its data from fs/subvol trees using regular file extents(which is also the only one using csum tree currently) Signed-off-by: Qu Wenruo --- cmds-check.c | 158 +-- 1 file changed, 155 insertions(+), 3 deletions(-) diff --git a/cmds-check.c b/cmds-check.c index 45d3468..bafa743 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -8534,8 +8534,141 @@ static int populate_csum(struct btrfs_trans_handle *trans, return ret; } -static int fill_csum_tree(struct btrfs_trans_handle *trans, - struct btrfs_root *csum_root) +static int fill_csum_tree_from_one_fs(struct btrfs_trans_handle *trans, + struct btrfs_root *csum_root, + struct btrfs_root *cur_root) +{ + struct btrfs_path *path; + struct btrfs_key key; + struct extent_buffer *node; + struct btrfs_file_extent_item *fi; + char *buf = NULL; + u64 start = 0; + u64 len = 0; + int slot = 0; + int ret = 0; + + path = btrfs_alloc_path(); + if (!path) + return -ENOMEM; + buf = malloc(cur_root->fs_info->csum_root->sectorsize); + if (!buf) { + ret = -ENOMEM; + goto out; + } + + key.objectid = 0; + key.offset = 0; + key.type = 0; + + ret = btrfs_search_slot(NULL, cur_root, &key, path, 0, 0); + if (ret < 0) + goto out; + /* Iterate all regular file extents and fill its csum */ + while (1) { + btrfs_item_key_to_cpu(path->nodes[0], &key, path->slots[0]); + + if (key.type != BTRFS_EXTENT_DATA_KEY) + goto next; + node = path->nodes[0]; + slot = path->slots[0]; + fi = btrfs_item_ptr(node, slot, struct btrfs_file_extent_item); + if (btrfs_file_extent_type(node, fi) != BTRFS_FILE_EXTENT_REG) + goto next; + start = btrfs_file_extent_disk_bytenr(node, fi); + len = btrfs_file_extent_disk_num_bytes(node, fi); + + ret = populate_csum(trans, csum_root, buf, start, len); + if (ret == -EEXIST) + ret = 0; + if (ret < 0) + goto out; +next: + /* +* TODO: if next leaf is corrupted, jump to nearest next valid +* leaf. +*/ + ret = btrfs_next_item(cur_root, path); + if (ret < 0) + goto out; + if (ret > 0) { + ret = 0; + goto out; + } + } + +out: + btrfs_free_path(path); + free(buf); + return ret; +} + +static int fill_csum_tree_from_fs(struct btrfs_trans_handle *trans, + struct btrfs_root *csum_root) +{ + struct btrfs_fs_info *fs_info = csum_root->fs_info; + struct btrfs_path *path; + struct btrfs_root *tree_root = fs_info->tree_root; + struct btrfs_root *cur_root; + struct extent_buffer *node; + struct btrfs_key key; + int slot = 0; + int ret = 0; + + path = btrfs_alloc_path(); + if (!path) + return -ENOMEM; + + key.objectid = BTRFS_FS_TREE_OBJECTID; + key.offset = 0; + key.type = BTRFS_ROOT_ITEM_KEY; + + ret = btrfs_search_slot(NULL, tree_root, &key, path, 0, 0); + if (ret < 0) + goto out; + if (ret > 0) { + ret = -ENOENT; + goto out; + } + + while (1) { + node = path->nodes[0]; + slot = path->slots[0]; + btrfs_item_key_to_cpu(node, &key, slot); + if (key.objectid > BTRFS_LAST_FREE_OBJECTID) + goto out; + if (key.type != BTRFS_ROOT_ITEM_KEY) + goto next; + if (!is_fstree(key.objectid)) + goto next; + key.offset = (u64)-1; + + cur_root = btrfs_read_fs_root(fs_info, &key); + if (IS_ERR(cur_root) || !cur_root) { + fprintf(stderr, "Fail to read fs/subvol tree: %lld\n", + key.objectid); + goto out; + } + ret = fill_csum_tree_from_one_fs(trans, csum_root, cur_root); + if (ret < 0) + goto out; +next: + ret = btrfs_nex
Re: 3.19-rc5: Bug 91911: [REGRESSION] rm command hangs big time with deleting a lot of files at once
On Fri, Jan 23, 2015 at 02:38:09PM +, Holger Hoffstätte wrote: > On Fri, 23 Jan 2015 15:01:28 +0100, Martin Steigerwald wrote: > > > Hi! > > > > Anyone seen this? > > > > Reported as: > > > > https://bugzilla.kernel.org/show_bug.cgi?id=91911 > > You might be interested in: > > https://git.kernel.org/cgit/linux/kernel/git/josef/btrfs-next.git/commit/?h=evict-softlockup&id=29249e14d6e3379a5c4bb098dd4beddfefbc606f > > and > > https://git.kernel.org/cgit/linux/kernel/git/josef/btrfs-next.git/commit/?h=evict-softlockup&id=e4a58b71ff981b098ac3371f4d573dc6a90006ce > > I'm sure everyone would love to hear how this works out for you ;-) I merged both commits and I've been running with them since Friday. Several softlockups since then, in unlinkat() and renameat2(). Some typical stacks: [] ? free_extent_state.part.29+0x34/0xb0 [] ? free_extent_state+0x25/0x30 [] ? __set_extent_bit+0x3aa/0x4f0 [] ? _raw_spin_unlock_irqrestore+0x32/0x70 [] ? get_parent_ip+0x11/0x50 [] schedule+0x29/0x70 [] lock_extent_bits+0x1b0/0x200 [] ? add_wait_queue+0x60/0x60 [] btrfs_evict_inode+0x139/0x550 [] evict+0xb8/0x190 [] iput+0x105/0x1a0 [] do_unlinkat+0x189/0x2d0 [] ? SyS_newlstat+0x2a/0x40 [] ? trace_hardirqs_on_thunk+0x3a/0x3c [] SyS_unlink+0x16/0x20 [] system_call_fastpath+0x1a/0x1f Note that the above stack is _very_ typical. I've caught machines with well over 100 processes stuck in "D" state with an identical stack trace from "btrfs_evict_inode" to "system_call_fastpath". [] lock_extent_bits+0x1b0/0x200 [] btrfs_evict_inode+0x12a/0x540 [] evict+0xb8/0x190 [] iput+0x105/0x1a0 [] __dentry_kill+0x190/0x200 [] dput+0xba/0x190 [] SyS_renameat2+0x510/0x580 [] SyS_rename+0x1e/0x20 [] system_call_fastpath+0x16/0x1b [] 0x The above is a typical renameat2() softlockup stack. [] wait_on_page_bit+0xb8/0xc0 [] shrink_page_list+0x8c4/0xb20 [] shrink_inactive_list+0x19d/0x500 [] shrink_lruvec+0x59d/0x760 [] shrink_zone+0x83/0x1c0 [] do_try_to_free_pages+0x16e/0x460 [] try_to_free_mem_cgroup_pages+0x9e/0x180 [] mem_cgroup_reclaim+0x4e/0xe0 [] try_charge+0x15d/0x500 [] mem_cgroup_try_charge+0x8d/0x1a0 [] __add_to_page_cache_locked+0x8f/0x280 [] add_to_page_cache_lru+0x28/0x80 [] pagecache_get_page+0xab/0x1d0 [] alloc_extent_buffer+0xe4/0x380 [btrfs] [] btrfs_find_create_tree_block+0x1f/0x30 [btrfs] [] readahead_tree_block+0x1f/0x60 [btrfs] [] reada_for_balance+0x160/0x1e0 [btrfs] [] btrfs_search_slot+0x687/0xac0 [btrfs] [] btrfs_lookup_inode+0x2f/0xa0 [btrfs] [] __btrfs_update_delayed_inode+0x65/0x210 [btrfs] [] btrfs_commit_inode_delayed_inode+0x13a/0x150 [btrfs] [] btrfs_evict_inode+0x2ca/0x520 [btrfs] [] evict+0xb8/0x190 [] iput+0x105/0x1a0 [] __dentry_kill+0x1b8/0x210 [] dput+0xba/0x190 [] SyS_renameat2+0x440/0x530 [] SyS_rename+0x1e/0x20 [] system_call_fastpath+0x1a/0x1f [] 0x The last one is a little older (from 3.17.4) but it's a bit more interesting. Since mem cgroups were involved, I allocated a lot more RAM to the cgroup and it seems to have helped reduce the frequency of this bug occurring. > > -h > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html signature.asc Description: Digital signature
Resolved...ish. was: Re: spurious I/O errors from btrfs...at the caching layer?
It seems that the rate of spurious I/O errors varies most according to the vm.vfs_cache_pressure sysctl. At '10' the I/O errors occur so often that building a kernel is impossible. At '100' I can't reproduce even a single I/O error. I guess this is own my fault for using non-default sysctl parameters, although I wouldn't expect any value of this sysctl to cause these symptoms... :-P On Sun, Jan 25, 2015 at 11:50:36AM -0500, Zygo Blaxell wrote: > On Sat, Jan 24, 2015 at 01:06:01PM -0500, Zygo Blaxell wrote: > > I am seeing a lot of spurious I/O errors that look like they come from > > the cache-facing side of btrfs. While running a heavy load with some > > extent-sharing (e.g. building 20 Linux kernels at once from source trees > > copied with 'cp -a --reflink=always'), some files will return spurious > > EIO on read. It happens often enough to prevent a Linux kernel build > > about 1/3 of the time. > [...] > > Observed from 3.17..3.18.3. All filesystems affected use skinny-metadata. > > No filesystems that are not using skinny-metadata seem to have this > > problem. > > I ran a test overnight using 3.18.3 on a freshly formatted filesystem with > no skinny-metadata. > > The test consisted of creating reflink copies of a Linux kernel source > tree and running kernel builds in each copy simultaneously, like this: > > # assume you have a ready-to-build kernel tree in 'linux' > for x in $(seq 1 5); do > cp -a --reflink linux linux-$x > done > > # build all the kernels at once > for x in $(seq 1 5); do > (cd linux-$x && make -j10 2>&1 | tee make.log) & > done > > wait > # then tail all the make.logs and see how many failed due to > # I/O errors > > Spurious I/O errors occured with as few as two concurrent kernel builds. > > The test machine has 16GB of RAM and the filesystem is also 16GB, > RAID1 on two spinning disks. > signature.asc Description: Digital signature
Re: [PATCH RFC v3 1/5] Revert "btrfs: add support for processing pending changes" related commits
Original Message Subject: Re: [PATCH RFC v3 1/5] Revert "btrfs: add support for processing pending changes" related commits From: David Sterba To: Qu Wenruo Date: 2015年01月23日 22:57 On Fri, Jan 23, 2015 at 05:31:41PM +0800, Qu Wenruo wrote: For mount option change, later patches will introduce copy-n-update method and rwsem protects to keep mount options consistent during transaction. That's a better approach, for the mount options. I'm glad that you like this method. Although the description in this patch is outdated, it is now per-transaction mount option. Sorry for the confusion. For sysfs interface to change label/features, it will keep the same behavior as 'btrfs pro set', so pending changes are also not needed. This still leaves the transaction commit inside the syfs handler, that was one of the points not to do that. The callstack looks safe from, eg. the label handler: [169148.523158] WARNING: CPU: 1 PID: 2044 at fs/btrfs/sysfs.c:394 btrfs_label_store+0x135/0x190 [btrfs]() [169148.533925] Modules linked in: btrfs dm_flakey rpcsec_gss_krb5 loop [last unloaded: btrfs] [169148.536950] CPU: 1 PID: 2044 Comm: bash Tainted: GW 3.19.0-rc5-default+ #211 [169148.536952] Hardware name: Intel Corporation Santa Rosa platform/Matanzas, BIOS TSRSCRB1.86C.0047.B00.0610170821 10/17/06 [169148.536954] 018a 88007a753dc8 81a9898b 018a [169148.536963] 88007a753e08 81077f65 880077fb0100 [169148.536972] 880075dc 880077fbff00 0009 880075dc06d0 [169148.536980] Call Trace: [169148.536983] [] dump_stack+0x4f/0x6c [169148.536991] [] warn_slowpath_common+0x95/0xe0 [169148.537000] [] warn_slowpath_null+0x1a/0x20 [169148.537005] [] btrfs_label_store+0x135/0x190 [btrfs] [169148.537030] [] kobj_attr_store+0x17/0x20 [169148.537037] [] sysfs_kf_write+0x4f/0x70 [169148.537044] [] kernfs_fop_write+0x128/0x180 [169148.537051] [] vfs_write+0xd4/0x1d0 [169148.537059] [] SyS_write+0x59/0xd0 [169148.537070] [] system_call_fastpath+0x12/0x17 Lockep shows these locks held: [169148.537296] 4 locks held by bash/2044: [169148.537309] #0: (sb_writers#5){.+.+.+}, at: [] vfs_write+0x1b0/0x1d0 [169148.537319] #1: (&of->mutex){+.+.+.}, at: [] kernfs_fop_write+0x8e/0x180 [169148.537330] #2: (s_active#214){.+.+.+}, at: [] kernfs_fop_write+0x96/0x180 [169148.537342] #3: (tasklist_lock){.+.+..}, at: [] debug_show_all_locks+0x44/0x1e0 #3 is from lockdep #2 is not really a lock, annotated vfs atomic counter #0 is annotated atomic, the freezing barrier #1 is a kernfs mutex that, afaics it's per file, but I don't like to see the lock dependency here. That's a lock we can see now, but it's outside of btrfs or the vfs. It's a matter of precaution. Thanks for pointing out the problem. It makes sense to delay it. But we have btrfs-workqueue, why not put it to "worker" workqueue? If using this method, we can just wrap btrfs_ioctl_set_fslabel() and queue it to fs_info->workers. This can avoid the the lockdep problem, but the behavior is still inconsistent with the synchronized ioctl method. Although not perfect, it should be good enough and still clean enough. What do you think about such method? Thanks, Qu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
On Fri, 23 Jan 2015 17:59:49 +0100, David Sterba wrote: > On Wed, Jan 21, 2015 at 03:04:02PM +0800, Miao Xie wrote: >>> Pending changes are *not* only mount options. Feature change and label >>> change >>> are also pending changes if using sysfs. >> >> My miss, I don't notice feature and label change by sysfs. >> >> But the implementation of feature and label change by sysfs is wrong, we can >> not change them without write permission. > > Label change does not happen if the fs is readonly. If the filesystem is > RW and label is changed through sysfs, then remount to RO will sync the > filesystem and the new label will be saved. > > The sysfs features write handler is missing that protection though, I'll > send a patch. First, the R/O protection is so cheap, there is a race between R/O remount and label/feature change, please consider the following case: Remount R/O taskLabel/Attr Change Task Check R/O remount ro R/O change Label/feature Second, it forgets to handle the freezing event. > >>> For freeze, it's not the same problem since the fs will be unfreeze sooner >>> or >>> later and transaction will be initiated. >> >> You can not assume the operations of the users, they might freeze the fs and >> then shutdown the machine. > > The semantics of freezing should make the on-device image consistent, > but still keep some changes in memory. > > For example, if we change the features/label through sysfs, and then > umount > the fs, It is different from pending change. >>> No, now features/label changing using sysfs both use pending changes to do >>> the >>> commit. >>> See BTRFS_PENDING_COMMIT bit. >>> So freeze -> change features/label -> sync will still cause the deadlock in >>> the >>> same way, >>> and you can try it yourself. >> >> As I said above, the implementation of sysfs feature and label change is >> wrong, >> it is better to separate them from the pending mount option change, make the >> sysfs feature and label change be done in the context of transaction after >> getting the write permission. If so, we needn't do anything special when sync >> the fs. > > That would mean to drop the write support of sysfs files that change > global filesystem state (label and features right now). This would leave > only the ioctl way to do that. I'd like to keep the sysfs write support > though for ease of use from scripts and languages not ioctl-friendly. > . not drop the write support of sysfs, just fix the bug and make it change the label and features under the writable context. Thanks Miao -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 3.19-rc5: Bug 91911: [REGRESSION] rm command hangs big time with deleting a lot of files at once
On Fri, Jan 23, 2015 at 06:29:40PM -0500, Zygo Blaxell wrote: > On Fri, Jan 23, 2015 at 03:01:28PM +0100, Martin Steigerwald wrote: > > Hi! > > > > Anyone seen this? > > > > Reported as: > > > > https://bugzilla.kernel.org/show_bug.cgi?id=91911 > > I have seen something like this since 3.15. > > I've also seen its cousin, which gets stuck in evict_inode, but the stacks > of the hanging processes start from renameat2() instead of unlinkat(). > I haven't seen the renameat2() variant of this bug since 3.18-rc6. Since I wrote the above paragraph two days ago, I've seen the renameat2()/btrfs_evict_inode bug twice on 3.18.3. :-P > > I just want to get rid of some 127000+ akonadi lost+found files, any delete > > command I start just gets rid of some thousands and then hangs. > > > > merkaba:~> btrfs fi df /home > > Data, RAID1: total=160.92GiB, used=111.09GiB > > System, RAID1: total=32.00MiB, used=48.00KiB > > Metadata, RAID1: total=5.99GiB, used=2.49GiB > > GlobalReserve, single: total=512.00MiB, used=0.00B > > merkaba:~> btrfs fi sh /home > > Label: 'home' uuid: […] > > Total devices 2 FS bytes used 113.58GiB > > devid1 size 170.00GiB used 166.94GiB path /dev/mapper/msata- > > home > > devid2 size 170.00GiB used 166.94GiB path /dev/mapper/sata- > > home > > > > Btrfs v3.18 > > > > > > merkaba:/home/ms/.local/share/akonadi#1> find file_lost+found | wc -l > > 110070 > > merkaba:/home/ms/.local/share/akonadi> find file_lost+found -delete & > > [4] 2660 > > merkaba:/home/ms/.local/share/akonadi> find file_lost+found | wc -l > > 101645 > > merkaba:/home/ms/.local/share/akonadi> find file_lost+found -delete & > > [5] 2663 > > merkaba:/home/ms/.local/share/akonadi> find file_lost+found -delete & > > [6] 2664 > > merkaba:/home/ms/.local/share/akonadi> find file_lost+found | wc -l > > 91369 > > merkaba:/home/ms/.local/share/akonadi> find file_lost+found | wc -l > > 89844 > > merkaba:/home/ms/.local/share/akonadi> find file_lost+found | wc -l > > 88042 > > merkaba:/home/ms/.local/share/akonadi> find file_lost+found -delete & > > [7] 2671 > > merkaba:/home/ms/.local/share/akonadi> uname -a > > Linux merkaba 3.19.0-rc5-tp520-trim-all-bgroups+ #18 SMP PREEMPT Mon Jan > > 19 09:58:33 CET 2015 x86_64 GNU/Linux > > merkaba:/home/ms/.local/share/akonadi> find file_lost+found -delete & > > [8] 2694 > > merkaba:/home/ms/.local/share/akonadi> find file_lost+found -delete & > > [9] 2700 > > merkaba:/home/ms/.local/share/akonadi> find file_lost+found | wc -l > > 67278 > > merkaba:/home/ms/.local/share/akonadi> find file_lost+found | wc -l > > 65244 > > merkaba:/home/ms/.local/share/akonadi> find file_lost+found | wc -l > > 63713 > > merkaba:/home/ms/.local/share/akonadi> find file_lost+found | wc -l > > 62725 > > merkaba:/home/ms/.local/share/akonadi> find file_lost+found | wc -l > > 62213 > > merkaba:/home/ms/.local/share/akonadi> find file_lost+found | wc -l > > 61213 > > merkaba:/home/ms/.local/share/akonadi> find file_lost+found -delete & > > [10] 2715 > > merkaba:/home/ms/.local/share/akonadi> find file_lost+found | wc -l > > 60470 > > merkaba:/home/ms/.local/share/akonadi> find file_lost+found -delete & > > [11] 2718 > > merkaba:/home/ms/.local/share/akonadi> find file_lost+found | wc -l > > 53303 > > > > > > merkaba:/home/ms/.local/share/akonadi> find file_lost+found | wc -l > > 51396 > > merkaba:/home/ms/.local/share/akonadi> find file_lost+found | wc -l > > 51396 > > merkaba:/home/ms/.local/share/akonadi> find file_lost+found | wc -l > > 51396 > > > > > > merkaba:/home/ms/.local/share/akonadi> ps aux | grep find > > ms2647 0.4 0.2 43096 36204 pts/3D+ 14:45 0:00 find > > file_lost+found -delete > > root 2651 0.3 0.2 42568 35688 pts/0DN 14:45 0:00 find > > file_lost+found -delete > > root 2654 2.7 0.2 44544 35652 pts/0DN 14:46 0:05 find > > file_lost+found -delete > > root 2657 0.3 0.2 44016 35048 pts/0DN 14:46 0:00 find > > file_lost+found -delete > > root 2660 2.1 0.1 39136 32280 pts/0DN 14:46 0:03 find > > file_lost+found -delete > > root 2663 0.2 0.1 36760 29988 pts/0DN 14:46 0:00 find > > file_lost+found -delete > > root 2664 3.3 0.1 36760 29888 pts/0DN 14:46 0:05 find > > file_lost+found -delete > > root 2671 0.9 0.1 33856 26984 pts/0DN 14:46 0:01 find > > file_lost+found -delete > > root 2694 1.1 0.1 32404 25380 pts/0DN 14:47 0:01 find > > file_lost+found -delete > > root 2700 4.0 0.1 30952 24064 pts/0DN 14:47 0:04 find > > file_lost+found -delete > > root 2715 0.3 0.1 26200 19332 pts/0DN 14:47 0:00 find > > file_lost+found -delete > > root 2718 4.1 0.1 26068 19068 pts/0DN 14:47 0:02 find > > file_lost+found -delete > > root 2840 0.0 0.0 12672 1592 pts/0S+ 14:49 0:00 grep find > > merkaba:/home/ms/.local/share/akonadi> ps aux | grep rm
Re: spuious I/O errors from btrfs...at the caching layer?
On Sat, Jan 24, 2015 at 01:06:01PM -0500, Zygo Blaxell wrote: > I am seeing a lot of spurious I/O errors that look like they come from > the cache-facing side of btrfs. While running a heavy load with some > extent-sharing (e.g. building 20 Linux kernels at once from source trees > copied with 'cp -a --reflink=always'), some files will return spurious > EIO on read. It happens often enough to prevent a Linux kernel build > about 1/3 of the time. [...] > Observed from 3.17..3.18.3. All filesystems affected use skinny-metadata. > No filesystems that are not using skinny-metadata seem to have this > problem. I ran a test overnight using 3.18.3 on a freshly formatted filesystem with no skinny-metadata. The test consisted of creating reflink copies of a Linux kernel source tree and running kernel builds in each copy simultaneously, like this: # assume you have a ready-to-build kernel tree in 'linux' for x in $(seq 1 5); do cp -a --reflink linux linux-$x done # build all the kernels at once for x in $(seq 1 5); do (cd linux-$x && make -j10 2>&1 | tee make.log) & done wait # then tail all the make.logs and see how many failed due to # I/O errors Spurious I/O errors occured with as few as two concurrent kernel builds. The test machine has 16GB of RAM and the filesystem is also 16GB, RAID1 on two spinning disks. signature.asc Description: Digital signature
Re: btrfs convert running out of space
Am Fri, 23 Jan 2015 08:46:23 + (UTC) schrieb Duncan <1i5t5.dun...@cox.net>: > Marc Joliet posted on Fri, 23 Jan 2015 08:54:41 +0100 as excerpted: > > > Am Fri, 23 Jan 2015 04:34:19 + (UTC) > > schrieb Duncan <1i5t5.dun...@cox.net>: > > > >> Gareth Pye posted on Fri, 23 Jan 2015 08:58:08 +1100 as excerpted: > >> > >> > What are the chances that splitting all the large files up into sub > >> > gig pieces, finish convert, then recombine them all will work? > >> > > [...] > >> Option 2: Since new files should be created using the desired target > >> mode (raid1 IIRC), you may actually be able to move them off and > >> immediately back on, so they appear as new files and thus get created > >> in the desired mode. > > > > With current coreutils, wouldn't that also work if he moves the files to > > another (temporary) subvolume? (And with future coreutils, by copying > > the files without using reflinks and then removing the originals.) > > If done correctly, yes. > > However, "off the filesystem" is far simpler to explain over email or the > like, and is much less ambiguous in terms of "OK, but did you do it > 'correctly'" if it doesn't end up helping. If it doesn't work, it > doesn't work. If "move to a different subvolume under specific > conditions in terms of reflinking and the like" doesn't work, there's > always the question of whether it /really/ didn't work, or if somehow the > instructions weren't clear enough and thus failure was simply the result > of a failure to fully meet the technical requirements. > > Of course if I was doing it myself, and if I was absolutely sure of the > technical details in terms of what command I had to use to be /sure/ it > didn't simply reflink and thus defeat the whole exercise, I'd likely use > the shortcut. But in reality, if it didn't work I'd be second-guessing > myself and would probably move everything entirely off and back on to be > sure, and knowing that, I'd probably do it the /sure/ way in the first > place, avoiding the chance of having to redo it to prove to myself that > I'd done it correctly. > > Of course, having demonstrated to myself that it worked, if I ever had > the problem again, I might try the shortcut, just to demonstrate to my > own satisfaction the full theory that the effect of the shortcut was the > same as the effect of doing it the longer and more fool-proof way. But > of course I'd rather not have the opportunity to try that second-half > proof. =:^) > > Make sense? =:^) I was going to argue that my suggestion was hardly difficult to get right, but then I read that cp defaults to --reflink=always and that it is not possible to turn off reflinks (i.e., there is no --reflink=never). So then would have to consider alternatives like dd, and, well, you are right, I suppose :) . (Of course, with the *current* version of coreutils, the simple "mv somefile tmp_subvol/; mv tmp_subvol/somefile ." will still work.) -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup pgpo2SzLpOPXM.pgp Description: Digitale Signatur von OpenPGP