[PATCH] fs: encode_fh: return FILEID_INVALID if invalid fid_type
From: Namjae Jeon namjae.j...@samsung.com This patch is a follow up on below patch: [PATCH] exportfs: add FILEID_INVALID to indicate invalid fid_type commit: 216b6cbdcbd86b1db0754d58886b466ae31f5a63 Signed-off-by: Namjae Jeon namjae.j...@samsung.com Signed-off-by: Vivek Trivedi t.vi...@samsung.com Acked-by: Steven Whitehouse swhit...@redhat.com --- fs/btrfs/export.c |4 ++-- fs/ceph/export.c|4 ++-- fs/fuse/inode.c |2 +- fs/gfs2/export.c|4 ++-- fs/isofs/export.c |4 ++-- fs/nilfs2/namei.c |4 ++-- fs/ocfs2/export.c |4 ++-- fs/reiserfs/inode.c |4 ++-- fs/udf/namei.c |4 ++-- fs/xfs/xfs_export.c |4 ++-- mm/cleancache.c |2 +- mm/shmem.c |2 +- 12 files changed, 21 insertions(+), 21 deletions(-) diff --git a/fs/btrfs/export.c b/fs/btrfs/export.c index 614f34a..81ee29e 100644 --- a/fs/btrfs/export.c +++ b/fs/btrfs/export.c @@ -22,10 +22,10 @@ static int btrfs_encode_fh(struct inode *inode, u32 *fh, int *max_len, if (parent (len BTRFS_FID_SIZE_CONNECTABLE)) { *max_len = BTRFS_FID_SIZE_CONNECTABLE; - return 255; + return FILEID_INVALID; } else if (len BTRFS_FID_SIZE_NON_CONNECTABLE) { *max_len = BTRFS_FID_SIZE_NON_CONNECTABLE; - return 255; + return FILEID_INVALID; } len = BTRFS_FID_SIZE_NON_CONNECTABLE; diff --git a/fs/ceph/export.c b/fs/ceph/export.c index ca3ab3f..16796be 100644 --- a/fs/ceph/export.c +++ b/fs/ceph/export.c @@ -81,7 +81,7 @@ static int ceph_encode_fh(struct inode *inode, u32 *rawfh, int *max_len, if (parent_inode) { /* nfsd wants connectable */ *max_len = connected_handle_length; - type = 255; + type = FILEID_INVALID; } else { dout(encode_fh %p\n, dentry); fh-ino = ceph_ino(inode); @@ -90,7 +90,7 @@ static int ceph_encode_fh(struct inode *inode, u32 *rawfh, int *max_len, } } else { *max_len = handle_length; - type = 255; + type = FILEID_INVALID; } if (dentry) dput(dentry); diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index 9876a87..973e8f0 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -679,7 +679,7 @@ static int fuse_encode_fh(struct inode *inode, u32 *fh, int *max_len, if (*max_len len) { *max_len = len; - return 255; + return FILEID_INVALID; } nodeid = get_fuse_inode(inode)-nodeid; diff --git a/fs/gfs2/export.c b/fs/gfs2/export.c index 4767774..9973df4 100644 --- a/fs/gfs2/export.c +++ b/fs/gfs2/export.c @@ -37,10 +37,10 @@ static int gfs2_encode_fh(struct inode *inode, __u32 *p, int *len, if (parent (*len GFS2_LARGE_FH_SIZE)) { *len = GFS2_LARGE_FH_SIZE; - return 255; + return FILEID_INVALID; } else if (*len GFS2_SMALL_FH_SIZE) { *len = GFS2_SMALL_FH_SIZE; - return 255; + return FILEID_INVALID; } fh[0] = cpu_to_be32(ip-i_no_formal_ino 32); diff --git a/fs/isofs/export.c b/fs/isofs/export.c index 2b4f235..12088d8 100644 --- a/fs/isofs/export.c +++ b/fs/isofs/export.c @@ -125,10 +125,10 @@ isofs_export_encode_fh(struct inode *inode, */ if (parent (len 5)) { *max_len = 5; - return 255; + return FILEID_INVALID; } else if (len 3) { *max_len = 3; - return 255; + return FILEID_INVALID; } len = 3; diff --git a/fs/nilfs2/namei.c b/fs/nilfs2/namei.c index 1d0c0b8..9de78f0 100644 --- a/fs/nilfs2/namei.c +++ b/fs/nilfs2/namei.c @@ -517,11 +517,11 @@ static int nilfs_encode_fh(struct inode *inode, __u32 *fh, int *lenp, if (parent *lenp NILFS_FID_SIZE_CONNECTABLE) { *lenp = NILFS_FID_SIZE_CONNECTABLE; - return 255; + return FILEID_INVALID; } if (*lenp NILFS_FID_SIZE_NON_CONNECTABLE) { *lenp = NILFS_FID_SIZE_NON_CONNECTABLE; - return 255; + return FILEID_INVALID; } fid-cno = root-cno; diff --git a/fs/ocfs2/export.c b/fs/ocfs2/export.c index 322216a..2965116 100644 --- a/fs/ocfs2/export.c +++ b/fs/ocfs2/export.c @@ -195,11 +195,11 @@ static int ocfs2_encode_fh(struct inode *inode, u32 *fh_in, int *max_len, if (parent (len 6)) { *max_len = 6; - type = 255; + type = FILEID_INVALID; goto bail; } else if (len 3) { *max_len = 3; - type = 255; + type = FILEID_INVALID; goto bail; } diff --git
[PATCH] btrfs: accept zero for balance usage filter
The condition can be relaxed to accept also 0 which will delete unoccupied chunks and does not need space for the actual data relocation. Until there is an automatic empty chunk reclaim, we can use this as a last resort option under enospc. CC: Ilya Dryomov idryo...@gmail.com Signed-off-by: David Sterba dste...@suse.cz --- Also needs progs update, but is not required for the kernel side. fs/btrfs/volumes.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 5cbb7f4..fb871d0 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2623,7 +2623,10 @@ static int chunk_usage_filter(struct btrfs_fs_info *fs_info, u64 chunk_offset, user_thresh = div_factor_fine(cache-key.offset, bargs-usage); - if (chunk_used user_thresh) + pr_debug(btrfs: chunk_used %llu, user_thresh %llu\n, + (unsigned long long)chunk_used, + (unsigned long long)user_thresh); + if (chunk_used = user_thresh) ret = 0; btrfs_put_block_group(cache); -- 1.7.9 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid1 not working as expected
On Mon, Feb 11, 2013 at 04:28:43PM +0400, Alexey Polyakov wrote: Hi! I created filesystem with two devices, 200gb partition each, raid1 both for data and metadata, and 16k leaf and node size. After using it for day or two I got checksum error on one file. scrub reported about 70 more similar errors. Did you use a kernel version between 3.8-rc1 and 3.8-rc5 ? david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Diff using send-receive code
On Sun, Feb 10, 2013 at 10:21:31PM +0530, nafisa mandliwala wrote: Hello, We're a team of 4 final year computer science students and are working on generating a diff between file system snapshots using the send receive code. Looks like a duplicated effort. Did you see Mark's initial proposal https://patchwork.kernel.org/patch/1921821/ and the latest version https://patchwork.kernel.org/patch/2069961/ ? Do you build on top of these patches or have your own kernel-side implementation? david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
btrfs-progs 0.20rc1.1, linux 3.7.6 - new filesystem: failed to recover relocation
I have just completed installing an archlinux machine with 2 btrfs partitions, one works fine (the rootfs), the other one I cannot mount. Even after recreating the filesystem I still cannot mount it. I receive te following output in my dmesg: [ 6526.037089] device fsid 74cc8eb8-f60a-4797-9ab3-0c8ac4fe847f devid 1 transid 3 /dev/vdb1 [ 6526.037763] btrfs: disk space caching is enabled [ 6526.038415] btrfs: failed to recover relocation [ 6526.039716] btrfs: open_ctree failed this is on a virtual machine using libvirt, a virtio hdd and kvm; btrfsck reports no errors. An image of this filesystem can be found here: http://home.react.nl/~sjon/bug-reports/btrfs/btrfs-failed-to-recover-relocation (29 KiB) After recreating the same filesystem again on a smaller (100Mb) filesystem btrfsck reports: Check tree block failed, want=139264, have=0 Check tree block failed, want=139264, have=0 Check tree block failed, want=139264, have=0 read block failed check_tree_block Couldn't read chunk root and btrfs-image segfaults on this. I have dumped the partition with dd, it can be found here: http://home.react.nl/~sjon/bug-reports/btrfs/vdb1.raw.gz (98 KiB). I couldn't find any recent reports mentioning these sort of problems. Thanks, Sjon Hortensius -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid1 not working as expected
On Mon, Feb 11, 2013 at 4:49 PM, David Sterba dste...@suse.cz wrote: On Mon, Feb 11, 2013 at 04:28:43PM +0400, Alexey Polyakov wrote: Did you use a kernel version between 3.8-rc1 and 3.8-rc5 ? I used 3.8-rc6 and 3.8-rc7 when working with fs in question, but also I mounted it under kernel that Ubuntu ships, version number is 3.8.0-4 - I'm not sure which vanilla version it is based on. -- Alexey -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs-progs 0.20rc1.1, linux 3.7.6 - new filesystem: failed to recover relocation
On Mon, Feb 11, 2013 at 02:09:56PM +0100, Sjon Hortensius wrote: I have just completed installing an archlinux machine with 2 btrfs partitions, one works fine (the rootfs), the other one I cannot mount. Even after recreating the filesystem I still cannot mount it. I receive te following output in my dmesg: [ 6526.037089] device fsid 74cc8eb8-f60a-4797-9ab3-0c8ac4fe847f devid 1 transid 3 /dev/vdb1 [ 6526.037763] btrfs: disk space caching is enabled [ 6526.038415] btrfs: failed to recover relocation [ 6526.039716] btrfs: open_ctree failed What was the size of the filesystem? btrfsck reports no errors. An image of this filesystem can be found here: http://home.react.nl/~sjon/bug-reports/btrfs/btrfs-failed-to-recover-relocation (29 KiB) Restoring the image produces a 4MB file, which is kind of too small, there's a patch for mkfs to prevent creating such a fs. I'll get to adding it to progs integration soon. david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fs: encode_fh: return FILEID_INVALID if invalid fid_type
Acked-by: Sage Weil s...@inktank.com On Mon, 11 Feb 2013, Namjae Jeon wrote: From: Namjae Jeon namjae.j...@samsung.com This patch is a follow up on below patch: [PATCH] exportfs: add FILEID_INVALID to indicate invalid fid_type commit: 216b6cbdcbd86b1db0754d58886b466ae31f5a63 Signed-off-by: Namjae Jeon namjae.j...@samsung.com Signed-off-by: Vivek Trivedi t.vi...@samsung.com Acked-by: Steven Whitehouse swhit...@redhat.com --- fs/btrfs/export.c |4 ++-- fs/ceph/export.c|4 ++-- fs/fuse/inode.c |2 +- fs/gfs2/export.c|4 ++-- fs/isofs/export.c |4 ++-- fs/nilfs2/namei.c |4 ++-- fs/ocfs2/export.c |4 ++-- fs/reiserfs/inode.c |4 ++-- fs/udf/namei.c |4 ++-- fs/xfs/xfs_export.c |4 ++-- mm/cleancache.c |2 +- mm/shmem.c |2 +- 12 files changed, 21 insertions(+), 21 deletions(-) diff --git a/fs/btrfs/export.c b/fs/btrfs/export.c index 614f34a..81ee29e 100644 --- a/fs/btrfs/export.c +++ b/fs/btrfs/export.c @@ -22,10 +22,10 @@ static int btrfs_encode_fh(struct inode *inode, u32 *fh, int *max_len, if (parent (len BTRFS_FID_SIZE_CONNECTABLE)) { *max_len = BTRFS_FID_SIZE_CONNECTABLE; - return 255; + return FILEID_INVALID; } else if (len BTRFS_FID_SIZE_NON_CONNECTABLE) { *max_len = BTRFS_FID_SIZE_NON_CONNECTABLE; - return 255; + return FILEID_INVALID; } len = BTRFS_FID_SIZE_NON_CONNECTABLE; diff --git a/fs/ceph/export.c b/fs/ceph/export.c index ca3ab3f..16796be 100644 --- a/fs/ceph/export.c +++ b/fs/ceph/export.c @@ -81,7 +81,7 @@ static int ceph_encode_fh(struct inode *inode, u32 *rawfh, int *max_len, if (parent_inode) { /* nfsd wants connectable */ *max_len = connected_handle_length; - type = 255; + type = FILEID_INVALID; } else { dout(encode_fh %p\n, dentry); fh-ino = ceph_ino(inode); @@ -90,7 +90,7 @@ static int ceph_encode_fh(struct inode *inode, u32 *rawfh, int *max_len, } } else { *max_len = handle_length; - type = 255; + type = FILEID_INVALID; } if (dentry) dput(dentry); diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index 9876a87..973e8f0 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -679,7 +679,7 @@ static int fuse_encode_fh(struct inode *inode, u32 *fh, int *max_len, if (*max_len len) { *max_len = len; - return 255; + return FILEID_INVALID; } nodeid = get_fuse_inode(inode)-nodeid; diff --git a/fs/gfs2/export.c b/fs/gfs2/export.c index 4767774..9973df4 100644 --- a/fs/gfs2/export.c +++ b/fs/gfs2/export.c @@ -37,10 +37,10 @@ static int gfs2_encode_fh(struct inode *inode, __u32 *p, int *len, if (parent (*len GFS2_LARGE_FH_SIZE)) { *len = GFS2_LARGE_FH_SIZE; - return 255; + return FILEID_INVALID; } else if (*len GFS2_SMALL_FH_SIZE) { *len = GFS2_SMALL_FH_SIZE; - return 255; + return FILEID_INVALID; } fh[0] = cpu_to_be32(ip-i_no_formal_ino 32); diff --git a/fs/isofs/export.c b/fs/isofs/export.c index 2b4f235..12088d8 100644 --- a/fs/isofs/export.c +++ b/fs/isofs/export.c @@ -125,10 +125,10 @@ isofs_export_encode_fh(struct inode *inode, */ if (parent (len 5)) { *max_len = 5; - return 255; + return FILEID_INVALID; } else if (len 3) { *max_len = 3; - return 255; + return FILEID_INVALID; } len = 3; diff --git a/fs/nilfs2/namei.c b/fs/nilfs2/namei.c index 1d0c0b8..9de78f0 100644 --- a/fs/nilfs2/namei.c +++ b/fs/nilfs2/namei.c @@ -517,11 +517,11 @@ static int nilfs_encode_fh(struct inode *inode, __u32 *fh, int *lenp, if (parent *lenp NILFS_FID_SIZE_CONNECTABLE) { *lenp = NILFS_FID_SIZE_CONNECTABLE; - return 255; + return FILEID_INVALID; } if (*lenp NILFS_FID_SIZE_NON_CONNECTABLE) { *lenp = NILFS_FID_SIZE_NON_CONNECTABLE; - return 255; + return FILEID_INVALID; } fid-cno = root-cno; diff --git a/fs/ocfs2/export.c b/fs/ocfs2/export.c index 322216a..2965116 100644 --- a/fs/ocfs2/export.c +++ b/fs/ocfs2/export.c @@ -195,11 +195,11 @@ static int ocfs2_encode_fh(struct inode *inode, u32 *fh_in, int *max_len, if (parent (len 6)) { *max_len = 6; - type = 255; + type = FILEID_INVALID; goto bail; } else if (len 3) { *max_len = 3; - type = 255; +
Re: btrfs-progs 0.20rc1.1, linux 3.7.6 - new filesystem: failed to recover relocation
On Mon, Feb 11, 2013 at 2:27 PM, David Sterba dste...@suse.cz wrote: On Mon, Feb 11, 2013 at 02:09:56PM +0100, Sjon Hortensius wrote: I have just completed installing an archlinux machine with 2 btrfs partitions, one works fine (the rootfs), the other one I cannot mount. Even after recreating the filesystem I still cannot mount it. I receive te following output in my dmesg: [ 6526.037089] device fsid 74cc8eb8-f60a-4797-9ab3-0c8ac4fe847f devid 1 transid 3 /dev/vdb1 [ 6526.037763] btrfs: disk space caching is enabled [ 6526.038415] btrfs: failed to recover relocation [ 6526.039716] btrfs: open_ctree failed What was the size of the filesystem? The first filesystem (of which I created the image) was ~ 20 GiB; the dd dump is from a 100 MiB partition. btrfsck reports no errors. An image of this filesystem can be found here: http://home.react.nl/~sjon/bug-reports/btrfs/btrfs-failed-to-recover-relocation (29 KiB) Restoring the image produces a 4MB file, which is kind of too small, there's a patch for mkfs to prevent creating such a fs. I'll get to adding it to progs integration soon. david While debugging this my vm-host began throwing segfaults as well, so I think this problem was caused by something else (namely this: https://bugzilla.redhat.com/show_bug.cgi?id=893854). Thanks anyway. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Deleted subvolume reappears and other cleaner issues
On Thu, Jan 31, 2013 at 03:03:06PM +0200, Alex Lyakas wrote: # umount may get delayed because of pending-for-deletion subvolumes: btrfs_commit_super() locks the cleaner_mutex, so it will wait for the cleaner to complete. On the other hand, cleaner will not give up until it completes processing all its splice. If currently cleaner is not running, then btrfs_commit_super() calls btrfs_clean_old_snapshots() directly. So does it make sense: - btrfs_commit_super() will not call btrfs_clean_old_snapshots() - close_ctree() calls kthread_stop(cleaner_kthread) early, and cleaner thread periodically checks if it needs to exit This is on my list of annoyances for a long time and I have wip patches to fix that. I'll send what I have for review. david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: experimental raid5/6 code in git
On Sun, Feb 10, 2013 at 03:35:05PM -0700, Gordon Manning wrote: Hi, Is the BTRFS raid code susceptible to RAID-5 write holes? �I think with the original plan, the problem was avoided by always giving full stripe writes to the raid layers. �Does the current plan deal with the hole in a different manner? The current code in my git tree does not deal with the raid-5 write hole. That's the part I'm finishing off now. -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs-progs 0.20rc1.1, linux 3.7.6 - new filesystem: failed to recover relocation
After converting all my raw images to qcow2 the host no longer segfaults, but I still get a corrupted btrfs filesystem. Please have a look at this new image at http://home.react.nl/~sjon/bug-reports/btrfs/btrfs-corrupt.qcow2.gz (16 KiB) It contains a 20 GiB disk with 1 partition that btrfsck (again) has no problems with, but I cannot mount it. On Mon, Feb 11, 2013 at 3:08 PM, Sjon Hortensius s...@hortensius.net wrote: On Mon, Feb 11, 2013 at 2:27 PM, David Sterba dste...@suse.cz wrote: On Mon, Feb 11, 2013 at 02:09:56PM +0100, Sjon Hortensius wrote: I have just completed installing an archlinux machine with 2 btrfs partitions, one works fine (the rootfs), the other one I cannot mount. Even after recreating the filesystem I still cannot mount it. I receive te following output in my dmesg: [ 6526.037089] device fsid 74cc8eb8-f60a-4797-9ab3-0c8ac4fe847f devid 1 transid 3 /dev/vdb1 [ 6526.037763] btrfs: disk space caching is enabled [ 6526.038415] btrfs: failed to recover relocation [ 6526.039716] btrfs: open_ctree failed What was the size of the filesystem? The first filesystem (of which I created the image) was ~ 20 GiB; the dd dump is from a 100 MiB partition. btrfsck reports no errors. An image of this filesystem can be found here: http://home.react.nl/~sjon/bug-reports/btrfs/btrfs-failed-to-recover-relocation (29 KiB) Restoring the image produces a 4MB file, which is kind of too small, there's a patch for mkfs to prevent creating such a fs. I'll get to adding it to progs integration soon. david While debugging this my vm-host began throwing segfaults as well, so I think this problem was caused by something else (namely this: https://bugzilla.redhat.com/show_bug.cgi?id=893854). Thanks anyway. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fwd: Current State of BTRFS
On Fri, Feb 08, 2013 at 04:12:17PM -0700, Florian Hofmann wrote: I ran it as root. The first times there was no output whatsoever. This time triggering gave the SysRq : Show Blocked State line. Strange Well that's weird, it looks like you don't have any blocked tasks, so either sysrq+w is screwing up or you are getting stuck in something CPU intensive. Try doing sysrq+w next time it happens again and also run top and see if something is using up 100% of the CPU. If it's something chewing up CPU then I'll tell you how to figure out what's going on. Thanks, Josef -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC][PATCH] btrfs: clean snapshots one by one
Each time pick one dead root from the list and let the caller know if it's needed to continue. This should improve responsiveness during umount and balance which at some point wait for cleaning all currently queued dead roots. A new dead root is added to the end of the list, so the snapshots disappear in the order of deletion. Process snapshot cleaning is now done only from the cleaner thread and the others wake it if needed. Signed-off-by: David Sterba dste...@suse.cz --- * btrfs_clean_old_snapshots is removed from the reloc loop, I don't know if this is safe wrt reloc's assumptions * btrfs_run_delayed_iputs is left in place in super_commit, may get removed as well because transaction commit calls it in the end * the responsiveness can be improved further if btrfs_drop_snapshot check fs_closing, but this needs changes to error handling in the main reloc loop fs/btrfs/disk-io.c |8 -- fs/btrfs/relocation.c |3 -- fs/btrfs/transaction.c | 57 fs/btrfs/transaction.h |2 +- 4 files changed, 44 insertions(+), 26 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 51bff86..6a02336 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1635,15 +1635,17 @@ static int cleaner_kthread(void *arg) struct btrfs_root *root = arg; do { + int again = 0; + if (!(root-fs_info-sb-s_flags MS_RDONLY) mutex_trylock(root-fs_info-cleaner_mutex)) { btrfs_run_delayed_iputs(root); - btrfs_clean_old_snapshots(root); + again = btrfs_clean_one_deleted_snapshot(root); mutex_unlock(root-fs_info-cleaner_mutex); btrfs_run_defrag_inodes(root-fs_info); } - if (!try_to_freeze()) { + if (!try_to_freeze() !again) { set_current_state(TASK_INTERRUPTIBLE); if (!kthread_should_stop()) schedule(); @@ -3301,8 +3303,8 @@ int btrfs_commit_super(struct btrfs_root *root) mutex_lock(root-fs_info-cleaner_mutex); btrfs_run_delayed_iputs(root); - btrfs_clean_old_snapshots(root); mutex_unlock(root-fs_info-cleaner_mutex); + wake_up_process(root-fs_info-cleaner_kthread); /* wait until ongoing cleanup work done */ down_write(root-fs_info-cleanup_work_sem); diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index ba5a321..ab6a718 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -4060,10 +4060,7 @@ int btrfs_relocate_block_group(struct btrfs_root *extent_root, u64 group_start) while (1) { mutex_lock(fs_info-cleaner_mutex); - - btrfs_clean_old_snapshots(fs_info-tree_root); ret = relocate_block_group(rc); - mutex_unlock(fs_info-cleaner_mutex); if (ret 0) { err = ret; diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index 361fb7d..f1e3606 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -895,7 +895,7 @@ static noinline int commit_cowonly_roots(struct btrfs_trans_handle *trans, int btrfs_add_dead_root(struct btrfs_root *root) { spin_lock(root-fs_info-trans_lock); - list_add(root-root_list, root-fs_info-dead_roots); + list_add_tail(root-root_list, root-fs_info-dead_roots); spin_unlock(root-fs_info-trans_lock); return 0; } @@ -1783,31 +1783,50 @@ cleanup_transaction: } /* - * interface function to delete all the snapshots we have scheduled for deletion + * return 0 if error + * 0 if there are no more dead_roots at the time of call + * 1 there are more to be processed, call me again + * + * The return value indicates there are certainly more snapshots to delete, but + * if there comes a new one during processing, it may return 0. We don't mind, + * because btrfs_commit_super will poke cleaner thread and it will process it a + * few seconds later. */ -int btrfs_clean_old_snapshots(struct btrfs_root *root) +int btrfs_clean_one_deleted_snapshot(struct btrfs_root *root) { - LIST_HEAD(list); + int ret; + int run_again = 1; struct btrfs_fs_info *fs_info = root-fs_info; + if (root-fs_info-sb-s_flags MS_RDONLY) { + pr_debug(G btrfs: cleaner called for RO fs!\n); + return 0; + } + spin_lock(fs_info-trans_lock); - list_splice_init(fs_info-dead_roots, list); + if (list_empty(fs_info-dead_roots)) { + spin_unlock(fs_info-trans_lock); + return 0; + } + root = list_first_entry(fs_info-dead_roots, + struct btrfs_root, root_list); + list_del(root-root_list); spin_unlock(fs_info-trans_lock); - while (!list_empty(list)) { -
Re: Fwd: Current State of BTRFS
On Mon, Feb 11, 2013 at 4:05 PM, Josef Bacik jba...@fusionio.com wrote: On Fri, Feb 08, 2013 at 04:12:17PM -0700, Florian Hofmann wrote: I ran it as root. The first times there was no output whatsoever. This time triggering gave the SysRq : Show Blocked State line. Strange Well that's weird, it looks like you don't have any blocked tasks, so either sysrq+w is screwing up or you are getting stuck in something CPU intensive. Try doing sysrq+w next time it happens again and also run top and see if something is using up 100% of the CPU. If it's something chewing up CPU then I'll tell you how to figure out what's going on. Thanks, I noticed you have the FS mounted with compress flag (compress=lzo). Could it be that your CPU is bottle-necking the process? Josef -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Caution: breathing may be hazardous to your health. #include stdio.h int main(){printf(%s,\x4c\x65\x6f\x6e\x69\x64\x61\x73);} -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Oops when mounting btrfs partition
On Friday 08 February 2013, David Sterba wrote: On Mon, Feb 04, 2013 at 09:55:50PM +, Arnd Bergmann wrote: On Saturday 02 February 2013, Chris Mason wrote: I've done a full backup of all data now, without any further Ooops messages, but I did get these: [66155.429029] btrfs no csum found for inode 1212139 start 23707648 The missing csums were caused by a bug introcuded in 3.8-rc1 and fixed in rc5. Ok, thanks for the information. Arnd -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs: add cancellation points to defrag
On Mon, Feb 11, 2013 at 10:59:54AM -0600, Eric Sandeen wrote: On 2/9/13 5:38 PM, David Sterba wrote: The defrag operation can take very long, we want to have a way how to cancel it. The code checks for a pending signal at safe points in the defrag loops and returns EAGAIN. This means a user can press ^C after running 'btrfs fi defrag', woks for both defrag modes, files and root. Returning from the command was instant in my light tests, but may take longer depending on the aging factor of the filesystem. When __btrfs_run_defrag_inode() calls btrfs_defrag_file() and gets -EAGAIN back due to the cancellation, will it reset the defrag- counters and call btrfs_requeue_inode_defrag()? Is that ok? Should __btrfs_run_defrag_inode explicitly check for and handle an actual error returned to it? __btrfs_run_defrag_inode - btrfs_defrag_file applies only in case of autodefrag. The ioctl 'defrag' goes directly to btrfs_defrag_file and what you describe does not happen here. (The autodefrag loop runs within kernel threads and I want to avoid enabling signals for them.) I agree that the negative error code should be handled in __btrfs_run_defrag_inode (there are two of them EINVAL and ENOMEM). Requeing defrag might make sense in theory in the ENOMEM case, but triggering more activity when the system is low on memory is not practical. david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs: add cancellation points to defrag
On 2/11/13 11:45 AM, David Sterba wrote: On Mon, Feb 11, 2013 at 10:59:54AM -0600, Eric Sandeen wrote: On 2/9/13 5:38 PM, David Sterba wrote: The defrag operation can take very long, we want to have a way how to cancel it. The code checks for a pending signal at safe points in the defrag loops and returns EAGAIN. This means a user can press ^C after running 'btrfs fi defrag', woks for both defrag modes, files and root. Returning from the command was instant in my light tests, but may take longer depending on the aging factor of the filesystem. When __btrfs_run_defrag_inode() calls btrfs_defrag_file() and gets -EAGAIN back due to the cancellation, will it reset the defrag- counters and call btrfs_requeue_inode_defrag()? Is that ok? Should __btrfs_run_defrag_inode explicitly check for and handle an actual error returned to it? __btrfs_run_defrag_inode - btrfs_defrag_file applies only in case of autodefrag. The ioctl 'defrag' goes directly to btrfs_defrag_file and what you describe does not happen here. Ok, was thinking that might be the case, but wasn't sure what all that worker thread handled. So I guess there should be no signals in that case. (The autodefrag loop runs within kernel threads and I want to avoid enabling signals for them.) Understood. I agree that the negative error code should be handled in __btrfs_run_defrag_inode (there are two of them EINVAL and ENOMEM). Requeing defrag might make sense in theory in the ENOMEM case, but triggering more activity when the system is low on memory is not practical. *nod* - but a separate issue, I guess. Thanks, -Eric david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs: add cancellation points to defrag
On Mon, Feb 11, 2013 at 11:48:20AM -0600, Eric Sandeen wrote: On 2/11/13 11:45 AM, David Sterba wrote: __btrfs_run_defrag_inode - btrfs_defrag_file applies only in case of autodefrag. The ioctl 'defrag' goes directly to btrfs_defrag_file and what you describe does not happen here. Ok, was thinking that might be the case, but wasn't sure what all that worker thread handled. So I guess there should be no signals in that case. Just for the record, btrfs_run_defrag_inodes is called from cleaner thread, so if we had a cleaner way to inform the thread to stop the defrag it'd be possible to stop autodefrag as well. I tested defrag of a 1G file with ~90k of extents (produced by overwriting random 4k ranges in the file) and then doing sequential rewrite of the file. Cpu and IO activity went expectedly high and in case of many files under defrag it'd be more flexible to actually control the internal defrag as well. Not by a signal, because in case of more mounted filesystems one cannot know which process to target. I agree that the negative error code should be handled in __btrfs_run_defrag_inode (there are two of them EINVAL and ENOMEM). Requeing defrag might make sense in theory in the ENOMEM case, but triggering more activity when the system is low on memory is not practical. *nod* - but a separate issue, I guess. Yes. david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: fix the deadlock between the transaction attach and commit
On Thu, Feb 07, 2013 at 11:55:51PM -0700, Miao Xie wrote: Here is the whole story: Trans_Attach_Task Trans_Commit_Task btrfs_commit_transaction() |-wait writers to be 1 btrfs_attach_transaction() | btrfs_commit_transaction() | | |-set trans_no_join to 1 | | (close join transaction) |-btrfs_run_ordered_operations | (Those ordered operations| are added when releasing| file) | |-btrfs_join_transaction() | |-wait_commit() | |-wait writers to be 1 Then these two tasks waited for each other. As we know, btrfs_attach_transaction() is used to catch the current transaction, and commit it, so if someone has committed the transaction, it is unnecessary to join it and commit it, wait is the best choice for it. In this way, we can fix the above problem. Signed-off-by: Miao Xie mi...@cn.fujitsu.com This caused another problem [ 8050.503904] btrfs-transacti D 0 5546 2 0x0080 [ 8050.503913] 88037bfb9d18 0046 88037bfb9cb8 810c6d4d [ 8050.503924] 88037c4d8000 88037bfb9fd8 88037bfb9fd8 88037bfb9fd8 [ 8050.503933] 88042f17a000 88037c4d8000 88042c33b000 88037ba0bdb8 [ 8050.503943] Call Trace: [ 8050.503953] [810c6d4d] ? trace_hardirqs_on+0xd/0x10 [ 8050.503962] [816507c9] schedule+0x29/0x70 [ 8050.504002] [a084eb75] wait_current_trans+0xb5/0x110 [btrfs] [ 8050.504011] [810891f0] ? __init_waitqueue_head+0x60/0x60 [ 8050.504047] [a08503c0] start_transaction+0x160/0x4e0 [btrfs] [ 8050.504082] [a0850757] btrfs_attach_transaction+0x17/0x20 [btrfs] [ 8050.504114] [a084857a] transaction_kthread+0x15a/0x240 [btrfs] [ 8050.504147] [a0848420] ? btrfs_destroy_delayed_refs+0x330/0x330 [btrfs] [ 8050.504155] [8108883a] kthread+0xea/0xf0 [ 8050.504166] [81088750] ? flush_kthread_worker+0x150/0x150 [ 8050.504175] [8165a06c] ret_from_fork+0x7c/0xb0 [ 8050.504183] [81088750] ? flush_kthread_worker+0x150/0x150 [ 8050.504189] syncD 0 5572 5342 0x0080 [ 8050.504198] 88037c235dd8 0046 88037c235d78 810c6d4d [ 8050.504207] 88037ca8a000 88037c235fd8 88037c235fd8 88037c235fd8 [ 8050.504217] 88042f184000 88037ca8a000 88042c33b000 88037ba0bdb8 [ 8050.504227] Call Trace: [ 8050.504236] [810c6d4d] ? trace_hardirqs_on+0xd/0x10 [ 8050.504245] [816507c9] schedule+0x29/0x70 [ 8050.504278] [a084eb75] wait_current_trans+0xb5/0x110 [btrfs] [ 8050.504287] [810891f0] ? __init_waitqueue_head+0x60/0x60 [ 8050.504322] [a08503c0] start_transaction+0x160/0x4e0 [btrfs] [ 8050.504360] [a0866d94] ? btrfs_wait_ordered_extents+0x174/0x230 [btrfs] [ 8050.504395] [a0850757] btrfs_attach_transaction+0x17/0x20 [btrfs] [ 8050.504420] [a0820133] btrfs_sync_fs+0x53/0x130 [btrfs] [ 8050.504430] [811cac30] ? __sync_filesystem+0x60/0x60 [ 8050.504438] [811cac30] ? __sync_filesystem+0x60/0x60 [ 8050.504447] [811cac50] sync_fs_one_sb+0x20/0x30 [ 8050.504455] [8119e0c1] iterate_supers+0xf1/0x100 [ 8050.504463] [811cad25] sys_sync+0x55/0x90 [ 8050.504472] [8165a119] system_call_fastpath+0x16/0x1b So we're getting stuck in the if (may_wait_transaction()) wait_current_trans(); thing. If we set blocked in __btrfs_end_transaction we'll just sit there forever because nobody can actually commit the transaction. Probably need to change this to if (type == TRANS_ATTACH trans-in_commit) or something like that. Me and kdave reproduced by running 274 in a loop, it happpened pretty quick. I'd fix it myself but I have to leave my house for people to come look at it. If you haven't fixed this by tomorrow I'll fix it up. Thanks, Josef -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fwd: Current State of BTRFS
@Leonidas: I don't think so ... this is running on a i7 core and lzo is pretty damn fast. 2013/2/11 Leonidas Spyropoulos artafi...@gmail.com: On Mon, Feb 11, 2013 at 4:05 PM, Josef Bacik jba...@fusionio.com wrote: On Fri, Feb 08, 2013 at 04:12:17PM -0700, Florian Hofmann wrote: I ran it as root. The first times there was no output whatsoever. This time triggering gave the SysRq : Show Blocked State line. Strange Well that's weird, it looks like you don't have any blocked tasks, so either sysrq+w is screwing up or you are getting stuck in something CPU intensive. Try doing sysrq+w next time it happens again and also run top and see if something is using up 100% of the CPU. If it's something chewing up CPU then I'll tell you how to figure out what's going on. Thanks, I noticed you have the FS mounted with compress flag (compress=lzo). Could it be that your CPU is bottle-necking the process? Josef -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Caution: breathing may be hazardous to your health. #include stdio.h int main(){printf(%s,\x4c\x65\x6f\x6e\x69\x64\x61\x73);} -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fs: encode_fh: return FILEID_INVALID if invalid fid_type
On Mon, Feb 11, 2013 at 05:25:58PM +0900, Namjae Jeon wrote: From: Namjae Jeon namjae.j...@samsung.com This patch is a follow up on below patch: [PATCH] exportfs: add FILEID_INVALID to indicate invalid fid_type commit: 216b6cbdcbd86b1db0754d58886b466ae31f5a63 diff --git a/fs/xfs/xfs_export.c b/fs/xfs/xfs_export.c index a836118..3391800 100644 --- a/fs/xfs/xfs_export.c +++ b/fs/xfs/xfs_export.c @@ -48,7 +48,7 @@ static int xfs_fileid_length(int fileid_type) case FILEID_INO32_GEN_PARENT | XFS_FILEID_TYPE_64FLAG: return 6; } - return 255; /* invalid */ + return FILEID_INVALID; /* invalid */ } I think you can drop the /* invalid */ comment from there now as it is redundant with this change. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fs: encode_fh: return FILEID_INVALID if invalid fid_type
2013/2/12, Dave Chinner da...@fromorbit.com: On Mon, Feb 11, 2013 at 05:25:58PM +0900, Namjae Jeon wrote: From: Namjae Jeon namjae.j...@samsung.com This patch is a follow up on below patch: [PATCH] exportfs: add FILEID_INVALID to indicate invalid fid_type commit: 216b6cbdcbd86b1db0754d58886b466ae31f5a63 diff --git a/fs/xfs/xfs_export.c b/fs/xfs/xfs_export.c index a836118..3391800 100644 --- a/fs/xfs/xfs_export.c +++ b/fs/xfs/xfs_export.c @@ -48,7 +48,7 @@ static int xfs_fileid_length(int fileid_type) case FILEID_INO32_GEN_PARENT | XFS_FILEID_TYPE_64FLAG: return 6; } -return 255; /* invalid */ +return FILEID_INVALID; /* invalid */ } I think you can drop the /* invalid */ comment from there now as it is redundant with this change. Okay, Thanks for review :-) Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fwd: Current State of BTRFS
Hi Florian, On 10/02/13 00:35, Florian Hofmann wrote: (Sadly) No SSD. I think Marc was asking as they can cause issues, so don't be sad. :-) I just upgraded from kernel version 3.7.5 to 3.7.6 (running Arch Linux) with no change in behavior. There are rarely any updates to btrfs in stable kernel releases, if you want to try the current state of the code you'll need to try 3.8-rc7. cheers, Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs-progs: check out if the swap device
Currently, the following commands succeed. # cat /proc/swaps FilenameTypeSizeUsed Priority /dev/sda3 partition 8388604 0 -1 /dev/sdc8 partition 9765884 0 -2 # mkfs.btrfs /dev/sdc8 WARNING! - Btrfs v0.20-rc1-165-g82ac345 IS EXPERIMENTAL WARNING! - see http://btrfs.wiki.kernel.org before using fs created label (null) on /dev/sdc8 nodesize 4096 leafsize 4096 sectorsize 4096 size 9.31GB Btrfs v0.20-rc1-165-g82ac345 # btrfs fi sh /dev/sdc8 Label: none uuid: fc0bdbd0-7eed-460f-b4e9-131273b66df2 Total devices 1 FS bytes used 28.00KB devid1 size 9.31GB used 989.62MB path /dev/sdc8 Btrfs v0.20-rc1-165-g82ac345 # But we should check out the swap device. So fixed it. Signed-off-by: Tsutomu Itoh t-i...@jp.fujitsu.com --- (this patch is based on Chris's raid56-experimental branch) --- mkfs.c | 18 ++ utils.c | 49 + utils.h | 1 + 3 files changed, 68 insertions(+) diff --git a/mkfs.c b/mkfs.c index 2d3c2af..fdc3373 100644 --- a/mkfs.c +++ b/mkfs.c @@ -1366,6 +1366,15 @@ int main(int ac, char **av) if (source_dir == 0) { file = av[optind++]; + ret = is_swap_device(file); + if (ret 0) { + fprintf(stderr, error checking %s status\n, file); + exit(1); + } + if (ret == 1) { + fprintf(stderr, %s is a swap device\n, file); + exit(1); + } ret = check_mounted(file); if (ret 0) { fprintf(stderr, error checking %s mount status\n, file); @@ -1461,6 +1470,15 @@ int main(int ac, char **av) int old_mixed = mixed; file = av[optind++]; + ret = is_swap_device(file); + if (ret 0) { + fprintf(stderr, error checking %s status\n, file); + exit(1); + } + if (ret == 1) { + fprintf(stderr, %s is a swap device\n, file); + exit(1); + } ret = check_mounted(file); if (ret 0) { fprintf(stderr, error checking %s mount status\n, diff --git a/utils.c b/utils.c index f9ee812..0c551a0 100644 --- a/utils.c +++ b/utils.c @@ -1386,3 +1386,52 @@ int get_fs_info(int fd, char *path, struct btrfs_ioctl_fs_info_args *fi_args, return 0; } + +/* + * Checks if the swap device or not. + * Returns 1 if the swap device, 0 on error or 0 if not the swap device. + */ +int is_swap_device(const char *file) +{ + FILE*f; + struct stat st_buf; + charbuf[1024]; + char*cp; + dev_t rdev; + int ret = 0; + + if (stat(file, st_buf) 0) + return -errno; + if (!S_ISBLK(st_buf.st_mode)) + return 0; + + rdev = st_buf.st_rdev; + + if ((f = fopen(/proc/swaps, r)) == NULL) + return -errno; + + /* skip the first line */ + if (fgets(buf, sizeof(buf), f) == NULL) + goto out; + + while (fgets(buf, sizeof(buf), f) != NULL) { + if ((cp = strchr(buf, ' ')) != NULL) + *cp = 0; + if ((cp = strchr(buf, '\t')) != NULL) + *cp = 0; + if (strcmp(file, buf) == 0) { + ret = 1; + break; + } + if ((stat(buf, st_buf) == 0) S_ISBLK(st_buf.st_mode) + rdev == st_buf.st_rdev) { + ret = 1; + break; + } + } + +out: + fclose(f); + + return ret; +} diff --git a/utils.h b/utils.h index bbcaf6a..60a0fea 100644 --- a/utils.h +++ b/utils.h @@ -55,6 +55,7 @@ int get_fs_info(int fd, char *path, struct btrfs_ioctl_fs_info_args *fi_args, struct btrfs_ioctl_dev_info_args **di_ret); char *__strncpy__null(char *dest, const char *src, size_t n); +int is_swap_device(const char *file); /* Helper to always get proper size of the destination string */ #define strncpy_null(dest, src) __strncpy__null(dest, src, sizeof(dest)) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs-progs: check out if the swap device
On 2/11/13 7:25 PM, Tsutomu Itoh wrote: Currently, the following commands succeed. # cat /proc/swaps FilenameTypeSizeUsed Priority /dev/sda3 partition 8388604 0 -1 /dev/sdc8 partition 9765884 0 -2 # mkfs.btrfs /dev/sdc8 WARNING! - Btrfs v0.20-rc1-165-g82ac345 IS EXPERIMENTAL WARNING! - see http://btrfs.wiki.kernel.org before using fs created label (null) on /dev/sdc8 nodesize 4096 leafsize 4096 sectorsize 4096 size 9.31GB Btrfs v0.20-rc1-165-g82ac345 # btrfs fi sh /dev/sdc8 Label: none uuid: fc0bdbd0-7eed-460f-b4e9-131273b66df2 Total devices 1 FS bytes used 28.00KB devid1 size 9.31GB used 989.62MB path /dev/sdc8 Btrfs v0.20-rc1-165-g82ac345 # But we should check out the swap device. So fixed it. I guess it's nice to parse /proc/swaps to be able to offer the helpful error message in this case. (though I wonder how long /proc/swaps will be available, and in this format? Does it count as ABI?) Your implementation looks just like the one in e2fsprogs, so it should work fine. But I also wonder if overall it would be safest to open the device O_EXCL, which would fail with EBUSY if it were in use for swap, or mounted, or opened O_EXCL by another process for any other reason: [root@host tmp]# cat /proc/swaps FilenameTypeSizeUsedPriority /dev/sda3 partition 2048280 822616 -1 [root@host tmp]# strace -e open ./test open(/etc/ld.so.cache, O_RDONLY) = 3 open(/lib64/libc.so.6, O_RDONLY) = 3 open(/dev/sda3, O_RDWR|O_EXCL)= -1 EBUSY (Device or resource busy) open: Device or resource busy -Eric Signed-off-by: Tsutomu Itoh t-i...@jp.fujitsu.com --- (this patch is based on Chris's raid56-experimental branch) --- mkfs.c | 18 ++ utils.c | 49 + utils.h | 1 + 3 files changed, 68 insertions(+) diff --git a/mkfs.c b/mkfs.c index 2d3c2af..fdc3373 100644 --- a/mkfs.c +++ b/mkfs.c @@ -1366,6 +1366,15 @@ int main(int ac, char **av) if (source_dir == 0) { file = av[optind++]; + ret = is_swap_device(file); + if (ret 0) { + fprintf(stderr, error checking %s status\n, file); + exit(1); + } + if (ret == 1) { + fprintf(stderr, %s is a swap device\n, file); + exit(1); + } ret = check_mounted(file); if (ret 0) { fprintf(stderr, error checking %s mount status\n, file); @@ -1461,6 +1470,15 @@ int main(int ac, char **av) int old_mixed = mixed; file = av[optind++]; + ret = is_swap_device(file); + if (ret 0) { + fprintf(stderr, error checking %s status\n, file); + exit(1); + } + if (ret == 1) { + fprintf(stderr, %s is a swap device\n, file); + exit(1); + } ret = check_mounted(file); if (ret 0) { fprintf(stderr, error checking %s mount status\n, diff --git a/utils.c b/utils.c index f9ee812..0c551a0 100644 --- a/utils.c +++ b/utils.c @@ -1386,3 +1386,52 @@ int get_fs_info(int fd, char *path, struct btrfs_ioctl_fs_info_args *fi_args, return 0; } + +/* + * Checks if the swap device or not. + * Returns 1 if the swap device, 0 on error or 0 if not the swap device. + */ +int is_swap_device(const char *file) +{ + FILE*f; + struct stat st_buf; + charbuf[1024]; + char*cp; + dev_t rdev; + int ret = 0; + + if (stat(file, st_buf) 0) + return -errno; + if (!S_ISBLK(st_buf.st_mode)) + return 0; + + rdev = st_buf.st_rdev; + + if ((f = fopen(/proc/swaps, r)) == NULL) + return -errno; + + /* skip the first line */ + if (fgets(buf, sizeof(buf), f) == NULL) + goto out; + + while (fgets(buf, sizeof(buf), f) != NULL) { + if ((cp = strchr(buf, ' ')) != NULL) + *cp = 0; + if ((cp = strchr(buf, '\t')) != NULL) + *cp = 0; + if (strcmp(file, buf) == 0) { + ret = 1; + break; + } + if ((stat(buf, st_buf) == 0) S_ISBLK(st_buf.st_mode) + rdev == st_buf.st_rdev) { + ret = 1; + break; + } + } + +out: + fclose(f); + + return ret; +} diff --git a/utils.h b/utils.h index bbcaf6a..60a0fea 100644 ---
Re: [PATCH] Btrfs-progs: check out if the swap device
Hi, Eric, Thanks for your comment. On 2013/02/12 13:22, Eric Sandeen wrote: On 2/11/13 7:25 PM, Tsutomu Itoh wrote: Currently, the following commands succeed. # cat /proc/swaps FilenameTypeSizeUsed Priority /dev/sda3 partition 8388604 0 -1 /dev/sdc8 partition 9765884 0 -2 # mkfs.btrfs /dev/sdc8 WARNING! - Btrfs v0.20-rc1-165-g82ac345 IS EXPERIMENTAL WARNING! - see http://btrfs.wiki.kernel.org before using fs created label (null) on /dev/sdc8 nodesize 4096 leafsize 4096 sectorsize 4096 size 9.31GB Btrfs v0.20-rc1-165-g82ac345 # btrfs fi sh /dev/sdc8 Label: none uuid: fc0bdbd0-7eed-460f-b4e9-131273b66df2 Total devices 1 FS bytes used 28.00KB devid1 size 9.31GB used 989.62MB path /dev/sdc8 Btrfs v0.20-rc1-165-g82ac345 # But we should check out the swap device. So fixed it. I guess it's nice to parse /proc/swaps to be able to offer the helpful error message in this case. (though I wonder how long /proc/swaps will be available, and in this format? Does it count as ABI?) Umm, I don't know how long /proc/swaps will be available, too... Your implementation looks just like the one in e2fsprogs, so it should work fine. Yes. But I also wonder if overall it would be safest to open the device O_EXCL, which would fail with EBUSY if it were in use for swap, or mounted, or opened O_EXCL by another process for any other reason: But details of the error cannot be notified when O_EXCL is used. and, after is_swap_device(), check_mounted() check state of the mount or not. So, I chose this one. (read /proc/swaps) Thanks, Tsutomu [root@host tmp]# cat /proc/swaps FilenameTypeSizeUsedPriority /dev/sda3 partition 2048280 822616 -1 [root@host tmp]# strace -e open ./test open(/etc/ld.so.cache, O_RDONLY) = 3 open(/lib64/libc.so.6, O_RDONLY) = 3 open(/dev/sda3, O_RDWR|O_EXCL)= -1 EBUSY (Device or resource busy) open: Device or resource busy -Eric Signed-off-by: Tsutomu Itoh t-i...@jp.fujitsu.com --- (this patch is based on Chris's raid56-experimental branch) --- mkfs.c | 18 ++ utils.c | 49 + utils.h | 1 + 3 files changed, 68 insertions(+) diff --git a/mkfs.c b/mkfs.c index 2d3c2af..fdc3373 100644 --- a/mkfs.c +++ b/mkfs.c @@ -1366,6 +1366,15 @@ int main(int ac, char **av) if (source_dir == 0) { file = av[optind++]; + ret = is_swap_device(file); + if (ret 0) { + fprintf(stderr, error checking %s status\n, file); + exit(1); + } + if (ret == 1) { + fprintf(stderr, %s is a swap device\n, file); + exit(1); + } ret = check_mounted(file); if (ret 0) { fprintf(stderr, error checking %s mount status\n, file); @@ -1461,6 +1470,15 @@ int main(int ac, char **av) int old_mixed = mixed; file = av[optind++]; + ret = is_swap_device(file); + if (ret 0) { + fprintf(stderr, error checking %s status\n, file); + exit(1); + } + if (ret == 1) { + fprintf(stderr, %s is a swap device\n, file); + exit(1); + } ret = check_mounted(file); if (ret 0) { fprintf(stderr, error checking %s mount status\n, diff --git a/utils.c b/utils.c index f9ee812..0c551a0 100644 --- a/utils.c +++ b/utils.c @@ -1386,3 +1386,52 @@ int get_fs_info(int fd, char *path, struct btrfs_ioctl_fs_info_args *fi_args, return 0; } + +/* + * Checks if the swap device or not. + * Returns 1 if the swap device, 0 on error or 0 if not the swap device. + */ +int is_swap_device(const char *file) +{ + FILE*f; + struct stat st_buf; + charbuf[1024]; + char*cp; + dev_t rdev; + int ret = 0; + + if (stat(file, st_buf) 0) + return -errno; + if (!S_ISBLK(st_buf.st_mode)) + return 0; + + rdev = st_buf.st_rdev; + + if ((f = fopen(/proc/swaps, r)) == NULL) + return -errno; + + /* skip the first line */ + if (fgets(buf, sizeof(buf), f) == NULL) + goto out; + + while (fgets(buf, sizeof(buf), f) != NULL) { + if ((cp = strchr(buf, ' ')) != NULL) + *cp = 0; + if ((cp = strchr(buf, '\t')) != NULL) + *cp = 0; + if (strcmp(file, buf) == 0) { + ret = 1; +
Re: Heavy memory leak when using quota groups
Also immediately after this problem, its impossible to mount the filesystem. it consistently fails with [ 2092.254428] BUG: unable to handle kernel NULL pointer dereference at 03c4 [ 2092.255945] IP: [a033d0be] btrfs_search_old_slot+0x63e/0x940 [btrfs] [ 2092.257340] PGD 23d42067 PUD 3a93a067 PMD 0 [ 2092.257982] Oops: [#1] SMP [ 2092.257982] Modules linked in: raid1 xt_multiport xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi xfrm_user xfrm4_tunnel tunnel4 ipcomp xfrm_ipcomp esp4 ah4 8021q garp stp llc bonding btrfs(OF) deflate zlib_deflate ctr twofish_generic twofish_x86_64_3way twofish_x86_64 twofish_common camellia_generic camellia_x86_64 serpent_sse2_x86_64 glue_helper lrw serpent_generic xts gf128mul blowfish_generic blowfish_x86_64 blowfish_common ablk_helper cryptd cast5_generic cast_common des_generic xcbc rmd160 crypto_null af_key xfrm_algo scst_vdisk(OF) iscsi_scst(OF) scst(OF) libcrc32c microcode nfsv4 psmouse nfsd(OF) virtio_balloon nfs_acl serio_raw auth_rpcgss nfs fscache lockd sunrpc lp parport floppy ixgbevf [ 2092.257982] CPU 0 [ 2092.257982] Pid: 27156, comm: mount Tainted: GF O 3.8.0-030800rc5-generic #201301251535 Bochs Bochs [ 2092.257982] RIP: 0010:[a033d0be] [a033d0be] btrfs_search_old_slot+0x63e/0x940 [btrfs] [ 2092.257982] RSP: 0018:88003752f598 EFLAGS: 00010206 [ 2092.257982] RAX: RBX: 0001 RCX: 880017826560 [ 2092.257982] RDX: 0f83e0f83e0f83e1 RSI: 0066 RDI: 8800374bda00 [ 2092.257982] RBP: 88003752f628 R08: 880019dfc000 R09: 88003752f508 [ 2092.257982] R10: 000c R11: R12: 880018d60800 [ 2092.257982] R13: 88001c3bd900 R14: 88001c3ce158 R15: 8800 [ 2092.257982] FS: 7fdc62688800() GS:88003fc0() knlGS: [ 2092.257982] CS: 0010 DS: ES: CR0: 8005003b [ 2092.257982] CR2: 03c4 CR3: 3a91a000 CR4: 06f0 [ 2092.257982] DR0: DR1: DR2: [ 2092.257982] DR3: DR6: 0ff0 DR7: 0400 [ 2092.257982] Process mount (pid: 27156, threadinfo 88003752e000, task 880018ea5d00) [ 2092.257982] Stack: [ 2092.257982] 88003752f5c8 88003d554480 880017826560 880019dfc000 [ 2092.257982] 18d60800 880018729498 [ 2092.257982] 00dc 0001 88001c3ce158 1c3bd900 [ 2092.257982] Call Trace: [ 2092.257982] [a03b23f3] __resolve_indirect_refs+0x173/0x620 [btrfs] [ 2092.257982] [a037aa17] ? free_extent_buffer+0x37/0x90 [btrfs] [ 2092.257982] [a03b316a] find_parent_nodes+0x7da/0xf90 [btrfs] [ 2092.257982] [a03b39b9] btrfs_find_all_roots+0x99/0x100 [btrfs] [ 2092.257982] [81183beb] ? kfree+0x3b/0x150 [ 2092.257982] [a03b691b] btrfs_qgroup_account_ref+0xfb/0x550 [btrfs] [ 2092.257982] [a0346088] ? btrfs_delayed_refs_qgroup_accounting+0x58/0x100 [btrfs] [ 2092.257982] [81183cc4] ? kfree+0x114/0x150 [ 2092.257982] [a03460d3] btrfs_delayed_refs_qgroup_accounting+0xa3/0x100 [btrfs] [ 2092.257982] [a034d269] btrfs_run_delayed_refs+0x49/0x2f0 [btrfs] [ 2092.257982] [a0373f43] ? btrfs_run_ordered_operations+0x2b3/0x2e0 [btrfs] [ 2092.257982] [a035ce25] btrfs_commit_transaction+0x85/0xad0 [btrfs] [ 2092.257982] [a033c5de] ? btrfs_search_slot+0x2fe/0x7a0 [btrfs] [ 2092.257982] [8107fc70] ? add_wait_queue+0x60/0x60 [ 2092.257982] [81183d42] ? kmem_cache_free+0x42/0x160 [ 2092.257982] [a03754c1] ? release_extent_buffer.isra.26+0x81/0xf0 [btrfs] [ 2092.257982] [a0396aa5] btrfs_recover_log_trees+0x335/0x3b0 [btrfs] [ 2092.257982] [a03953d0] ? fixup_inode_link_counts+0x150/0x150 [btrfs] [ 2092.257982] [a035ae96] open_ctree+0x1646/0x1d70 [btrfs] [ 2092.257982] [a0333bbb] btrfs_mount+0x57b/0x670 [btrfs] [ 2092.257982] [8119e543] mount_fs+0x43/0x1b0 [ 2092.257982] [811b92e6] vfs_kern_mount+0x76/0x120 [ 2092.257982] [811ba761] do_new_mount+0xb1/0x1e0 [ 2092.257982] [811bbf76] do_mount+0x1b6/0x1f0 [ 2092.257982] [811bc040] sys_mount+0x90/0xe0 [ 2092.257982] [816f45dd] system_call_fastpath+0x1a/0x1f [ 2092.257982] Code: 00 48 03 10 48 89 d0 48 ba 00 00 00 00 00 88 ff ff 48 c1 f8 06 48 c1 e0 0c 8b 74 10 60 49 8b 40 40 48 ba e1 83 0f 3e f8 e0 83 0f 8b 80 c4 03 00 00 48 83 e8 65 48 f7 e2 48 d1 ea 48 39 d6 0f 87 [ 2092.257982] RIP [a033d0be] btrfs_search_old_slot+0x63e/0x940 [btrfs] [ 2092.257982] RSP 88003752f598 [ 2092.257982] CR2: 03c4 [ 2092.340821] ---[ end trace 25df2cb40c31fa55 ]--- I presume this is because of the partial failure before the
Re: [PATCH] btrfs: add delayed_iput list head to btrfs inode
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2/6/13 11:02 AM, Liu Bo wrote: On Wed, Feb 06, 2013 at 09:53:05AM -0600, Eric Sandeen wrote: On 2/5/13 8:08 PM, Liu Bo wrote: On Tue, Feb 05, 2013 at 03:14:05PM -0800, Zach Brown wrote: + struct btrfs_inode *b_inode = BTRFS_I(inode); + struct btrfs_fs_info *fs_info = b_inode-root-fs_info; if (atomic_add_unless(inode-i_count, -1, 1)) return; - delayed = kmalloc(sizeof(*delayed), GFP_NOFS | __GFP_NOFAIL); - delayed-inode = inode; - spin_lock(fs_info-delayed_iput_lock); - list_add_tail(delayed-list, fs_info-delayed_iputs); + list_add_tail(b_inode-delayed_iput, fs_info-delayed_iputs); spin_unlock(fs_info-delayed_iput_lock); } Hmm. I'm not great with inode life cycles, but isn't this only safe if someone else can't get an i_count reference while this is in flight? It looks like the final iput does the unhashing, and so on, so couldn't an iget/iput race with this and try to add the inode's list_head twice? Yeah, same concern here. Basically this will result in inodes still being in use on unmount. Actually I did a similar one, here is some disscussion: https://patchwork.kernel.org/patch/1824711/ I read it, thanks. Did you try the counter approach? Yes, it'll bring a tradeoff situation. With counter, we need to lock the list all the time instead of doing a splice on the list and unlocking it. I think splice would be faster so I didn't go further(I MIGHT be wrong on this).. Thanks for looking into this. I left this note to myself during the development of the error handling patches while on a tangent to try to eliminate NOFAIL allocs. It's not the alloc/free that's the issue (though eliminating these can probably only help), it's that NOFAIL allocs essentially become locks when memory pressure is high enough that the NOFAIL functionality gets invoked. OTOH, bailing out of that path when we encounter an allocation failure is impossible. - -Jeff - -- Jeff Mahoney SUSE Labs -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.18 (Darwin) Comment: GPGTools - http://gpgtools.org iQIcBAEBAgAGBQJRGfB3AAoJEB57S2MheeWy/E4QALVJ2YI1zbwCHnkUia+yuT40 LoYfyRJoTiKwnwiFeByy98tX9WxVnXGZUVpR8GMwVuLfDIMyVgQmaAicqiirHHHD ySNV3jsyz8HCOb6ALu7eQyWy4F8yBD1HG75njvvzVO+zUlSsaKGmfvsXS0f4ubCk hyxg7OujW++cWg+WOedCZsg2n7kF34MLPJiyjS1E1vw8DZW3tHKWgv/hyJIzp+JK wIZQPrzNUTp0kS4N6+b8rJnXTNkj7zMhWPYeJdIMIG9/+oDr2r1N/XedYMY7fkdS g7Gj28nmTtufYlTcgztL6MHFwxm/tRQNl85+lRU/zYFKIR0ok4+1kFrpZ5KcF97m NZeGSsSiaZfMXE+t6B/AgagFJUws+y/RHBJ/V9paMNjsojLRUBVPQOdeHw355XVm lJeTtyElA+SSawPkzf2115IEj1EgFmHIouSQJdUCPoTfS126NHhH0PYX2GHgAs8b 1ImyG9E/Z/JswVRzAxWGQSffdxzg5Vb8P8w7LzAlIdToVa0tM3Q2n9h3a0vcl83m NQEqe3+GnsflB2xSVyoztVx+ZL8664HC1UzIjgb7oUihGHe7gJZ4uqDgaClGprKh pQyvr8zsbjeMwpvlqv7gRQDFyY3JKK4W5UeS/pGjTM7ORS1LmEUTR5S4pQknTUgc Qj/bH6806My5pW3VB5i5 =ZSdX -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html