Re: [RFC PATCH v3 0/2] Online data deduplication
You didn't use an INCOPMAT option for this so you need to deal with a user mounting the file system with an older kernel or even forgetting to use mount -o dedup. Otherwise your dedup tree will become out of date and you could corrupt peoples data. So if you aren't going to use an INCOMPAT flag you need to at least use a COMPAT flag so we know the option has been used at all and then you need to have a mechanism to know if you need to invalidate the hash tree. Users are also going to make the mistake of thinking dedup will make their workload awesome, and when it doesn't they need a way to turn it off. If you do an INCOMPAT option then you need to have a way to delete the hash tree and unset the INCOMPAT flag. If you do the COMPAT route then you get this for free since the user just needs to stop using -o dedup, but you'll probably also want to provide a mechanism to delete the tree to free up space. Thanks, Josef I made a few mistakes on this, yeah I should also provide a dedup disable way and I'm going to use INCOMPAT. But forgetting to use mount -o dedup will not get dedup tree to be out of date, because dedup tree is loaded if we have it, no matter whether using 'mount -o dedup'. Thanks for the nice reminder, Josef :) thanks, liubo -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Creating recursive snapshots for all filesystems
Alexander Skwar wrote (ao): Where I'm hanging right now, is that I can't seem to figure out a bullet proof way to find all the subvolumes of the filesystems I might have. Is there an easier way to achieve what I want? I want to achieve: Creating recursive snapshots for all filesystems Not sure if this helps, but I have subvolid=0, which contains all my subvolumes, mounted under /.root/ /etc/fstab: LABEL=panda / btrfs subvol=rootvolume,space_cache,inode_cache,compress=lzo,ssd 0 0 LABEL=panda /home btrfs subvol=home 0 0 LABEL=panda /root btrfs subvol=root 0 0 LABEL=panda /varbtrfs subvol=var 0 0 LABEL=panda /holdingbtrfs subvol=.holding 0 0 LABEL=panda /.root btrfs subvolid=0 0 0 LABEL=panda /.backupadmin btrfs subvol=backupadmin 0 0 /Varlib /var/libnonebind 0 0 panda:~# ls -l /.root/ total 0 drwxr-xr-x. 1 root root 580800 Jan 30 17:46 backupadmin drwxr-xr-x. 1 root root 24 Mar 27 2012 home drwx--. 1 root root742 Mar 19 15:50 root drwxr-xr-x. 1 root root226 May 16 2012 rootvolume drwxr-xr-x. 1 root root 96 Apr 3 2012 var In my snapshots script: ... mmddhhmm=`date +%Y%m%d_%H.%M` ... for subvolume in `ls /.root/` do ... /sbin/btrfs subvolume snapshot ${filesystem}/${subvolume}/ \ /.root/.snapshot_${mmddhhmm}_${hostname}_${subvolume}/ || result=2 ... done ... This creates timestamped snapshots for all subvolumes. Sander -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
open_ctree failure on upgrading 3.7 to 3.8 kernel
Hi, Long story short: I've got btrfs raid10 six disk array plus 2 other disks just having a normal setup btrfs filesystems. Everything was running happily under linux 3.5 and 3.7. 3.5 was a stock ubuntu kernel, 3.7 was slightly less stock ubuntu kernel. Now I've upgraded my box to 3.8 and none of btrfs file systems mounts any more. I got open_ctree errors every time I try to mount those. When I reboot system choosing old kernel from grub - everything runs smooth again. Was there any on disk format change or compatibility change?. Some kernel.log output: [ 13.517952] device fsid 9415cddb-e3b8-4977-804c-369553a7eda7 devid 4 transid 30 /dev/sdh1 [ 13.518535] btrfs: disk space caching is enabled [ 13.518773] btrfs: failed to read the system array on sdh1 [ 13.523175] btrfs: open_ctree failed -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs subv list - ERROR: Failed to lookup path for root 0 - No such file or directory
Hi Russel Russell Coker russell at coker.com.au writes: I asked a similar question about 10 days ago and got the below response which solved it for me. Thanks a lot. This solved it for me as well. Cheers, Alexander -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Script for creating/managing snapshots of all subvolumes of all filesystems
Hi FWIW, I've also written a script which creates and manages (ie. deletes old) snapshots. It figures out all the available filesystems and creates snaps for all the available (sub)volumes. It's also on https://copy.com/WI9AXqTH2nD4 and http://pastebin.com/YX8WKcsR to avoid line break issues and also with comments. Regards, Alexander - cut here #!/bin/sh echo Usage: $0 SNAPSHOT_TAG NUM_SNAPSHOTS Create hourly, daily, weekly, and monthly snapshots of btrfs filesystems. Based somewhat on http://article.gmane.org/gmane.comp.file- systems.btrfs/12609 Here's my crontab: 00,15,30,45 * * * * $0 frequently 4 38 * * * * $0 hourly 24 08 00 * * * $0 daily 7 08 12 * * 0 $0 weekly 4 exit 1 fi SNAPSHOT_TAG=$1 NUM_SNAPSHOTS=$2 snap_prefix=snapshot:$SNAPSHOT_TAG: snap_date=`date +%Y-%m-%d--%H.%M.%S.%N` script_name=`basename $0` log_fac=local5 log_tag=$script_name btrfs_progs_dev_path=/home/a/Copy/Computerkram/Programme/btrfs- progs.dev/bin PATH=$btrfs_progs_dev_path:$PATH btrfs fi show 2/dev/null | awk '/ path / {print $NF}' | while read dev; do set -- `btrfs fi show 2/dev/null | grep -B2 path $dev | \ grep Label: | sed 's,.*: \(.*\) uuid: \(.*\),\1 \2,'` label=$1 uuid=$2 logger -t $log_tag -p $log_fac.info -- \ Processing filesystem with label $label and uuid $uuid on $dev safe_dev=`echo $dev | tr / .` tmp_mount_dir=`mktemp -d /tmp/.btrfs.mount.$uuid.$safe_dev.XX` if ! mount -t btrfs $dev $tmp_mount_dir; then logger -t $log_tag -p $log_fac.err -- \ Error! Could not do: mount -t btrfs $dev $tmp_mount_dir exit 1 fi _snap_name=$tmp_mount_dir/,$snap_prefix$snap_date if ! btrfs subv snaps -r $tmp_mount_dir $_snap_name /dev/null; then logger -t $log_tag -p $log_fac.err -- \ Error! Could not do: btrfs subv snaps -r $tmp_mount_dir $_snap_name exit 1 else logger -t $log_tag -p $log_fac.info -- \ Created snapshot $Path,$snap_prefix$snap_date for root volume of fs with uuid $uuid fi (btrfs subv list -r $tmp_mount_dir | grep path ,$snap_prefix \ | tail -$NUM_SNAPSHOTS btrfs subv list -r $tmp_mount_dir | grep path ,$snap_prefix) \ | sort | uniq -u \ | while read __id IdDel __gen GenDel __top __level ToplevelDel __path PathDel; do if ! btrfs subv del $tmp_mount_dir/$PathDel /dev/null; then logger -t $log_tag -p $log_fac.err -- \ Error! Could not do: btrfs subv del $tmp_mount_dir/$PathDel exit 1 else logger -t $log_tag -p $log_fac.info -- \ Removed snapshot $PathDel fi done (btrfs subv list -ar $tmp_mount_dir; btrfs subv list -a $tmp_mount_dir) \ | sort | uniq -u \ | while read _id Id _gen Gen _top _level Toplevel _path Path; do _snap_name=$tmp_mount_dir/$Path,$snap_prefix$snap_date if ! btrfs subv snaps -r $tmp_mount_dir/$Path $_snap_name /dev/null; then logger -t $log_tag -p $log_fac.err -- \ Error! Could not do: btrfs subv snaps -r $tmp_mount_dir/$Path $_snap_name exit 1 else logger -t $log_tag -p $log_fac.info -- \ Created snapshot $Path,$snap_prefix$snap_date for subvolume $Path fi (btrfs subv list -r $tmp_mount_dir \ | grep path $Path,$snap_prefix | tail -$NUM_SNAPSHOTS btrfs subv list -r $tmp_mount_dir|grep path $Path,$snap_prefix) \ | sort | uniq -u \ | while read __id IdDel __gen GenDel __top __level ToplevelDel __path PathDel; do if ! btrfs subv del $tmp_mount_dir/$PathDel /dev/null; then logger -t $log_tag -p $log_fac.err -- \ Error! Could not do: btrfs subv del $tmp_mount_dir/$PathDel exit 1 else logger -t $log_tag -p $log_fac.info -- \ Removed snapshot $PathDel fi done done if ! umount $tmp_mount_dir; then logger -t $log_tag -p $log_fac.err -- \ Error! Could not do: umount$tmp_mount_dir fi if ! rmdir $tmp_mount_dir; then logger -t $log_tag -p $log_fac.err -- \ Error! Could not do: rmdir $tmp_mount_dir fi done exit 0 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Creating recursive snapshots for all filesystems
Hi Sander sander at humilis.net writes: Alexander Skwar wrote (ao): Where I'm hanging right now, is that I can't seem to figure out a bullet proof way to find all the subvolumes of the filesystems I might have. Is there an easier way to achieve what I want? I want to achieve: Creating recursive snapshots for all filesystems Not sure if this helps, but I have subvolid=0, which contains all my subvolumes, mounted under /.root/ Hm, not quite what I'm after and not nearly as easy as ZFS... Problem with your approach: The admin has to maintain this. I was looking for something, which maints itself, so to say. And your approach also wouldn't scale if there are sub-subvolumes. ZFS really is so much easier (at least regarding that). Thanks a lot, though. It's a worthwhile idea. Regards, Alexander -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] xfstests 311: test fsync with dm flakey V3
Josef, The patch does not compile on older kernels (i.e. SLES11 SP2). fsync-tester.c: In function 'test_three': fsync-tester.c:133: warning: implicit declaration of function 'syncfs' /tmp/cciHR6Gb.o: In function `test_three': /data/lwork/gulag1c/rjohnston/xfstests/src/fsync-tester.c:133: undefined reference to `syncfs' collect2: ld returned 1 exit status gmake[3]: *** [fsync-tester] Error 1 gmake[2]: *** [src] Error 2 make[1]: *** [default] Error 2 make: *** [default] Error 2 src/fsync-tester.c 133 syncfs(test_fd); Typo ? ^^ Did you mean fsync? Regards, --Rich -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs: free up reserved space if we fail to insert extent entry
If we are inserting an extent entry for the first allocation of an extent and the addition fails we need to clean up the reserved space otherwise we'll get WARN_ON()'s on unmount because we have left over reserve space. Thanks, Signed-off-by: Josef Bacik jba...@fusionio.com --- fs/btrfs/extent-tree.c | 45 + 1 files changed, 29 insertions(+), 16 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 2305b5c..7049bbc 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -6407,16 +6407,16 @@ static int alloc_reserved_file_extent(struct btrfs_trans_handle *trans, size = sizeof(*extent_item) + btrfs_extent_inline_ref_size(type); path = btrfs_alloc_path(); - if (!path) - return -ENOMEM; + if (!path) { + ret = -ENOMEM; + goto out; + } path-leave_spinning = 1; ret = btrfs_insert_empty_item(trans, fs_info-extent_root, path, ins, size); - if (ret) { - btrfs_free_path(path); - return ret; - } + if (ret) + goto out; leaf = path-nodes[0]; extent_item = btrfs_item_ptr(leaf, path-slots[0], @@ -6444,14 +6444,21 @@ static int alloc_reserved_file_extent(struct btrfs_trans_handle *trans, btrfs_mark_buffer_dirty(path-nodes[0]); btrfs_free_path(path); + path = NULL; ret = update_block_group(root, ins-objectid, ins-offset, 1); - if (ret) { /* -ENOENT, logic error */ + if (ret) { btrfs_err(fs_info, update block group failed for %llu %llu, (unsigned long long)ins-objectid, (unsigned long long)ins-offset); - BUG(); + goto out; } + + return ret; +out: + btrfs_free_path(path); + btrfs_pin_extent(root, ins-objectid, ins-offset, 1); + btrfs_del_csums(trans, root, ins-objectid, ins-offset); return ret; } @@ -6476,16 +6483,16 @@ static int alloc_reserved_tree_block(struct btrfs_trans_handle *trans, size += sizeof(*block_info); path = btrfs_alloc_path(); - if (!path) - return -ENOMEM; + if (!path) { + ret = -ENOMEM; + goto out; + } path-leave_spinning = 1; ret = btrfs_insert_empty_item(trans, fs_info-extent_root, path, ins, size); - if (ret) { - btrfs_free_path(path); - return ret; - } + if (ret) + goto out; leaf = path-nodes[0]; extent_item = btrfs_item_ptr(leaf, path-slots[0], @@ -6517,14 +6524,20 @@ static int alloc_reserved_tree_block(struct btrfs_trans_handle *trans, btrfs_mark_buffer_dirty(leaf); btrfs_free_path(path); + path = NULL; ret = update_block_group(root, ins-objectid, root-leafsize, 1); - if (ret) { /* -ENOENT, logic error */ + if (ret) { btrfs_err(fs_info, update block group failed for %llu %llu, (unsigned long long)ins-objectid, (unsigned long long)ins-offset); - BUG(); + goto out; } + + return ret; +out: + btrfs_free_path(path); + btrfs_pin_extent(root, ins-objectid, root-leafsize, 1); return ret; } -- 1.7.7.6 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs: increase the max global reserve size to 1gig
Apparently 512mb was too small, with a fs_mark command we could get so much delayed work built up that we'd never trip the lets commit the transaction logic until we'd gotten too much delayed refs built up. Increasing this to 1 gig makes us much safer and we no longer abort with Dave's fs_mark tester. Thanks, Signed-off-by: Josef Bacik jba...@fusionio.com --- fs/btrfs/extent-tree.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 7049bbc..f10ac46 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -4516,7 +4516,7 @@ static void update_global_block_rsv(struct btrfs_fs_info *fs_info) spin_lock(sinfo-lock); spin_lock(block_rsv-lock); - block_rsv-size = min_t(u64, num_bytes, 512 * 1024 * 1024); + block_rsv-size = min_t(u64, num_bytes, 1024 * 1024 * 1024); num_bytes = sinfo-bytes_used + sinfo-bytes_pinned + sinfo-bytes_reserved + sinfo-bytes_readonly + -- 1.7.7.6 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] xfstests: btrfs/276 - stop all fsstress before exiting
On 04/26/2013 08:10 AM, Eric Sandeen wrote: On Apr 26, 2013, at 3:35 AM, Jan Schmidt list.bt...@jan-o-sch.net wrote: On Fri, April 26, 2013 at 07:29 (+0200), Eric Sandeen wrote: Tests after 276 were failing because the background fsstress hadn't quit prior to exit, devices couldn't be unmounted, etc. I don't see how that would happen. Any further insight? Yes, sorry for not including it. The parent process was killed, but the fsstress processes just got reparented to init. I tried for a while to use pkill to knock them of first but this seems simpler, actually. Eric Jan, with Eric's explanation, may I put your Reviewed-by: on this patch? -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] xfstests: btrfs/276 - stop all fsstress before exiting
Thanks for the patch Eric and the review Jan, this has been committed. --Rich commit 0b5677123b5d8c0a29b45f55c7b981aeeca9b2c8 Author: Eric Sandeen sand...@redhat.com Date: Fri Apr 26 05:29:21 2013 + xfstests: btrfs/276 - stop all fsstress before exiting -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: increase the max global reserve size to 1gig
On Fri, May 03, 2013 at 08:56:54AM -0400, Josef Bacik wrote: Apparently 512mb was too small, with a fs_mark command we could get so much delayed work built up that we'd never trip the lets commit the transaction logic until we'd gotten too much delayed refs built up. Increasing this to 1 gig makes us much safer and we no longer abort with Dave's fs_mark tester. Thanks, I remember that last time I made a similar commit, but users complains that they cannot boot their system on root btrfs partition due to lacking space and Chris eventually got to revert that one... thanks, liubo Signed-off-by: Josef Bacik jba...@fusionio.com --- fs/btrfs/extent-tree.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 7049bbc..f10ac46 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -4516,7 +4516,7 @@ static void update_global_block_rsv(struct btrfs_fs_info *fs_info) spin_lock(sinfo-lock); spin_lock(block_rsv-lock); - block_rsv-size = min_t(u64, num_bytes, 512 * 1024 * 1024); + block_rsv-size = min_t(u64, num_bytes, 1024 * 1024 * 1024); num_bytes = sinfo-bytes_used + sinfo-bytes_pinned + sinfo-bytes_reserved + sinfo-bytes_readonly + -- 1.7.7.6 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] xfstests 311: test fsync with dm flakey V3
On Fri, May 03, 2013 at 06:28:03AM -0600, Rich Johnston wrote: Josef, The patch does not compile on older kernels (i.e. SLES11 SP2). fsync-tester.c: In function 'test_three': fsync-tester.c:133: warning: implicit declaration of function 'syncfs' /tmp/cciHR6Gb.o: In function `test_three': /data/lwork/gulag1c/rjohnston/xfstests/src/fsync-tester.c:133: undefined reference to `syncfs' collect2: ld returned 1 exit status gmake[3]: *** [fsync-tester] Error 1 gmake[2]: *** [src] Error 2 make[1]: *** [default] Error 2 make: *** [default] Error 2 src/fsync-tester.c 133 syncfs(test_fd); Typo ? ^^ Did you mean fsync? Argh crap I should have noticed this in the manpage syncfs() first appeared in Linux 2.6.39 You can just replace it with sync(), or do you want me to resend the patch with that change? Thanks, Josef -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] xfstests 311: test fsync with dm flakey V3
On 05/03/2013 12:30 PM, Josef Bacik wrote: On Fri, May 03, 2013 at 06:28:03AM -0600, Rich Johnston wrote: Josef, The patch does not compile on older kernels (i.e. SLES11 SP2). fsync-tester.c: In function 'test_three': fsync-tester.c:133: warning: implicit declaration of function 'syncfs' /tmp/cciHR6Gb.o: In function `test_three': /data/lwork/gulag1c/rjohnston/xfstests/src/fsync-tester.c:133: undefined reference to `syncfs' collect2: ld returned 1 exit status gmake[3]: *** [fsync-tester] Error 1 gmake[2]: *** [src] Error 2 make[1]: *** [default] Error 2 make: *** [default] Error 2 src/fsync-tester.c 133 syncfs(test_fd); Typo ? ^^ Did you mean fsync? Argh crap I should have noticed this in the manpage syncfs() first appeared in Linux 2.6.39 You can just replace it with sync(), or do you want me to resend the patch with that change? Thanks, Josef No need to repost I will change it to sync() at commit time ;-) --Rich -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] xfstests 311: test fsync with dm flakey V3
Thanks for another patch Josef, it has been committed with the change discussed. --Rich commit 2ca254dfddbbab8def35472b6ca39140400aff76 Author: Josef Bacik jba...@fusionio.com Date: Fri Apr 26 19:13:59 2013 + xfstests 311: test fsync with dm flakey V3 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] xfstests 311: test fsync with dm flakey V3
On Fri, May 03, 2013 at 12:21:59PM -0600, Rich Johnston wrote: Thanks for another patch Josef, it has been committed with the change discussed. Err I forgot to point out I already have a sync variable in there so it fails to compile, we'll need to change the var to do_sync or something. Want me to send a patch along? Thanks, Josef -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] xfstests 311: test fsync with dm flakey V3
On 05/03/2013 02:05 PM, Josef Bacik wrote: On Fri, May 03, 2013 at 12:21:59PM -0600, Rich Johnston wrote: Thanks for another patch Josef, it has been committed with the change discussed. Err I forgot to point out I already have a sync variable in there so it fails to compile, we'll need to change the var to do_sync or something. Want me to send a patch along? Thanks, Josef Sorry this was my fault, I have reverted commit 7f622f44b651aec13b99ef62c2942388a6fbee5d Author: Rich Johnston rjohns...@sgi.com Date: Fri May 3 14:07:59 2013 -0500 Revert xfstests 311: test fsync with dm flakey V3 and committed it again. commit dd3b5268312e0518ae695e8ee2a618f13805c425 Author: Josef Bacik jba...@fusionio.com Date: Fri Apr 26 19:13:59 2013 + xfstests 311: test fsync with dm flakey V4 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] xfstests: unmount scratch mnt in test 307
So if you have a mount command that doesn't use /etc/mtab then it will spit out a different device for the mounted device. So say we have SCRATCH_DEV_POOL=/dev/sda /dev/sdb /dev/sdc we will turn this into SCRATCH_DEV=/dev/sda SCRATCH_DEV_POOL=/dev/sdb /dev/sdc and then when you mkfs this you do _scratch_mkfs $SCRATCH_DEV_POOL which turns into this mkfs.btrfs /dev/sdb /dev/sdc /dev/sda becuase we do mkfs $* $SCRATCH_DEV Then btrfs will always show the lowest devid in /proc/mounts to maintain consistency, so even though we do mount /dev/sda $SCRATCH_MNT, you will see /dev/sdb as the mounted device in /proc/mounts. So then say the next test wants to just use $SCRATCH_DEV, it will do _require_scratchdev which will check to see if $SCRATCH_DEV is mounted, which it will look like it is not because /proc/mounts shows /dev/sdb instead of /dev/sda, and so it won't umount $SCRATCH_MNT, and then that test will fail because we can't mkfs the device because it is busy. I reproduced this on a box that doesn't use /etc/mtab by doing ./check btrfs/307 generic/015 and 015 would fail. With this patch it passes now. Thanks, Signed-off-by: Josef Bacik jba...@fusionio.com --- tests/btrfs/307 |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/tests/btrfs/307 b/tests/btrfs/307 index 87314c6..15157b3 100644 --- a/tests/btrfs/307 +++ b/tests/btrfs/307 @@ -35,6 +35,7 @@ _cleanup() { cd / rm -f $tmp.* +umount $SCRATCH_MNT } # get standard environment, filters and checks -- 1.7.7.6 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] xfstests: unmount scratch mnt in test 307
On 5/3/13 3:11 PM, Josef Bacik wrote: So if you have a mount command that doesn't use /etc/mtab then it will spit out a different device for the mounted device. So say we have SCRATCH_DEV_POOL=/dev/sda /dev/sdb /dev/sdc we will turn this into SCRATCH_DEV=/dev/sda SCRATCH_DEV_POOL=/dev/sdb /dev/sdc and then when you mkfs this you do _scratch_mkfs $SCRATCH_DEV_POOL which turns into this mkfs.btrfs /dev/sdb /dev/sdc /dev/sda becuase we do mkfs $* $SCRATCH_DEV Then btrfs will always show the lowest devid in /proc/mounts to maintain consistency, so even though we do mount /dev/sda $SCRATCH_MNT, you will see /dev/sdb as the mounted device in /proc/mounts. So then say the next test wants to just use $SCRATCH_DEV, it will do _require_scratchdev which will check to see if $SCRATCH_DEV is mounted, which it will look like it is not because /proc/mounts shows /dev/sdb instead of /dev/sda, and so it won't umount $SCRATCH_MNT, and then that test will fail because we can't mkfs the device because it is busy. I reproduced this on a box that doesn't use /etc/mtab by doing ./check btrfs/307 generic/015 and 015 would fail. With this patch it passes now. Thanks, Signed-off-by: Josef Bacik jba...@fusionio.com --- tests/btrfs/307 |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/tests/btrfs/307 b/tests/btrfs/307 index 87314c6..15157b3 100644 --- a/tests/btrfs/307 +++ b/tests/btrfs/307 @@ -35,6 +35,7 @@ _cleanup() { cd / rm -f $tmp.* +umount $SCRATCH_MNT } # get standard environment, filters and checks This seems fine for this particular test. Is it really a hard requirement that each test unmount SCRATCH_[DEV|MNT] if it used it? If so, fine... the README does indicate this. But I wonder if we can make it a little more foolproof by updating _require_scratch to handle this situation more gracefully? -Eric -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs: use only memmove_extent_buffer and simplify the helpers
On Mon, Apr 29, 2013 at 07:38:01AM -0600, David Sterba wrote: After commit a65917156e34594 (Btrfs: stop using highmem for extent_buffers) we don't need to call kmap_atomic anymore and can reduce the move_pages helper to a simple memmove. There's only one caller of memcpy_extent_buffer, we can use the memmove_ variant here. This makes -l 64k blow the hell up, just try generic/001. I'm kicking this patch out. Thanks, Josef -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs: use only memmove_extent_buffer and simplify the helpers
Quoting Josef Bacik (2013-05-03 16:33:44) On Mon, Apr 29, 2013 at 07:38:01AM -0600, David Sterba wrote: After commit a65917156e34594 (Btrfs: stop using highmem for extent_buffers) we don't need to call kmap_atomic anymore and can reduce the move_pages helper to a simple memmove. There's only one caller of memcpy_extent_buffer, we can use the memmove_ variant here. This makes -l 64k blow the hell up, just try generic/001. I'm kicking this patch out. Thanks, Sorry Dave, I only now remember having this same problem the last time I tried to get rid of memcpy. -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
grub/grub2 boot into btrfs raid root and with no initrd
I've made a few attempts to boot into a root filesystem created using: mkfs.btrfs -d raid1 -m raid1 -L btrfs_root_3 /dev/sda3 /dev/sdb3 Both grub and grub2 pick up a kernel image fine from an ext4 /boot on /dev/sda1 for exaample, but then fail to find or assemble the btrfs root. Setting up an initrd and grub operates fine for the btrfs raid. What is the special magic to do this without the need for an initrd? Is the comment/patch below from last year languishing unknown? Or is there some problem with that kernel approach? Thanks, Martin See: http://forums.gentoo.org/viewtopic-t-923554-start-0.html Below is my patch, which is working fine for me with 3.8.2. Code: $ cat /etc/portage/patches/sys-kernel/gentoo-sources/earlydevtmpfs.patch --- init/do_mounts.c.orig 2013-03-24 20:49:53.446971127 +0100 +++ init/do_mounts.c 2013-03-24 20:51:46.408237541 +0100 @@ -529,6 +529,7 @@ create_dev(/dev/root, ROOT_DEV); if (saved_root_name[0]) { create_dev(saved_root_name, ROOT_DEV); + devtmpfs_mount(dev); mount_block_root(saved_root_name, root_mountflags); } else { create_dev(/dev/root, ROOT_DEV); -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: grub/grub2 boot into btrfs raid root and with no initrd
On Fri, May 3, 2013 at 10:42 PM, Martin m_bt...@ml1.co.uk wrote: I've made a few attempts to boot into a root filesystem created using: mkfs.btrfs -d raid1 -m raid1 -L btrfs_root_3 /dev/sda3 /dev/sdb3 Both grub and grub2 pick up a kernel image fine from an ext4 /boot on /dev/sda1 for exaample, but then fail to find or assemble the btrfs root. Setting up an initrd and grub operates fine for the btrfs raid. What is the special magic to do this without the need for an initrd? Is the comment/patch below from last year languishing unknown? Or is there some problem with that kernel approach? Thanks, Martin See: http://forums.gentoo.org/viewtopic-t-923554-start-0.html Below is my patch, which is working fine for me with 3.8.2. Code: $ cat /etc/portage/patches/sys-kernel/gentoo-sources/earlydevtmpfs.patch --- init/do_mounts.c.orig 2013-03-24 20:49:53.446971127 +0100 +++ init/do_mounts.c 2013-03-24 20:51:46.408237541 +0100 @@ -529,6 +529,7 @@ create_dev(/dev/root, ROOT_DEV); if (saved_root_name[0]) { create_dev(saved_root_name, ROOT_DEV); + devtmpfs_mount(dev); mount_block_root(saved_root_name, root_mountflags); } else { create_dev(/dev/root, ROOT_DEV); -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html The initrd has to run btrfs-scan so that btrfs can find the other devices that have btrfs on them. Alternatively you can give all involved devices in the fstab and kernel command line with device=/dev/name -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs-progs: init free space ctl with proper unit
btrfsck was blowing up when checking the free space cache when we ran xfstests with -l 64k. That is because I was init'ing the free space ctl to whatever the leafsize was, which isn't right for data block groups. With this patch btrfsck no longer complains. This also fixes a tiny little typo in free-space-cache.c I noticed while figuring this problem out. Thanks, Signed-off-by: Josef Bacik jba...@fusionio.com --- cmds-check.c | 11 +-- free-space-cache.c |2 -- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/cmds-check.c b/cmds-check.c index 030ab77..02bfedd 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -3001,8 +3001,15 @@ static int check_space_cache(struct btrfs_root *root) start = cache-key.objectid + cache-key.offset; if (!cache-free_space_ctl) { - if (btrfs_init_free_space_ctl(cache, - root-leafsize)) { + int sectorsize; + + if (cache-flags (BTRFS_BLOCK_GROUP_METADATA | + BTRFS_BLOCK_GROUP_SYSTEM)) + sectorsize = root-leafsize; + else + sectorsize = root-sectorsize; + + if (btrfs_init_free_space_ctl(cache, sectorsize)) { ret = -ENOMEM; break; } diff --git a/free-space-cache.c b/free-space-cache.c index 8a77a32..5fb8ece 100644 --- a/free-space-cache.c +++ b/free-space-cache.c @@ -808,8 +808,6 @@ int btrfs_add_free_space(struct btrfs_free_space_ctl *ctl, u64 offset, try_merge_free_space(ctl, info); ret = link_free_space(ctl, info); - if (ret) - if (ret) { printk(KERN_CRIT btrfs: unable to add free space :%d\n, ret); BUG_ON(ret == -EEXIST); -- 1.7.7.6 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] xfstests: unmount scratch mnt in test 307
On Fri, May 03, 2013 at 03:15:01PM -0500, Eric Sandeen wrote: On 5/3/13 3:11 PM, Josef Bacik wrote: So if you have a mount command that doesn't use /etc/mtab then it will spit out a different device for the mounted device. So say we have SCRATCH_DEV_POOL=/dev/sda /dev/sdb /dev/sdc we will turn this into SCRATCH_DEV=/dev/sda SCRATCH_DEV_POOL=/dev/sdb /dev/sdc and then when you mkfs this you do _scratch_mkfs $SCRATCH_DEV_POOL which turns into this mkfs.btrfs /dev/sdb /dev/sdc /dev/sda becuase we do mkfs $* $SCRATCH_DEV Then btrfs will always show the lowest devid in /proc/mounts to maintain consistency, so even though we do mount /dev/sda $SCRATCH_MNT, you will see /dev/sdb as the mounted device in /proc/mounts. So then say the next test wants to just use $SCRATCH_DEV, it will do _require_scratchdev which will check to see if $SCRATCH_DEV is mounted, which it will look like it is not because /proc/mounts shows /dev/sdb instead of /dev/sda, and so it won't umount $SCRATCH_MNT, and then that test will fail because we can't mkfs the device because it is busy. I reproduced this on a box that doesn't use /etc/mtab by doing ./check btrfs/307 generic/015 and 015 would fail. With this patch it passes now. Thanks, Signed-off-by: Josef Bacik jba...@fusionio.com --- tests/btrfs/307 |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/tests/btrfs/307 b/tests/btrfs/307 index 87314c6..15157b3 100644 --- a/tests/btrfs/307 +++ b/tests/btrfs/307 @@ -35,6 +35,7 @@ _cleanup() { cd / rm -f $tmp.* +umount $SCRATCH_MNT } # get standard environment, filters and checks This seems fine for this particular test. Is it really a hard requirement that each test unmount SCRATCH_[DEV|MNT] if it used it? If so, fine... the README does indicate this. But I wonder if we can make it a little more foolproof by updating _require_scratch to handle this situation more gracefully? It already tries to unmount $SCRATCH_DEV, and will through an error if it's not mounted on $SCRATCH_MNT. I guess the opposite checks are necessary in this case i.e. check that SCRATCH_MNT is not mounted, and through an error if it's not SCRATCH_DEV that is mounted there... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[3.9] parallel fsmark perf is real bad on sparse devices
Hi folks, It's that time again - I ran fsmark on btrfs and found performance was awful. tl;dr: memory pressure causes random writeback of metadata (bad), fragmenting the underlying sparse storage. This causes a downward spiral as btrfs cycles through good IO patterns that get fragmented at the device level due to the bad IO patterns fragmenting the underlying sparse device. FYI, The storage hardware is a DM RAID0 stripe across 4 SSDs sitting behind 512MB of BBWC with an XFS filesystem on it. The only file on the filesystem is the sparse 100TB file used for the device, and the VM is using virtio,cache=none to access the filesystem image. i.e. the storage I'm working on this time is a thinly provisioned 100TB device fed to an 8p, 4GB RAM VM, and this script is then run: $ cat fsmark-50-test-btrfs.sh #!/bin/bash sudo umount /mnt/scratch /dev/null 21 sudo mkfs.btrfs /dev/vdc sudo mount /dev/vdc /mnt/scratch sudo chmod 777 /mnt/scratch cd /home/dave/src/fs_mark-3.3/ time ./fs_mark -D 1 -S0 -n 10 -s 0 -L 63 \ -d /mnt/scratch/0 -d /mnt/scratch/1 \ -d /mnt/scratch/2 -d /mnt/scratch/3 \ -d /mnt/scratch/4 -d /mnt/scratch/5 \ -d /mnt/scratch/6 -d /mnt/scratch/7 \ | tee (stats --trim-outliers | tail -1 12) sync $ $ ./fsmark-50-test-btrfs.sh WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL WARNING! - see http://btrfs.wiki.kernel.org before using fs created label (null) on /dev/vdc nodesize 4096 leafsize 4096 sectorsize 4096 size 100.00TB Btrfs Btrfs v0.19 # ./fs_mark -D 1 -S0 -n 10 -s 0 -L 63 -d /mnt/scratch/0 -d /mnt/scratch/1 -d /mnt/scratch/2 -d /mnt/scratch/3 -d /mnt/scratch/4 -d /mnt/scratch/5 -d /mnt/scratch/6 -d /mnt/scratch/7 # Version 3.3, 8 thread(s) starting at Fri May 3 17:08:46 2013 # Sync method: NO SYNC: Test does not issue sync() or fsync() calls. # Directories: Time based hash between directories across 1 subdirectories with 180 seconds per subdirectory. # File names: 40 bytes long, (16 initial bytes of time stamp with 24 random bytes at end of name) # Files info: size 0 bytes, written with an IO size of 16384 bytes per write # App overhead is time in microseconds spent in the test not doing file writing related system calls. FSUse%Count SizeFiles/sec App Overhead 0 800 53498.9 7898900 0 1600 11186.5 9409278 0 2400 17026.1 7907599 0 3200 25815.6 9749980 0 4000 11503.0 8556349 0 4800 43561.9 8295238 0 5600 17175.3 8304668 ^C 0 80-560(3.2e+06+/-1.1e+06)0 11186.50-53498.90(23016.4+/-1.1e+04) 7898900-9749980(8.49463e+06+/-5e+05) What I'm seeing is that the underlying image file is getting badly, badly fragmented. This short test created approximately 8 million extents in the image file in about 10 minutes runtime. Running xfs_fsr on the image file pointed this out: # xfs_fsr -d -v vm-100TB-sparse.img vm-100TB-sparse.img vm-100TB-sparse.img extents=7971773 can_save=7926036 tmp=./.fsr6198 DEBUG: fsize=109951162777600 blsz_dio=16773120 d_min=512 d_max=2147483136 pgsz=4096 Temporary file has 46107 extents (7971773 in original) extents before:7971773 after:46107 vm-100TB-sparse.img # Most of the data written to the file is contiguous. This means that btrfs is filling the filesystem in a contiguous manner, but it's IO is anything but contiguous. So, what's happening here? Turns out that when the machine first runs out of free memory (about 1.2m inodes in), btrfs goes from running a couple of hundred nice large 512k IOs a second to an intense 10s long burst of 10-15kiops of tiny random IOs. Looking at it from the IO completion side of things: 253,32 4 238 5.936043934 0 C W 103680 + 1024 [0] 253,32 4 239 5.936155917 0 C W 2201728 + 1024 [0] 253,32 4 240 5.936172087 0 C W 104704 + 1024 [0] 253,32 4 241 5.936283060 0 C W 2202752 + 1024 [0] 253,32 4 242 5.936294881 0 C W 105728 + 1024 [0] 253,32 4 243 5.936385182 0 C W 106752 + 1024 [0] 253,32 4 244 5.936394695 0 C W 107776 + 1024 [0] 253,32 4 245 5.936402936 0 C W 108800 + 1024 [0] 253,32 4 246 5.936406721 0 C W 109824 + 896 [0] 253,32 4 247 5.936414258 0 C W 2203776 + 1024 [0] 253,32 4 248 5.936515302 0 C W 2204800 + 1024 [0] 253,32 4 249 5.936606737 0 C W 2205824 + 1024 [0] 253,32 4 250 5.936689345 0 C W 2206848 + 1024 [0] All nice and large, mostly sequential IO patterns. Fast foward to where we've run out of memory: 253,32