Re: btrfs raid1 degraded does not mount or fsck
On Friday, 29 October, 2010, Vladi Gergov wrote: kernel: scratch git repo from today 10.29.10 @ 14:30 PST Btrfs v0.19-35-g1b444cd-dirty gypsyops @ /mnt sudo btrfs filesystem show Label: 'das4' uuid: d0e5137f-e5e7-49da-91f6-a9c4e4e72c6f Total devices 3 FS bytes used 1.38TB devid3 size 1.82TB used 0.00 path /dev/sdb devid2 size 1.82TB used 1.38TB path /dev/sdc *** Some devices missing Btrfs v0.19-35-g1b444cd-dirty gypsyops @ /mnt sudo mount -o degraded /dev/sdc das3/ Password: mount: wrong fs type, bad option, bad superblock on /dev/sdc, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so [ 684.577540] device label das4 devid 2 transid 107954 /dev/sdc [ 684.595150] btrfs: allowing degraded mounts [ 684.595594] btrfs: failed to read chunk root on sdb [ 684.604110] btrfs: open_ctree failed gypsyops @ /mnt sudo btrfsck /dev/sdc btrfsck: volumes.c:1367: btrfs_read_sys_array: Assertion `!(ret)' failed. any help please? did you execute btrfs dev scan before trying to mount the filesystem ? -- ,-| Vladi `-| Gergov !DSPAM:4ccb39a9191821603519226! -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) kreij...@inwind.it Key fingerprint = 4769 7E51 5293 D36C 814E C054 BF04 F161 3DC5 0512 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Can't remove missing drive
Hi, I have a raid1 setup with a missing device. I have added a new device and everything seems to be working fine, except I cannot remove the old, missing, device. There is no error - but the 'some devices missing' tag doesn't go away. r...@willvo:~# btrfs filesystem show failed to read /dev/sr0 Label: none uuid: f929c413-01c8-443f-b4f2-86f36702f519 Total devices 3 FS bytes used 578.39GB devid1 size 931.51GB used 604.00GB path /dev/sdb1 devid2 size 931.51GB used 604.00GB path /dev/sdc1 *** Some devices missing Btrfs Btrfs v0.19 r...@willvo:~# btrfs device delete missing /data r...@willvo:~# btrfs filesystem show failed to read /dev/sr0 Label: none uuid: f929c413-01c8-443f-b4f2-86f36702f519 Total devices 3 FS bytes used 578.39GB devid1 size 931.51GB used 604.00GB path /dev/sdb1 devid2 size 931.51GB used 604.00GB path /dev/sdc1 *** Some devices missing Btrfs Btrfs v0.19 There are a number of sub-volumes of /data that are mounted in other locations. I'm using kernel 2.6.36 (the lucid backport of the natty kernel) and similar btrfs-tools (lucid backport of natty tools). Interestingly looking at the output of `dh -h`, it appears that the 'missing' devices are no longer being counted in the filesystem size - there is just a phantom 'missing' tag in btrfs-show. Is this actually a problem, or can I just keep running as is? It seems to mount fine without -odegraded. Any ideas how I can list the missing devices? Any ideas on how I can remove the missing devices? Be well, Will:-} -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(...)) - generated by Coccinelle
This patch was generated using the Coccinelle scripts and btrfs code in v2.6.36-9657-g7a3f8fe. Use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(...)) The semantic patch that makes this change is available in scripts/coccinelle/api/err_cast.cocci. More information about semantic patching is available at http://coccinelle.lip6.fr/ Signed-off-by: Chris Samuel ch...@csamuel.org diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c index 454ca52..23cb8da 100644 --- a/fs/btrfs/extent_map.c +++ b/fs/btrfs/extent_map.c @@ -335,7 +335,7 @@ struct extent_map *lookup_extent_mapping(struct extent_map_tree *tree, goto out; } if (IS_ERR(rb_node)) { - em = ERR_PTR(PTR_ERR(rb_node)); + em = ERR_CAST(rb_node); goto out; } em = rb_entry(rb_node, struct extent_map, rb_node); @@ -384,7 +384,7 @@ struct extent_map *search_extent_mapping(struct extent_map_tree *tree, goto out; } if (IS_ERR(rb_node)) { - em = ERR_PTR(PTR_ERR(rb_node)); + em = ERR_CAST(rb_node); goto out; } em = rb_entry(rb_node, struct extent_map, rb_node); diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index ebe46c6..1bc7a0a 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -380,7 +380,7 @@ static struct dentry *get_default_root(struct super_block *sb, find_root: new_root = btrfs_read_fs_root_no_name(root-fs_info, location); if (IS_ERR(new_root)) - return ERR_PTR(PTR_ERR(new_root)); + return ERR_CAST(new_root); if (btrfs_root_refs(new_root-root_item) == 0) return ERR_PTR(-ENOENT); -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC This email may come with a PGP signature as a file. Do not panic. For more info see: http://en.wikipedia.org/wiki/OpenPGP signature.asc Description: This is a digitally signed message part.
Re: [patch 1/2] Balance progress monitoring.
On Sat, Oct 30, 2010 at 01:07:27AM +0100, Hugo Mills wrote: This patch introduces a basic form of progress monitoring for balance operations, by counting the number of block groups remaining. The information is exposed to userspace by an ioctl. Dammit. An unrefreshed quilt patch let an error get through (see below). Updated patch in a few moments. Hugo. Index: linux-mainline/fs/btrfs/volumes.c === --- linux-mainline.orig/fs/btrfs/volumes.c2010-10-26 18:03:38.0 +0100 +++ linux-mainline/fs/btrfs/volumes.c 2010-10-29 17:23:40.463279287 +0100 @@ -1902,6 +1902,7 @@ struct btrfs_root *chunk_root = dev_root-fs_info-chunk_root; struct btrfs_trans_handle *trans; struct btrfs_key found_key; + struct btrfs_balance_status *bal_info; + struct btrfs_balance_info *bal_info; -- === Hugo Mills: h...@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- dragon A linked list is still a binary tree. Just a --- very unbalanced one. signature.asc Description: Digital signature
[GIT PULL] Btrfs updates for 2.6.37
Hi everyone, There were some minor conflicts with Linus' current tree, so my branch is merged with Linus' tree as of this morning. It includes some new writeback helpers so that btrfs can kick off IO to reclaim delalloc space. I bounced a few different interfaces off Christoph before this one. It isn't quite perfect for what btrfs is doing but it just adds a new func that limits the number of pages we'll send down to writeback_inodes_sb Otherwise these are all btrfs commits. The big focus of the work this time around is performance around ENOSPC and bug fixes. Josef also has some block group caching code which writes out the free space cache with each commit. This is disabled by default, but you get it with mount -o space_cache. After a fresh mount, his new code makes us dramatically faster because we don't have to scan the btree for free blocks. Sage has a collection of new ioctls that ceph will be using. They make is possible for ceph to stop using the transaction start/top ioctls, so it's a big cleanup. Linus, please pull the for-linus branch of the btrfs unstable tree: git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable.git for-linus The master branch has all of these changes against 2.6.36 Josef Bacik (17) commits (+1775/-254): Btrfs: set trans to null in reserve_metadata_bytes if we commit the transaction (+6/-3) Btrfs: check cache-caching_ctl before returning if caching has started (+6/-0) Btrfs: stop trying to shrink delalloc if there are no inodes to reclaim (+5/-0) Btrfs: add support for mixed data+metadata block groups (+28/-4) Btrfs: fix reservation code for mixed block groups (+6/-2) Btrfs: let the user know space caching is enabled (+2/-0) Btrfs: create special free space cache inode (+668/-46) Btrfs: rework how we reserve metadata bytes (+136/-127) Btrfs: fix the df ioctl to report raid types (+76/-24) Btrfs: don't allocate chunks as aggressively (+5/-2) Btrfs: load free space cache if it exists (+345/-3) Btrfs: fix error handling in btrfs_get_sb (+3/-4) Btrfs: remove warn_on from use_block_rsv (+0/-5) Btrfs: Add a clear_cache mount option (+8/-3) Btrfs: write out free space cache (+420/-13) Btrfs: re-work delalloc flushing (+38/-15) Btrfs: fix df regression (+23/-3) Sage Weil (9) commits (+438/-47): Btrfs: allow subvol deletion by unprivileged user with -o user_subvol_rm_allowed (+116/-5) Btrfs: fix clone ioctl where range is adjacent to extent (+1/-1) Btrfs: fix deadlock in btrfs_commit_transaction (+5/-8) Btrfs: fix lockdep warning on clone ioctl (+4/-4) Btrfs: fix delalloc checks in clone ioctl (+5/-3) Btrfs: add START_SYNC, WAIT_SYNC ioctls (+89/-0) Btrfs: add SNAP_CREATE_ASYNC ioctl (+93/-25) Btrfs: async transaction commit (+124/-0) Btrfs: make SNAP_DESTROY async (+1/-1) Chris Mason (7) commits (+108/-48): Btrfs: tune the chunk allocation to 5% of the FS as metadata (+18/-4) Btrfs: use the flusher threads for delalloc throttling (+15/-18) Btrfs: deal with errors from updating the tree log (+2/-1) Add new functions for triggering inode writeback (+44/-10) Btrfs: fix raid code for removing missing drives (+1/-2) Btrfs: drop unused variable in block_alloc_rsv (+0/-4) Btrfs: don't loop forever on bad btree blocks (+28/-9) Julia Lawall (2) commits (+9/-17): Btrfs: use memdup_user helpers (+6/-14) Btrfs: Use ERR_CAST helpers (+3/-3) Andi Kleen (2) commits (+14/-100): Btrfs: Fix variables set but not read (bugs found by gcc 4.6) (+10/-6) Btrfs: cleanup warnings from gcc 4.6 (nonbugs) (+4/-94) Miao Xie (2) commits (+86/-80): Btrfs: Switch the extent buffer rbtree into a radix tree (+49/-69) Btrfs: restructure try_release_extent_buffer() (+37/-11) Total: (39) commits (+2430/-546) fs/btrfs/compression.c |2 - fs/btrfs/ctree.c| 57 ++-- fs/btrfs/ctree.h| 100 +- fs/btrfs/dir-item.c |2 +- fs/btrfs/disk-io.c | 32 ++- fs/btrfs/extent-tree.c | 694 +++- fs/btrfs/extent_io.c| 168 +- fs/btrfs/extent_io.h|4 +- fs/btrfs/extent_map.c |4 +- fs/btrfs/free-space-cache.c | 751 +++ fs/btrfs/free-space-cache.h | 18 + fs/btrfs/inode.c| 202 +--- fs/btrfs/ioctl.c| 398 ++- fs/btrfs/ioctl.h| 13 +- fs/btrfs/ordered-data.c |2 - fs/btrfs/relocation.c | 109 ++- fs/btrfs/root-tree.c|2 - fs/btrfs/super.c| 41 ++- fs/btrfs/transaction.c | 234 -- fs/btrfs/transaction.h |8 + fs/btrfs/tree-defrag.c |2 - fs/btrfs/tree-log.c | 17 +- fs/btrfs/volumes.c |7 +- fs/btrfs/xattr.c|2 - fs/btrfs/zlib.c |5 - fs/fs-writeback.c |
[2.6.37-rc0 patch] direct I/O submission fixes v3
Hi Chris, These patches from myself and Josef are still relevant, but not in your last mainline pull request. Can you add them if you are happy please? I've rediffed them [1,2] against your for-linux tree. Many thanks, Daniel --- [1] Fix use-after-free on error path. Signed-off-by: Josef Bacik jo...@redhat.com diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 558cac2..986cc40 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -5761,7 +5761,7 @@ free_ordered: if (write) { struct btrfs_ordered_extent *ordered; ordered = btrfs_lookup_ordered_extent(inode, - dip-logical_offset); + file_offset); if (!test_bit(BTRFS_ORDERED_PREALLOC, ordered-flags) !test_bit(BTRFS_ORDERED_NOCOW, ordered-flags)) btrfs_free_reserved_extent(root, ordered-start, --- [2] Fix leak of 'dip' on error path and unnecessary double-assignment. Signed-off-by: Daniel J Blueman daniel.blue...@gmail.com diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 558cac2..312eeb7 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -5701,15 +5701,15 @@ static void btrfs_submit_direct(int rw, struct bio *bio, struct inode *inode, ret = -ENOMEM; goto free_ordered; } - dip-csums = NULL; if (!skip_sum) { dip-csums = kmalloc(sizeof(u32) * bio-bi_vcnt, GFP_NOFS); if (!dip-csums) { ret = -ENOMEM; - goto free_ordered; + goto out_err; } - } + } else + dip-csums = NULL; dip-private = bio-bi_private; dip-inode = inode; -- Forwarded message -- From: Daniel J Blueman daniel.blue...@gmail.com Date: 25 July 2010 19:53 Subject: Re: [2.6.35-rc6 patch] direct I/O submission fixes v2 To: Josef Bacik jo...@redhat.com, Chris Mason chris.ma...@oracle.com Cc: Linux BTRFS linux-btrfs@vger.kernel.org On 25 July 2010 15:42, Josef Bacik jo...@redhat.com wrote: On Sat, Jul 24, 2010 at 12:01:59AM +0100, Daniel J Blueman wrote: Hi Chris, This fixes some issues relating to direct I/O submission, however a further patch will be needed to handle the case where allocation of 'dip' fails, which is always dereferenced when finding the ordered extent. Hi, There's an easier way to do this. This patch should fix the problem, Signed-off-by: Josef Bacik jo...@redhat.com diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 3232945..7259ef9 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -5815,7 +5815,7 @@ free_ordered: if (write) { struct btrfs_ordered_extent *ordered; ordered = btrfs_lookup_ordered_extent(inode, - dip-logical_offset); + file_offset); if (!test_bit(BTRFS_ORDERED_PREALLOC, ordered-flags) !test_bit(BTRFS_ORDERED_NOCOW, ordered-flags)) btrfs_free_reserved_extent(root, ordered-start, Good move! With your patch applied, mine (now not priority) then becomes: Fix leak of 'dip' on error path and double assignment. Signed-off-by: Daniel J Blueman daniel.blue...@gmail.com diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 1bff92a..bd7f940 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -5652,15 +5652,15 @@ static void btrfs_submit_direct(int rw, struct bio *bio, struct inode *inode, ret = -ENOMEM; goto free_ordered; } - dip-csums = NULL; if (!skip_sum) { dip-csums = kmalloc(sizeof(u32) * bio-bi_vcnt, GFP_NOFS); if (!dip-csums) { ret = -ENOMEM; - goto free_ordered; + goto out_err; } - } + } else + dip-csums = NULL; dip-private = bio-bi_private; dip-inode = inode; -- Daniel J Blueman -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] Btrfs updates for 2.6.37
On Sat, Oct 30, 2010 at 6:51 AM, Chris Mason chris.ma...@oracle.com wrote: There were some minor conflicts with Linus' current tree, so my branch is merged with Linus' tree as of this morning. Gaah. Please don't do this. Unless it's a _really_ messy merge, I really do want to do the merge. It's fine to have an alternate pre-merged branch for me to compare against, but please do that separately. So what I did was to just instead merge the state before your merge, and in the process I: (a) noticed that your merge was incorrect (you had left around a unused error: label in btrfs_mount()), since I did use your merge as something to compare against (see above). That label had been removed in your branch by commit 0e78340f3c1f, but your merge resurrected it. (b) saw just how horribly nasty your writeback_inodes_sb() end result was, and decided to clean up the estimation of dirty pages in order to not end up with the function call argument from hell. Now, it's obviously totally possible that I screwed things up entirely in the process, but as mentioned elsewhere, I do feel that actually seeing the merge conflicts really does help me get a feel for what I'm merging, and what the points of conflict are. And yes, maybe it's just me showing my insecurities again. I have various mental hangups, and liking to feel like I know roughly what is going on is one of them. Doing the merges and looking at the code that clashes makes me feel like I have some kind of awareness of how things are interacting in the development process. Linus -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/2] Control filesystem balances (kernel side)
On Saturday, 30 October, 2010, Hugo Mills wrote: These two patches give a degree of control over balance operations. The first makes it possible to get an idea of how much work remains to do, by tracking the number of block groups (chunks) that need to be moved/rewritten. The second patch allows a running balance operation to be cancelled when the current block group has been moved. One fundamental question, though -- is the progress monitor function best implemented as an ioctl, as I've done here, or should it be two or three sysfs files? I'm thinking of /proc/mdstat... Obviously, /proc/mdstat would never get into /sys, but exposing the expected and remaining values as files has an attractive simplicity to it. I like the idea that these info should be put under sysfs. Something like /sys/btrfs/filesystem-uuid/ balance- info on balancing devices- list of device (a directory of links or a file which contains the list of devices) subvolumes/ - info on subvolume(s) label - label of the filesystem other btrfs filesystem related knoba Obviously we need another btrfs command to extract an uuid from a btrfs filesystem like: # btrfs filesystem get-uuid /path/to/a/btrfs/filesystem f9b9c413-0dc8-4e3f-94f2-86faa702f519 The user-space side of things are in a separate patch series, to follow. Please be gentle with me, this is my first (serious, non-trivial) kernel patch. :) Hugo. -- === Hugo Mills: h...@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- No! My collection of rare, incurable diseases! Violated! --- -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) kreij...@inwind.it Key fingerprint = 4769 7E51 5293 D36C 814E C054 BF04 F161 3DC5 0512 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 0/2] Control filesystem balances (userspace)
On Saturday, 30 October, 2010, Hugo Mills wrote: These two patches complement the previous two kernel-side patches. The first implements a way of displaying the current progress of any running balance process. The second patch allows a running balance to be cancelled. I'm a bit uncertain about the best name for these commands. Several options: [...] 4) btrfs balance progress path btrfs balance cancel path My current favourite, although we introduce a new namespace (balance) for commands. We could add btrfs balance start path as a synonym for btrfs filesystem balance path, for some degree of consistency. I like this. Regards G.Baroncelli At some point, I'll add a monitor function, which will poll at 1s intervals for progress updates, and print out progress when it changes. Hugo. -- === Hugo Mills: h...@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- No! My collection of rare, incurable diseases! Violated! --- -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) kreij...@inwind.it Key fingerprint = 4769 7E51 5293 D36C 814E C054 BF04 F161 3DC5 0512 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] update ioctl.h from 2.6.37-rc1
Signed-off-by: Sage Weil s...@newdream.net --- ioctl.h | 39 ++- 1 files changed, 30 insertions(+), 9 deletions(-) diff --git a/ioctl.h b/ioctl.h index 776d7a9..5a03317 100644 --- a/ioctl.h +++ b/ioctl.h @@ -23,13 +23,28 @@ #define BTRFS_IOCTL_MAGIC 0x94 #define BTRFS_VOL_NAME_MAX 255 -#define BTRFS_PATH_NAME_MAX 4087 +/* this should be 4k */ +#define BTRFS_PATH_NAME_MAX 4087 struct btrfs_ioctl_vol_args { __s64 fd; char name[BTRFS_PATH_NAME_MAX + 1]; }; +#define BTRFS_SNAPSHOT_NAME_MAX 4079 +struct btrfs_ioctl_async_vol_args { + __s64 fd; + __u64 transid; + char name[BTRFS_SNAPSHOT_NAME_MAX + 1]; +}; + +#define BTRFS_INO_LOOKUP_PATH_MAX 4080 +struct btrfs_ioctl_ino_lookup_args { + __u64 treeid; + __u64 objectid; + char name[BTRFS_INO_LOOKUP_PATH_MAX]; +}; + struct btrfs_ioctl_search_key { /* which root are we searching. 0 is the tree of tree roots */ __u64 tree_id; @@ -72,7 +87,7 @@ struct btrfs_ioctl_search_header { __u64 offset; __u32 type; __u32 len; -} __attribute__((may_alias)); +}; #define BTRFS_SEARCH_ARGS_BUFSIZE (4096 - sizeof(struct btrfs_ioctl_search_key)) /* @@ -85,11 +100,10 @@ struct btrfs_ioctl_search_args { char buf[BTRFS_SEARCH_ARGS_BUFSIZE]; }; -#define BTRFS_INO_LOOKUP_PATH_MAX 4080 -struct btrfs_ioctl_ino_lookup_args { - __u64 treeid; - __u64 objectid; - char name[BTRFS_INO_LOOKUP_PATH_MAX]; +struct btrfs_ioctl_clone_range_args { + __s64 src_fd; + __u64 src_offset, src_length; + __u64 dest_offset; }; /* flags for the defrag range ioctl */ @@ -155,11 +169,14 @@ struct btrfs_ioctl_space_args { struct btrfs_ioctl_vol_args) #define BTRFS_IOC_BALANCE _IOW(BTRFS_IOCTL_MAGIC, 12, \ struct btrfs_ioctl_vol_args) -/* 13 is for CLONE_RANGE */ + +#define BTRFS_IOC_CLONE_RANGE _IOW(BTRFS_IOCTL_MAGIC, 13, \ + struct btrfs_ioctl_clone_range_args) + #define BTRFS_IOC_SUBVOL_CREATE _IOW(BTRFS_IOCTL_MAGIC, 14, \ struct btrfs_ioctl_vol_args) #define BTRFS_IOC_SNAP_DESTROY _IOW(BTRFS_IOCTL_MAGIC, 15, \ - struct btrfs_ioctl_vol_args) + struct btrfs_ioctl_vol_args) #define BTRFS_IOC_DEFRAG_RANGE _IOW(BTRFS_IOCTL_MAGIC, 16, \ struct btrfs_ioctl_defrag_range_args) #define BTRFS_IOC_TREE_SEARCH _IOWR(BTRFS_IOCTL_MAGIC, 17, \ @@ -169,4 +186,8 @@ struct btrfs_ioctl_space_args { #define BTRFS_IOC_DEFAULT_SUBVOL _IOW(BTRFS_IOCTL_MAGIC, 19, u64) #define BTRFS_IOC_SPACE_INFO _IOWR(BTRFS_IOCTL_MAGIC, 20, \ struct btrfs_ioctl_space_args) +#define BTRFS_IOC_START_SYNC _IOR(BTRFS_IOCTL_MAGIC, 24, __u64) +#define BTRFS_IOC_WAIT_SYNC _IOW(BTRFS_IOCTL_MAGIC, 22, __u64) +#define BTRFS_IOC_SNAP_CREATE_ASYNC _IOW(BTRFS_IOCTL_MAGIC, 23, \ + struct btrfs_ioctl_async_vol_args) #endif -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] btrfs: implement 'async-snapshot' command
This is identical to 'snapshot', but uses the new async snapshot creation ioctl, and prints out the transid the new snapshot will be committed with. Signed-off-by: Sage Weil s...@newdream.net --- btrfs.c |8 +++- btrfs_cmds.c | 32 +++- btrfs_cmds.h |3 ++- 3 files changed, 36 insertions(+), 7 deletions(-) diff --git a/btrfs.c b/btrfs.c index 46314cf..c4b9a31 100644 --- a/btrfs.c +++ b/btrfs.c @@ -44,11 +44,17 @@ static struct Command commands[] = { /* avoid short commands different for the case only */ - { do_clone, 2, + { do_create_snap, 2, subvolume snapshot, source [dest/]name\n Create a writable snapshot of the subvolume source with\n the name name in the dest directory. }, + { do_create_snap_async, 2, + subvolume async-snapshot, source [dest/]name\n + Create a writable snapshot of the subvolume source with\n + the name name in the dest directory. Do not wait for\n + the snapshot creation to commit to disk before returning. + }, { do_delete_subvolume, 1, subvolume delete, subvolume\n Delete the subvolume subvolume. diff --git a/btrfs_cmds.c b/btrfs_cmds.c index 8031c58..6da5862 100644 --- a/btrfs_cmds.c +++ b/btrfs_cmds.c @@ -307,7 +307,7 @@ int do_subvol_list(int argc, char **argv) return 0; } -int do_clone(int argc, char **argv) +static int create_snap(int argc, char **argv, int async) { char*subvol, *dst; int res, fd, fddst, len; @@ -316,7 +316,6 @@ int do_clone(int argc, char **argv) subvol = argv[1]; dst = argv[2]; - struct btrfs_ioctl_vol_args args; res = test_issubvolume(subvol); if(res0){ @@ -374,9 +373,22 @@ int do_clone(int argc, char **argv) printf(Create a snapshot of '%s' in '%s/%s'\n, subvol, dstdir, newname); - args.fd = fd; - strcpy(args.name, newname); - res = ioctl(fddst, BTRFS_IOC_SNAP_CREATE, args); + if (async) { + struct btrfs_ioctl_async_vol_args async_args; + async_args.fd = fd; + async_args.transid = 0; + strcpy(async_args.name, newname); + res = ioctl(fddst, BTRFS_IOC_SNAP_CREATE_ASYNC, async_args); + if (res == 0) + printf(transid %llu\n, + (unsigned long long)async_args.transid); + } else { + struct btrfs_ioctl_vol_args args; + + args.fd = fd; + strcpy(args.name, newname); + res = ioctl(fddst, BTRFS_IOC_SNAP_CREATE, args); + } close(fd); close(fddst); @@ -390,6 +402,16 @@ int do_clone(int argc, char **argv) } +int do_create_snap_async(int argc, char **argv) +{ + return create_snap(argc, argv, 1); +} + +int do_create_snap(int argc, char **argv) +{ + return create_snap(argc, argv, 0); +} + int do_delete_subvolume(int argc, char **argv) { int res, fd, len; diff --git a/btrfs_cmds.h b/btrfs_cmds.h index 7bde191..c44dc79 100644 --- a/btrfs_cmds.h +++ b/btrfs_cmds.h @@ -15,7 +15,8 @@ */ /* btrfs_cmds.c*/ -int do_clone(int nargs, char **argv); +int do_create_snap(int nargs, char **argv); +int do_create_snap_async(int nargs, char **argv); int do_delete_subvolume(int nargs, char **argv); int do_create_subvol(int nargs, char **argv); int do_fssync(int nargs, char **argv); -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] btrfs: implement 'start-sync' and 'wait-sync' commands
The 'start-sync' command initiates a sync, but does not wait for it to complete. A transaction is printed that can be fed to 'wait-sync', which will wait for it to commit. 'wait-sync' can also be used in combination with 'async-snapshot' to wait for an async snapshot creation to commit. Signed-off-by: Sage Weil s...@newdream.net --- btrfs.c |9 + btrfs_cmds.c | 49 + btrfs_cmds.h |2 ++ 3 files changed, 60 insertions(+), 0 deletions(-) diff --git a/btrfs.c b/btrfs.c index c4b9a31..d45ac1f 100644 --- a/btrfs.c +++ b/btrfs.c @@ -83,6 +83,15 @@ static struct Command commands[] = { filesystem sync, path\n Force a sync on the filesystem path. }, + { do_start_sync, 1, + filesystem start-sync, path\n + Start a sync on the filesystem path, and print the resulting\n + transaction id. + }, + { do_wait_sync, 2, + filesystem wait-sync, path transid\n + Wait for the transaction transid on the filesystem at path to commit. + }, { do_resize, 2, filesystem resize, [+/-]newsize[gkm]|max filesystem\n Resize the file system. If 'max' is passed, the filesystem\n diff --git a/btrfs_cmds.c b/btrfs_cmds.c index 6da5862..5b5bb15 100644 --- a/btrfs_cmds.c +++ b/btrfs_cmds.c @@ -548,6 +548,55 @@ int do_fssync(int argc, char **argv) return 0; } +int do_start_sync(int argc, char **argv) +{ + int fd, res; + char*path = argv[1]; + __u64 transid; + + fd = open_file_or_dir(path); + if (fd 0) { + fprintf(stderr, ERROR: can't access to '%s'\n, path); + return 12; + } + + printf(StartSync '%s'\n, path); + res = ioctl(fd, BTRFS_IOC_START_SYNC, transid); + close(fd); + if( res 0 ){ + fprintf(stderr, ERROR: unable to fs-syncing '%s'\n, path); + return 16; + } else { + printf(transid %llu\n, (unsigned long long)transid); + } + + return 0; +} + +int do_wait_sync(int argc, char **argv) +{ + int fd, res; + char*path = argv[1]; + __u64 transid = atoll(argv[2]); + + fd = open_file_or_dir(path); + if (fd 0) { + fprintf(stderr, ERROR: can't access to '%s'\n, path); + return 12; + } + + printf(WaitSync '%s' transid %llu\n, path, (unsigned long long)transid); + res = ioctl(fd, BTRFS_IOC_WAIT_SYNC, transid); + close(fd); + if( res 0 ){ + fprintf(stderr, ERROR: unable to wait-sync on '%s' transid %llu: %s\n, path, + (unsigned long long)transid, strerror(errno)); + return 16; + } + + return 0; +} + int do_scan(int argc, char **argv) { int i, fd; diff --git a/btrfs_cmds.h b/btrfs_cmds.h index c44dc79..e0e5ceb 100644 --- a/btrfs_cmds.h +++ b/btrfs_cmds.h @@ -20,6 +20,8 @@ int do_create_snap_async(int nargs, char **argv); int do_delete_subvolume(int nargs, char **argv); int do_create_subvol(int nargs, char **argv); int do_fssync(int nargs, char **argv); +int do_start_sync(int nargs, char **argv); +int do_wait_sync(int nargs, char **argv); int do_defrag(int argc, char **argv); int do_show_filesystem(int nargs, char **argv); int do_add_volume(int nargs, char **args); -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] btrfs: implement 'async-snapshot' command
On Saturday, 30 October, 2010, Sage Weil wrote: This is identical to 'snapshot', but uses the new async snapshot creation ioctl, and prints out the transid the new snapshot will be committed with. Signed-off-by: Sage Weil s...@newdream.net --- btrfs.c |8 +++- btrfs_cmds.c | 32 +++- btrfs_cmds.h |3 ++- 3 files changed, 36 insertions(+), 7 deletions(-) diff --git a/btrfs.c b/btrfs.c index 46314cf..c4b9a31 100644 --- a/btrfs.c +++ b/btrfs.c @@ -44,11 +44,17 @@ static struct Command commands[] = { /* avoid short commands different for the case only */ - { do_clone, 2, + { do_create_snap, 2, subvolume snapshot, source [dest/]name\n Create a writable snapshot of the subvolume source with\n the name name in the dest directory. }, + { do_create_snap_async, 2, + subvolume async-snapshot, source [dest/]name\n + Create a writable snapshot of the subvolume source with\n + the name name in the dest directory. Do not wait for\n + the snapshot creation to commit to disk before returning. + }, Why create another command ? I think that it is better add a switch like btrfs subvolume snapshot [--async] source [dest/]name Create a writable snapshot of the subvolume source with the name name in the dest directory. If --async is passed, do not wait for the snapshot creation to commit to disk before returning. In any case, please update the man page too. { do_delete_subvolume, 1, subvolume delete, subvolume\n Delete the subvolume subvolume. diff --git a/btrfs_cmds.c b/btrfs_cmds.c index 8031c58..6da5862 100644 --- a/btrfs_cmds.c +++ b/btrfs_cmds.c @@ -307,7 +307,7 @@ int do_subvol_list(int argc, char **argv) return 0; } -int do_clone(int argc, char **argv) +static int create_snap(int argc, char **argv, int async) { char*subvol, *dst; int res, fd, fddst, len; @@ -316,7 +316,6 @@ int do_clone(int argc, char **argv) subvol = argv[1]; dst = argv[2]; - struct btrfs_ioctl_vol_args args; res = test_issubvolume(subvol); if(res0){ @@ -374,9 +373,22 @@ int do_clone(int argc, char **argv) printf(Create a snapshot of '%s' in '%s/%s'\n, subvol, dstdir, newname); - args.fd = fd; - strcpy(args.name, newname); - res = ioctl(fddst, BTRFS_IOC_SNAP_CREATE, args); + if (async) { + struct btrfs_ioctl_async_vol_args async_args; + async_args.fd = fd; + async_args.transid = 0; + strcpy(async_args.name, newname); + res = ioctl(fddst, BTRFS_IOC_SNAP_CREATE_ASYNC, async_args); + if (res == 0) + printf(transid %llu\n, +(unsigned long long)async_args.transid); + } else { + struct btrfs_ioctl_vol_args args; + + args.fd = fd; + strcpy(args.name, newname); + res = ioctl(fddst, BTRFS_IOC_SNAP_CREATE, args); + } close(fd); close(fddst); @@ -390,6 +402,16 @@ int do_clone(int argc, char **argv) } +int do_create_snap_async(int argc, char **argv) +{ + return create_snap(argc, argv, 1); +} + +int do_create_snap(int argc, char **argv) +{ + return create_snap(argc, argv, 0); +} + int do_delete_subvolume(int argc, char **argv) { int res, fd, len; diff --git a/btrfs_cmds.h b/btrfs_cmds.h index 7bde191..c44dc79 100644 --- a/btrfs_cmds.h +++ b/btrfs_cmds.h @@ -15,7 +15,8 @@ */ /* btrfs_cmds.c*/ -int do_clone(int nargs, char **argv); +int do_create_snap(int nargs, char **argv); +int do_create_snap_async(int nargs, char **argv); int do_delete_subvolume(int nargs, char **argv); int do_create_subvol(int nargs, char **argv); int do_fssync(int nargs, char **argv); -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) kreij...@inwind.it Key fingerprint = 4769 7E51 5293 D36C 814E C054 BF04 F161 3DC5 0512 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] btrfs: implement 'start-sync' and 'wait-sync' commands
On Saturday, 30 October, 2010, Sage Weil wrote: The 'start-sync' command initiates a sync, but does not wait for it to complete. A transaction is printed that can be fed to 'wait-sync', which will wait for it to commit. 'wait-sync' can also be used in combination with 'async-snapshot' to wait for an async snapshot creation to commit. As previous, if you add or update a command, please update the man page too. Signed-off-by: Sage Weil s...@newdream.net --- btrfs.c |9 + btrfs_cmds.c | 49 + btrfs_cmds.h |2 ++ 3 files changed, 60 insertions(+), 0 deletions(-) diff --git a/btrfs.c b/btrfs.c index c4b9a31..d45ac1f 100644 --- a/btrfs.c +++ b/btrfs.c @@ -83,6 +83,15 @@ static struct Command commands[] = { filesystem sync, path\n Force a sync on the filesystem path. }, + { do_start_sync, 1, + filesystem start-sync, path\n + Start a sync on the filesystem path, and print the resulting\n + transaction id. + }, + { do_wait_sync, 2, + filesystem wait-sync, path transid\n + Wait for the transaction transid on the filesystem at path to commit. + }, { do_resize, 2, filesystem resize, [+/-]newsize[gkm]|max filesystem\n Resize the file system. If 'max' is passed, the filesystem\n diff --git a/btrfs_cmds.c b/btrfs_cmds.c index 6da5862..5b5bb15 100644 --- a/btrfs_cmds.c +++ b/btrfs_cmds.c @@ -548,6 +548,55 @@ int do_fssync(int argc, char **argv) return 0; } +int do_start_sync(int argc, char **argv) +{ + int fd, res; + char*path = argv[1]; + __u64 transid; + + fd = open_file_or_dir(path); + if (fd 0) { + fprintf(stderr, ERROR: can't access to '%s'\n, path); + return 12; + } + + printf(StartSync '%s'\n, path); + res = ioctl(fd, BTRFS_IOC_START_SYNC, transid); + close(fd); + if( res 0 ){ + fprintf(stderr, ERROR: unable to fs-syncing '%s'\n, path); + return 16; + } else { + printf(transid %llu\n, (unsigned long long)transid); + } + + return 0; +} + +int do_wait_sync(int argc, char **argv) +{ + int fd, res; + char*path = argv[1]; + __u64 transid = atoll(argv[2]); + + fd = open_file_or_dir(path); + if (fd 0) { + fprintf(stderr, ERROR: can't access to '%s'\n, path); + return 12; + } + + printf(WaitSync '%s' transid %llu\n, path, (unsigned long long)transid); + res = ioctl(fd, BTRFS_IOC_WAIT_SYNC, transid); + close(fd); + if( res 0 ){ + fprintf(stderr, ERROR: unable to wait-sync on '%s' transid %llu: %s\n, path, + (unsigned long long)transid, strerror(errno)); + return 16; + } + + return 0; +} + int do_scan(int argc, char **argv) { int i, fd; diff --git a/btrfs_cmds.h b/btrfs_cmds.h index c44dc79..e0e5ceb 100644 --- a/btrfs_cmds.h +++ b/btrfs_cmds.h @@ -20,6 +20,8 @@ int do_create_snap_async(int nargs, char **argv); int do_delete_subvolume(int nargs, char **argv); int do_create_subvol(int nargs, char **argv); int do_fssync(int nargs, char **argv); +int do_start_sync(int nargs, char **argv); +int do_wait_sync(int nargs, char **argv); int do_defrag(int argc, char **argv); int do_show_filesystem(int nargs, char **argv); int do_add_volume(int nargs, char **args); -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) kreij...@inwind.it Key fingerprint = 4769 7E51 5293 D36C 814E C054 BF04 F161 3DC5 0512 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] btrfs: implement 'async-snapshot' command
On Saturday, 30 October, 2010, Sage Weil wrote: This is identical to 'snapshot', but uses the new async snapshot creation ioctl, and prints out the transid the new snapshot will be committed with. Only for curiosity, how long may take snapshot a tree ? It should be only a copy and update of few pages on the disk (the head of the trees), so the time should be O(1)... Goffredo [...] -- gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) kreij...@inwind.it Key fingerprint = 4769 7E51 5293 D36C 814E C054 BF04 F161 3DC5 0512 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: some issues with lots of snapshots
On Fri, 29 Oct 2010 08:31:05 +0200 Roman Kapusta roman.kapu...@gmail.com wrote: On Fri, Oct 29, 2010 at 00:03, Pat Regan theh...@patshead.com wrote: On Wed, 27 Oct 2010 10:39:48 +0200 Xavier Nicollet nicol...@jeru.org wrote: Le 26 octobre 2010 à 15:15, Pat Regan a écrit: I turned off the 5-minute snapshots and I'm now just keeping 4 weekly, 7 daily, and 24 hourly snapshots alive. I have just rebooted and I am going with /15 minutes interval. I'm just replying so this is documented somewhere. After I read your message I decided to turn on snaphots at 15 minute intervals yesterday. This morning I had snapshot processing filling up my process list again. I think there is no problem with snapshot creation every 5 or 15 minutes, but problem is with deleting old snapshots every 5 or 15 minutes. Can you try to run cleanup of old snapshots only once per day to check if it will improve? Reducing the frequency of snapshot removal sure did work. I started with a pretty large amount of time between snapshot removal jobs and I have been decreasing that number. I have been running with a 1 hour delay between snapshot removals for almost 24 hours so far with no problems. Pat -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Blog: BTRFS is effectively stable
On Fri, Oct 29, 2010 at 4:38 PM, Chris Samuel ch...@csamuel.org wrote: A friend of mine who builds storage systems designed for HPC use has been keeping an eye on btrfs and has just done some testing of it with 2.6.36 and seems to like what he sees in terms of stability. That's a *very* misleading conclusion to come to based solely on a single file I/O test. It's more realistic to say stable under fio load in ideal conditions. For example: No device-yanking tests were done. No power-cord yanking tests were done. No device cables were yanked, shaken, or plugged/unplugged in rapid succession. No dd the raw device underneath the filesystem while doing file I/O tests were done. No recovery tests were done. IOW, you can't really say it's stable across the board like that. -- Freddie Cash fjwc...@gmail.com -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Blog: BTRFS is effectively stable
For example: No device-yanking tests were done. No power-cord yanking tests were done. No device cables were yanked, shaken, or plugged/unplugged in rapid succession. No dd the raw device underneath the filesystem while doing file I/O tests were done. No recovery tests were done. Any reallife tests to show how close we are to becoming really stable ? i.e ideally I'd like to know that we're for example 85% stable failing N tests -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Blog: BTRFS is effectively stable
On 10/30/2010 05:19 PM, Freddie Cash wrote: On Fri, Oct 29, 2010 at 4:38 PM, Chris Samuelch...@csamuel.org wrote: A friend of mine who builds storage systems designed for HPC use has been keeping an eye on btrfs and has just done some testing of it with 2.6.36 and seems to like what he sees in terms of stability. That's a *very* misleading conclusion to come to based solely on a single file I/O test. It's more realistic to say stable under fio load in ideal conditions. Since it's my blog post that is generating these responses, let me provide some more information. We want to see if the file system, at a basic level, works under load. We aren't yanking power, or otherwise purposefully damaging the underlying platform during operations, as that is not what we are testing. What we've found is that zfs on fuse doesn't pass these very basic tests. nilfs2 does (recent kernels anyway). btrfs does (now). Our focus for the tests were quite simple. Will the file system work when we are trying to shove GB/s down its throat. If the answer is no, then we don't even consider looking at the lets see how stable it is under purposefully harmful conditions tests. If the answer is yes, that it works, then we have to ask is the performance near where we need it for it to be useful. Currently the answer to that is no. Once this changes (and I saw some posts recently from Chris M that suggests that there have been some changes in this respect for 2.6.37 time frame), then we can start looking at the broader picture of suitability for use. That latter set of issues, file system and metadata repair, stability in the face of less than ideal conditions, gets tested after we see the system able to perform where we need it to. We aren't there yet. Its stable against the tests we ran on it, which, as noted, some other file systems (some in wide spread use) aren't. - Joe -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Can't remove missing drive
On 10/30/2010 12:37 AM, William Uther wrote: Hi, I have a raid1 setup with a missing device. I have added a new device and everything seems to be working fine, except I cannot remove the old, missing, device. There is no error - but the 'some devices missing' tag doesn't go away. r...@willvo:~# btrfs filesystem show failed to read /dev/sr0 Label: none uuid: f929c413-01c8-443f-b4f2-86f36702f519 Total devices 3 FS bytes used 578.39GB devid1 size 931.51GB used 604.00GB path /dev/sdb1 devid2 size 931.51GB used 604.00GB path /dev/sdc1 *** Some devices missing Btrfs Btrfs v0.19 r...@willvo:~# btrfs device delete missing /data r...@willvo:~# btrfs filesystem show failed to read /dev/sr0 Label: none uuid: f929c413-01c8-443f-b4f2-86f36702f519 Total devices 3 FS bytes used 578.39GB devid1 size 931.51GB used 604.00GB path /dev/sdb1 devid2 size 931.51GB used 604.00GB path /dev/sdc1 *** Some devices missing Btrfs Btrfs v0.19 The lack of a message on the delete operation indicates success. What you see is the expected behavior, since 'btrfs filesystem show' is reading the partitions directly. Therefore, it won't see any changes that haven't been committed to disk yet. The 'some devices missing' message should go away after running 'sync', or rebooting, or un-mounting the file system. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html