[PATCH] Btrfs: kill space_info pointer from inode structure
Inodes always allocate from space with BTRFS_BLOCK_GROUP_DATA type, which means every inode has the same BTRFS_I(inode)->space_info. Signed-off-by: Li Zefan --- fs/btrfs/btrfs_inode.h |3 --- fs/btrfs/ctree.h |2 ++ fs/btrfs/extent-tree.c | 20 fs/btrfs/inode.c |3 --- 4 files changed, 10 insertions(+), 18 deletions(-) diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index 52d7eca..0daab1f 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -69,9 +69,6 @@ struct btrfs_inode { /* node for the red-black tree that links inodes in subvolume root */ struct rb_node rb_node; - /* the space_info for where this inode's data allocations are done */ - struct btrfs_space_info *space_info; - /* full 64 bit generation number, struct vfs_inode doesn't have a big * enough field for this. */ diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 3998d90..26d2bb9 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1068,6 +1068,8 @@ struct btrfs_fs_info { */ struct list_head space_info; + struct btrfs_space_info *data_sinfo; + struct reloc_control *reloc_ctl; spinlock_t delalloc_lock; diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 71cd456..39e7bbe 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -2935,6 +2935,8 @@ static int update_space_info(struct btrfs_fs_info *info, u64 flags, *space_info = found; list_add_rcu(&found->list, &info->space_info); atomic_set(&found->caching_threads, 0); + if (flags & BTRFS_BLOCK_GROUP_DATA) + info->data_sinfo = found; return 0; } @@ -3016,12 +3018,6 @@ u64 btrfs_get_alloc_profile(struct btrfs_root *root, int data) return get_alloc_profile(root, flags); } -void btrfs_set_inode_space_info(struct btrfs_root *root, struct inode *inode) -{ - BTRFS_I(inode)->space_info = __find_space_info(root->fs_info, - BTRFS_BLOCK_GROUP_DATA); -} - /* * This will check the space that the inode allocates from to make sure we have * enough space for bytes. @@ -3030,6 +3026,7 @@ int btrfs_check_data_free_space(struct inode *inode, u64 bytes) { struct btrfs_space_info *data_sinfo; struct btrfs_root *root = BTRFS_I(inode)->root; + struct btrfs_fs_info *fs_info = root->fs_info; u64 used; int ret = 0, committed = 0, alloc_chunk = 1; @@ -3042,7 +3039,7 @@ int btrfs_check_data_free_space(struct inode *inode, u64 bytes) committed = 1; } - data_sinfo = BTRFS_I(inode)->space_info; + data_sinfo = fs_info->data_sinfo; if (!data_sinfo) goto alloc; @@ -3083,10 +3080,9 @@ alloc: goto commit_trans; } - if (!data_sinfo) { - btrfs_set_inode_space_info(root, inode); - data_sinfo = BTRFS_I(inode)->space_info; - } + if (!data_sinfo) + data_sinfo = root->fs_info->data_sinfo; + goto again; } @@ -3134,7 +3130,7 @@ void btrfs_free_reserved_data_space(struct inode *inode, u64 bytes) /* make sure bytes are sectorsize aligned */ bytes = (bytes + root->sectorsize - 1) & ~((u64)root->sectorsize - 1); - data_sinfo = BTRFS_I(inode)->space_info; + data_sinfo = root->fs_info->data_sinfo; spin_lock(&data_sinfo->lock); data_sinfo->bytes_may_use -= bytes; BTRFS_I(inode)->reserved_bytes -= bytes; diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index d340f63..af8e14f 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -3934,7 +3934,6 @@ static int btrfs_init_locked_inode(struct inode *inode, void *p) struct btrfs_iget_args *args = p; inode->i_ino = args->ino; BTRFS_I(inode)->root = args->root; - btrfs_set_inode_space_info(args->root, inode); return 0; } @@ -4469,7 +4468,6 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans, BTRFS_I(inode)->root = root; BTRFS_I(inode)->generation = trans->transid; inode->i_generation = BTRFS_I(inode)->generation; - btrfs_set_inode_space_info(root, inode); if (mode & S_IFDIR) owner = 0; @@ -6720,7 +6718,6 @@ struct inode *btrfs_alloc_inode(struct super_block *sb) return NULL; ei->root = NULL; - ei->space_info = NULL; ei->generation = 0; ei->sequence = 0; ei->last_trans = 0; -- 1.7.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Example: Plain Text Message
: This is a test message. Best Regards, Smart Williams wsmar...@yahoo.nl Sent on 2011-07-18 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/7] btrfs: don't BUG_ON allocation errors in btrfs_drop_snapshot
Hi, Mark, (2011/07/19 7:09), Mark Fasheh wrote: > On Fri, Jul 15, 2011 at 12:04:46PM +0900, Tsutomu Itoh wrote: >> (2011/07/15 7:15), Mark Fasheh wrote: >>> In addition to properly handling allocation failure from btrfs_alloc_path, I >>> also fixed up the kzalloc error handling code immediately below it. >> >> Need not you correct the caller of btrfs_drop_snapshot()? > > Hmm, I don't think so - the only two callers of btrfs_drop_snapshot() are > merge_reloc_roots() and > btrfs_clean_old_snapshots(). Both of which currently ignore the return code. If you think so, I think that you should change the type of btrfs_drop_snapshot() into 'void'. Thanks, Tsutomu > --Mark > > -- > Mark Fasheh > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/7] btrfs: Don't BUG_ON alloc_path errors in find_next_chunk
Excerpts from Mark Fasheh's message of 2011-07-18 17:36:57 -0400: > Hi Tsutomu, > > Thanks for the review, it is appreciated! > > On Fri, Jul 15, 2011 at 11:43:52AM +0900, Tsutomu Itoh wrote: > > > @@ -1037,7 +1037,8 @@ static noinline int find_next_chunk(struct > > > btrfs_root *root, > > > struct btrfs_key found_key; > > > > > > path = btrfs_alloc_path(); > > > -BUG_ON(!path); > > > +if (!path) > > > +return -ENOMEM; > > > > If find_next_chunk() returns -ENOMEM, space_info->full becomes 1 by > > following code. > > > > 3205 static int do_chunk_alloc(struct btrfs_trans_handle *trans, > > 3206 struct btrfs_root *extent_root, u64 > > alloc_bytes, > > 3207 u64 flags, int force) > > 3208 { > > ... > > 3277 ret = btrfs_alloc_chunk(trans, extent_root, flags); > > 3278 spin_lock(&space_info->lock); > > 3279 if (ret) > > 3280 space_info->full = 1; > > 3281 else > > 3282 ret = 1; > > > > Is it OK? > > I don't think so actually. It looks like in this case we might want to > bubble the error back up past do_chunk_alloc and leave space_info untouched. > Chris, does that seem reasonable? Yeah, once space_info->full is 1, we don't flip it back to zero until more space is available somehow. We should bubble the error up. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/6] Series short description
Sorry, discard this email On 07/18/2011 07:33 PM, Goffredo Baroncelli wrote: > The following series implements... > > --- > > Goffredo Baroncelli (6): > Add info for the commands. > Add the header/footer/introduction of the man page. > helpextract: tool to extract the info for the help from the source. > Update the makefile for generating the man page. > Show the help messages from the info in the comment. > Update the makefile for generating the help messages. > > > Makefile | 28 +++ > btrfs.c| 210 + > btrfs_cmds.c | 276 + > helpextract.c | 435 > > man/btrfs.8.in | 359 --- > man/btrfs.8.in.old | 359 +++ > scrub.c| 79 + > 7 files changed, 1380 insertions(+), 366 deletions(-) > create mode 100644 helpextract.c > delete mode 100644 man/btrfs.8.in > create mode 100644 man/btrfs.8.in.old > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/7] btrfs: don't BUG_ON allocation errors in btrfs_drop_snapshot
On Fri, Jul 15, 2011 at 12:04:46PM +0900, Tsutomu Itoh wrote: > (2011/07/15 7:15), Mark Fasheh wrote: > > In addition to properly handling allocation failure from btrfs_alloc_path, I > > also fixed up the kzalloc error handling code immediately below it. > > Need not you correct the caller of btrfs_drop_snapshot()? Hmm, I don't think so - the only two callers of btrfs_drop_snapshot() are merge_reloc_roots() and btrfs_clean_old_snapshots(). Both of which currently ignore the return code. --Mark -- Mark Fasheh -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/7] btrfs: Don't BUG_ON alloc_path errors in find_next_chunk
Hi Tsutomu, Thanks for the review, it is appreciated! On Fri, Jul 15, 2011 at 11:43:52AM +0900, Tsutomu Itoh wrote: > > @@ -1037,7 +1037,8 @@ static noinline int find_next_chunk(struct btrfs_root > > *root, > > struct btrfs_key found_key; > > > > path = btrfs_alloc_path(); > > - BUG_ON(!path); > > + if (!path) > > + return -ENOMEM; > > If find_next_chunk() returns -ENOMEM, space_info->full becomes 1 by following > code. > > 3205 static int do_chunk_alloc(struct btrfs_trans_handle *trans, > 3206 struct btrfs_root *extent_root, u64 > alloc_bytes, > 3207 u64 flags, int force) > 3208 { > ... > 3277 ret = btrfs_alloc_chunk(trans, extent_root, flags); > 3278 spin_lock(&space_info->lock); > 3279 if (ret) > 3280 space_info->full = 1; > 3281 else > 3282 ret = 1; > > Is it OK? I don't think so actually. It looks like in this case we might want to bubble the error back up past do_chunk_alloc and leave space_info untouched. Chris, does that seem reasonable? --Mark -- Mark Fasheh -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: write(2) taking 4s
2011-07-18 11:39:12 +0100, Stephane Chazelas: > 2011-07-17 10:17:37 +0100, Stephane Chazelas: > > 2011-07-16 13:12:10 +0100, Stephane Chazelas: > > > Still on my btrfs-based backup system. I still see one BUG() > > > reached in btrfs-fixup per boot time, no memory exhaustion > > > anymore. There is now however something new: write performance > > > is down to a few bytes per second. > > [...] > > > > The condition that was causing that seems to have cleared by > > itself this morning before 4am. > > > > flush-btrfs-1 and sync are still in D state. > > > > Can't really tell what cleared it. Could be when the first of > > the rsyncs ended as all the other ones (and ntfsclones from nbd > > devices) ended soon after > [...] > > New nightly backup, and it's happening again. Started about 40 > minutes after the start of the backup. [...] > Actively running at the moment are 1 rsync and 3 ntfsclone. [...] And then again today. Interestingly, I "killall -STOP"ed all the ntfsclone and rsync processes and: # strace -tt -Te write yes > a-file-on-the-btrfs-fs 20:23:26.635848 write(1, "y\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\n"..., 4096) = 4096 <4.095223> 20:23:30.731391 write(1, "y\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\n"..., 4096) = 4096 <4.095769> 20:23:34.827390 write(1, "y\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\n"..., 4096) = 4096 <4.095788> 20:23:38.923388 write(1, "y\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\n"..., 4096) = 4096 <4.095771> Now 95% of the write(2)s take 4 seconds (while it was about 15% before I stopped the processes). [304257.760119] yes S 88001e8e3780 0 13179 13178 0x0001 [304257.760119] 88001e8e3780 0086 8160b020 [304257.760119] 000127c0 880074543fd8 880074543fd8 000127c0 [304257.760119] 88001e8e3780 880074542010 0286 00010286 [304257.760119] Call Trace: [304257.760119] [] ? schedule_timeout+0xa0/0xd7 [304257.760119] [] ? lock_timer_base+0x49/0x49 [304257.760119] [] ? shrink_delalloc+0x100/0x14e [btrfs] [304257.760119] [] ? btrfs_delalloc_reserve_metadata+0xf9/0x10b [btrfs] [304257.760119] [] ? btrfs_delalloc_reserve_space+0x20/0x3e [btrfs] [304257.760119] [] ? __btrfs_buffered_write+0x137/0x2dc [btrfs] [304257.760119] [] ? btrfs_dirty_inode+0x119/0x139 [btrfs] [304257.760119] [] ? btrfs_file_aio_write+0x395/0x42b [btrfs] [304257.760119] [] ? __switch_to+0x19c/0x288 [304257.760119] [] ? do_sync_write+0xb1/0xea [304257.760119] [] ? ptrace_notify+0x7f/0x9d [304257.760119] [] ? security_file_permission+0x18/0x2d [304257.760119] [] ? vfs_write+0xa4/0xff [304257.760119] [] ? syscall_trace_enter+0xb6/0x15b [304257.760119] [] ? sys_write+0x45/0x6e [304257.760119] [] ? tracesys+0xd9/0xde After killall -CONT, it's back to 15% write(2)s delayed. What's going on? -- Stephane -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Applications using fsync cause hangs for several seconds every few minutes
On 06/06/2011 06:58 PM, Nirbheek Chauhan wrote: > Hello list, > > I've been using btrfs on my personal machines for about two years now, > and on this machine for about a year with absolutely no problems. > Infact, it has held up better than ext4 with regards to reliability. > > However, recently, perhaps with 2.6.39, or after I quickly started > filling up my disk again, it has become impossible for me to work for > long periods on my machine. > > Every few minutes, (I guess) when applications do fsync (firefox, > xchat, vim, etc), all applications that use fsync() hang for several > seconds, and applications that use general IO suffer extreme > slowdowns. iotop shows various combinations of the processes listed > below doing writes, and the total write as 2-3MB/s. > > [btrfs-dealloc-] > [btrfs-submit-0] > [btrfs-transacti] > [btrfs-endio-wri] > [flush-btrfs-1] > > In some extreme cases, I've had hangs for 5 whole minutes. I'm really > beginning to appreciate how little I/O GNOME Shell does since it > remains completely responsive throughout this. I have a feeling that > the cause for this is extreme fragmentation. > > My hard disk is a 500GB SATA hdd, my btrfs partition details are: > > # btrfs filesystem show > Label: 'gentoo' uuid: 6f539d7f-f70f-4216-a4a9-6f7a2117a04a > Total devices 1 FS bytes used 246.37GB > devid1 size 345.13GB used 345.13GB path /dev/sda7 > > Btrfs v0.19-35-g1b444cd-dirty > > What can I do to debug this issue? What other information should I > supply? Could someone guide me on how to figure out why my machine is > unusable now? > > Thanks in advance, > Hello, I've been looking into this and I have a suspicion. Would you run with this patch and see if the problem goes away? If so I'm on the right track and I'll have more test patches for you to try :). Thanks, Josef diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 19450bc..2e30350 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -150,7 +150,6 @@ static noinline int run_scheduled_bios(struct btrfs_device *device) * another device without first sending all of these down. * So, setup a plug here and finish it off before we return */ - blk_start_plug(&plug); bdi = blk_get_backing_dev_info(device->bdev); fs_info = device->dev_root->fs_info; @@ -290,7 +289,6 @@ loop_lock: spin_unlock(&device->io_lock); done: - blk_finish_plug(&plug); return 0; } -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: don't be as agressive with delalloc metadata reservations V2
On 07/18/2011 02:11 PM, Josef Bacik wrote: > Currently we reserve enough space to COW an entirely full btree for every > extent > we have reserved for an inode. This _sucks_, because you only need to COW > once, > and then everybody else is ok. Unfortunately we don't know we'll all be able > to > get into the same transaction so that's what we have had to do. But the > global > reserve holds a reservation large enough to cover a large percentage of all > the > metadata currently in the fs. So all we really need to account for is any new > blocks that we may allocate. So fix this by > > 1) Passing to btrfs_alloc_free_block() wether this is a new block or a COW > block. If it is a COW block we use the global reserve, if not we use the > trans->block_rsv. > 2) Reduce the amount of space we reserve. Since we don't need to account for > cow'ing the tree we can just keep track of new blocks to reserve, which > greatly > reduces the reservation amount. > > This makes my basic random write test go from 3 mb/s to 75 mb/s. I've tested > this with my horrible ENOSPC test and it seems to work out fine. Thanks, > > Signed-off-by: Josef Bacik > --- > V1->V2: > -fix a problem reported by Liubo, we need to make sure that we move bytes > over for any new extents we may add to the extent tree so we don't get a bunch > of warnings. > -fix the global reserve to reserve 50% of the metadata space currently used. Argh helps if I actually send the updated patch, sorry! --- fs/btrfs/ctree.c | 10 +- fs/btrfs/ctree.h |5 ++--- fs/btrfs/disk-io.c |3 ++- fs/btrfs/extent-tree.c | 31 --- fs/btrfs/ioctl.c |2 +- 5 files changed, 34 insertions(+), 17 deletions(-) diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 2e66786..fbd48e9 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -206,7 +206,7 @@ int btrfs_copy_root(struct btrfs_trans_handle *trans, cow = btrfs_alloc_free_block(trans, root, buf->len, 0, new_root_objectid, &disk_key, level, -buf->start, 0); +buf->start, 0, 1); if (IS_ERR(cow)) return PTR_ERR(cow); @@ -412,7 +412,7 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans, cow = btrfs_alloc_free_block(trans, root, buf->len, parent_start, root->root_key.objectid, &disk_key, -level, search_start, empty_size); +level, search_start, empty_size, 0); if (IS_ERR(cow)) return PTR_ERR(cow); @@ -1985,7 +1985,7 @@ static noinline int insert_new_root(struct btrfs_trans_handle *trans, c = btrfs_alloc_free_block(trans, root, root->nodesize, 0, root->root_key.objectid, &lower_key, - level, root->node->start, 0); + level, root->node->start, 0, 1); if (IS_ERR(c)) return PTR_ERR(c); @@ -2112,7 +2112,7 @@ static noinline int split_node(struct btrfs_trans_handle *trans, split = btrfs_alloc_free_block(trans, root, root->nodesize, 0, root->root_key.objectid, - &disk_key, level, c->start, 0); + &disk_key, level, c->start, 0, 1); if (IS_ERR(split)) return PTR_ERR(split); @@ -2937,7 +2937,7 @@ again: right = btrfs_alloc_free_block(trans, root, root->leafsize, 0, root->root_key.objectid, - &disk_key, 0, l->start, 0); + &disk_key, 0, l->start, 0, 1); if (IS_ERR(right)) return PTR_ERR(right); diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 3ba4d5f..1accb56 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -2135,8 +2135,7 @@ static inline bool btrfs_mixed_space_info(struct btrfs_space_info *space_info) static inline u64 btrfs_calc_trans_metadata_size(struct btrfs_root *root, unsigned num_items) { - return (root->leafsize + root->nodesize * (BTRFS_MAX_LEVEL - 1)) * - 3 * num_items; + return root->leafsize * 3 * num_items; } void btrfs_put_block_group(struct btrfs_block_group_cache *cache); @@ -2161,7 +2160,7 @@ struct extent_buffer *btrfs_alloc_free_block(struct btrfs_trans_handle *trans, struct btrfs_root *root, u32 blocksize, u64 parent, u64 root_objectid, struct btrfs_disk_key *key, int level, - u64 hint, u64 empty_size); + u64 hint, u6
[PATCH] Btrfs: don't be as agressive with delalloc metadata reservations V2
Currently we reserve enough space to COW an entirely full btree for every extent we have reserved for an inode. This _sucks_, because you only need to COW once, and then everybody else is ok. Unfortunately we don't know we'll all be able to get into the same transaction so that's what we have had to do. But the global reserve holds a reservation large enough to cover a large percentage of all the metadata currently in the fs. So all we really need to account for is any new blocks that we may allocate. So fix this by 1) Passing to btrfs_alloc_free_block() wether this is a new block or a COW block. If it is a COW block we use the global reserve, if not we use the trans->block_rsv. 2) Reduce the amount of space we reserve. Since we don't need to account for cow'ing the tree we can just keep track of new blocks to reserve, which greatly reduces the reservation amount. This makes my basic random write test go from 3 mb/s to 75 mb/s. I've tested this with my horrible ENOSPC test and it seems to work out fine. Thanks, Signed-off-by: Josef Bacik --- V1->V2: -fix a problem reported by Liubo, we need to make sure that we move bytes over for any new extents we may add to the extent tree so we don't get a bunch of warnings. -fix the global reserve to reserve 50% of the metadata space currently used. fs/btrfs/ctree.c | 10 +- fs/btrfs/ctree.h |5 ++--- fs/btrfs/disk-io.c |3 ++- fs/btrfs/extent-tree.c | 20 +++- fs/btrfs/ioctl.c |2 +- 5 files changed, 25 insertions(+), 15 deletions(-) diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 2e66786..fbd48e9 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -206,7 +206,7 @@ int btrfs_copy_root(struct btrfs_trans_handle *trans, cow = btrfs_alloc_free_block(trans, root, buf->len, 0, new_root_objectid, &disk_key, level, -buf->start, 0); +buf->start, 0, 1); if (IS_ERR(cow)) return PTR_ERR(cow); @@ -412,7 +412,7 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans, cow = btrfs_alloc_free_block(trans, root, buf->len, parent_start, root->root_key.objectid, &disk_key, -level, search_start, empty_size); +level, search_start, empty_size, 0); if (IS_ERR(cow)) return PTR_ERR(cow); @@ -1985,7 +1985,7 @@ static noinline int insert_new_root(struct btrfs_trans_handle *trans, c = btrfs_alloc_free_block(trans, root, root->nodesize, 0, root->root_key.objectid, &lower_key, - level, root->node->start, 0); + level, root->node->start, 0, 1); if (IS_ERR(c)) return PTR_ERR(c); @@ -2112,7 +2112,7 @@ static noinline int split_node(struct btrfs_trans_handle *trans, split = btrfs_alloc_free_block(trans, root, root->nodesize, 0, root->root_key.objectid, - &disk_key, level, c->start, 0); + &disk_key, level, c->start, 0, 1); if (IS_ERR(split)) return PTR_ERR(split); @@ -2937,7 +2937,7 @@ again: right = btrfs_alloc_free_block(trans, root, root->leafsize, 0, root->root_key.objectid, - &disk_key, 0, l->start, 0); + &disk_key, 0, l->start, 0, 1); if (IS_ERR(right)) return PTR_ERR(right); diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 3ba4d5f..1accb56 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -2135,8 +2135,7 @@ static inline bool btrfs_mixed_space_info(struct btrfs_space_info *space_info) static inline u64 btrfs_calc_trans_metadata_size(struct btrfs_root *root, unsigned num_items) { - return (root->leafsize + root->nodesize * (BTRFS_MAX_LEVEL - 1)) * - 3 * num_items; + return root->leafsize * 3 * num_items; } void btrfs_put_block_group(struct btrfs_block_group_cache *cache); @@ -2161,7 +2160,7 @@ struct extent_buffer *btrfs_alloc_free_block(struct btrfs_trans_handle *trans, struct btrfs_root *root, u32 blocksize, u64 parent, u64 root_objectid, struct btrfs_disk_key *key, int level, - u64 hint, u64 empty_size); + u64 hint, u64 empty_size, int new_block); void btrfs_free_tree_block(struct btrfs_trans_handle *trans, struct btrfs_root *root,
[PATCH 0/6] Series short description
The following series implements... --- Goffredo Baroncelli (6): Add info for the commands. Add the header/footer/introduction of the man page. helpextract: tool to extract the info for the help from the source. Update the makefile for generating the man page. Show the help messages from the info in the comment. Update the makefile for generating the help messages. Makefile | 28 +++ btrfs.c| 210 + btrfs_cmds.c | 276 + helpextract.c | 435 man/btrfs.8.in | 359 --- man/btrfs.8.in.old | 359 +++ scrub.c| 79 + 7 files changed, 1380 insertions(+), 366 deletions(-) create mode 100644 helpextract.c delete mode 100644 man/btrfs.8.in create mode 100644 man/btrfs.8.in.old -- Signature -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Applications using fsync cause hangs for several seconds every few minutes
On Tue, 2011-06-07 at 04:28 +0530, Nirbheek Chauhan wrote: > Every few minutes, (I guess) when applications do fsync (firefox, > xchat, vim, etc), all applications that use fsync() hang for several > seconds, and applications that use general IO suffer extreme > slowdowns. iotop shows various combinations of the processes listed > below doing writes, and the total write as 2-3MB/s. I have experienced this too. It /seemed/ to help removing a lot of snapshots (i have hundreds that i didn't really need). Would it be stupid to try disabling fsync like described at http://ubuntuforums.org/archive/index.php/t-1103926.html ? I don't know of the consequences... but it would prove your theory? ~mck -- “Don’t worry about people stealing your ideas. If your ideas are any good, you’ll have to ram them down people’s throats.” - Howard Aiken | http://semb.wever.org | http://sesat.no | http://tech.finn.no | Java XSS Filter signature.asc Description: This is a digitally signed message part
Re: [PATCH] Btrfs: don't be as agressive with delalloc metadata reservations
On 07/17/2011 11:04 PM, liubo wrote: > On 07/16/2011 02:29 AM, Josef Bacik wrote: >> Currently we reserve enough space to COW an entirely full btree for every >> extent >> we have reserved for an inode. This _sucks_, because you only need to COW >> once, >> and then everybody else is ok. Unfortunately we don't know we'll all be >> able to >> get into the same transaction so that's what we have had to do. But the >> global >> reserve holds a reservation large enough to cover a large percentage of all >> the >> metadata currently in the fs. So all we really need to account for is any >> new >> blocks that we may allocate. So fix this by >> >> 1) Passing to btrfs_alloc_free_block() wether this is a new block or a COW >> block. If it is a COW block we use the global reserve, if not we use the >> trans->block_rsv. >> 2) Reduce the amount of space we reserve. Since we don't need to account for >> cow'ing the tree we can just keep track of new blocks to reserve, which >> greatly >> reduces the reservation amount. >> >> This makes my basic random write test go from 3 mb/s to 75 mb/s. I've tested >> this with my horrible ENOSPC test and it seems to work out fine. Thanks, >> > > Hi, Josef, > > After I patched this and did a "tar xf source.tar", I got lots of warnings, > > Would you like to look into this? > > [ cut here ] > WARNING: at fs/btrfs/extent-tree.c:5695 btrfs_alloc_free_block+0x178/0x340 > [btrfs]() > Hardware name: QiTianM7150 > Modules linked in: btrfs iptable_nat nf_nat zlib_deflate libcrc32c > ebtable_nat ebtables bridge stp llc cpufreq_ondemand acpi_cpufreq freq_table > mperf be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb3i libcxgbi cxgb3 mdio > iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ext3 jbd dm_mirror > dm_region_hash dm_log dm_mod sg ppdev serio_raw pcspkr i2c_i801 iTCO_wdt > iTCO_vendor_support sky2 parport_pc parport ext4 mbcache jbd2 sd_mod > crc_t10dif pata_acpi ata_generic ata_piix i915 drm_kms_helper drm > i2c_algo_bit i2c_core video [last unloaded: btrfs] > Pid: 16008, comm: umount Tainted: GW 2.6.39+ #9 > Call Trace: > [] warn_slowpath_common+0x7f/0xc0 > [] warn_slowpath_null+0x1a/0x20 > [] btrfs_alloc_free_block+0x178/0x340 [btrfs] > [] ? read_extent_buffer+0xd8/0x1d0 [btrfs] > [] __btrfs_cow_block+0x155/0x5f0 [btrfs] > [] btrfs_cow_block+0x10b/0x240 [btrfs] > [] btrfs_search_slot+0x49e/0x7a0 [btrfs] > [] btrfs_write_dirty_block_groups+0x1a9/0x4d0 [btrfs] > [] ? btrfs_tree_unlock+0x50/0x50 [btrfs] > [] commit_cowonly_roots+0x105/0x1e0 [btrfs] > [] btrfs_commit_transaction+0x428/0x850 [btrfs] > [] ? wait_current_trans+0x28/0x100 [btrfs] > [] ? join_transaction+0x25/0x250 [btrfs] > [] ? wake_up_bit+0x40/0x40 > [] btrfs_sync_fs+0x67/0xd0 [btrfs] > [] __sync_filesystem+0x5e/0x90 > [] sync_filesystem+0x4b/0x70 > [] generic_shutdown_super+0x34/0xf0 > [] kill_anon_super+0x16/0x60 > [] deactivate_locked_super+0x45/0x70 > [] deactivate_super+0x4a/0x70 > [] mntput_no_expire+0x13c/0x1c0 > [] sys_umount+0x7b/0x3a0 > [] system_call_fastpath+0x16/0x1b > ---[ end trace 9a65800674b03b84 ]--- > Hmm, not reserving enough room for the chunk tree it seems, or I screwed something up and we're not using the right reserve. I will look into this, thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 2/5] btrfs-progs: scrub ioctls
- scrub structs added - ioctls for scrub - BTRFS_FSID_SIZE moved Signed-off-by: Jan Schmidt --- ctree.h |2 +- ioctl.h | 54 +- 2 files changed, 54 insertions(+), 2 deletions(-) diff --git a/ctree.h b/ctree.h index 61eb639..6e1b80b 100644 --- a/ctree.h +++ b/ctree.h @@ -24,6 +24,7 @@ #include "radix-tree.h" #include "extent-cache.h" #include "extent_io.h" +#include "ioctl.h" struct btrfs_root; struct btrfs_trans_handle; @@ -250,7 +251,6 @@ static inline unsigned long btrfs_chunk_item_size(int num_stripes) sizeof(struct btrfs_stripe) * (num_stripes - 1); } -#define BTRFS_FSID_SIZE 16 #define BTRFS_HEADER_FLAG_WRITTEN (1ULL << 0) #define BTRFS_HEADER_FLAG_RELOC(1ULL << 1) #define BTRFS_SUPER_FLAG_SEEDING (1ULL << 32) diff --git a/ioctl.h b/ioctl.h index 7be1177..11730d2 100644 --- a/ioctl.h +++ b/ioctl.h @@ -23,8 +23,9 @@ #define BTRFS_IOCTL_MAGIC 0x94 #define BTRFS_VOL_NAME_MAX 255 -#define BTRFS_PATH_NAME_MAX 4087 +/* this should be 4k */ +#define BTRFS_PATH_NAME_MAX 4087 struct btrfs_ioctl_vol_args { __s64 fd; char name[BTRFS_PATH_NAME_MAX + 1]; @@ -44,6 +45,52 @@ struct btrfs_ioctl_vol_args_v2 { #define BTRFS_FSID_SIZE 16 #define BTRFS_UUID_SIZE 16 +struct btrfs_scrub_progress { + __u64 data_extents_scrubbed; + __u64 tree_extents_scrubbed; + __u64 data_bytes_scrubbed; + __u64 tree_bytes_scrubbed; + __u64 read_errors; + __u64 csum_errors; + __u64 verify_errors; + __u64 no_csum; + __u64 csum_discards; + __u64 super_errors; + __u64 malloc_errors; + __u64 uncorrectable_errors; + __u64 corrected_errors; + __u64 last_physical; + __u64 unverified_errors; +}; + +#define BTRFS_SCRUB_READONLY 1 +struct btrfs_ioctl_scrub_args { + __u64 devid;/* in */ + __u64 start;/* in */ + __u64 end; /* in */ + __u64 flags;/* in */ + struct btrfs_scrub_progress progress; /* out */ + /* pad to 1k */ + __u64 unused[(1024-32-sizeof(struct btrfs_scrub_progress))/8]; +}; + +#define BTRFS_DEVICE_PATH_NAME_MAX 1024 +struct btrfs_ioctl_dev_info_args { + __u64 devid;/* in/out */ + __u8 uuid[BTRFS_UUID_SIZE]; /* in/out */ + __u64 bytes_used; /* out */ + __u64 total_bytes; /* out */ + __u64 unused[379]; /* pad to 4k */ + __u8 path[BTRFS_DEVICE_PATH_NAME_MAX]; /* out */ +}; + +struct btrfs_ioctl_fs_info_args { + __u64 max_id; /* out */ + __u64 num_devices; /* out */ + __u8 fsid[BTRFS_FSID_SIZE]; /* out */ + __u64 reserved[124];/* pad to 1k */ +}; + struct btrfs_ioctl_search_key { /* which root are we searching. 0 is the tree of tree roots */ __u64 tree_id; @@ -234,6 +281,11 @@ struct btrfs_ioctl_balance_start { struct btrfs_ioctl_space_args) #define BTRFS_IOC_SNAP_CREATE_V2 _IOW(BTRFS_IOCTL_MAGIC, 23, \ struct btrfs_ioctl_vol_args_v2) +#define BTRFS_IOC_SCRUB _IOWR(BTRFS_IOCTL_MAGIC, 27, \ + struct btrfs_ioctl_scrub_args) +#define BTRFS_IOC_SCRUB_CANCEL _IO(BTRFS_IOCTL_MAGIC, 28) +#define BTRFS_IOC_SCRUB_PROGRESS _IOWR(BTRFS_IOCTL_MAGIC, 29, \ + struct btrfs_ioctl_scrub_args) #define BTRFS_IOC_DEV_INFO _IOWR(BTRFS_IOCTL_MAGIC, 30, \ struct btrfs_ioctl_dev_info_args) #define BTRFS_IOC_FS_INFO _IOR(BTRFS_IOCTL_MAGIC, 31, \ -- 1.7.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 4/5] btrfs-progs: scrub userland implementation
Signed-off-by: Jan Schmidt --- scrub.c | 1666 +++ 1 files changed, 1666 insertions(+), 0 deletions(-) diff --git a/scrub.c b/scrub.c new file mode 100644 index 000..9dca5f6 --- /dev/null +++ b/scrub.c @@ -0,0 +1,1666 @@ +/* + * Copyright (C) 2011 STRATO. All rights reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License v2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public + * License along with this program; if not, write to the + * Free Software Foundation, Inc., 59 Temple Place - Suite 330, + * Boston, MA 021110-1307, USA. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "ctree.h" +#include "ioctl.h" +#include "btrfs_cmds.h" +#include "utils.h" +#include "volumes.h" +#include "disk-io.h" + +#define SCRUB_DATA_FILE "/var/lib/btrfs/scrub.status" +#define SCRUB_PROGRESS_SOCKET_PATH "/var/lib/btrfs/scrub.progress" +#define SCRUB_FILE_VERSION_PREFIX "scrub status" +#define SCRUB_FILE_VERSION "1" + +struct scrub_stats { + time_t t_start; + time_t t_resumed; + u64 duration; + u64 finished; + u64 canceled; +}; + +struct scrub_progress { + struct btrfs_ioctl_scrub_args scrub_args; + int fd; + int ret; + int skip; + struct scrub_stats stats; + struct scrub_file_record *resumed; + int ioctl_errno; + pthread_mutex_t progress_mutex; +}; + +struct scrub_file_record { + u8 fsid[BTRFS_FSID_SIZE]; + u64 devid; + struct scrub_stats stats; + struct btrfs_scrub_progress p; +}; + +struct scrub_progress_cycle { + int fdmnt; + int prg_fd; + int do_record; + struct btrfs_ioctl_fs_info_args *fi; + struct scrub_progress *progress; + struct scrub_progress *shared_progress; + pthread_mutex_t *write_mutex; +}; + +struct scrub_fs_stat { + struct btrfs_scrub_progress p; + struct scrub_stats s; + int i; +}; + +static void print_scrub_full(struct btrfs_scrub_progress *sp) +{ + printf("\tdata_extents_scrubbed: %lld\n", sp->data_extents_scrubbed); + printf("\ttree_extents_scrubbed: %lld\n", sp->tree_extents_scrubbed); + printf("\tdata_bytes_scrubbed: %lld\n", sp->data_bytes_scrubbed); + printf("\ttree_bytes_scrubbed: %lld\n", sp->tree_bytes_scrubbed); + printf("\tread_errors: %lld\n", sp->read_errors); + printf("\tcsum_errors: %lld\n", sp->csum_errors); + printf("\tverify_errors: %lld\n", sp->verify_errors); + printf("\tno_csum: %lld\n", sp->no_csum); + printf("\tcsum_discards: %lld\n", sp->csum_discards); + printf("\tsuper_errors: %lld\n", sp->super_errors); + printf("\tmalloc_errors: %lld\n", sp->malloc_errors); + printf("\tuncorrectable_errors: %lld\n", sp->uncorrectable_errors); + printf("\tunverified_errors: %lld\n", sp->unverified_errors); + printf("\tcorrected_errors: %lld\n", sp->corrected_errors); + printf("\tlast_physical: %lld\n", sp->last_physical); +} + +#define ERR(test, ...) do {\ + if (test) \ + fprintf(stderr, __VA_ARGS__); \ +} while (0) + +#define PRINT_SCRUB_ERROR(test, desc) do { \ + if (test) \ + printf(" %s=%llu", desc, test); \ +} while (0) + +static void print_scrub_summary(struct btrfs_scrub_progress *p) +{ + u64 err_cnt; + u64 err_cnt2; + char *bytes; + + err_cnt = p->read_errors + + p->csum_errors + + p->verify_errors + + p->super_errors; + + err_cnt2 = p->corrected_errors + p->uncorrectable_errors; + + if (p->malloc_errors) + printf("*** WARNING: memory allocation failed while scrubbing. " + "results may be inaccurate\n"); + bytes = pretty_sizes(p->data_bytes_scrubbed + p->tree_bytes_scrubbed); + printf("\ttotal bytes scrubbed: %s with %llu errors\n", bytes, + max(err_cnt, err_cnt2)); + free(bytes); + if (err_cnt || err_cnt2) { + printf("\terror details:"); + PRINT_SCRUB_ERROR(p->read_errors, "read"); + PRINT_SCRUB_ERROR(p->super_errors, "super"); + PRINT_SCRUB_ERROR(p->verify_errors, "verify"); + PRINT_SCRUB_ERROR(p->csum_errors, "csum"); + printf("\n"); + printf("\tcorr
[PATCH v3 3/5] btrfs-progs: added check_mounted_where
new version of check_mounted() returning more information gathered while searching. check_mounted() is now a wrapper for check_mounted_where(). the new version is needed by scrub.c Signed-off-by: Jan Schmidt --- utils.c | 29 ++--- utils.h |2 ++ 2 files changed, 24 insertions(+), 7 deletions(-) diff --git a/utils.c b/utils.c index 64a4298..86c643c 100644 --- a/utils.c +++ b/utils.c @@ -790,13 +790,8 @@ int blk_file_in_dev_list(struct btrfs_fs_devices* fs_devices, const char* file) */ int check_mounted(const char* file) { - int ret; int fd; - u64 total_devs = 1; - int is_btrfs; - struct btrfs_fs_devices* fs_devices_mnt = NULL; - FILE *f; - struct mntent *mnt; + int ret; fd = open(file, O_RDONLY); if (fd < 0) { @@ -804,11 +799,26 @@ int check_mounted(const char* file) return -errno; } + ret = check_mounted_where(fd, file, NULL, 0, NULL); + close(fd); + + return ret; +} + +int check_mounted_where(int fd, const char *file, char *where, int size, + struct btrfs_fs_devices **fs_dev_ret) +{ + int ret; + u64 total_devs = 1; + int is_btrfs; + struct btrfs_fs_devices *fs_devices_mnt = NULL; + FILE *f; + struct mntent *mnt; + /* scan the initial device */ ret = btrfs_scan_one_device(fd, file, &fs_devices_mnt, &total_devs, BTRFS_SUPER_INFO_OFFSET); is_btrfs = (ret >= 0); - close(fd); /* scan other devices */ if (is_btrfs && total_devs > 1) { @@ -844,6 +854,11 @@ int check_mounted(const char* file) } /* Did we find an entry in mnt table? */ + if (mnt && size && where) + strncpy(where, mnt->mnt_dir, size); + if (fs_dev_ret) + *fs_dev_ret = fs_devices_mnt; + ret = (mnt != NULL); out_mntloop_err: diff --git a/utils.h b/utils.h index 0067728..c5f55e1 100644 --- a/utils.h +++ b/utils.h @@ -37,6 +37,8 @@ int btrfs_scan_for_fsid(struct btrfs_fs_devices *fs_devices, u64 total_devs, void btrfs_register_one_device(char *fname); int btrfs_scan_one_dir(char *dirname, int run_ioctl); int check_mounted(const char *devicename); +int check_mounted_where(int fd, const char *file, char *where, int size, + struct btrfs_fs_devices **fs_devices_mnt); int btrfs_device_already_in_root(struct btrfs_root *root, int fd, int super_offset); char *pretty_sizes(u64 size); -- 1.7.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 5/5] btrfs-progs: scrub added to manpage
Signed-off-by: Jan Schmidt --- man/btrfs.8.in | 64 +++- 1 files changed, 63 insertions(+), 1 deletions(-) diff --git a/man/btrfs.8.in b/man/btrfs.8.in index e1a6ad9..84a60cd 100644 --- a/man/btrfs.8.in +++ b/man/btrfs.8.in @@ -41,7 +41,15 @@ btrfs \- control a btrfs filesystem .PP \fBbtrfs\fP \fBdevice add\fP\fI [...] \fP .PP -\fBbtrfs\fP \fBdevice delete\fP\fI [...] \fP] +\fBbtrfs\fP \fBdevice delete\fP\fI [...] \fP +.PP +\fBbtrfs\fP \fBscrub start\fP [-Bdqru] {\fI\fP|\fI\fP} +.PP +\fBbtrfs\fP \fBscrub cancel\fP {\fI\fP|\fI\fP} +.PP +\fBbtrfs\fP \fBscrub resume\fP [-Bdqru] {\fI\fP|\fI\fP} +.PP +\fBbtrfs\fP \fBscrub status\fP [-d] {\fI\fP|\fI\fP} .PP \fBbtrfs\fP \fBhelp|\-\-help|\-h \fP\fI\fP .PP @@ -225,6 +233,60 @@ Finally, if \fB--all-devices\fP is passed, all the devices under /dev are scanned. .TP +\fBscrub start\fP [-Bdqru] {\fI\fP|\fI\fP} +Start a scrub on all devices of the filesystem identified by \fI\fR or on +a single \fI\fR. Without options, scrub is started as a background +process. Progress can be obtained with the \fBscrub status\fR command. Scrubbing +involves reading all data from all disks and verifying checksums. Errors are +corrected along the way if possible. +.RS + +\fIOptions\fR +.IP -B 5 +Do not background and print scrub statistics when finished. +.IP -d 5 +Print separate statistics for each device of the filesystem (-B only). +.IP -q 5 +Quiet. Omit error messages and statistics. +.IP -r 5 +Read only mode. Do not attempt to correct anything. +.IP -u 5 +Scrub unused space as well. (NOT IMPLEMENTED) +.RE +.TP + +\fBscrub cancel\fP {\fI\fP|\fI\fP} +If a scrub is running on the filesystem identified by \fI\fR, cancel it. +Progress is saved in the scrub progress file and scrubbing can be resumed later +using the \fBscrub resume\fR command. +If a \fI\fR is given, the corresponding filesystem is found and +\fBscrub cancel\fP behaves as if it was called on that filesystem. +.TP + +\fBscrub resume\fP [-Bdqru] {\fI\fP|\fI\fP} +Resume a canceled or interrupted scrub cycle on the filesystem identified by +\fI\fR or on a given \fI\fR. Does not start a new scrub if the +last scrub finished successfully. +.RS + +\fIOptions\fR +.TP +see \fBscrub start\fP. +.RE +.TP + +\fBscrub status\fP [-d] {\fI\fP|\fI\fP} +Show status of a running scrub for the filesystem identified by \fI\fR or +for the specified \fI\fR. +If no scrub is running, show statistics of the last finished or canceled scrub +for that filesystem or device. +.RS + +\fIOptions\fR +.IP -d 5 +Print separate statistics for each device of the filesystem. +.RE + .PP \fBbalance progress\fP [\fB-m\fP|\fB--monitor\fP] \fI\fP -- 1.7.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 1/5] btrfs-progs: commands added
- scrub commands added - open_file_or_dir no longer static (needed by scrub.c) Signed-off-by: Jan Schmidt --- Makefile |4 ++-- btrfs.c | 23 +++ btrfs_cmds.c |2 +- btrfs_cmds.h |5 + 4 files changed, 31 insertions(+), 3 deletions(-) diff --git a/Makefile b/Makefile index 22bae25..edee1a0 100644 --- a/Makefile +++ b/Makefile @@ -35,8 +35,8 @@ all: version $(progs) manpages version: bash version.sh -btrfs: $(objects) btrfs.o btrfs_cmds.o - $(CC) -lpthread $(CFLAGS) -o btrfs btrfs.o btrfs_cmds.o \ +btrfs: $(objects) btrfs.o btrfs_cmds.o scrub.o + $(CC) -lpthread $(CFLAGS) -o btrfs btrfs.o btrfs_cmds.o scrub.o \ $(objects) $(LDFLAGS) $(LIBS) btrfsctl: $(objects) btrfsctl.o diff --git a/btrfs.c b/btrfs.c index ee4e756..f494738 100644 --- a/btrfs.c +++ b/btrfs.c @@ -142,6 +142,29 @@ static struct Command commands[] = { "balance cancel", "\n" "Cancel the balance operation running on ." }, + { do_scrub_start, -1, + "scrub start", "[-Bdqr] |\n" + "Start a new scrub.", + "\n-B do not background\n" + "-d stats per device (-B only)\n" + "-q quiet\n" + "-r read only mode\n" + }, + { do_scrub_cancel, 1, + "scrub cancel", "|\n" + "Cancel a running scrub.", + NULL + }, + { do_scrub_resume, -1, + "scrub resume", "[-Bdqr] |\n" + "Resume previously canceled or interrupted scrub.", + NULL + }, + { do_scrub_status, -1, + "scrub status", "[-d] |\n" + "Show status of running or finished scrub.", + NULL + }, { do_scan, 999, "device scan", "[--all-devices| [ ...]]\n" "Scan all device for or the passed device for a btrfs\n" diff --git a/btrfs_cmds.c b/btrfs_cmds.c index 43ffc38..0612f34 100644 --- a/btrfs_cmds.c +++ b/btrfs_cmds.c @@ -92,7 +92,7 @@ static int test_isdir(char *path) } -static int open_file_or_dir(const char *fname) +int open_file_or_dir(const char *fname) { int ret; struct stat st; diff --git a/btrfs_cmds.h b/btrfs_cmds.h index 067b53f..83faa5b 100644 --- a/btrfs_cmds.h +++ b/btrfs_cmds.h @@ -23,6 +23,10 @@ int do_defrag(int argc, char **argv); int do_show_filesystem(int nargs, char **argv); int do_add_volume(int nargs, char **args); int do_balance(int nargs, char **argv); +int do_scrub_start(int nargs, char **argv); +int do_scrub_status(int argc, char **argv); +int do_scrub_resume(int argc, char **argv); +int do_scrub_cancel(int nargs, char **argv); int do_balance_progress(int nargs, char **argv); int do_balance_cancel(int nargs, char **argv); int do_remove_volume(int nargs, char **args); @@ -35,3 +39,4 @@ int do_df_filesystem(int nargs, char **argv); int find_updated_files(int fd, u64 root_id, u64 oldest_gen); int do_find_newer(int argc, char **argv); int do_change_label(int argc, char **argv); +int open_file_or_dir(const char *fname); -- 1.7.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 0/5] btrfs-progs: scrub interface
This is the next patch series for scrub userland tools. Change log v1->v2: - commands now reachable as "btrfs scrub ..." instead of "btrfs filesystem scrub ..." - ability to scrub a single device instead of a whole file system - superfluous command line options removed - resume is now a separate command ("scrub resume") instead of "scrub start -r" - read-only mode (which inherited the -r option immediately, sorry for that) - up to date progress numbers with "btrfs scrub status" while scrub is running - effective locking to protect against multiple scrubs on a filesystem - man page entry for scrub added Change log v2->v3: - unverified_errors counter added - return code of utility now depends on detected disk errors - bail out when check_mounted_where returns an error - changes as suggested by Hugo Mill's review. incomplete list: - style (checkpatch is happy now) - pthread_* error handling - use /var/lib/btrfs instead of /var/btrfs for storing history Attention: This version may be useful for Hugo, only. It is meant to be included in his current integration branch. Also, I wanted to avoid patching the patches previously sent and rather deliver a fresh version. Therefore, I used "integration-20110705" from http://git.darksatanic.net/repo/btrfs-progs-unstable.git/ as a base, reverted the scrub patches in the middle and built upon that. -Jan Jan Schmidt (5): btrfs-progs: commands added btrfs-progs: scrub ioctls btrfs-progs: added check_mounted_where btrfs-progs: scrub userland implementation btrfs-progs: scrub added to manpage Makefile |4 +- btrfs.c| 23 + btrfs_cmds.c |2 +- btrfs_cmds.h |5 + ctree.h|2 +- ioctl.h| 54 ++- man/btrfs.8.in | 64 +++- scrub.c| 1666 utils.c| 29 +- utils.h|2 + 10 files changed, 1838 insertions(+), 13 deletions(-) create mode 100644 scrub.c -- 1.7.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs:don't check the return value of __btrfs_add_inode_defrag
Don't need to check the return value of __btrfs_add_inode_defrag(), since it will always return 0. Signed-off-by: Wanlong Gao --- fs/btrfs/file.c | 11 +-- 1 files changed, 5 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index fa4ef18..4a6d190 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -74,7 +74,7 @@ struct inode_defrag { * If an existing record is found the defrag item you * pass in is freed */ -static int __btrfs_add_inode_defrag(struct inode *inode, +static void __btrfs_add_inode_defrag(struct inode *inode, struct inode_defrag *defrag) { struct btrfs_root *root = BTRFS_I(inode)->root; @@ -106,11 +106,11 @@ static int __btrfs_add_inode_defrag(struct inode *inode, BTRFS_I(inode)->in_defrag = 1; rb_link_node(&defrag->rb_node, parent, p); rb_insert_color(&defrag->rb_node, &root->fs_info->defrag_inodes); - return 0; + return; exists: kfree(defrag); - return 0; + return; } @@ -123,7 +123,6 @@ int btrfs_add_inode_defrag(struct btrfs_trans_handle *trans, { struct btrfs_root *root = BTRFS_I(inode)->root; struct inode_defrag *defrag; - int ret = 0; u64 transid; if (!btrfs_test_opt(root, AUTO_DEFRAG)) @@ -150,9 +149,9 @@ int btrfs_add_inode_defrag(struct btrfs_trans_handle *trans, spin_lock(&root->fs_info->defrag_inodes_lock); if (!BTRFS_I(inode)->in_defrag) - ret = __btrfs_add_inode_defrag(inode, defrag); + __btrfs_add_inode_defrag(inode, defrag); spin_unlock(&root->fs_info->defrag_inodes_lock); - return ret; + return 0; } /* -- 1.7.4.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: write(2) taking 4s
2011-07-17 10:17:37 +0100, Stephane Chazelas: > 2011-07-16 13:12:10 +0100, Stephane Chazelas: > > Still on my btrfs-based backup system. I still see one BUG() > > reached in btrfs-fixup per boot time, no memory exhaustion > > anymore. There is now however something new: write performance > > is down to a few bytes per second. > [...] > > The condition that was causing that seems to have cleared by > itself this morning before 4am. > > flush-btrfs-1 and sync are still in D state. > > Can't really tell what cleared it. Could be when the first of > the rsyncs ended as all the other ones (and ntfsclones from nbd > devices) ended soon after [...] New nightly backup, and it's happening again. Started about 40 minutes after the start of the backup. system -net/total- ---procs--- --dsk/sda-dsk/sdb-dsk/sdc-- time | recv send|run blk new| read writ: read writ: read writ 17-07 20:19:18| 0 0 |0.0 0.0 0.0| 142k 31k: 119k 36k: 120k 33k 17-07 20:19:48|8087k 224k|1.2 5.3 0.1|2976k 98k: 793k 400k:2856k 375k 17-07 20:20:18|5174k 134k|0.8 4.6 0.9| 880k 179k: 830k 916k:1801k 825k 17-07 20:20:48|6634k 148k|1.3 4.9 0.2| 609k 101k:1259k 96k:2628k 98k 17-07 20:21:18|6725k 165k|0.7 5.8 0.0| 237k 442k: 975k 723k:1870k 644k 17-07 20:21:48|7100k 153k|0.7 5.4 0| 305k 83k:1124k 314k:2155k 274k 17-07 20:22:18|4440k 178k|0.5 5.3 0.0| 296k 1775B:2094k 240k:1663k 239k 17-07 20:22:48|8181k 220k|0.9 5.8 0| 360k 410B:1579k 196k:2065k 196k 17-07 20:23:18|8144k 228k|1.3 5.6 0| 348k 54k:1781k 216k:2213k 164k 17-07 20:23:48|5506k 185k|0.8 5.2 0.1| 307k0 :2040k0 :2166k0 17-07 20:24:18|6260k 206k|1.0 5.4 0.1| 474k 78k:2034k 285k:2218k 207k 17-07 20:24:48|8420k 314k|1.5 5.4 0| 313k 363k:2367k 391k:2182k 124k 17-07 20:25:18|8367k 247k|0.9 5.1 0.2| 475k 77k:1797k 75k:2220k 410B 17-07 20:25:48|7511k 179k|1.0 4.7 0| 406k 7646B:1596k 145k:2397k 147k 17-07 20:26:18|7930k 162k|0.7 5.1 0| 991k 410B:1468k 26k:2186k 26k 17-07 20:26:48|7757k 176k|1.0 5.3 0|1884k 26k:1147k 58k:2761k 32k [...] 17-07 20:57:18|6917k 120k|0.3 4.1 0| 56k 410B: 65k 4506B: 213k 4506B 17-07 20:57:48|5698k 103k|0.1 4.0 0| 0 410B: 27k 6007B: 590k 6007B 17-07 20:58:18|6582k 117k|0.2 4.0 0| 229k 20k: 195k 956B: 290k 21k 17-07 20:58:48|6048k 110k|0.6 4.0 0.1| 32k 21k: 81k 410B: 331k 21k 17-07 20:59:18|8057k 138k|0.6 4.1 0| 42k 5871B: 33k 410B: 35k 5871B 17-07 20:59:48|7369k 145k|0.5 4.1 0| 59k 3959B: 230k 410B: 532k 3959B 17-07 21:00:18|8189k 140k|0.7 4.0 0| 53k 6007B: 58k 410B: 40k 6007B 17-07 21:00:48|7596k 137k|0.3 4.2 0| 24k 6690B: 250k 410B: 15k 5734B 17-07 21:01:18|8448k 145k|0.7 4.2 0| 24k 1365B: 325k 6827B: 15k 7646B 17-07 21:01:48|6821k 119k|0.3 4.0 0| 17k 410B: 175k 3004B: 11k 3004B 17-07 21:02:18|3614k 66k|0.7 2.7 0| 39k 410B: 538k 4779B: 45k 4779B 17-07 21:02:48| 417k 14k|0.5 1.3 0.3| 106k 1638B: 209k 4779B: 0 4779B 17-07 21:03:18| 353k 7979B|0.8 1.2 0| 0 1229B: 449k 2867B: 0 2867B 17-07 21:03:48| 327k 8981B|1.1 1.2 0| 0 410B: 686k 4506B: 43k 4506B [...] 18-07 11:02:48| 243k 4866B|0.0 1.2 0.1| 0 2458B: 0 3550B: 0 3550B 18-07 11:03:18| 274k 5506B|0.1 1.2 0.1| 0 1775B: 0 3550B: 0 3550B 18-07 11:03:48| 238k 4851B|0.1 1.2 0.0| 0 4369B: 0 3550B: 0 3550B 18-07 11:04:18| 243k 4999B|0.1 1.1 0.1| 0 4506B: 0 3550B: 0 3550B 18-07 11:04:48| 288k 6488B|0.1 1.1 0.4| 0 2458B: 0 3550B: 0 3550B Because that's after the week-end, there's not much to write. What's holding 3 of the backups is actually writing log data like "xx% Completed". Actively running at the moment are 1 rsync and 3 ntfsclone. # strace -tt -s 2 -Te write -p 8771 -p 8567 -p 8856 -p 8403 Process 8771 attached - interrupt to quit Process 8567 attached - interrupt to quit Process 8856 attached - interrupt to quit Process 8403 attached - interrupt to quit [pid 8403] 11:12:26.539830 write(4, "es"..., 1024 [pid 8771] 11:12:26.540417 write(4, "hb"..., 4096 [pid 8567] 11:12:26.555211 write(1, " 3"..., 25 [pid 8856] 11:12:26.593232 write(1, " 6"..., 25 [pid 8403] 11:12:30.635257 <... write resumed> ) = 1024 <4.095271> [pid 8403] 11:12:30.635309 write(4, "19"..., 112 [pid 8567] 11:12:30.635364 <... write resumed> ) = 25 <4.080091> [pid 8856] 11:12:30.635553 <... write resumed> ) = 25 <4.042268> [pid 8771] 11:12:30.635799 <... write resumed> ) = 4096 <4.095350> [pid 8771] 11:12:30.636182 write(4, "hb"..., 4096 [pid 8567] 11:12:30.649904 write(1, " 3"..., 25 [pid 8403] 11:12:30.651452 <... write resumed> ) = 112 <0.015921> [pid 8567] 11:12:30.651595 <... write resumed> ) = 25 <0.001640> [pid 8403] 11:12:30.651787 write(4, "@d"..., 1024 [pid 8771] 11:12:30.651865 <... write resumed> ) = 4096 <0.015638> [pid 8771] 11:12:30.652281 write(4, "hb"..., 4096 [pid 8856] 11:12:30.657579 write(1, " 6"..., 25 [pid 8567] 11:12:30.691113 write(1, "
Re: Broken btrfs?
On 17.07.2011 16:01, Jan Schubert wrote: > Jan Schubert gmx.li> writes: >> Please find some data and log below. Is there any chance to fix this? > > After playing around (incl. deleting the log) I get the strong feeling > it has something todo with compression=lzo. Dunno why it started suddenly > but I disabled compression and did reinstall everything which helped a lot. > I still have some broken configuration and other (non reinstalable) files > which causes crashing the box when I try to access them. I detect them > manually, is there any way to do this automagically? If you are on a 3.0 kernel, get the most current version of btrfs tools from Hugo's integration-20110705 branch at http://git.darksatanic.net/repo/btrfs-progs-unstable.git/ and do a scrub. -Jan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs-unstable: replace debug-tree to btrfs-debug-tree in INSTALL
>From c04da1655df6d75db834ddbd3a3b4a58a0d9a0c9 Mon Sep 17 00:00:00 2001 From: Wang Sheng-Hui Date: Mon, 18 Jul 2011 02:17:31 -0500 Subject: [PATCH] btrfs-progs-unstable: replace debug-tree to btrfs-debug-tree in INSTALL debug-tree doesn't exist after btrfs-progs installed. Use btrfs-debug-tree to print FS metadata in text form, not debug-tree. Signed-off-by: Wang Sheng-Hui --- INSTALL |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/INSTALL b/INSTALL index 16b45a5..2db6d27 100644 --- a/INSTALL +++ b/INSTALL @@ -42,7 +42,7 @@ btrfsctl: control program to create snapshots and subvolumes: btrfsck: do a limited check of the FS extent trees. -debug-tree: print all of the FS metadata in text form. Example: +btrfs-debug-tree: print all of the FS metadata in text form. Example: - debug-tree /dev/sda2 >& big_output_file + btrfs-debug-tree /dev/sda2 >& big_output_file -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3.0-rc7] btrfs: Update git repository links for btrfs utilities in Documentation/filesystems/btrfs.txt
On 2011年07月18日 14:06, Wanlong Gao wrote: > On 07/18/2011 12:53 PM, Wang Sheng-Hui wrote: >> The patch is against 3.0-rc7 kernel. >> >>> From d22497ac8c5dd55a2ef9a47de5f2ddee55f8ec50 Mon Sep 17 00:00:00 2001 >> From: Wang Sheng-Hui >> Date: Sun, 17 Jul 2011 21:45:01 -0500 >> Subject: [PATCH 3.0-rc7] btrfs: Update git repository links for btrfs >> utilities in Documentation/filesystems/btrfs.txt >> >> git repository link for btrfs utilities >> http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs-unstable.git >> doesn't work, and git-clone can get failed as: >> $ git clone >> http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs-unstable.git >> Initialized empty Git repository in >> /home/crossover/dev/btrfs-progs-unstable/.git/ >> fatal: >> http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs-unstable.git/info/refs >> not found: did you run git update-server-info on the server? >> >> Update git repository links for btrfs utilities to the latest ones >> in Documentation/filesystems/btrfs.txt >> >> Signed-off-by: Wang Sheng-Hui >> --- >> Documentation/filesystems/btrfs.txt |3 ++- >> 1 files changed, 2 insertions(+), 1 deletions(-) >> >> diff --git a/Documentation/filesystems/btrfs.txt >> b/Documentation/filesystems/btrfs.txt >> index 64087c3..b095261 100644 >> --- a/Documentation/filesystems/btrfs.txt >> +++ b/Documentation/filesystems/btrfs.txt >> @@ -63,8 +63,9 @@ IRC network. >> Userspace tools for creating and manipulating Btrfs file systems are >> available from the git repository at the following location: >> >> - http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs-unstable.git > It's the right git-web URL. >> >> git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs-unstable.git >> + >> http://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs-unstable.git >> + >> https://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs-unstable.git >> >> These include the following tools: >> > > I rerun git clone on my 2 boxes, both failed: $ git clone http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs-unstable.git Initialized empty Git repository in /home/crossover/btrfs-progs-unstable/.git/ fatal: http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs-unstable.git/info/refs not found: did you run git update-server-info on the server? But I can open the link by browser. My git version is: $ git version git version 1.7.1 However, I think we should update the links to the latest ones. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html