[PATCH] Btrfs-progs: fix segmentation fault of 'btrfs-debug-tree -e'
Due to some historical reasons, we remove 'printing leaf' part, which'd lead to 'Segmentation fault' of btrfs-debug-tree -e, this patch adds it back. Signed-off-by: Liu Bo bo.li@oracle.com --- debug-tree.c |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/debug-tree.c b/debug-tree.c index f6bd5d8..02b0389 100644 --- a/debug-tree.c +++ b/debug-tree.c @@ -52,6 +52,11 @@ static void print_extents(struct btrfs_root *root, struct extent_buffer *eb) if (!eb) return; + if (btrfs_is_leaf(eb)) { + btrfs_print_leaf(root, eb); + return; + } + size = btrfs_level_size(root, btrfs_header_level(eb) - 1); nr = btrfs_header_nritems(eb); for (i = 0; i nr; i++) { -- 1.7.7.6 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
btrfs send receive produces Too many open files in system
I believe what I am going to write is a bug report. When I finaly did # btrfs send -v /mnt/adama-docs/backups/20130101-192722 | btrfs receive /mnt/tmp/backups to migrate btrfs from one partition layout to another. After a while system keeps saying that Too many open files in system and denies access to almost every command line tool. When I had access to iostat I confirmed the correct pattern of disk activity (i.e. reads from all devices that make /mnt/adama-docs , and writes to all devices that make /mnt/tmp). Now, that system is almost unusable, the HDD LEDs are still blinking in the same pattern as they did when I confirmed the pattern of disk activity. When I canceled the send receive process, everything went back to normal. I use Ubuntu Quantal with the latest 3.7.8 kernel, latest btrfs tools (v0.20 -rc1) downloaded from git. The btrfs filesystem /mnt/adama-docs sits on top of lvm2 logical volume, which sits on top of cryptsetup Luks device which subsequentely sits on top of mdadm RAID-6 spanning a partition on each of 4 hard drives (I know that it is sub-optimal setup). The backups/20130101-192722 is a read-only snaphot which I estimate contain ca. 100GB data. The /mnt/tmp/backups is btrfs multidevice raid10 filesystem, which is based on 4 cryptsetup Luks devices, each live as a separate partition on the same 4 physical hard drives that ultimately make the /mnt/adama-docs. Both btrfs filesystems are mounted with -o compress, and the /mnt/adama-docs is also mounted with noatime. I suspect that it may be some type of race condition, because my setup is so highly inefficient (I've got only about 8MB/sek read (and the same speed of write) from each of all 4 hard drives). The problem is perfectly reproducible on my setup. I'm ready to assist with whatever info you need to troubleshoot this problem. -- Adam Ryczkowski +48505919892 Skype:sisteczko -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: fix the deadlock between the transaction attach and commit
(Sorry for the late reply, I was on my vacation of the Spring Festival last week.) On Tue, 12 Feb 2013 13:56:32 +0100, David Sterba wrote: On Mon, Feb 11, 2013 at 03:35:37PM -0500, Josef Bacik wrote: or something like that. Me and kdave reproduced by running 274 in a loop, it happpened pretty quick. I'd fix it myself but I have to leave my house for people to come look at it. If you haven't fixed this by tomorrow I'll fix it up. Thanks, I found 224 stuck with this [SNIP] mounted with noatime,space_cache Thanks for your test. My test skipped the 274th case because it always fails, and all the other cases passed, so I didn't hit this problem. Anyways, very sorry for my stupid patch. (I have reviewed Josef's fix patch, and commented on it, please see the reply of that patch) Thanks Miao -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [btrfs-progs] testing btrfs hierarchical quotas
On Mon, Feb 18, 2013 at 10:44:06AM +0530, Hemanth Kumar wrote: On Sat, Feb 16, 2013 at 2:29 AM, Hugo Mills h...@carfax.org.uk wrote: Here's a question -- what are you testing? (Not just here, but in general, with your test infrastructure) There are (at least) three classes of tests you could be doing: 1) Unit tests, which test individual functions within the code and ensure they do what they're meant to do. 2) Integration tests, which test the full end-to-end system. 3) Partial integration tests, which exercise the kernel filesystem code. 4) Partial integration tests, which exercise the userspace tools code. Now, clearly you're not doing (1) here. It's going to be hard to separate (2) from (3) and (4), but it's possible to write your tests to do more of one or of the other. (*) I am tying to write a script to test quota subsystem (qgroups) and hierarchical quota as suggested by Arne Jansen. Since i am trying to write a script to test a particular feature i guess it falls under unit testing category No, unit testing would typically be testing one individual function in the code, independently of the rest of the code-base. e.g. a battery of very small simple tests which verify that the device_size() function in utils.c returns the correct value in all cases. When you say test a particular feature, you haven't distinguished between testing the *kernel* feature (i.e. does the kernel behave correctly?) and the *userspace* feature (i.e. does the userspace tool make all of the checks that it should do, tell the kernel to do the right thing, and return useful information when something fails?). It's hard to separate these two fully without effectively reimplementing some part of the other side, but the decision either way will make a difference as to the set of tests you end up implementing. xfstests clearly is much more geared to (3), and stresses the kernel filesystem implementation rather than the userspace tools. If you want to test the implementation of qgroups, it belongs in xfstests. If you want to test the userspace code, you need to make sure that (over all your tests) you cover every command-line option, and every different way of using the tool, and ensure that it does the right things. What you've written in this patch seems to be more about testing the kernel behaviour than the userspace tools, but it'd be good if you can put your work into the context I've just talked about above. More comments below... On Fri, Feb 15, 2013 at 06:35:41PM +0530, Hemanth Kumar wrote: Signed-off-by: Hemanth Kumar hemanthkuma...@gmail.com --- hq.sh | 33 + 1 file changed, 33 insertions(+) create mode 100644 hq.sh diff --git a/hq.sh b/hq.sh new file mode 100644 index 000..6a0a820 --- /dev/null +++ b/hq.sh Rather cryptic filename here. If this is to be applied to btrfs-progs, I'd recommend putting all your test scripts in a test subdir, and a test target in the Makefile that invokes the tests. Can you elaborate on this part a bit more. Ignore my comment. As Dave Chinner pointed out, this is best integrated into xfstests. [snip] You may want to take a look at my earlier work on this, at: http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg13153.html That should at least give you a basic infrastructure to work in. [snip] Thank you, I will take a look at your script and continue my work. Again, don't bother -- Dave was right, and my assumptions about what xfstests was actually doing were wrong. Use xfstests instead. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- We believe in free will because we have no choice. --- signature.asc Description: Digital signature
Re: [PATCH] Btrfs: place ordered operations on a per transaction list
On wed, 13 Feb 2013 11:13:22 -0500, Josef Bacik wrote: Miao made the ordered operations stuff run async, which introduced a deadlock where we could get somebody (sync) racing in and committing the transaction while a commit was already happening. The new committer would try and flush ordered operations which would hang waiting for the commit to finish because it is done asynchronously and no longer inherits the callers trans handle. To fix this we need to make the ordered operations list a per transaction list. We can get new inodes added to the ordered operation list by truncating them and then having another process writing to them, so this makes it so that anybody trying to add an ordered operation _must_ start a transaction in order to add itself to the list, which will keep new inodes from getting added to the ordered operations list after we start committing. This should fix the deadlock and also keeps us from doing a lot more work than we need to during commit. Thanks, Firstly, thanks to deal with the bug which was introduced by my patch. But comparing with this fix method, I prefer the following one because: - we won't worry the similar problem if we add more work during commit in the future. - it is unnecessary to get a new handle and commit it if the transaction is under the commit. Thanks Miao diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index fc03aa6..c449cb5 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -277,7 +277,8 @@ static void wait_current_trans(struct btrfs_root *root) } } -static int may_wait_transaction(struct btrfs_root *root, int type) +static int may_wait_transaction(struct btrfs_root *root, int type, + bool is_joined) { if (root-fs_info-log_root_recovering) return 0; @@ -285,6 +286,14 @@ static int may_wait_transaction(struct btrfs_root *root, int type) if (type == TRANS_USERSPACE) return 1; + /* +* If we are ATTACH, it means we just want to catch the current +* transaction and commit it. So if someone is committing the +* current transaction now, it is very glad to wait it. +*/ + if (is_joined type == TRANS_ATTACH) + return 1; + if (type == TRANS_START !atomic_read(root-fs_info-open_ioctl_trans)) return 1; @@ -355,7 +364,7 @@ again: if (type TRANS_JOIN_NOLOCK) sb_start_intwrite(root-fs_info-sb); - if (may_wait_transaction(root, type)) + if (may_wait_transaction(root, type, false)) wait_current_trans(root); do { @@ -383,16 +392,26 @@ again: h-block_rsv = NULL; h-orig_rsv = NULL; h-aborted = 0; - h-qgroup_reserved = qgroup_reserved; + h-qgroup_reserved = 0; h-delayed_ref_elem.seq = 0; h-type = type; INIT_LIST_HEAD(h-qgroup_ref_list); INIT_LIST_HEAD(h-new_bgs); smp_mb(); - if (cur_trans-blocked may_wait_transaction(root, type)) { - btrfs_commit_transaction(h, root); - goto again; + if (cur_trans-blocked may_wait_transaction(root, type, true)) { + if (cur_trans-in_commit) { + btrfs_end_transaction(h, root); + wait_current_trans(root); + } else { + btrfs_commit_transaction(h, root); + } + if (unlikely(type == TRANS_ATTACH)) { + ret = -ENOENT; + goto alloc_fail; + } else { + goto again; + } } if (num_bytes) { @@ -401,6 +420,7 @@ again: h-block_rsv = root-fs_info-trans_block_rsv; h-bytes_reserved = num_bytes; } + h-qgroup_reserved = qgroup_reserved; got_it: btrfs_record_root_in_trans(h, root); -- 1.6.5.2 Signed-off-by: Josef Bacik jba...@fusionio.com --- fs/btrfs/ctree.h|7 --- fs/btrfs/disk-io.c | 11 ++- fs/btrfs/file.c | 15 ++- fs/btrfs/ordered-data.c | 13 - fs/btrfs/ordered-data.h |3 ++- fs/btrfs/transaction.c |5 +++-- fs/btrfs/transaction.h |1 + 7 files changed, 34 insertions(+), 21 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 0c4e4df..9f72ec8 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1408,13 +1408,6 @@ struct btrfs_fs_info { struct list_head delalloc_inodes; /* - * special rename and truncate targets that must be on disk before - * we're allowed to commit. This is basically the ext3 style - * data=ordered list. - */ - struct list_head ordered_operations; - - /* * there is a pool of worker threads for checksumming during writes * and a pool for checksumming
Re: Rebalancing RAID1
On Fri, 15 Feb 2013 22:56:19 +0100 (CET), Fredrik Tolf wrote: The oops cut can be found here: http://www.dolda2000.com/~fredrik/tmp/btrfs-oops This scrub issue is fixed since Linux 3.8-rc1 with commit 4ded4f6 Btrfs: fix BUG() in scrub when first superblock reading gives EIO -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: place ordered operations on a per transaction list
On Mon, Feb 18, 2013 at 04:22:09AM -0700, Miao Xie wrote: On wed, 13 Feb 2013 11:13:22 -0500, Josef Bacik wrote: Miao made the ordered operations stuff run async, which introduced a deadlock where we could get somebody (sync) racing in and committing the transaction while a commit was already happening. The new committer would try and flush ordered operations which would hang waiting for the commit to finish because it is done asynchronously and no longer inherits the callers trans handle. To fix this we need to make the ordered operations list a per transaction list. We can get new inodes added to the ordered operation list by truncating them and then having another process writing to them, so this makes it so that anybody trying to add an ordered operation _must_ start a transaction in order to add itself to the list, which will keep new inodes from getting added to the ordered operations list after we start committing. This should fix the deadlock and also keeps us from doing a lot more work than we need to during commit. Thanks, Firstly, thanks to deal with the bug which was introduced by my patch. But comparing with this fix method, I prefer the following one because: - we won't worry the similar problem if we add more work during commit in the future. - it is unnecessary to get a new handle and commit it if the transaction is under the commit. Mine has the benefit of not making a committing transaction flush more stuff that it doesn't need to, so I think I'll keep mine as well, but I agree yours is good for the attach case as well. So can you send this along properly with a signed off and such and we can have our cake and eat it too. Thanks, Josef -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V5] Btrfs: snapshot-aware defrag
On Sat, 16 Feb 2013 14:47:45 +0800, Liu Bo wrote: What about this patch(UNTESTED)? thanks, liubo diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index ca7ace7..dac9d4b 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4142,9 +4142,14 @@ static void inode_tree_del(struct inode *inode) * root_refs of 0, so this could end up dropping the tree root as a * snapshot, so we need the extra !root-fs_info-tree_root check to * make sure we don't drop it. + * + * Inode cache's inodes may be iput and add root back to dead roots + * list during killing super, which leads to use-after-free, so + * we need to check fs_info-closing to keep us from use-after-free. */ if (empty btrfs_root_refs(root-root_item) == 0 - root != root-fs_info-tree_root) { + root != root-fs_info-tree_root + btrfs_fs_closing(root-fs_info) 1) { synchronize_srcu(root-fs_info-subvol_srcu); spin_lock(root-inode_lock); empty = RB_EMPTY_ROOT(root-inode_tree); No improvement with this patch. The inode_cache causes a crash in __list_add. I tested it on the latest cmason/for-linus with and without your patch. This script is an 100% reproducer on my test box: mkfs.btrfs -d single -m raid1 /dev/sdc /dev/sdj /dev/sds /dev/sdt /dev/sdu /dev/sdv mount /dev/sdc /mnt -o compress=lzo,space_cache,inode_cache btrfs subv create /mnt/src (cd ~/git/btrfs/fs/btrfs tar cf - .) | (cd /mnt/src tar xf -) for i in `seq 2000`; do btrfs subv create /mnt/${i}; (cd /mnt/src tar cf - .) | (cd /mnt/${i} tar xf -); done for i in /mnt/[0-9]*; do btrfs subv dele ${i}; done sleep 45 umount /mnt BUG: unable to handle kernel paging request at 88023517d830 IP: [814415f7] __list_add+0x17/0xd0 PGD 1e0c063 PUD bf58e067 PMD bf737067 PTE 80023517d160 Oops: [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: btrfs raid1 mpt2sas scsi_transport_sas raid_class CPU 2 Pid: 18503, comm: umount Not tainted 3.7.0+ #44 Supermicro X8SIL/X8SIL RIP: 0010:[814415f7] [814415f7] __list_add+0x17/0xd0 RSP: 0018:88019e1abbd8 EFLAGS: 00010286 RAX: 8802353aa290 RBX: 880229e38828 RCX: 0001 RDX: 88023517d828 RSI: 8802327214c0 RDI: 880229e38828 RBP: 88019e1abbf8 R08: 0006e130 R09: R10: R11: 0001 R12: 880229e38000 R13: 880229e38898 R14: R15: 88019e1abd30 FS: 7f75eabc4740() GS:880236a0() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 88023517d830 CR3: 00019e17e000 CR4: 07e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process umount (pid: 18503, threadinfo 88019e1aa000, task 8802353aa290) Stack: a008619f 880229e38000 880229e38000 880229e38898 88019e1abc18 a00861c0 88012760dc38 88012760dc38 88019e1abc48 a0095358 88012760dc38 88012760dcc0 Call Trace: [a008619f] ? btrfs_add_dead_root+0x1f/0x60 [btrfs] [a00861c0] btrfs_add_dead_root+0x40/0x60 [btrfs] [a0095358] btrfs_destroy_inode+0x1d8/0x2d0 [btrfs] [811af9c7] destroy_inode+0x37/0x60 [811afafd] evict+0x10d/0x1a0 [811b02a5] iput+0x105/0x190 [a007dda8] free_fs_root+0x18/0x90 [btrfs] [a00811eb] btrfs_free_fs_root+0x7b/0x90 [btrfs] [a00812af] del_fs_roots+0xaf/0xf0 [btrfs] [a0082c16] close_ctree+0x1c6/0x300 [btrfs] [811b072c] ? evict_inodes+0xec/0x100 [a00583a4] btrfs_put_super+0x14/0x20 [btrfs] [8119805c] generic_shutdown_super+0x5c/0xe0 [81198171] kill_anon_super+0x11/0x20 [a005c3a5] btrfs_kill_super+0x15/0x90 [btrfs] [811991a1] ? deactivate_super+0x41/0x70 [8119856d] deactivate_locked_super+0x3d/0x70 [811991a9] deactivate_super+0x49/0x70 [811b4332] mntput_no_expire+0xd2/0x130 [811b52e1] sys_umount+0x71/0x390 [81956992] system_call_fastpath+0x16/0x1b -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fixing mount points
On Feb 18, 2013, at 12:45 AM, Bob McGowan ramjr0...@gmail.com wrote: Even though things look OK from the command line, logging in through the window system fails (actually, just hangs). I assume this means I should be doing something to clean up the subvolume? Or maybe there's something in the Window system configuration to change? I'm running Linux Mint 14 KDE. My fstab for the parts in question looks like: # / was on /dev/sde2 during installation UUID=1a...9 / btrfs defaults,subvol=@ 0 1 # /home was on /dev/sde2 during installation UUID=1a...9 /home btrfs defaults,subvol=@home 0 2 What I want is something like: # / was on /dev/sde2 during installation UUID=1a...9 / btrfs defaults 0 1 # /home is on /dev/sda1 UUID=7f...3 /home btrfs defaults 0 2 The 2nd fstab implies a completely different disk, the first partition is btrfs, mounted as /home. So long as the contents are user folders, i.e. the same thing found in sde2 subvol @home, then it's functionally the same as what you had before. Also, btrfs doesn't need fs_passno set. Chris Murphy-- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel panic when scrub is used
On 02/18/13 18:14, Jérôme Poulin wrote: I experience a kernel panic with General protection fault when doing a scrub on Kernel 3.8-rc7. Here is a screenshot: http://tinypic.com/r/34r6nad/6 I'd love to see the first stacktrace... The weird part is that the scrub completes from initramfs, but when system is fully booted, is kernel panics every time in the low percentage. (10%) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel panic when scrub is used
Here you go, I also added 2 other screenshots of the same problem. http://tinypic.com/r/5ckgug/6 http://tinypic.com/r/t0i9t4/6 http://tinypic.com/r/2r3xdvl/6 On Mon, Feb 18, 2013 at 12:37 PM, Arne Jansen li...@die-jansens.de wrote: On 02/18/13 18:14, Jérôme Poulin wrote: I experience a kernel panic with General protection fault when doing a scrub on Kernel 3.8-rc7. Here is a screenshot: http://tinypic.com/r/34r6nad/6 I'd love to see the first stacktrace... The weird part is that the scrub completes from initramfs, but when system is fully booted, is kernel panics every time in the low percentage. (10%) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel panic when scrub is used
On 02/18/13 18:53, Jérôme Poulin wrote: Here you go, I also added 2 other screenshots of the same problem. http://tinypic.com/r/5ckgug/6 http://tinypic.com/r/t0i9t4/6 http://tinypic.com/r/2r3xdvl/6 do you have any idea how I can reproduce it here? -Arne On Mon, Feb 18, 2013 at 12:37 PM, Arne Jansen li...@die-jansens.de wrote: On 02/18/13 18:14, Jérôme Poulin wrote: I experience a kernel panic with General protection fault when doing a scrub on Kernel 3.8-rc7. Here is a screenshot: http://tinypic.com/r/34r6nad/6 I'd love to see the first stacktrace... The weird part is that the scrub completes from initramfs, but when system is fully booted, is kernel panics every time in the low percentage. (10%) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/8] Add some helpers to manage the strings allocation/deallocation.
This patch adds some helpers to manage the strings allocation and deallocation. The function string_list_add(char *) adds the passed string to a list; the function string_list_free() frees all the strings together. Signed-off-by: Goffredo Baroncelli kreij...@inwind.it --- Makefile |3 ++- string_list.c | 62 + string_list.h | 23 + 3 files changed, 87 insertions(+), 1 deletion(-) create mode 100644 string_list.c create mode 100644 string_list.h diff --git a/Makefile b/Makefile index 596bf93..0d6c43a 100644 --- a/Makefile +++ b/Makefile @@ -5,7 +5,8 @@ objects = ctree.o disk-io.o radix-tree.o extent-tree.o print-tree.o \ root-tree.o dir-item.o file-item.o inode-item.o \ inode-map.o crc32c.o rbtree.o extent-cache.o extent_io.o \ volumes.o utils.o btrfs-list.o btrfslabel.o repair.o \ - send-stream.o send-utils.o qgroup.o raid6.o + send-stream.o send-utils.o qgroup.o raid6.o \ + string_list.o cmds_objects = cmds-subvolume.o cmds-filesystem.o cmds-device.o cmds-scrub.o \ cmds-inspect.o cmds-balance.o cmds-send.o cmds-receive.o \ cmds-quota.o cmds-qgroup.o cmds-replace.o diff --git a/string_list.c b/string_list.c new file mode 100644 index 000..d5a28b9 --- /dev/null +++ b/string_list.c @@ -0,0 +1,62 @@ +/* + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License v2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public + * License along with this program; if not, write to the + * Free Software Foundation, Inc., 59 Temple Place - Suite 330, + * Boston, MA 021110-1307, USA. + */ + +#include stdio.h +#include stdlib.h +#include string.h +#include unistd.h + +/* To store the strings */ +static void **strings_to_free; +static int count_string_to_free; + +/* + * Add a string to the dynamic allocated string list + */ +void string_list_add(char *s) +{ + int size; + + size = sizeof(void *) * ++count_string_to_free; + strings_to_free = realloc(strings_to_free, size); + + /* if we don't have enough memory, we have more serius + problem than that a wrong handling of not enough memory */ + if (!strings_to_free) { + fprintf(stderr, add_string_to_free(): Not enough memory\n); + strings_to_free = 0; + count_string_to_free = 0; + } + + strings_to_free[count_string_to_free-1] = s; +} + +/* + * Free the dynamic allocated strings list + */ +void string_list_free() +{ + int i; + for (i = 0 ; i count_string_to_free ; i++) + free(strings_to_free[i]); + + free(strings_to_free); + + strings_to_free = 0; + count_string_to_free = 0; +} + + diff --git a/string_list.h b/string_list.h new file mode 100644 index 000..f974fbc --- /dev/null +++ b/string_list.h @@ -0,0 +1,23 @@ +/* + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License v2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public + * License along with this program; if not, write to the + * Free Software Foundation, Inc., 59 Temple Place - Suite 330, + * Boston, MA 021110-1307, USA. + */ + +#ifndef STRING_LIST_H +#define STRING_LIST_H + +void string_list_add(char *s); +void string_list_free(); + +#endif -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/8] Enhance the command btrfs filesystem df.
Enhance the command btrfs filesystem df to show space usage information for a mount point(s). It shows also an estimation of the space available, on the basis of the current one used. Signed-off-by: Goffredo Baroncelli kreij...@inwind.it --- Makefile |2 +- cmds-fi-disk_usage.c | 520 ++ cmds-fi-disk_usage.h | 25 +++ cmds-filesystem.c| 125 +--- ctree.h | 17 +- utils.c | 14 ++ utils.h |2 + 7 files changed, 579 insertions(+), 126 deletions(-) create mode 100644 cmds-fi-disk_usage.c create mode 100644 cmds-fi-disk_usage.h diff --git a/Makefile b/Makefile index 0d6c43a..bd792b6 100644 --- a/Makefile +++ b/Makefile @@ -9,7 +9,7 @@ objects = ctree.o disk-io.o radix-tree.o extent-tree.o print-tree.o \ string_list.o cmds_objects = cmds-subvolume.o cmds-filesystem.o cmds-device.o cmds-scrub.o \ cmds-inspect.o cmds-balance.o cmds-send.o cmds-receive.o \ - cmds-quota.o cmds-qgroup.o cmds-replace.o + cmds-quota.o cmds-qgroup.o cmds-replace.o cmds-fi-disk_usage.o CHECKFLAGS= -D__linux__ -Dlinux -D__STDC__ -Dunix -D__unix__ -Wbitwise \ -Wuninitialized -Wshadow -Wundef diff --git a/cmds-fi-disk_usage.c b/cmds-fi-disk_usage.c new file mode 100644 index 000..1e3589f --- /dev/null +++ b/cmds-fi-disk_usage.c @@ -0,0 +1,520 @@ +/* + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License v2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public + * License along with this program; if not, write to the + * Free Software Foundation, Inc., 59 Temple Place - Suite 330, + * Boston, MA 021110-1307, USA. + */ + +#include stdio.h +#include stdlib.h +#include string.h +#include unistd.h +#include sys/ioctl.h +#include errno.h + +#include utils.h +#include kerncompat.h +#include ctree.h +#include string_list.h + +#include commands.h + +#include version.h + +#define DF_HUMAN_UNIT (10) + +/* + * To store the size information about the chunks: + * the chunks info are grouped by the tuple (type, devid, num_stripes), + * i.e. if two chunks are of the same type (RAID1, DUP...), are on the + * same disk, have the same stripes then their sizes are grouped + */ +struct chunk_info { + u64 type; + u64 size; + u64 devid; + int num_stripes; +}; + +/* + * Pretty print the size + */ +static char *df_pretty_sizes(u64 size, int mode) +{ + char *s; + + if (mode DF_HUMAN_UNIT) { + s = pretty_sizes(size); + if (!s) + return NULL; + } else { + s = malloc(20); + if (!s) + return NULL; + sprintf(s, %llu, size); + } + + string_list_add(s); + return s; +} + +/* + * Add the chunk info to the chunk_info list + */ +static int add_info_to_list(struct chunk_info **info_ptr, + int *info_count, + struct btrfs_chunk *chunk) +{ + + u64 type = btrfs_stack_chunk_type(chunk); + u64 size = btrfs_stack_chunk_length(chunk); + int num_stripes = btrfs_stack_chunk_num_stripes(chunk); + int j; + + for (j = 0 ; j num_stripes ; j++) { + int i; + struct chunk_info *p = 0; + struct btrfs_stripe *stripe; + u64devid; + + stripe = btrfs_stripe_nr(chunk, j); + devid = btrfs_stack_stripe_devid(stripe); + + for (i = 0 ; i *info_count ; i++) + if ((*info_ptr)[i].type == type + (*info_ptr)[i].devid == devid + (*info_ptr)[i].num_stripes == num_stripes ) { + p = (*info_ptr) + i; + break; + } + + if (!p) { + int size = sizeof(struct btrfs_chunk) * (*info_count+1); + struct chunk_info *res = realloc(*info_ptr, size); + + if (!res) { + fprintf(stderr, ERROR: not enough memory\n); + return -1; + } + + *info_ptr = res; + p = res + *info_count; + (*info_count)++; + + p-devid = devid; + p-type = type; + p-size = 0; + p-num_stripes = num_stripes; + } + +
[PATCH 4/8] Add helpers functions to handle the printing of data in tabular format.
This patch adds some functions to manage the printing of the data in tabular format. The function struct string_table *table_create(int columns, int rows) creates an (empty) table. The functions char *table_printf(struct string_table *tab, int column, int row, char *fmt, ...) char *table_vprintf(struct string_table *tab, int column, int row, char *fmt, va_list ap) populate the table with text. To align the text to the left, the text shall be prefixed with '', otherwise the text shall be prefixed by a ''. If the first character is a '=', the the text is replace by a sequence of '=' to fill the column width. The function void table_free(struct string_table *) frees all the data associated to the table. The function void table_dump(struct string_table *tab) prints the table on stdout. Signed-off-by: Goffredo Baroncelli kreij...@inwind.it --- Makefile |2 +- string_table.c | 157 string_table.h | 36 + 3 files changed, 194 insertions(+), 1 deletion(-) create mode 100644 string_table.c create mode 100644 string_table.h diff --git a/Makefile b/Makefile index bd792b6..fd1b312 100644 --- a/Makefile +++ b/Makefile @@ -6,7 +6,7 @@ objects = ctree.o disk-io.o radix-tree.o extent-tree.o print-tree.o \ inode-map.o crc32c.o rbtree.o extent-cache.o extent_io.o \ volumes.o utils.o btrfs-list.o btrfslabel.o repair.o \ send-stream.o send-utils.o qgroup.o raid6.o \ - string_list.o + string_list.o string_table.o cmds_objects = cmds-subvolume.o cmds-filesystem.o cmds-device.o cmds-scrub.o \ cmds-inspect.o cmds-balance.o cmds-send.o cmds-receive.o \ cmds-quota.o cmds-qgroup.o cmds-replace.o cmds-fi-disk_usage.o diff --git a/string_table.c b/string_table.c new file mode 100644 index 000..9784422 --- /dev/null +++ b/string_table.c @@ -0,0 +1,157 @@ +/* + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public + * License v2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public + * License along with this program; if not, write to the + * Free Software Foundation, Inc., 59 Temple Place - Suite 330, + * Boston, MA 021110-1307, USA. + */ + +#include stdlib.h +#include string.h +#include stdio.h +#include stdarg.h + +#include string_table.h + +/* + * This function create an array of char * which will represent a table + */ +struct string_table *table_create(int columns, int rows) +{ + struct string_table *p; + int size; + + + size = sizeof( struct string_table ) + + rows * columns* sizeof(char *); + p = calloc(1, size); + + if (!p) return NULL; + + p-ncols = columns; + p-nrows = rows; + + return p; +} + +/* + * This function is like a vprintf, but store the results in a cell of + * the table. + * If fmt starts with '', the text is left aligned; if fmt starts with + * '' the text is right aligned. If fmt is equal to '=' the text will + * be replaced by a '=' dimensioned in the basis of the column width + */ +char *table_vprintf(struct string_table *tab, int column, int row, + char *fmt, va_list ap) +{ + int idx = tab-ncols*row+column; + char *msg = calloc(100, sizeof(char)); + + if (!msg) + return NULL; + + if (tab-cells[idx]) + free(tab-cells[idx]); + tab-cells[idx] = msg; + vsnprintf(msg, 99, fmt, ap); + + return msg; +} + + +/* + * This function is like a printf, but store the results in a cell of + * the table. + */ +char *table_printf(struct string_table *tab, int column, int row, + char *fmt, ...) +{ + va_list ap; + char *ret; + + va_start(ap, fmt); + ret = table_vprintf(tab, column, row, fmt, ap); + va_end(ap); + + return ret; +} + +/* + * This function dumps the table. Every = string will be replaced by + * a === length as the column + */ +void table_dump(struct string_table *tab) +{ + int sizes[tab-ncols]; + int i, j; + + for (i = 0 ; i tab-ncols ; i++) { + sizes[i] = 0; + for (j = 0 ; j tab-nrows ; j++) { + int idx = i + j*tab-ncols; + int s; + + if (!tab-cells[idx]) + continue; + + s = strlen(tab-cells[idx]) - 1; + if (s 1 || tab-cells[idx][0] == '=') +
[PATCH 6/8] Create entry in man page for btrfs filesystem disk-usage
Signed-off-by: Goffredo Baroncelli kreij...@inwind.it --- man/btrfs.8.in | 13 + 1 file changed, 13 insertions(+) diff --git a/man/btrfs.8.in b/man/btrfs.8.in index e2f86ea..50dc510 100644 --- a/man/btrfs.8.in +++ b/man/btrfs.8.in @@ -29,6 +29,9 @@ btrfs \- control a btrfs filesystem .PP \fBbtrfs\fP \fBfilesystem resize\fP\fI [devid:][+/\-]size[gkm]|[devid:]max filesystem\fP .PP +\fBbtrfs\fP \fBfilesystem filesystem disk-usage [-t][-b]\fP\fI path +[path..]\fP +.PP \fBbtrfs\fP \fBfilesystem label\fP\fI dev [newlabel]\fP .PP \fBbtrfs\fP \fBfilesystem df\fP\fI [-b] \fIpath [path..]\fR\fP @@ -251,6 +254,16 @@ it with the new desired size. When recreating the partition make sure to use the same starting disk cylinder as before. .TP +\fBfilesystem disk-usage\fP [-t][-b] \fIpath [path..]\fR + +Show in which disk the chunks are allocated. + +\fB-b\fP Set byte as unit + +\fB-t\fP Show data in tabular format + +.TP + \fBfilesystem label\fP\fI dev [newlabel]\fP Show or update the label of a filesystem. \fIdev\fR is used to identify the filesystem. -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 7/8] Add btrfs device disk-usage command
Signed-off-by: Goffredo Baroncelli kreij...@inwind.it --- cmds-device.c|3 ++ cmds-fi-disk_usage.c | 141 ++ cmds-fi-disk_usage.h |4 ++ 3 files changed, 148 insertions(+) diff --git a/cmds-device.c b/cmds-device.c index 198ad68..0dbc02c 100644 --- a/cmds-device.c +++ b/cmds-device.c @@ -27,6 +27,7 @@ #include ctree.h #include ioctl.h #include utils.h +#include cmds-fi-disk_usage.h #include commands.h @@ -403,6 +404,8 @@ const struct cmd_group device_cmd_group = { { scan, cmd_scan_dev, cmd_scan_dev_usage, NULL, 0 }, { ready, cmd_ready_dev, cmd_ready_dev_usage, NULL, 0 }, { stats, cmd_dev_stats, cmd_dev_stats_usage, NULL, 0 }, + { disk-usage, cmd_device_disk_usage, + cmd_device_disk_usage_usage, NULL, 0 }, { 0, 0, 0, 0, 0 } } }; diff --git a/cmds-fi-disk_usage.c b/cmds-fi-disk_usage.c index eea4168..18350ce 100644 --- a/cmds-fi-disk_usage.c +++ b/cmds-fi-disk_usage.c @@ -949,4 +949,145 @@ int cmd_filesystem_disk_usage(int argc, char **argv) return 0; } +static void print_disk_chunks(int fd, + u64 devid, + u64 total_size, + struct chunk_info *chunks_info_ptr, + int chunks_info_count, + int mode) +{ + int i; + u64 allocated = 0; + char *s; + + for (i = 0 ; i chunks_info_count ; i++) { + const char *description; + const char *r_mode; + u64 flags; + u64 size; + + if (chunks_info_ptr[i].devid != devid) + continue; + + flags = chunks_info_ptr[i].type; + + description = btrfs_flags2description(flags); + r_mode = btrfs_flags2profile(flags); + size = calc_chunk_size(chunks_info_ptr+i); + s = df_pretty_sizes(size, mode); + printf( %s,%s:%*s%10s\n, + description, + r_mode, + (int)(20 - strlen(description) - strlen(r_mode)), , + s); + + allocated += size; + + } + s = df_pretty_sizes(total_size - allocated, mode); + printf( Unallocated: %*s%10s\n, + (int)(20 - strlen(Unallocated)), , + s); + +} + +static int _cmd_device_disk_usage(int fd, char *path, int mode) +{ + int i; + int ret = 0; + int info_count = 0; + struct chunk_info *info_ptr = 0; + struct disk_info *disks_info_ptr = 0; + int disks_info_count = 0; + + if (load_chunk_info(fd, info_ptr, info_count) || + load_disks_info(fd, disks_info_ptr, disks_info_count)) { + ret = -1; + goto exit; + } + + for (i = 0 ; i disks_info_count ; i++) { + char *s; + + s = df_pretty_sizes(disks_info_ptr[i].size, mode); + printf(%s\t%10s\n, disks_info_ptr[i].path, s); + + print_disk_chunks(fd, disks_info_ptr[i].devid, + disks_info_ptr[i].size, + info_ptr, info_count, + mode); + printf(\n); + + } + + +exit: + + string_list_free(); + if (disks_info_ptr) + free(disks_info_ptr); + if (info_ptr) + free(info_ptr); + + return ret; +} + +const char * const cmd_device_disk_usage_usage[] = { + btrfs device disk-usage [-b] path [path..], + Show which chunks are in a device., + , + -b\tSet byte as unit, + NULL +}; + +int cmd_device_disk_usage(int argc, char **argv) +{ + + int flags = DF_HUMAN_UNIT; + int i, more_than_one = 0; + + optind = 1; + while (1) { + charc = getopt(argc, argv, b); + + if (c 0) + break; + + switch (c) { + case 'b': + flags = ~DF_HUMAN_UNIT; + break; + default: + usage(cmd_device_disk_usage_usage); + } + } + + if (check_argc_min(argc - optind, 1)) { + usage(cmd_device_disk_usage_usage); + return 21; + } + + for (i = optind; i argc ; i++) { + int r, fd; + if (more_than_one) + printf(\n); + + fd = open_file_or_dir(argv[i]); + if (fd 0) { + fprintf(stderr, ERROR: can't access to '%s'\n, + argv[1]); + return 12; + } + r = _cmd_device_disk_usage(fd, argv[i], flags); + close(fd); + + if (r) +
[PATCH 5/8] Add command btrfs filesystem disk-usage
Signed-off-by: Goffredo Baroncelli kreij...@inwind.it --- cmds-fi-disk_usage.c | 434 +- cmds-fi-disk_usage.h |2 + cmds-filesystem.c|2 + utils.c | 58 +++ utils.h |3 + 5 files changed, 498 insertions(+), 1 deletion(-) diff --git a/cmds-fi-disk_usage.c b/cmds-fi-disk_usage.c index 1e3589f..eea4168 100644 --- a/cmds-fi-disk_usage.c +++ b/cmds-fi-disk_usage.c @@ -20,11 +20,13 @@ #include unistd.h #include sys/ioctl.h #include errno.h +#include stdarg.h #include utils.h #include kerncompat.h #include ctree.h #include string_list.h +#include string_table.h #include commands.h @@ -42,7 +44,14 @@ struct chunk_info { u64 type; u64 size; u64 devid; - int num_stripes; + u64 num_stripes; +}; + +/* to store information about the disks */ +struct disk_info { + u64 devid; + charpath[BTRFS_DEVICE_PATH_NAME_MAX]; + u64 size; }; /* @@ -518,3 +527,426 @@ int cmd_filesystem_df(int argc, char **argv) return 0; } +/* + * Helper to sort the disk_info structure + */ +static int cmp_disk_info(const void *a, const void *b) +{ + return strcmp(((struct disk_info *)a)-path, + ((struct disk_info *)b)-path); +} + +/* + * This function load the disk_info structure and put them in an array + */ +static int load_disks_info(int fd, + struct disk_info **disks_info_ptr, + int *disks_info_count) +{ + + int ret, i, ndevs; + struct btrfs_ioctl_fs_info_args fi_args; + struct btrfs_ioctl_dev_info_args dev_info; + struct disk_info *info; + + *disks_info_count = 0; + *disks_info_ptr = 0; + + ret = ioctl(fd, BTRFS_IOC_FS_INFO, fi_args); + if (ret 0) { + fprintf(stderr, ERROR: cannot get filesystem info\n); + return -1; + } + + info = malloc(sizeof(struct disk_info) * fi_args.num_devices); + if (!info) { + fprintf(stderr, ERROR: not enough memory\n); + return -1; + } + + for (i = 0, ndevs = 0 ; i = fi_args.max_id ; i++) { + + BUG_ON(ndevs = fi_args.num_devices); + ret = get_device_info(fd, i, dev_info); + + if (ret == -ENODEV) + continue; + if (ret) { + fprintf(stderr, + ERROR: cannot get info about device devid=%d\n, + i); + free(info); + return -1; + } + + info[ndevs].devid = dev_info.devid; + strcpy(info[ndevs].path, (char *)dev_info.path); + info[ndevs].size = get_partition_size((char *)dev_info.path); + ++ndevs; + } + + BUG_ON(ndevs != fi_args.num_devices); + qsort(info, fi_args.num_devices, + sizeof(struct disk_info), cmp_disk_info); + + *disks_info_count = fi_args.num_devices; + *disks_info_ptr = info; + + return 0; + +} + +/* + * This function computes the size of a chunk in a disk + */ +static u64 calc_chunk_size(struct chunk_info *ci) +{ + if (ci-type BTRFS_BLOCK_GROUP_RAID0) + return ci-size / ci-num_stripes; + else if (ci-type BTRFS_BLOCK_GROUP_RAID1) + return ci-size ; + else if (ci-type BTRFS_BLOCK_GROUP_DUP) + return ci-size ; + else if (ci-type BTRFS_BLOCK_GROUP_RAID5) + return ci-size / (ci-num_stripes -1); + else if (ci-type BTRFS_BLOCK_GROUP_RAID6) + return ci-size / (ci-num_stripes -2); + else if (ci-type BTRFS_BLOCK_GROUP_RAID10) + return ci-size / ci-num_stripes; + return ci-size; +} + +/* + * This function print the results of the command btrfs fi disk-usage + * in tabular format + */ +static void _cmd_filesystem_disk_usage_tabular(int mode, + struct btrfs_ioctl_space_args *sargs, + struct chunk_info *chunks_info_ptr, + int chunks_info_count, + struct disk_info *disks_info_ptr, + int disks_info_count) +{ + int i; + u64 total_unused = 0; + struct string_table *matrix = 0; + int ncols, nrows; + + + ncols = sargs-total_spaces + 2; + nrows = 2 + 1 + disks_info_count + 1 + 2; + + matrix = table_create(ncols, nrows); + if (!matrix) { + fprintf(stderr, ERROR: not enough memory\n); + return; + } + + /* header */ + for (i = 0; i sargs-total_spaces; i++) { + const char *description; + + u64 flags = sargs-spaces[i].flags; + description =
[PATCH 8/8] Create a new entry in btrfs man page for btrfs device disk-usage.
Signed-off-by: Goffredo Baroncelli kreij...@inwind.it --- man/btrfs.8.in |8 1 file changed, 8 insertions(+) diff --git a/man/btrfs.8.in b/man/btrfs.8.in index 50dc510..e60c81f 100644 --- a/man/btrfs.8.in +++ b/man/btrfs.8.in @@ -46,6 +46,8 @@ btrfs \- control a btrfs filesystem .PP \fBbtrfs\fP \fBdevice delete\fP\fI device [device...] path \fP .PP +\fBbtrfs\fP \fBdevice disk-usage\fP\fI [-b] path [path...] \fP +.PP \fBbtrfs\fP \fBreplace start\fP \fI[-Bfr] srcdev|devid targetdev path\fP .PP \fBbtrfs\fP \fBreplace status\fP \fI[-1] path\fP @@ -360,6 +362,12 @@ Add device(s) to the filesystem identified by \fIpath\fR. Remove device(s) from a filesystem identified by \fIpath\fR. .TP +\fBdevice disk-usage\fR\fI [-b] path [path..] path\fR +Show which chunks are in a device. + +\fB-b\fP set byte as unit. +.TP + \fBdevice scan\fR \fI[--all-devices|device [device...]\fR If one or more devices are passed, these are scanned for a btrfs filesystem. If no devices are passed, \fBbtrfs\fR scans all the block devices listed -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH][BTRFS-PROGS] Enhance btrfs fi df with raid5/6 support
Hi all, I updates my previous patches [1] to add support for raid5/6. These patches update the btrfs fi df command and add two new commands: - btrfs filesystem disk-usage path - btrfs device disk-usage path The command btrfs filesystem df now shows only the disk usage/available. $ sudo btrfs filesystem df /mnt/btrfs1/ Disk size: 400.00GB Disk allocated:8.04GB Disk unallocated:391.97GB Used: 11.29MB Free (Estimated):250.45GB (Max: 396.99GB, min: 201.00GB) Data to disk ratio: 63 % The Free (Estimated) tries to give an estimation of the free space on the basis of the chunks usage. Max and min are the maximum allowable space (if the next chunk are allocated as SINGLE) or the minimum one ( if the next chunks are allocated as DUP/RAID1/RAID10). The other two commands show the chunks in the disks. $ sudo btrfs filesystem disk-usage /mnt/btrfs1/ Data,Single: Size:8.00MB, Used:0.00 /dev/vdb 8.00MB Data,RAID6: Size:2.00GB, Used:11.25MB /dev/vdb 1.00GB /dev/vdc 1.00GB /dev/vdd 1.00GB /dev/vde 1.00GB Metadata,Single: Size:8.00MB, Used:0.00 /dev/vdb 8.00MB Metadata,RAID5: Size:3.00GB, Used:36.00KB /dev/vdb 1.00GB /dev/vdc 1.00GB /dev/vdd 1.00GB /dev/vde 1.00GB System,Single: Size:4.00MB, Used:0.00 /dev/vdb 4.00MB System,RAID5: Size:12.00MB, Used:4.00KB /dev/vdb 4.00MB /dev/vdc 4.00MB /dev/vdd 4.00MB /dev/vde 4.00MB Unallocated: /dev/vdb97.98GB /dev/vdc98.00GB /dev/vdd98.00GB /dev/vde98.00GB or in tabular format $ sudo ./btrfs filesystem disk-usage -t /mnt/btrfs1/ Data DataMetadata Metadata System System Single RAID6 Single RAID5Single RAID5 Unallocated /dev/vdb 8.00MB 1.00GB 8.00MB 1.00GB 4.00MB 4.00MB 97.98GB /dev/vdc - 1.00GB- 1.00GB - 4.00MB 98.00GB /dev/vdd - 1.00GB- 1.00GB - 4.00MB 98.00GB /dev/vde - 1.00GB- 1.00GB - 4.00MB 98.00GB == === == === === Total8.00MB 2.00GB 8.00MB 3.00GB 4.00MB 12.00MB391.97GB Used 0.00 11.25MB 0.00 36.00KB 0.00 4.00KB These are the most complete output, where it is possible to know which disk a chunk uses and the usage of every chunk. Finally the last command shows which chunks a disk hosts: $ sudo ./btrfs device disk-usage /mnt/btrfs1/ /dev/vdb 100.00GB Data,Single: 8.00MB Data,RAID6: 1.00GB Metadata,Single: 8.00MB Metadata,RAID5: 1.00GB System,Single:4.00MB System,RAID5: 4.00MB Unallocated: 97.98GB /dev/vdc 100.00GB Data,RAID6: 1.00GB Metadata,RAID5: 1.00GB System,RAID5: 4.00MB Unallocated: 98.00GB /dev/vdd 100.00GB Data,RAID6: 1.00GB Metadata,RAID5: 1.00GB System,RAID5: 4.00MB Unallocated: 98.00GB /dev/vde 100.00GB Data,RAID6: 1.00GB Metadata,RAID5: 1.00GB System,RAID5: 4.00MB Unallocated: 98.00GB More or less are the same information above, only grouped by disk. Unfortunately I don't have any information about the chunk usage per disk basis. Comments are welcome. The code is pullable from http://cassiopea.homelinux.net/git/btrfs-progs-unstable.git branch df-du-raid56 BR G.Baroncelli [1] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/21071 -- gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/8] Create the man page entry for the command btrfs fi df
Signed-off-by: Goffredo Baroncelli kreij...@inwind.it --- man/btrfs.8.in | 49 + 1 file changed, 49 insertions(+) diff --git a/man/btrfs.8.in b/man/btrfs.8.in index 94f4ffe..e2f86ea 100644 --- a/man/btrfs.8.in +++ b/man/btrfs.8.in @@ -31,6 +31,8 @@ btrfs \- control a btrfs filesystem .PP \fBbtrfs\fP \fBfilesystem label\fP\fI dev [newlabel]\fP .PP +\fBbtrfs\fP \fBfilesystem df\fP\fI [-b] \fIpath [path..]\fR\fP +.PP \fBbtrfs\fP \fBfilesystem balance\fP\fI path \fP .PP \fBbtrfs\fP \fBdevice scan\fP\fI [--all-devices|device [device...]]\fP @@ -266,6 +268,53 @@ NOTE: Currently there are the following limitations: - the filesystem should not have more than one device. .TP +\fBfilesystem df\fP [-b] \fIpath [path..]\fR + +Show space usage information for a mount point. + +\fB-b\fP Set byte as unit + +The command \fBbtrfs filesystem df\fP is used to query how many space on the +disk(s) are used and an estimation of the free +space of the filesystem. +The output of the command \fBbtrfs filesystem df\fP shows: + +.RS +.IP \fBDisk\ size\fP +the total size of the disks which compose the filesystem. + +.IP \fBDisk\ allocated\fP +the size of the area of the disks used by the chunks. + +.IP \fBDisk\ unallocated\fP +the size of the area of the disks which is free (i.e. +the differences of the values above). + +.IP \fBUsed\fP +the portion of the logical space used by the file and metadata. + +.IP \fBFree\ (estimated)\fP +the estimated free space available: i.e. how many space can be used +by the user. The evaluation +cannot be rigorous because it depends by the allocation policy (DUP, Single, +RAID1...) of the metadata and data chunks. If every chunk is stored as +Single the sum of the \fBfree (estimated)\fP space and the \fBused\fP +space is equal to the \fBdisk size\fP. +Otherwise if all the chunk are mirrored (raid1 or raid10) or duplicated +the sum of the \fBfree (estimated)\fP space and the \fBused\fP space is +half of the \fBdisk size\fP. Normally the \fBfree (estimated)\fP is between +these two limits. + +.IP \fBData\ to\ disk\ ratio\fP +the ratio betwen the \fBlogical size\fP (i.e. the space available by +the chunks) and the \fBdisk allocated\fP (by the chunks). Normally it is +lower than 100% because the metadata is duplicated for security reasons. +If all the data and metadata are duplicated (or have a profile like RAID1) +the \fBData\ to\ disk\ ratio\fP could be 50%. + +.RE +.TP + \fBfilesystem show\fR [--all-devices|uuid|label]\fR Show the btrfs filesystem with some additional info. If no \fIUUID\fP or \fIlabel\fP is passed, \fBbtrfs\fR show info of all the btrfs filesystem. -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/8] Add some helpers to manage the strings allocation/deallocation.
On Mon, Feb 18, 2013 at 10:04:26PM +0100, Goffredo Baroncelli wrote: This patch adds some helpers to manage the strings allocation and deallocation. The function string_list_add(char *) adds the passed string to a list; the function string_list_free() frees all the strings together. Please don't do this. To verify that a given pointer isn't freed before it's used we'd have to make sure that there are no string_list_free() calls in the interim that would hit their pointer on this global list. As far as I can tell, this is only used for the pretty units? Instead of printf(%s, leaked_string(raw)); how about printf(%llu%s, scaled_value(raw), static_unit_str(raw)); That'd avoid the need to pass back arbitrary allocated strings and this code could go away. + if (!strings_to_free) { + fprintf(stderr, add_string_to_free(): Not enough memory\n); + strings_to_free = 0; if (a == 0) a = 0? + count_string_to_free = 0; + } + + strings_to_free[count_string_to_free-1] = s; NULL[-1] = s? - z -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RAID5/6 Implementation - Understanding first
Chris and team, hats off on the RAID5/6 being at least experimental. I have been following your work for a year now, and waiting for these days. I am trying to get my head rapped around the architecture for BTRFS before I jump in and start recommending code changes to the branch. What I am trying to understand is the comments in the GIT commit which state: Read/modify/write is done after the higher levels of the filesystem have prepared a given bio. This means the higher layers are not responsible for building full stripes, and they don't need to query for the topology of the extents that may get allocated during delayed allocation runs. It also means different files can easily share the same stripe. As I understand it, what we are doing is trying to hide the underlying extents architecture to gain some advantages in the higher level code. I have been digging in the code, and believe I know the answer to this question. So by higher levels does this mean that RMW, snapshots, checksums and duplicate detection are all unaware of RAID architecture? If so, I might have some points to consider in this space. If not, I will need to dig deeper in the code to understand how some of my concerns can be realized and how I missed the answer to my question. Thank you for this awesome work you all are doing and thank you for the time to answer. Anthony Plack-- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID5/6 Implementation - Understanding first
On Mon, Feb 18, 2013 at 04:20:58PM -0700, Tony Plack wrote: Chris and team, hats off on the RAID5/6 being at least experimental. I have been following your work for a year now, and waiting for these days. I am trying to get my head rapped around the architecture for BTRFS before I jump in and start recommending code changes to the branch. What I am trying to understand is the comments in the GIT commit which state: Read/modify/write is done after the higher levels of the filesystem have prepared a given bio. This means the higher layers are not responsible for building full stripes, and they don't need to query for the topology of the extents that may get allocated during delayed allocation runs. It also means different files can easily share the same stripe. As I understand it, what we are doing is trying to hide the underlying extents architecture to gain some advantages in the higher level code. I have been digging in the code, and believe I know the answer to this question. So by higher levels does this mean that RMW, snapshots, checksums and duplicate detection are all unaware of RAID architecture? Yes, although the allocator is aware of the raid code, and the raid code is aware that the higher levels are doing copy-on-write. They also share the same transaction subsystem, at least until my parity logging code is complete. Longer term the two will cooperate more. For example, when we trigger read/modify/write in RAID because a sub-stripe write was made to a large file, we might as well use adjacent blocks from that file to fill the new stripe. This will reduce a lot of complexity in terms of small extent overhead in the rest of the code. -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V5] Btrfs: snapshot-aware defrag
On Mon, Feb 18, 2013 at 05:53:50PM +0100, Stefan Behrens wrote: On Sat, 16 Feb 2013 14:47:45 +0800, Liu Bo wrote: What about this patch(UNTESTED)? thanks, liubo diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index ca7ace7..dac9d4b 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4142,9 +4142,14 @@ static void inode_tree_del(struct inode *inode) * root_refs of 0, so this could end up dropping the tree root as a * snapshot, so we need the extra !root-fs_info-tree_root check to * make sure we don't drop it. +* +* Inode cache's inodes may be iput and add root back to dead roots +* list during killing super, which leads to use-after-free, so +* we need to check fs_info-closing to keep us from use-after-free. */ if (empty btrfs_root_refs(root-root_item) == 0 - root != root-fs_info-tree_root) { + root != root-fs_info-tree_root + btrfs_fs_closing(root-fs_info) 1) { synchronize_srcu(root-fs_info-subvol_srcu); spin_lock(root-inode_lock); empty = RB_EMPTY_ROOT(root-inode_tree); No improvement with this patch. The inode_cache causes a crash in __list_add. I tested it on the latest cmason/for-linus with and without your patch. Ahh, I think I made a finger error, + btrfs_fs_closing(root-fs_info) 1) { SHOULD be + btrfs_fs_closing(root-fs_info) 2) { This script is an 100% reproducer on my test box: mkfs.btrfs -d single -m raid1 /dev/sdc /dev/sdj /dev/sds /dev/sdt /dev/sdu /dev/sdv mount /dev/sdc /mnt -o compress=lzo,space_cache,inode_cache btrfs subv create /mnt/src (cd ~/git/btrfs/fs/btrfs tar cf - .) | (cd /mnt/src tar xf -) for i in `seq 2000`; do btrfs subv create /mnt/${i}; (cd /mnt/src tar cf - .) | (cd /mnt/${i} tar xf -); done for i in /mnt/[0-9]*; do btrfs subv dele ${i}; done sleep 45 umount /mnt With the latest cmason/for-linus(commit 6f60cbd3ae442cb35861bb522f388db123d42ec1 btrfs: access superblock via pagecache in scan_one_device), I ran this script several times with all good, I used two 40G disks, others remains same. I'm wondering which line does 'del_fs_roots+0xaf/0xf0 [btrfs]' refer to? thanks, liubo BUG: unable to handle kernel paging request at 88023517d830 IP: [814415f7] __list_add+0x17/0xd0 PGD 1e0c063 PUD bf58e067 PMD bf737067 PTE 80023517d160 Oops: [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: btrfs raid1 mpt2sas scsi_transport_sas raid_class CPU 2 Pid: 18503, comm: umount Not tainted 3.7.0+ #44 Supermicro X8SIL/X8SIL RIP: 0010:[814415f7] [814415f7] __list_add+0x17/0xd0 RSP: 0018:88019e1abbd8 EFLAGS: 00010286 RAX: 8802353aa290 RBX: 880229e38828 RCX: 0001 RDX: 88023517d828 RSI: 8802327214c0 RDI: 880229e38828 RBP: 88019e1abbf8 R08: 0006e130 R09: R10: R11: 0001 R12: 880229e38000 R13: 880229e38898 R14: R15: 88019e1abd30 FS: 7f75eabc4740() GS:880236a0() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 88023517d830 CR3: 00019e17e000 CR4: 07e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process umount (pid: 18503, threadinfo 88019e1aa000, task 8802353aa290) Stack: a008619f 880229e38000 880229e38000 880229e38898 88019e1abc18 a00861c0 88012760dc38 88012760dc38 88019e1abc48 a0095358 88012760dc38 88012760dcc0 Call Trace: [a008619f] ? btrfs_add_dead_root+0x1f/0x60 [btrfs] [a00861c0] btrfs_add_dead_root+0x40/0x60 [btrfs] [a0095358] btrfs_destroy_inode+0x1d8/0x2d0 [btrfs] [811af9c7] destroy_inode+0x37/0x60 [811afafd] evict+0x10d/0x1a0 [811b02a5] iput+0x105/0x190 [a007dda8] free_fs_root+0x18/0x90 [btrfs] [a00811eb] btrfs_free_fs_root+0x7b/0x90 [btrfs] [a00812af] del_fs_roots+0xaf/0xf0 [btrfs] [a0082c16] close_ctree+0x1c6/0x300 [btrfs] [811b072c] ? evict_inodes+0xec/0x100 [a00583a4] btrfs_put_super+0x14/0x20 [btrfs] [8119805c] generic_shutdown_super+0x5c/0xe0 [81198171] kill_anon_super+0x11/0x20 [a005c3a5] btrfs_kill_super+0x15/0x90 [btrfs] [811991a1] ? deactivate_super+0x41/0x70 [8119856d] deactivate_locked_super+0x3d/0x70 [811991a9] deactivate_super+0x49/0x70 [811b4332] mntput_no_expire+0xd2/0x130 [811b52e1] sys_umount+0x71/0x390 [81956992] system_call_fastpath+0x16/0x1b -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/8] Add some helpers to manage the strings allocation/deallocation.
On 02/19/2013 12:08 AM, Zach Brown wrote: On Mon, Feb 18, 2013 at 10:04:26PM +0100, Goffredo Baroncelli wrote: This patch adds some helpers to manage the strings allocation and deallocation. The function string_list_add(char *) adds the passed string to a list; the function string_list_free() frees all the strings together. Please don't do this. To verify that a given pointer isn't freed before it's used we'd have to make sure that there are no string_list_free() calls in the interim that would hit their pointer on this global list. The idea is that the code shouldn't care about do deallocate the strings until finishing its jobs. At the end it calls string_list_free(). Of course, if after string_list_free() some dynamically allocated strings are used then bad things could happen. Ideally string_list_free() should be called at the end of the main. I don't think that btrfs-progs allocates an huge quantity of string, so this could be an acceptable behaviour. As far as I can tell, this is only used for the pretty units? Instead of printf(%s, leaked_string(raw)); how about printf(%llu%s, scaled_value(raw), static_unit_str(raw)); That'd avoid the need to pass back arbitrary allocated strings and this code could go away. Sorry I don't understand the differences between {leaked, scaled, raw}_string. Could you elaborate a bit ? +if (!strings_to_free) { +fprintf(stderr, add_string_to_free(): Not enough memory\n); +strings_to_free = 0; if (a == 0) a = 0? +count_string_to_free = 0; +} + +strings_to_free[count_string_to_free-1] = s; NULL[-1] = s? Right, I will correct soon. - z -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html