Re: fstrim on BTRFS
On Fri, Dec 30, 2011 at 1:19 PM, Li Zefan wrote: >> Or would some data >> block group can be converted to metadata, and vice versa? >> > > This won't happen. Also empty block groups won't be reclaimed, but it's > in TODO list. Ah, OK. 6G for metadata out of 50G total seems a bit much, but I can live with it for now. Thanks, Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: fstrim on BTRFS
Fajar A. Nugraha wrote: > On Thu, Dec 29, 2011 at 4:39 PM, Li Zefan wrote: >> Fajar A. Nugraha wrote: >>> On Wed, Dec 28, 2011 at 11:57 PM, Martin Steigerwald >>> wrote: But BTRFS does not: merkaba:~> fstrim -v / /: 4431613952 bytes were trimmed merkaba:~> fstrim -v / /: 4341846016 bytes were trimmed >>> >>> and apparently it can't trim everything. Or maybe my kernel is >>> just too old. >>> >>> >>> $ sudo fstrim -v / >>> 2258165760 Bytes was trimmed >>> >>> $ df -h / >>> FilesystemSize Used Avail Use% Mounted on >>> /dev/sda6 50G 34G 12G 75% / >>> >>> $ mount | grep "/ " >>> /dev/sda6 on / type btrfs (rw,noatime,subvolid=258,compress-force=lzo) >>> >>> so only about 2G out of 12G can be trimmed. This is on kernel 3.1.4. >>> >> >> That's because only free spaces in block groups will be trimmed. Btrfs >> allocates space from block groups, and when there's no space availabe, >> it will allocate a new block group from the pool. In your case there's >> ~10G in the pool. > > Thanks for your response. > >> >> You can do a "btrfs fi df /", and you'll see the total size of existing >> block groups. > > $ sudo btrfs fi df / > Data: total=43.47GB, used=31.88GB > System, DUP: total=8.00MB, used=12.00KB > System: total=4.00MB, used=0.00 > Metadata, DUP: total=3.25GB, used=619.88MB This is DUP, so the actual physical size is (3.25 * 2) = 6.5G > Metadata: total=8.00MB, used=0.00 > > That should mean existing block groups is at least 46GB, right? In so the sum is 50G. > which case my pool (a 50G partition) should only have about 4GB of > space not allocated to block groups. The numbers don't seem to match. > The pool has been emptied, so there're other reasons that you had only ~2GB trimmed, and the possible reason is fstrim in btrfs is buggy. I sent a fix weeks ago, which is not merged yet: http://marc.info/?l=linux-btrfs&m=132212530410572&w=2 >> >> You can empty the pool by: >> >># dd if=/dev/zero of=/mytmpfile bs=1M >> >> Then release the space (but it won't return back to the pool): >> >># rm /mytmpfile >># sync > > Is there a bad side effect of doing so? For example, since all free > space in the pool would be allocated to data block group, would that > mean my metadata block group is capped at 3.25GB? You can config the ratio of data block groups and metadata block groups via "metadata_ratio=" mount option. > Or would some data > block group can be converted to metadata, and vice versa? > This won't happen. Also empty block groups won't be reclaimed, but it's in TODO list. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Cloning a Btrfs partition
Actually, I seem to be having problems where my rsync script ends up hanging the system again. It's pretty repeatable, and the system is completely frozen and I have to do a hard reboot. Runs for a couple of hours and hangs the system every time. Of course, I'm not doing anything special other than an rsync of compressed btrfs data and snapshots. Well, that and my btrfs partitions are on external SATA port multipliers and btrfs is used to create a two drive RAID-0 for each partition (the source and the destination). I tried the bwlimit switch on rsync, which seemed to allow it to go longer between crashes, but of course that just means I'm copying the data slower too I can't find anything in the usual logs. Any suggestions? I'm using CentOS 6.2 fully updated. -BJ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Compession, on filesystem or volume?
On 12/29/2011 07:11 PM, Fajar A. Nugraha wrote: > On Thu, Dec 29, 2011 at 5:51 PM, Remco Hosman wrote: >> Hi, >> >> Something i could not find in the documentation i managed to find: >> if you mount with compress=lzo and rebalance, is compression on for that >> filesystem or only a single volume? >> >> eg, can i have a @boot volume uncompressed and @ and @home compressed. > > Last time I asked a similar question, the answer was no. It's per filesystem. > > however you can change compression of individual files between > zlib/lzo using "btrfs fi defragment -c", regardless of what the > filesystem is currently mounted with. > for individual files and directories, we can also set them compress via FS_IOC_SETFLAGS. thanks, liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Compession, on filesystem or volume?
On Thu, Dec 29, 2011 at 5:51 PM, Remco Hosman wrote: > Hi, > > Something i could not find in the documentation i managed to find: > if you mount with compress=lzo and rebalance, is compression on for that > filesystem or only a single volume? > > eg, can i have a @boot volume uncompressed and @ and @home compressed. Last time I asked a similar question, the answer was no. It's per filesystem. however you can change compression of individual files between zlib/lzo using "btrfs fi defragment -c", regardless of what the filesystem is currently mounted with. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Compession, on filesystem or volume?
Hi, Something i could not find in the documentation i managed to find: if you mount with compress=lzo and rebalance, is compression on for that filesystem or only a single volume? eg, can i have a @boot volume uncompressed and @ and @home compressed. Thanks, Remco -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: fstrim on BTRFS
Am Donnerstag, 29. Dezember 2011 schrieb Li Zefan: > Martin Steigerwald wrote: > > Hi! > > > > With 3.2-rc4 (probably earlier), Ext4 seems to remember what areas it > > trimmed: > > > > merkaba:~> fstrim -v /boot > > /boot: 224657408 bytes were trimmed > > merkaba:~> fstrim -v /boot > > /boot: 0 bytes were trimmed > > > > > > But BTRFS does not: > > > > merkaba:~> fstrim -v / > > /: 4431613952 bytes were trimmed > > merkaba:~> fstrim -v / > > /: 4341846016 bytes were trimmed > > > > > > Is it planned to add this feature to BTRFS as well? > > There's no such plan, but it's do-able, and I can take care of it. > There's an issue though. > > Whether we want to store TRIMMED information on disk? ext4 doesn't > do this, so the first fstrim will be slow though you've done fstrim > in previous mount. > > For btrfs this issue can't be solved without disk format change that > will break older kernels, but only 3.2-rcX kernels will be affected if > we push the following change into mainline before 3.2 release. I can´t comment on the disk format change. But if it is accepted, I can give your patchset a spin before 3.3 merge window. Tell me when you´d like that. If not, then AFAIK there is another disk format change necessary to raise hard link limit. So maybe then it makes sense to combine both disk format changes at some future kernel. Better an early one, before adoption raises even more. Thanks, -- Martin 'Helios' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS fsck ?
On Fri, 23 Dec 2011 17:55:35 +0100 Erik Logtenberg wrote: > Any updates on the results of another month of full time work on > btrfsck? I also wonder about it...During the summer I migrated from FreeBSD/ZFS back to Linux and had a feeling, based on info I got (IRC, wiki etc.) that fsck is around the corner, but now I see it's good that we opted for ext4... Sincerely, Gour -- Whatever action a great man performs, common men follow. And whatever standards he sets by exemplary acts, all the world pursues. http://atmarama.net | Hlapicina (Croatia) | GPG: 52B5C810 signature.asc Description: PGP signature
Re: fstrim on BTRFS
On Thu, Dec 29, 2011 at 4:39 PM, Li Zefan wrote: > Fajar A. Nugraha wrote: >> On Wed, Dec 28, 2011 at 11:57 PM, Martin Steigerwald >> wrote: >>> But BTRFS does not: >>> >>> merkaba:~> fstrim -v / >>> /: 4431613952 bytes were trimmed >>> merkaba:~> fstrim -v / >>> /: 4341846016 bytes were trimmed >> >> and apparently it can't trim everything. Or maybe my kernel is >> just too old. >> >> >> $ sudo fstrim -v / >> 2258165760 Bytes was trimmed >> >> $ df -h / >> Filesystem Size Used Avail Use% Mounted on >> /dev/sda6 50G 34G 12G 75% / >> >> $ mount | grep "/ " >> /dev/sda6 on / type btrfs (rw,noatime,subvolid=258,compress-force=lzo) >> >> so only about 2G out of 12G can be trimmed. This is on kernel 3.1.4. >> > > That's because only free spaces in block groups will be trimmed. Btrfs > allocates space from block groups, and when there's no space availabe, > it will allocate a new block group from the pool. In your case there's > ~10G in the pool. Thanks for your response. > > You can do a "btrfs fi df /", and you'll see the total size of existing > block groups. $ sudo btrfs fi df / Data: total=43.47GB, used=31.88GB System, DUP: total=8.00MB, used=12.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=3.25GB, used=619.88MB Metadata: total=8.00MB, used=0.00 That should mean existing block groups is at least 46GB, right? In which case my pool (a 50G partition) should only have about 4GB of space not allocated to block groups. The numbers don't seem to match. > > You can empty the pool by: > > # dd if=/dev/zero of=/mytmpfile bs=1M > > Then release the space (but it won't return back to the pool): > > # rm /mytmpfile > # sync Is there a bad side effect of doing so? For example, since all free space in the pool would be allocated to data block group, would that mean my metadata block group is capped at 3.25GB? Or would some data block group can be converted to metadata, and vice versa? -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] Btrfs: save trimmed flag onto disk
To speed up the first fstrim after mounting the filesystem, we save the trimmed flag to disk. # fstrim -v /mnt/ /mnt/: 267714560 bytes were trimmed # fstrim -v /mnt/ /mnt/: 0 bytes were trimmed # sync # umount /mnt # !mount # fstrim -v /mnt/ /mnt/: 152240128 bytes were trimmed Because caches for block groups smaller than 100M will not be written to disk, we'll still have to trim them. Signed-off-by: Li Zefan --- fs/btrfs/ctree.h|1 + fs/btrfs/free-space-cache.c | 19 --- 2 files changed, 17 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index ca4eb2d..84e9ff6 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -280,6 +280,7 @@ struct btrfs_chunk { #define BTRFS_FREE_SPACE_EXTENT(1 << 0) #define BTRFS_FREE_SPACE_BITMAP(1 << 1) +#define BTRFS_FREE_SPACE_TRIMMED (1 << 2) struct btrfs_free_space_entry { __le64 offset; diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index cba2a94..592ba54 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -469,7 +469,7 @@ static int io_ctl_check_crc(struct io_ctl *io_ctl, int index) } static int io_ctl_add_entry(struct io_ctl *io_ctl, u64 offset, u64 bytes, - void *bitmap) + void *bitmap, bool trimmed) { struct btrfs_free_space_entry *entry; @@ -481,6 +481,8 @@ static int io_ctl_add_entry(struct io_ctl *io_ctl, u64 offset, u64 bytes, entry->bytes = cpu_to_le64(bytes); entry->type = (bitmap) ? BTRFS_FREE_SPACE_BITMAP : BTRFS_FREE_SPACE_EXTENT; + if (trimmed) + entry->type |= BTRFS_FREE_SPACE_TRIMMED; io_ctl->cur += sizeof(struct btrfs_free_space_entry); io_ctl->size -= sizeof(struct btrfs_free_space_entry); @@ -669,6 +671,9 @@ int __load_free_space_cache(struct btrfs_root *root, struct inode *inode, goto free_cache; } + if (type & BTRFS_FREE_SPACE_TRIMMED) + e->trimmed = true; + if (type & BTRFS_FREE_SPACE_EXTENT) { spin_lock(&ctl->tree_lock); ret = link_free_space(ctl, e); @@ -899,7 +904,7 @@ int __btrfs_write_out_cache(struct btrfs_root *root, struct inode *inode, entries++; ret = io_ctl_add_entry(&io_ctl, e->offset, e->bytes, - e->bitmap); + e->bitmap, e->trimmed); if (ret) goto out_nospc; @@ -937,7 +942,7 @@ int __btrfs_write_out_cache(struct btrfs_root *root, struct inode *inode, len = min(len, end + 1 - start); entries++; - ret = io_ctl_add_entry(&io_ctl, start, len, NULL); + ret = io_ctl_add_entry(&io_ctl, start, len, NULL, false); if (ret) goto out_nospc; @@ -2696,6 +2701,14 @@ int btrfs_trim_block_group(struct btrfs_block_group_cache *block_group, if (update) { spin_lock(&space_info->lock); spin_lock(&block_group->lock); + + if (btrfs_test_opt(fs_info->tree_root, + SPACE_CACHE) && + block_group->disk_cache_state < + BTRFS_DC_CLEAR); + block_group->disk_cache_state = + BTRFS_DC_CLEAR; + block_group->dirty = 1; if (block_group->ro) space_info->bytes_readonly += bytes; block_group->reserved -= bytes; -- 1.7.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] Btrfs: speed up fstrim
By remembering which areas has been trimmed, we can speed up fstrim significantly. # fstrim -v /mnt/ /mnt/: 152772608 bytes were trimmed # fstrim -v /mnt/ /mnt/: 0 bytes were trimmed No bytes has to be trimmed for the second run. Signed-off-by: Li Zefan --- fs/btrfs/extent-tree.c | 29 ++--- fs/btrfs/free-space-cache.c | 38 -- fs/btrfs/free-space-cache.h |7 --- fs/btrfs/inode-map.c| 16 +--- 4 files changed, 63 insertions(+), 27 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index f5fbe57..e743395 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -319,7 +319,7 @@ static u64 add_new_free_space(struct btrfs_block_group_cache *block_group, size = extent_start - start; total_added += size; ret = btrfs_add_free_space(block_group, start, - size); + size, false); BUG_ON(ret); start = extent_end + 1; } else { @@ -330,7 +330,7 @@ static u64 add_new_free_space(struct btrfs_block_group_cache *block_group, if (start < end) { size = end - start; total_added += size; - ret = btrfs_add_free_space(block_group, start, size); + ret = btrfs_add_free_space(block_group, start, size, false); BUG_ON(ret); } @@ -4631,7 +4631,7 @@ static int unpin_extent_range(struct btrfs_root *root, u64 start, u64 end) if (start < cache->last_byte_to_unpin) { len = min(len, cache->last_byte_to_unpin - start); - btrfs_add_free_space(cache, start, len); + btrfs_add_free_space(cache, start, len, false); } start += len; @@ -4987,7 +4987,7 @@ void btrfs_free_tree_block(struct btrfs_trans_handle *trans, WARN_ON(test_bit(EXTENT_BUFFER_DIRTY, &buf->bflags)); - btrfs_add_free_space(cache, buf->start, buf->len); + btrfs_add_free_space(cache, buf->start, buf->len, false); btrfs_update_reserved_bytes(cache, buf->len, RESERVE_FREE); } out: @@ -5427,14 +5427,16 @@ checks: search_start = stripe_align(root, offset); /* move on to the next group */ if (search_start + num_bytes >= search_end) { - btrfs_add_free_space(used_block_group, offset, num_bytes); + btrfs_add_free_space(used_block_group, offset, +num_bytes, false); goto loop; } /* move on to the next group */ if (search_start + num_bytes > used_block_group->key.objectid + used_block_group->key.offset) { - btrfs_add_free_space(used_block_group, offset, num_bytes); + btrfs_add_free_space(used_block_group, offset, +num_bytes, false); goto loop; } @@ -5443,13 +5445,14 @@ checks: if (offset < search_start) btrfs_add_free_space(used_block_group, offset, -search_start - offset); +search_start - offset, false); BUG_ON(offset > search_start); ret = btrfs_update_reserved_bytes(used_block_group, num_bytes, alloc_type); if (ret == -EAGAIN) { - btrfs_add_free_space(used_block_group, offset, num_bytes); + btrfs_add_free_space(used_block_group, offset, +num_bytes, false); goto loop; } @@ -5459,7 +5462,7 @@ checks: if (offset < search_start) btrfs_add_free_space(used_block_group, offset, -search_start - offset); +search_start - offset, false); BUG_ON(offset > search_start); if (used_block_group != block_group) btrfs_put_block_group(used_block_group); @@ -5668,6 +5671,7 @@ static int __btrfs_free_reserved_extent(struct btrfs_root *root, { struct btrfs_block_group_cache *cache; int ret = 0; + bool trimmed = false; cache = btrfs_lookup_block_group(root->fs_info, start); if (!cache) { @@ -5676,13 +5680,16 @@ static int __btrfs_free_reserved_extent(struct btrfs_root *root, return -ENOSPC;
[PATCH 1/3][URGENT] Btrfs: allow future use of type field of struct btrfs_free_space_entry
This field indicates if an entry is an extent or a bitmap, and only 2 bits of it are used. This patch makes the other bits are avaiable for future use without breaking old kernels. For example, we're going to use one bit to mark if the free space has been trimmed. Signed-off-by: Li Zefan --- This has to be queued for 3.2, so later patches can affect 3.2-rcX kernels only. --- fs/btrfs/ctree.h|4 ++-- fs/btrfs/free-space-cache.c |4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 6738503..ca4eb2d 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -278,8 +278,8 @@ struct btrfs_chunk { /* additional stripes go here */ } __attribute__ ((__packed__)); -#define BTRFS_FREE_SPACE_EXTENT1 -#define BTRFS_FREE_SPACE_BITMAP2 +#define BTRFS_FREE_SPACE_EXTENT(1 << 0) +#define BTRFS_FREE_SPACE_BITMAP(1 << 1) struct btrfs_free_space_entry { __le64 offset; diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index ec23d43..044c0ec 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -669,7 +669,7 @@ int __load_free_space_cache(struct btrfs_root *root, struct inode *inode, goto free_cache; } - if (type == BTRFS_FREE_SPACE_EXTENT) { + if (type & BTRFS_FREE_SPACE_EXTENT) { spin_lock(&ctl->tree_lock); ret = link_free_space(ctl, e); spin_unlock(&ctl->tree_lock); @@ -679,7 +679,7 @@ int __load_free_space_cache(struct btrfs_root *root, struct inode *inode, kmem_cache_free(btrfs_free_space_cachep, e); goto free_cache; } - } else { + } else if (type & BTRFS_FREE_SPACE_BITMAP) { BUG_ON(!num_bitmaps); num_bitmaps--; e->bitmap = kzalloc(PAGE_CACHE_SIZE, GFP_NOFS); -- 1.7.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC][PATCH 0/3] Btrfs: speed up fstrim
(This patchset is not for merge or review, except the first patch) By remembering which areas have been trimmed, we can speed up fstrim significantly. # fstrim -v /mnt/ /mnt/: 152772608 bytes were trimmed # fstrim -v /mnt/ /mnt/: 0 bytes were trimmed To implement this, after a free space item has been trimmed, we mark it as trimmed before inserting it into free space cache. (*)If we want to speed up the first fstrim after mounting the filesystem, we have to save the trimmed flag to disk, which will break backward compatibility, but only 3.2-rcX kernels will be affected. That is, if you use fstrim in newest kernel with this patchset applied, and then you mount the fs in a 3.2-rcX kernel, you may trigger a BUG_ON() in __load_free_space_cache() sooner or later. So, is this acceptable? # fstrim -v /mnt/ /mnt/: 267714560 bytes were trimmed # fstrim -v /mnt/ /mnt/: 0 bytes were trimmed # sync # umount /mnt # !mount # fstrim -v /mnt/ /mnt/: 152240128 bytes were trimmed Because caches for block groups smaller than 100M will not be written to disk, we'll still have to trim them. *See this thread for a user request for this feature: https://lkml.org/lkml/2011/12/1/24 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: fstrim on BTRFS
Fajar A. Nugraha wrote: > On Wed, Dec 28, 2011 at 11:57 PM, Martin Steigerwald > wrote: >> But BTRFS does not: >> >> merkaba:~> fstrim -v / >> /: 4431613952 bytes were trimmed >> merkaba:~> fstrim -v / >> /: 4341846016 bytes were trimmed > > and apparently it can't trim everything. Or maybe my kernel is > just too old. > > > $ sudo fstrim -v / > 2258165760 Bytes was trimmed > > $ df -h / > FilesystemSize Used Avail Use% Mounted on > /dev/sda6 50G 34G 12G 75% / > > $ mount | grep "/ " > /dev/sda6 on / type btrfs (rw,noatime,subvolid=258,compress-force=lzo) > > so only about 2G out of 12G can be trimmed. This is on kernel 3.1.4. > That's because only free spaces in block groups will be trimmed. Btrfs allocates space from block groups, and when there's no space availabe, it will allocate a new block group from the pool. In your case there's ~10G in the pool. You can do a "btrfs fi df /", and you'll see the total size of existing block groups. You can empty the pool by: # dd if=/dev/zero of=/mytmpfile bs=1M Then release the space (but it won't return back to the pool): # rm /mytmpfile # sync and try "btrfs fi df /" and trim again. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html