BTRFS setup advice for laptop performance ?
Hi, I'm going to receive a new small laptop with a 500 GB 5400 RPM mechanical ole' rust HD, and I plan ton install BTRFS on it. It will have a kernel 3.13 for now, until 3.14 gets released. However I'm still concerned with chronic BTRFS dreadful performance and still find that BRTFS degrades much over time even with periodic defrag and best practices etc. So I'd like to start with the best possible options and have a few questions : - Is it still recommended to mkfs with a nodesize or leafsize different (bigger) than the default ? I wouldn't like to lose too much disk space anyway (1/2 nodesize per file on average ?), as it will be limited... - Is it recommended to alter the FS to have skinny extents ? I've done this on all of my BTRFS machines without problem, still the kernel spits a notice at mount time, and I'm worrying kind of Why is the kernel warning me I have skinny extents ? Is it bad ? Is it something I should avoid ? - Are there other optimization tricks I should perform at mkfs time because thay can't be changed later on ? - Are there other btrfstune or mount options I should pass before starting to populate the FS with a system and data ? - Generally speaking, does LZO compression improve or degrade performance ? I'm not able to figure it out clearly. TIA for the insight. -- Swâmi Petaramesh sw...@petaramesh.org http://petaramesh.org PGP 9076E32E -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[ANNOUNCE] xfstests: updated to cf1ed54
Hi folks, The xfstests repository at git://oss.sgi.com/xfs/cmds/xfstests has just been updated. Patches often get missed, so please check if your outstanding patches were in this update. If they have not been in this update, please resubmit them to x...@oss.sgi.com so they can be picked up in the next update. The new head of the master branch is commit: cf1ed54 check: fix RESULT_BASE typo in check script The major new functionality worth mentioning in this update is the new config file format from Lukas. The existing format config files should continue to work without change, but the new format is much richer and allows specification of multiple different configurations to run test under. Hence testing multiple mount an dmkfs configurations becomes as simple as iterating the configurations in the config file. New Commits: Christoph Hellwig (2): [feb7da1] common: add flink support to _require_xfs_io_command [3bbbc25] generic: add a basic O_TMPFILE test Eric Sandeen (1): [cf1ed54] check: fix RESULT_BASE typo in check script Eryu Guan (1): [dadfd3c] shared: new test to use up free inodes Filipe David Borba Manana (3): [bb2499e] btrfs: add test for btrfs send issuing premature rmdir operations [c99c847] btrfs: add test for btrfs incremental send [e0ff31a] btrfs: add test for btrfs incremental send data corruption Filipe Manana (1): [1a87439] btrfs: add test for btrfs send directory moves/renames Hannes Frederic Sowa (1): [947ee8b] fs: add directories hash collision test Lukas Czerner (10): [4d18f5a] generic: add generic test for fallocate zero range [bf44459] check: Prepare for config section [667308d] check: Add support for sections in config file [f8e4f53] check: Allow to recreate TEST_DEV [b1ffb05] check: Remount file system if MOUNT_OPTIONS changed [7baa3e2] check: unmount TEST_DEV and SCRATCH_DEV after test run [21723cd] generic: Make some shared tests generic [259d680] ext4: Make shared/243 ext4 specific [5f8c711] fsx: Add fallocate collapse range operation [f98d930] fsstress: Add fallocate collapse range operation Code Diffstat: .gitignore| 1 + README.config-sections| 87 +++ check | 404 ++ common/config | 155 +++- common/rc | 30 ++- ltp/fsstress.c| 20 ++ ltp/fsx.c | 107 +++- src/Makefile | 2 +- src/dirhash_collide.c | 223 + tests/btrfs/043 | 149 +++ tests/btrfs/043.out | 1 + tests/btrfs/044 | 129 ++ tests/btrfs/044.out | 1 + tests/btrfs/045 | 376 +++ tests/btrfs/045.out | 1 + tests/btrfs/046 | 304 ++ tests/btrfs/046.out | 213 tests/btrfs/group | 4 + tests/{shared/243 = ext4/002}| 4 +- tests/{shared/243.out = ext4/002.out}| 2 +- tests/ext4/group | 1 + tests/generic/004 | 65 + tests/generic/004.out | 6 + tests/generic/009 | 78 ++ tests/generic/009.out | 333 tests/{shared/003 = generic/012} | 6 +- tests/{shared/003.out = generic/012.out} | 2 +- tests/{shared/004 = generic/016} | 6 +- tests/{shared/004.out = generic/016.out} | 2 +- tests/{shared/005 = generic/017} | 4 +- tests/generic/017.out | 4 + tests/{shared/218 = generic/018} | 4 +- tests/{shared/218.out = generic/018.out} | 2 +- tests/{shared/305 = generic/019} | 4 +- tests/{shared/305.out = generic/019.out} | 2 +- tests/{shared/001 = generic/021} | 6 +- tests/{shared/001.out = generic/021.out} | 2 +- tests/{shared/002 = generic/022} | 6 +- tests/{shared/002.out = generic/022.out} | 2 +- tests/generic/group | 9 + tests/shared/005.out | 4 - tests/shared/006 | 97 +++ tests/shared/006.out | 2 + tests/shared/group| 10 +- tests/xfs/006 | 63 + tests/xfs/006.out | 28 +++ tests/xfs/group | 1 + 47 files changed, 2746 insertions(+), 216 deletions(-) create mode 100644 README.config-sections create mode 100644 src/dirhash_collide.c create mode 100644 tests/btrfs/043 create mode 100644
Re: [PATCH RFCv4] new ioctl TREE_SEARCH_V2
Hi Chris, On Thu, Jan 30, 2014 at 04:23:56PM +0100, Gerhard Heift wrote: This patch series first rewrites tree_search to copy found items directly to userspace and then adds a new ioctl TREE_SEARCH_V2 with which we could store them in a varying buffer. Now even items larger than 3992 bytes or a large amount of items can be returned. This is the case for some EXTENT_CSUM items, which could have a size up to 16k. can you add this patchset to 3.15 queue? It hasn't gone through btrfs-next for unknown reasons, but it should be merged because it makes search ioctl usable with bigblocks that are now 4k by default. The newly added functionality is localized, I don't see huge risks adding it even it was not in -next yet. thanks, david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs: filter invalid arg for btrfs resize
On Mon, Mar 31, 2014 at 06:03:25PM +0800, Gui Hecheng wrote: Originally following cmds will work: # btrfs fi resize -10A mnt # btrfs fi resize -10Gaha mnt Filter the arg by checking the return pointer of memparse. This should probably also go to stable@ Signed-off-by: Gui Hecheng guihc.f...@cn.fujitsu.com --- fs/btrfs/ioctl.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index e174770..2ed21fa 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1459,6 +1459,7 @@ static noinline int btrfs_ioctl_resize(struct file *file, struct btrfs_trans_handle *trans; struct btrfs_device *device = NULL; char *sizestr; + char *retptr; Minor nit, the variable is otherwise unused, it's better to put it into the enclosing statement block with memparse. char *devstr = NULL; int ret = 0; int mod = 0; @@ -1526,8 +1527,8 @@ static noinline int btrfs_ioctl_resize(struct file *file, mod = 1; sizestr++; } - new_size = memparse(sizestr, NULL); - if (new_size == 0) { + new_size = memparse(sizestr, retptr); + if (*retptr != '\0' || new_size == 0) { ret = -EINVAL; goto out_free; } -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS setup advice for laptop performance ?
On 2014-04-04 04:02, Swâmi Petaramesh wrote: Hi, I'm going to receive a new small laptop with a 500 GB 5400 RPM mechanical ole' rust HD, and I plan ton install BTRFS on it. It will have a kernel 3.13 for now, until 3.14 gets released. However I'm still concerned with chronic BTRFS dreadful performance and still find that BRTFS degrades much over time even with periodic defrag and best practices etc. I keep hearing this from people, but i personally don't see this to be the case at all. I'm pretty sure the 'big' performance degradation that people are seeing is due to how they are using snapshots, not a result using BTRFS itself (I don't use them for anything other than ensuring a stable system image for rsync and/or tar based backups). So I'd like to start with the best possible options and have a few questions : - Is it still recommended to mkfs with a nodesize or leafsize different (bigger) than the default ? I wouldn't like to lose too much disk space anyway (1/2 nodesize per file on average ?), as it will be limited... This depends on many things, the average size of the files on the disk is the biggest factor. In general you should get the best disk utilization by setting nodesize so that a majority of the files are less than the leafsize minus 256 bytes, and all but a few are smaller than two times the leafsize minus 256 bytes. However, if you want to really benefit from the data compression, you should just use the smallest leaf/nodesize for your system (which is what mkfs defaults to), as data that gets as BTRFS stores files whose size is at least (roughly) 256 bytes less than the leafsize inline with the metadata, and doesn't compress such files. - Is it recommended to alter the FS to have skinny extents ? I've done this on all of my BTRFS machines without problem, still the kernel spits a notice at mount time, and I'm worrying kind of Why is the kernel warning me I have skinny extents ? Is it bad ? Is it something I should avoid ? I think that the primary reason for the warning is that it is backward incompatible, older kernels can't mount filesystems using it. - Are there other optimization tricks I should perform at mkfs time because thay can't be changed later on ? - Are there other btrfstune or mount options I should pass before starting to populate the FS with a system and data ? Unless you are using stuff like QEMU or Virtualbox, you should probably have autodefrag and space_cache on from the very start. - Generally speaking, does LZO compression improve or degrade performance ? I'm not able to figure it out clearly. As long as your memory bandwidth is significantly higher than disk bandwidth (which is almost always the case, even with SSD's), this should provide at least some improvement with respect to IO involving large files. Because you are using a traditional hard disk instead of an SSD, you might get better performance using zlib (assuming you don't mind slightly higer processor usage for IO to files larger than the leafsize). If you care less about disk utilization than you do about performance, you might want to use compress_force instead of compress, as the performance boost comes from not having to write as much data to disk. TIA for the insight. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2 v2] btrfs: btrfs_rm_device() should zero mirror SB as well
On Mon, Mar 31, 2014 at 10:13:56PM +0800, Anand Jain wrote: From: Anand Jain anand.j...@oracle.com This fix will ensure all SB copies on the disk is zeroed when the disk is intentionally removed. This helps to better manage disks in the user land. Signed-off-by: Anand Jain anand.j...@oracle.com btrfs: don't double brelse on device rm Device removal currently causes bdev removal to try to double free a bh in the bdev: [ 55.714833] WARNING: at fs/buffer.c:1160 __brelse+0x36/0x40() [ 55.714833] VFS: brelse: Trying to free free buffer Commit 7e3d9ebb1 added a double release of the bh for a device being removed when all the supers don't fit in the device. In that case it releases the bh assuming that it's going to read a new one, finds that it won't read, and goes to a label that releases the bh again. All it needed to do was only brelse() right before overwriting the current bh with __bread(). Signed-off-by: Zach Brown z...@redhat.com This is a bit confusing, two changelogs, one patch, the referenced commit id does not in fact exist. To keep all due credits, 2 patches would make sense but ... up to you. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS setup advice for laptop performance ?
Le vendredi 4 avril 2014 08:33:10 Austin S Hemmelgarn a écrit : However I'm still concerned with chronic BTRFS dreadful performance and still find that BRTFS degrades much over time even with periodic defrag and best practices etc. I keep hearing this from people, but i personally don't see this to be the case at all. I'm pretty sure the 'big' performance degradation that people are seeing is due to how they are using snapshots, not a result using BTRFS itself (I don't use them for anything other than ensuring a stable system image for rsync and/or tar based backups). Maybe I was wrong to suppose that if a feature exists, it is supposed to be usable... I have used ZFS for years, and on ZFS having *hundreds* of snapshots of any given FS have exactly zero impact on performance... With BTRFS, some time ago I tried to use SuSE snapper that passes its time doing and releasing snapshots, but it soon made my systems unusable... Now, I only keep 2-3 manually made snapshots just for keeping a stable and OK archive of my machine in a known state just in case... But if even this has a noticeable negative impact on BTRFS performance, then what the hell are BTRFS snapshots good at ?? Kind regards. -- Swâmi Petaramesh sw...@petaramesh.org http://petaramesh.org PGP 9076E32E -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [ANNOUNCE] xfstests: updated to cf1ed54
On Fri, Apr 4, 2014 at 10:03 AM, Dave Chinner da...@fromorbit.com wrote: Hi folks, The xfstests repository at git://oss.sgi.com/xfs/cmds/xfstests has just been updated. Patches often get missed, so please check if your outstanding patches were in this update. If they have not been in this update, please resubmit them to x...@oss.sgi.com so they can be picked up in the next update. The new head of the master branch is commit: cf1ed54 check: fix RESULT_BASE typo in check script The major new functionality worth mentioning in this update is the new config file format from Lukas. The existing format config files should continue to work without change, but the new format is much richer and allows specification of multiple different configurations to run test under. Hence testing multiple mount an dmkfs configurations becomes as simple as iterating the configurations in the config file. Hi, I might be missing something, but after checking out these changes, I am no longer able to run btrfs tests. Example: $ ./check btrfs/041 common/config: Error: $SCRATCH_DEV should be unset when $SCRATCH_DEV_POOL is set Passed all 0 tests $ cat local.config export TEST_DEV=/dev/sdb export TEST_DIR=/home/fdmanana/btrfs-tests/dev export SCRATCH_MNT=/home/fdmanana/btrfs-tests/scratch_1 export SCRATCH_DEV_POOL=/dev/sdc /dev/sdd I did check too that my shell environment didn't define/export SCRATCH_DEV. Going back to revision 3948694eb12e9699f558fab5e8169a8b090780d1, using the same exact config, it works. Do I need to adjust something in my config or is it a regression? thanks New Commits: Christoph Hellwig (2): [feb7da1] common: add flink support to _require_xfs_io_command [3bbbc25] generic: add a basic O_TMPFILE test Eric Sandeen (1): [cf1ed54] check: fix RESULT_BASE typo in check script Eryu Guan (1): [dadfd3c] shared: new test to use up free inodes Filipe David Borba Manana (3): [bb2499e] btrfs: add test for btrfs send issuing premature rmdir operations [c99c847] btrfs: add test for btrfs incremental send [e0ff31a] btrfs: add test for btrfs incremental send data corruption Filipe Manana (1): [1a87439] btrfs: add test for btrfs send directory moves/renames Hannes Frederic Sowa (1): [947ee8b] fs: add directories hash collision test Lukas Czerner (10): [4d18f5a] generic: add generic test for fallocate zero range [bf44459] check: Prepare for config section [667308d] check: Add support for sections in config file [f8e4f53] check: Allow to recreate TEST_DEV [b1ffb05] check: Remount file system if MOUNT_OPTIONS changed [7baa3e2] check: unmount TEST_DEV and SCRATCH_DEV after test run [21723cd] generic: Make some shared tests generic [259d680] ext4: Make shared/243 ext4 specific [5f8c711] fsx: Add fallocate collapse range operation [f98d930] fsstress: Add fallocate collapse range operation Code Diffstat: .gitignore| 1 + README.config-sections| 87 +++ check | 404 ++ common/config | 155 +++- common/rc | 30 ++- ltp/fsstress.c| 20 ++ ltp/fsx.c | 107 +++- src/Makefile | 2 +- src/dirhash_collide.c | 223 + tests/btrfs/043 | 149 +++ tests/btrfs/043.out | 1 + tests/btrfs/044 | 129 ++ tests/btrfs/044.out | 1 + tests/btrfs/045 | 376 +++ tests/btrfs/045.out | 1 + tests/btrfs/046 | 304 ++ tests/btrfs/046.out | 213 tests/btrfs/group | 4 + tests/{shared/243 = ext4/002}| 4 +- tests/{shared/243.out = ext4/002.out}| 2 +- tests/ext4/group | 1 + tests/generic/004 | 65 + tests/generic/004.out | 6 + tests/generic/009 | 78 ++ tests/generic/009.out | 333 tests/{shared/003 = generic/012} | 6 +- tests/{shared/003.out = generic/012.out} | 2 +- tests/{shared/004 = generic/016} | 6 +- tests/{shared/004.out = generic/016.out} | 2 +- tests/{shared/005 = generic/017} | 4 +- tests/generic/017.out | 4 + tests/{shared/218 = generic/018} | 4 +- tests/{shared/218.out = generic/018.out} | 2 +- tests/{shared/305 = generic/019} | 4 +- tests/{shared/305.out =
Re: [PATCH RFC v2] Btrfs: device_list_add() should not update list when mounted
On Wed, Apr 02, 2014 at 05:48:21PM +0800, Anand Jain wrote: Device list add shouldn't update the list when FS is mounted, unless the whole loop w.r.t to bringing back the missing disk is completed. (That is making it to be part of the group profile and the code for this isn't there yet). As as of now (without this patch) when device is scanned with missing disk, it would update in the device list but the disk is left unused, it won't be opened by the btrfs and won't be part of the btrfs group profile operation. reproducer 1: mkfs.btrfs -draid1 -mraid1 /dev/sdb /dev/sdc modprobe -r btrfs mount -o degraded /dev/sdb /btrfs btrfs dev scan /dev/sdc use btrfs-devlist (or any of your method) to know that /dev/sdc isn't actually part of the above FS, (though now the missing disk path is updated in the btrfs_fs_devices) whats more btrfs kernel didn't open it all. Up to now, it works as expected. A device is permanently opened only via mount, 'dev scan' lets the kernel module know about the device, but does not keep it open. And so mkfs.ext4 can be successfully be created on /dev/sdc, whereas it fails on /dev/sdb This should not matter, the device /dev/sdc is not opened, one could overwrite the block device without restrictions. Besides this passes because mkfs.ext4 does not check the device for existing filesystems. Try other mkfs': # mkfs.btrfs /dev/sda14 /dev/sda14 appears to contain an existing filesystem (btrfs). Error: Use the -f option to force overwrite. # mkfs.xfs /dev/sda14 mkfs.xfs: /dev/sda14 appears to contain an existing filesystem (btrfs). mkfs.xfs: Use the -f option to force overwrite. reproducer 2: disappear a disk then replace (RAID1) the disappeared disk and then make the disappeared disk to reappear. mkfs.btrfs -f -m raid1 -d raid1 /dev/sdc /dev/sdd mount /dev/sdc /btrfs dd if=/dev/zero of=/btrfs/tf1 count=1 btrfs fi sync /btrfs devmgt[1] will help to attach or detach a disk easily devmgt show devmgt detach /dev/sdc btrfs sill unaware of device missing (but thats not the point) Because it prints the device state cached in memory, there's no validation step, and no way as of now to tell btrfs the device is gone. btrfs fi show -m Label: none uuid: 5dc0aaf4-4683-4050-b2d6-5ebe5f5cd120 Total devices 2 FS bytes used 32.00KiB devid1 size 958.94MiB used 115.88MiB path /dev/sdc -- devid2 size 958.94MiB used 103.88MiB path /dev/sdd btrfs rep start -f 1 /dev/sde /btrfs Label: none uuid: 5dc0aaf4-4683-4050-b2d6-5ebe5f5cd120 Total devices 2 FS bytes used 32.00KiB devid1 size 958.94MiB used 115.88MiB path /dev/sde devid2 size 958.94MiB used 103.88MiB path /dev/sdd so far good. now missing /dev/sdc comes-back. devmgt attach host2 btrfs fi show -m shows sdc Label: none uuid: 5dc0aaf4-4683-4050-b2d6-5ebe5f5cd120^M Total devices 2 FS bytes used 32.00KiB^M devid1 size 958.94MiB used 115.88MiB path /dev/sdc - Wrong. devid2 size 958.94MiB used 103.88MiB path /dev/sdd this is wrong it should be sde. Agreed, this is a problem. this happened because when disk comes back device_list_add() is called which would invariably replace the existing disk with the given disk with the same fsid/devid. But the actual IO is still going to sde not to sdc. Yes, because it's the 'sde' device that has been open and used, so it's possible that 'btrfs fi show' wrongly identifies the device. Further when we start fresh with (modprobe -r btrfs) unless it is carefully managed using btrfs dev scan dev it may pair with wrong disk (there will be a new patch to fix this). IOW, there are more than one devices with the same fsid/id. The same as if we do 'dd'. This is a known problem, and you demonstrated another way how to get to it. IIRC, theres' no way for the kernel module to know which of the devices with same fsid/id is the one desired to mount. The workaround is don't do that. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs: fix reversed warning condition in btrfs_delayed_inode_reserve_metadata
On Fri, Apr 04, 2014 at 11:03:16AM +0800, Liu Bo wrote: On Thu, Apr 03, 2014 at 06:18:40PM +0200, David Sterba wrote: On Thu, Apr 03, 2014 at 01:34:23PM +0800, Liu Bo wrote: On Wed, Apr 02, 2014 at 07:13:00PM +0200, David Sterba wrote: Commit fae7f21cece9a4c181 (btrfs: Use WARN_ON()'s return value in place of WARN_ON(1)) cleaned up WARN_ON usage and in one place reversed the condition that led to loads of warnings that were not supposed to occur. WARN_ON will trigger because it sees 'ret' though in the previous code did not reach the WARN_ON below. The correct pattern is - if (condition) + if (WARN_ON(condition)) CC: Dulshani Gunawardhana dulshani.gunawardhan...@gmail.com CC: sta...@vger.kernel.org # 3.13 Reported-by: Liu Bo bo.li@oracle.com Signed-off-by: David Sterba dste...@suse.cz --- fs/btrfs/delayed-inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c index 451b00c86f6c..098af20abd88 100644 --- a/fs/btrfs/delayed-inode.c +++ b/fs/btrfs/delayed-inode.c @@ -649,7 +649,7 @@ static int btrfs_delayed_inode_reserve_metadata( goto out; ret = btrfs_block_rsv_migrate(src_rsv, dst_rsv, num_bytes); - if (!WARN_ON(ret)) + if (WARN_ON(!ret)) goto out; Oh sorry, I'd have to get my Reviewed-by back and give a NACK instead. With this patch, (ret = 0) triggers the WARNING, which is not right. Thanks for catching this, you're right, my patch was wrong. I must say the patch (fae7f21ce) made the code harder to read at some places, I don't see much help in removing plain WARN_ON(1) at this cost. I agree, I prefer the original code which is easier to understand, if (!ret) goto out; WARN_ON(1); Back to the warning flood you observed, the comment under the warning says: 655 /* 656 * Ok this is a problem, let's just steal from the global rsv 657 * since this really shouldn't happen that often. 658 */ 659 ret = btrfs_block_rsv_migrate(root-fs_info-global_block_rsv, 660 dst_rsv, num_bytes); so the question is why it does happen so often. A WARN_ON_ONCE hides the severity of the problem, so I'd rather suggest to put it under enospc_debug option so we can debug it and it does not bother users. As this is closer to the way you were going to fix that, I'm not sending a patch, take this as a review comment. The comment was based on some assumptions which could be wrong according to my observation. Then the question is if the WARN_ON points to a problem or not. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: fix unlock in __start_delalloc_inodes()
On Wed, Apr 02, 2014 at 07:53:32PM +0800, Wang Shilong wrote: This patch fix a regression caused by the following patch: Btrfs: don't flush all delalloc inodes when we doesn't get s_umount lock break while loop will make us call @spin_unlock() without calling @spin_lock() before, fix it. Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com Reviewed-by: David Sterba dste...@suse.cz -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: implement inode_operations callback tmpfile
On 04/04/2014 09:59 AM, David Sterba wrote: On Tue, Apr 01, 2014 at 11:53:19PM +0100, Filipe David Borba Manana wrote: This implements the tmpfile callback of struct inode_operations, introduced in the linux kernel 3.11 [1], and implemented already by some filesystems. Nice! Btw, would be good to mention 'O_TMPFILE' at lest in the changelog. Signed-off-by: Filipe David Borba Manana fdman...@gmail.com Reviewed-by: David Sterba dste...@suse.cz CC: linuxpatc...@star.c10r.facebook.com how many likes so far? He got a Nice from me ;) That's for when a like alone won't do. It doesn't look like xfstests has an O_TMPFILE test? -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Btrfs: implement inode_operations callback tmpfile
On Fri, Apr 4, 2014 at 3:12 PM, Chris Mason c...@fb.com wrote: On 04/04/2014 09:59 AM, David Sterba wrote: On Tue, Apr 01, 2014 at 11:53:19PM +0100, Filipe David Borba Manana wrote: This implements the tmpfile callback of struct inode_operations, introduced in the linux kernel 3.11 [1], and implemented already by some filesystems. Nice! Btw, would be good to mention 'O_TMPFILE' at lest in the changelog. Signed-off-by: Filipe David Borba Manana fdman...@gmail.com Reviewed-by: David Sterba dste...@suse.cz CC: linuxpatc...@star.c10r.facebook.com how many likes so far? He got a Nice from me ;) That's for when a like alone won't do. It doesn't look like xfstests has an O_TMPFILE test? It has now, if you look at Dave's latest xfstests update e-mail he sent earlier today. For reference, I used this for testing: https://friendpaste.com/4n46iUoo4ZdqTAdkPntqYe (plus scrub, send, btrfsck, etc) -chris -- Filipe David Manana, Reasonable men adapt themselves to the world. Unreasonable men adapt the world to themselves. That's why all progress depends on unreasonable men. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH] Btrfs: send, add calculate data size flag to allow for progress estimation
This new send flag makes send calculate first the amount of new file data (in bytes) the send root has relatively to the parent root, or for the case of a non-incremental send, the total amount of file data we will send through the send stream. In other words, it computes the sum of the lengths of all write and clone operations that will be sent through the send stream. This data size value is sent in a new command, named BTRFS_SEND_C_TOTAL_DATA_SIZE, that immediately follows a BTRFS_SEND_C_SUBVOL or BTRFS_SEND_C_SNAPSHOT command, and precedes any command that changes a file or the filesystem hierarchy. Upon receiving a write or clone command, the receiving end can increment a counter by the data length of that command and therefore report progress by comparing the counter's value with the data size value received in the BTRFS_SEND_C_TOTAL_DATA_SIZE command. The approach is simple, before the normal operation of send, do a scan in the file system tree for new inodes and file extent items, just like in send's normal operation, and keep incrementing a counter with new inodes' size and the size of file extents that are going to be written or cloned. This is actually a simpler and more lightweight tree scan/processing than the one we do when sending the changes, as it doesn't process inode references nor does any lookups in the extent tree for example. After modifying btrfs-progs to understand this new command and report progress, here's an example (the -o flag tells btrfs send to pass the new flag to the kernel's send ioctl): $ btrfs send -o /mnt/sdd/base | btrfs receive /mnt/sdc At subvol /mnt/sdd/base At subvol base About to receive 9211507211 bytes Subvolume/snapshot /mnt/sdc//base, progress 24.73%, 2278015008 bytes received (9211507211 total bytes) $ btrfs send -o -p /mnt/sdd/base /mnt/sdd/incr | btrfs receive /mnt/sdc At subvol /mnt/sdd/incr At snapshot incr About to receive 9211747739 bytes Subvolume/snapshot /mnt/sdc//incr, progress 63.42%, 5843024211 bytes received (9211747739 total bytes) Signed-off-by: Filipe David Borba Manana fdman...@gmail.com --- fs/btrfs/send.c| 194 + fs/btrfs/send.h| 1 + include/uapi/linux/btrfs.h | 13 ++- 3 files changed, 175 insertions(+), 33 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index c81e0d9..fa378c7 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -81,7 +81,13 @@ struct clone_root { #define SEND_CTX_MAX_NAME_CACHE_SIZE 128 #define SEND_CTX_NAME_CACHE_CLEAN_SIZE (SEND_CTX_MAX_NAME_CACHE_SIZE * 2) +enum btrfs_send_phase { + SEND_PHASE_STREAM_CHANGES, + SEND_PHASE_COMPUTE_DATA_SIZE, +}; + struct send_ctx { + enum btrfs_send_phase phase; struct file *send_filp; loff_t send_off; char *send_buf; @@ -116,6 +122,7 @@ struct send_ctx { u64 cur_inode_last_extent; u64 send_progress; + u64 total_data_size; struct list_head new_refs; struct list_head deleted_refs; @@ -687,6 +694,8 @@ static int send_rename(struct send_ctx *sctx, { int ret; + ASSERT(sctx-phase != SEND_PHASE_COMPUTE_DATA_SIZE); + verbose_printk(btrfs: send_rename %s - %s\n, from-start, to-start); ret = begin_cmd(sctx, BTRFS_SEND_C_RENAME); @@ -711,6 +720,8 @@ static int send_link(struct send_ctx *sctx, { int ret; + ASSERT(sctx-phase != SEND_PHASE_COMPUTE_DATA_SIZE); + verbose_printk(btrfs: send_link %s - %s\n, path-start, lnk-start); ret = begin_cmd(sctx, BTRFS_SEND_C_LINK); @@ -734,6 +745,8 @@ static int send_unlink(struct send_ctx *sctx, struct fs_path *path) { int ret; + ASSERT(sctx-phase != SEND_PHASE_COMPUTE_DATA_SIZE); + verbose_printk(btrfs: send_unlink %s\n, path-start); ret = begin_cmd(sctx, BTRFS_SEND_C_UNLINK); @@ -756,6 +769,8 @@ static int send_rmdir(struct send_ctx *sctx, struct fs_path *path) { int ret; + ASSERT(sctx-phase != SEND_PHASE_COMPUTE_DATA_SIZE); + verbose_printk(btrfs: send_rmdir %s\n, path-start); ret = begin_cmd(sctx, BTRFS_SEND_C_RMDIR); @@ -2286,6 +2301,9 @@ static int send_truncate(struct send_ctx *sctx, u64 ino, u64 gen, u64 size) int ret = 0; struct fs_path *p; + if (sctx-phase == SEND_PHASE_COMPUTE_DATA_SIZE) + return 0; + verbose_printk(btrfs: send_truncate %llu size=%llu\n, ino, size); p = fs_path_alloc(); @@ -2315,6 +2333,8 @@ static int send_chmod(struct send_ctx *sctx, u64 ino, u64 gen, u64 mode) int ret = 0; struct fs_path *p; + ASSERT(sctx-phase != SEND_PHASE_COMPUTE_DATA_SIZE); + verbose_printk(btrfs: send_chmod %llu mode=%llu\n, ino, mode); p = fs_path_alloc(); @@ -2344,6 +2364,8 @@ static int send_chown(struct send_ctx *sctx, u64 ino, u64 gen, u64 uid, u64 gid) int ret = 0; struct fs_path *p; +
[RFC PATCH] Btrfs-progs: send, calculate and report progress based on the new flag
This is a followup to the kernel patch titled: Btrfs: send, add calculate data size flag to allow for progress estimation This makes the btrfs send and receive commands aware of the new send flag, named BTRFS_SEND_C_TOTAL_DATA_SIZE, which tells us the amount of file data that is new between the parent and send snapshots/roots. As this command immediately follows the commands to start a snapshot/subvolume, it can be used to report and compute progress, by keeping a counter that is incremented with the data length of each write or clone command that is received from the stream. Example: $ btrfs send -o /mnt/sdd/base | btrfs receive /mnt/sdc At subvol /mnt/sdd/base At subvol base About to receive 9211507211 bytes Subvolume/snapshot /mnt/sdc//base, progress 24.73%, 2278015008 bytes received (9211507211 total bytes) $ btrfs send -o -p /mnt/sdd/base /mnt/sdd/incr | btrfs receive /mnt/sdc At subvol /mnt/sdd/incr At snapshot incr About to receive 9211747739 bytes Subvolume/snapshot /mnt/sdc//incr, progress 63.42%, 5843024211 bytes received (9211747739 total bytes) At the moment progress is only reported by btrfs-receive, but it is possible and simple to do it for btrfs-send too, so that we can get progress report when not piping btrfs-send output to btrfs-receive (directly to a file). Signed-off-by: Filipe David Borba Manana fdman...@gmail.com --- cmds-receive.c | 49 + cmds-send.c| 14 -- ioctl.h| 7 +++ send-stream.c | 4 send-stream.h | 1 + send.h | 1 + 6 files changed, 74 insertions(+), 2 deletions(-) diff --git a/cmds-receive.c b/cmds-receive.c index d6cd3da..5de24c3 100644 --- a/cmds-receive.c +++ b/cmds-receive.c @@ -71,6 +71,11 @@ struct btrfs_receive struct subvol_uuid_search sus; int honor_end_cmd; + + /* For the subvolume/snapshot we're currently receiving. */ + u64 total_data_size; + u64 bytes_received; + float progress; }; static int finish_subvol(struct btrfs_receive *r) @@ -156,6 +161,9 @@ static int process_subvol(const char *path, const u8 *uuid, u64 ctransid, goto out; r-cur_subvol = calloc(1, sizeof(*r-cur_subvol)); + r-total_data_size = 0; + r-bytes_received = 0; + r-progress = 0.0; if (strlen(r-dest_dir_path) == 0) r-cur_subvol-path = strdup(path); @@ -205,6 +213,9 @@ static int process_snapshot(const char *path, const u8 *uuid, u64 ctransid, goto out; r-cur_subvol = calloc(1, sizeof(*r-cur_subvol)); + r-total_data_size = 0; + r-bytes_received = 0; + r-progress = 0.0; if (strlen(r-dest_dir_path) == 0) r-cur_subvol-path = strdup(path); @@ -287,6 +298,41 @@ out: return ret; } +static int process_total_data_size(u64 size, void *user) +{ + struct btrfs_receive *r = user; + + r-total_data_size = size; + + fprintf(stdout, About to receive %llu bytes\n, size); + + return 0; +} + +static void update_progress(struct btrfs_receive *r, u64 bytes) +{ + float new_progress; + + if (r-total_data_size == 0) + return; + + r-bytes_received += bytes; + new_progress = ((float)r-bytes_received / r-total_data_size) * 100.0; + + if ((int)(new_progress * 100) (int)(r-progress * 100) || + r-bytes_received == r-total_data_size) + fprintf(stdout, + %sSubvolume/snapshot %s, progress %6.2f%%, %llu bytes received (%llu total bytes)%s, + (g_verbose ? : \r), + r-full_subvol_path, new_progress, + r-bytes_received, r-total_data_size, + (g_verbose ? \n : )); + if (r-bytes_received == r-total_data_size) + fprintf(stdout, \n); + + r-progress = new_progress; +} + static int process_mkfile(const char *path, void *user) { int ret; @@ -562,6 +608,7 @@ static int process_write(const char *path, const void *data, u64 offset, } pos += w; } + update_progress(r, len); out: free(full_path); @@ -638,6 +685,7 @@ static int process_clone(const char *path, u64 offset, u64 len, path, strerror(-ret)); goto out; } + update_progress(r, len); out: if (si) { @@ -819,6 +867,7 @@ static struct btrfs_send_ops send_ops = { .chmod = process_chmod, .chown = process_chown, .utimes = process_utimes, + .total_data_size = process_total_data_size }; static int do_receive(struct btrfs_receive *r, const char *tomnt, int r_fd) diff --git a/cmds-send.c b/cmds-send.c index dcb6607..e207ed3 100644 --- a/cmds-send.c +++ b/cmds-send.c @@ -45,6 +45,7 @@ #include send-utils.h static int g_verbose = 0; +static int
Re: determining snapshot size - adding work to do info to btrfs send
On Mon, Mar 31, 2014 at 5:35 PM, Marc MERLIN m...@merlins.org wrote: On Sat, Mar 29, 2014 at 05:21:23PM -0700, Marc MERLIN wrote: I had a look at http://bj0z.wordpress.com/2011/04/27/determining-snapshot-size-in-btrfs/#comment-35 but it's quite old and does not work anymore since userland became incompatible with it. Has anyone seen something newer or have a newer fixed version of this? While watching a btrfs send|receive going for hours, keeping the backup disk array spinning in my living room, and my wondering how far is it from being done, I was thinking: Would it be reasonably simple for btrfs send -p to have a few more features? 1) don't read all the data from disk, just read the metadata and tell me how many megabytes it will take to send. I can do this with btrfs send | wc -c I believe, but it would be better if it could do this without reading all the data blocks to send when I'm only caring about the byte output In turn this could be used to easily compute snapshot size diffs at least from one another. Totally agree. A progress indicator makes the feature much more user friendly. In fact this is something I had been thinking for a while, but didn't start implementation until a couple days ago after your mail. Here's a couple RFC patches (kernel and btrfs-progs) with a prototype: https://patchwork.kernel.org/patch/3938801/ https://patchwork.kernel.org/patch/3938811/ 2) output a list of files added/changed/removed, maybe with how much data is related to each. Arguably this would supersede 1) above even if it would be a little bit more work to do That would be cool I think. Most of what is needed is already implemented in the send code. You can get it somehow, in a not very user friendly way by passing the flag BTRFS_SEND_FLAG_NO_FILE_DATA to the send ioctl, which doesn't send any data back, only information about new/modified extent's offset and length and passing -vvv to btrfs receive. 3) when a real btrfs send is running, just like dd, I could send it a USR1 signal and it would output some kind of progress report. The Ted T'so motto (used for e2fsck) is everything with a progress bar is faster :) Note, the progress wouldn't have to be perfect, it could be by number of blocks, number of files, anything reasonably easy to implement it on. Does that sound reasonable? It does. Thanks, Marc -- A mouse is a device used to point at the xterm you want to type in - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- Filipe David Manana, Reasonable men adapt themselves to the world. Unreasonable men adapt the world to themselves. That's why all progress depends on unreasonable men. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs-progs: add separate make target to clean docs
Regenerating the asciidoc takes much longer now and makes quick build tests long. There's separate clean-doc target for that and clean-all that cleans docs and sources. Signed-off-by: David Sterba dste...@suse.cz --- This applies on top of the new asciidoc patches and makes frequent build tests more friendly Makefile | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/Makefile b/Makefile index 70eaf5765dd8..f2f2579ed9c6 100644 --- a/Makefile +++ b/Makefile @@ -56,9 +56,9 @@ btrfs_convert_libs = -lext2fs -lcom_err btrfs_image_libs = -lpthread btrfs_fragment_libs = -lgd -lpng -ljpeg -lfreetype -SUBDIRS = Documentation -BUILDDIRS = $(patsubst %,build-%,$(SUBDIRS)) -INSTALLDIRS = $(patsubst %,install-%,$(SUBDIRS)) +SUBDIRS = +BUILDDIRS = $(patsubst %,build-%,$(SUBDIRS)) Documentation +INSTALLDIRS = $(patsubst %,install-%,$(SUBDIRS)) Documentation CLEANDIRS = $(patsubst %,clean-%,$(SUBDIRS)) .PHONY: $(SUBDIRS) @@ -120,7 +120,7 @@ endif @echo [CC] $@ $(Q)$(CC) $(AM_CFLAGS) $(STATIC_CFLAGS) -c $ -o $@ -all: $(progs) manpages $(BUILDDIRS) +all: $(progs) $(BUILDDIRS) $(SUBDIRS): $(BUILDDIRS) $(BUILDDIRS): @echo Making all in $(patsubst build-%,%,$@) @@ -223,6 +223,8 @@ send-test: $(objects) $(libs) send-test.o manpages: $(Q)$(MAKE) $(MAKEOPTS) -C Documentation +clean-all: clean-doc clean + clean: $(CLEANDIRS) @echo Cleaning $(Q)rm -f $(progs) cscope.out *.o *.o.d btrfs-convert btrfs-image btrfs-select-super \ @@ -231,6 +233,10 @@ clean: $(CLEANDIRS) version.h $(check_defs) \ $(libs) $(lib_links) +clean-doc: + @echo Cleaning Documentation + $(Q)$(MAKE) $(MAKEOPTS) -C Documentation clean + $(CLEANDIRS): @echo Cleaning $(patsubst clean-%,%,$@) $(Q)$(MAKE) $(MAKEOPTS) -C $(patsubst clean-%,%,$@) clean -- 1.9.0 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] Btrfs: send, add calculate data size flag to allow for progress estimation
On 4/4/2014 6:20 μμ, Filipe David Borba Manana wrote: This new send flag makes send calculate first the amount of new file data (in bytes) the send root has relatively to the parent root, or for the case of a non-incremental send, the total amount of file data we will send through the send stream. In other words, it computes the sum of the lengths of all write and clone operations that will be sent through the send stream. This data size value is sent in a new command, named BTRFS_SEND_C_TOTAL_DATA_SIZE, that immediately follows a BTRFS_SEND_C_SUBVOL or BTRFS_SEND_C_SNAPSHOT command, and precedes any command that changes a file or the filesystem hierarchy. Upon receiving a write or clone command, the receiving end can increment a counter by the data length of that command and therefore report progress by comparing the counter's value with the data size value received in the BTRFS_SEND_C_TOTAL_DATA_SIZE command. The approach is simple, before the normal operation of send, do a scan in the file system tree for new inodes and file extent items, just like in send's normal operation, and keep incrementing a counter with new inodes' size and the size of file extents that are going to be written or cloned. This is actually a simpler and more lightweight tree scan/processing than the one we do when sending the changes, as it doesn't process inode references nor does any lookups in the extent tree for example. After modifying btrfs-progs to understand this new command and report progress, here's an example (the -o flag tells btrfs send to pass the new flag to the kernel's send ioctl): $ btrfs send -o /mnt/sdd/base | btrfs receive /mnt/sdc At subvol /mnt/sdd/base At subvol base About to receive 9211507211 bytes Subvolume/snapshot /mnt/sdc//base, progress 24.73%, 2278015008 bytes received (9211507211 total bytes) $ btrfs send -o -p /mnt/sdd/base /mnt/sdd/incr | btrfs receive /mnt/sdc At subvol /mnt/sdd/incr At snapshot incr About to receive 9211747739 bytes Subvolume/snapshot /mnt/sdc//incr, progress 63.42%, 5843024211 bytes received (9211747739 total bytes) Hi, as a user of send i can say that this feature is very useful. Is it possible to add current speed indication (MB/sec)? Signed-off-by: Filipe David Borba Manana fdman...@gmail.com --- fs/btrfs/send.c| 194 + fs/btrfs/send.h| 1 + include/uapi/linux/btrfs.h | 13 ++- 3 files changed, 175 insertions(+), 33 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index c81e0d9..fa378c7 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -81,7 +81,13 @@ struct clone_root { #define SEND_CTX_MAX_NAME_CACHE_SIZE 128 #define SEND_CTX_NAME_CACHE_CLEAN_SIZE (SEND_CTX_MAX_NAME_CACHE_SIZE * 2) +enum btrfs_send_phase { + SEND_PHASE_STREAM_CHANGES, + SEND_PHASE_COMPUTE_DATA_SIZE, +}; + struct send_ctx { + enum btrfs_send_phase phase; struct file *send_filp; loff_t send_off; char *send_buf; @@ -116,6 +122,7 @@ struct send_ctx { u64 cur_inode_last_extent; u64 send_progress; + u64 total_data_size; struct list_head new_refs; struct list_head deleted_refs; @@ -687,6 +694,8 @@ static int send_rename(struct send_ctx *sctx, { int ret; + ASSERT(sctx-phase != SEND_PHASE_COMPUTE_DATA_SIZE); + verbose_printk(btrfs: send_rename %s - %s\n, from-start, to-start); ret = begin_cmd(sctx, BTRFS_SEND_C_RENAME); @@ -711,6 +720,8 @@ static int send_link(struct send_ctx *sctx, { int ret; + ASSERT(sctx-phase != SEND_PHASE_COMPUTE_DATA_SIZE); + verbose_printk(btrfs: send_link %s - %s\n, path-start, lnk-start); ret = begin_cmd(sctx, BTRFS_SEND_C_LINK); @@ -734,6 +745,8 @@ static int send_unlink(struct send_ctx *sctx, struct fs_path *path) { int ret; + ASSERT(sctx-phase != SEND_PHASE_COMPUTE_DATA_SIZE); + verbose_printk(btrfs: send_unlink %s\n, path-start); ret = begin_cmd(sctx, BTRFS_SEND_C_UNLINK); @@ -756,6 +769,8 @@ static int send_rmdir(struct send_ctx *sctx, struct fs_path *path) { int ret; + ASSERT(sctx-phase != SEND_PHASE_COMPUTE_DATA_SIZE); + verbose_printk(btrfs: send_rmdir %s\n, path-start); ret = begin_cmd(sctx, BTRFS_SEND_C_RMDIR); @@ -2286,6 +2301,9 @@ static int send_truncate(struct send_ctx *sctx, u64 ino, u64 gen, u64 size) int ret = 0; struct fs_path *p; + if (sctx-phase == SEND_PHASE_COMPUTE_DATA_SIZE) + return 0; + verbose_printk(btrfs: send_truncate %llu size=%llu\n, ino, size); p = fs_path_alloc(); @@ -2315,6 +2333,8 @@ static int send_chmod(struct send_ctx *sctx, u64 ino, u64 gen, u64 mode) int ret = 0; struct fs_path *p; + ASSERT(sctx-phase != SEND_PHASE_COMPUTE_DATA_SIZE); + verbose_printk(btrfs: send_chmod %llu mode=%llu\n, ino, mode); p = fs_path_alloc(); @@
Re: [PATCH 3/4][RFC] btrfs: export global block reserve size as space_info
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2/10/14, 12:03 PM, Chris Mason wrote: On 02/07/2014 08:34 AM, David Sterba wrote: Introduce a block group type bit for a global reserve and fill the space info for SPACE_INFO ioctl. This should replace the newly added ioctl (01e219e8069516cdb98594d417b8bb8d906ed30d) to get just the 'size' part of the global reserve, while the actual usage can be now visible in the 'btrfs fi df' output during ENOSPC stress. The unpatched userspace tools will show the blockgroup as 'unknown'. This wasn't in my rc2 pull because I wanted to sync up with Jeff on it. I like the idea of combining this into SPACE_INFO, any objections? Sorry, was on vacation when this went by and just got the ping. Yeah, I have no objections here. - -Jeff - -- Jeff Mahoney SUSE Labs -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) iQIcBAEBAgAGBQJTPsgMAAoJEB57S2MheeWyf1EP/158KhPm7KoT9EGeNUwv4+nb Ex2z9hIbTduJ6rT9IG+n0vbV2Ka9IuWJDvYdoLLMIN1SFXOJdNo88MIf6aSTbHfb WWqJJ1nMB/DnDMt1bXp6cZSGyzQuSXvI/u97Pgy3gzMpmjuzXy37c7JkFCABM352 G2IF+bZiHqa6X+eTxSjlimDKvIBOPGw/hOIvegRmqyqDMNIy5dwzDQIsytUuqsgQ rl+fAt+R+VNdSe2ZEjn+FwviOpPrgR8TQL5Qycaoviqzd6apBtutlcJpfEIqCP8m 4guZ7bC/VjsUdJj1cxSYZe+Eh0dEas2T5qjH5DW5uyTmsKJAA3VM4lTaxITvQ0Y8 URxCQGSfAc0IUudpz+nCbLdwhUYtV/yfpA8i3Fnewyu8Jazvup0dxALo56RDu4Kf j7K3kwTFlfB7D9/S10SbsWK3j/NR3qJu1DicG1Wy18Acl3oZvCgB4qGIaxh1vcsV NZfkt+/5V+Mb0ocKEjdudO/sS0XNBJowMxmWOCZz6vGyKQTAWA+VYmZxJ/rKiOoG O/YYmJ2VuzWI8KPFrhNny12UJ9AsyZLhDw4Sbr7iEDI/l/mtrL6WsK/krhMHKDdJ wf4TWtM0CPyZS8ym6f2cgSdmvecaZ3AuK1Hd19DeMZnly6bwZV2TkNSXbfB+oAZO HL7ZLY8oHLj4brTHFOIH =fiJ0 -END PGP SIGNATURE- -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS setup advice for laptop performance ?
On Fri, Apr 04, 2014 at 10:02:27AM +0200, Swâmi Petaramesh wrote: Hi, I'm going to receive a new small laptop with a 500 GB 5400 RPM mechanical ole' rust HD, and I plan ton install BTRFS on it. It will have a kernel 3.13 for now, until 3.14 gets released. However I'm still concerned with chronic BTRFS dreadful performance and still find that BRTFS degrades much over time even with periodic defrag and best practices etc. There's something funny going on here. There are, apparently, a reasonable number of people using btrfs in daily use, with things like snapper (regular and frequent snapshots). I'm one of them, although I don't use snapper. We don't have lots of reports of massive slowdowns after a long period of use, so whatever you're doing, there seems to be something unusual involved. It's almost certainly not your fault, but there would appear to be something in your configuration or your use-case which is leading to these problems, and without knowing what's different, it's hard to set about identifying the problem. What software do you run on the machine? Browser? Any databases? Anything that contains a database? Torrents or other filesharing software? Bitcoin mining? Bitcoin wallet? Anything else beyond the ordinary boring desktop/office type applications? Are you compiling lots of things (e.g. Gentoo)? Creating and deleting lots of files? If so, large ones or small ones? Are you running very close to a full filesystem? How are you measuring the slowdown -- do you have a specific piece of benchmarking software, or just anecdotal evidence? So I'd like to start with the best possible options and have a few questions : - Is it still recommended to mkfs with a nodesize or leafsize different (bigger) than the default ? I wouldn't like to lose too much disk space anyway (1/2 nodesize per file on average ?), as it will be limited... No, nodes are used for the metadata trees, not for file storage. I'd suggest nodesize=leafsize=16k or 32k. I don't think you can change the block size at the moment. - Is it recommended to alter the FS to have skinny extents ? I've done this on all of my BTRFS machines without problem, still the kernel spits a notice at mount time, and I'm worrying kind of Why is the kernel warning me I have skinny extents ? Is it bad ? Is it something I should avoid ? As far as I know, they're considered safe and stable. I suspect that the message is just a developer info thing that hasn't been taken out yet. - Are there other optimization tricks I should perform at mkfs time because thay can't be changed later on ? Nodesize/leafsize are the only things you should probably change at mkfs time. The other thing would be --mixed, but you probably don't want that on a 500 GiB drive. - Are there other btrfstune or mount options I should pass before starting to populate the FS with a system and data ? I think everything else other than the above can be done after the fact with btrfstune. I'd definitely suggest extended inode refs simply because it fixes a known limitation. - Generally speaking, does LZO compression improve or degrade performance ? I'm not able to figure it out clearly. Yes, it improves or degrades performance. :) It'll depend entirely on what you're doing with it. If you're storing lots of zeroes (Phoronix, I'm looking at you), then you'll get huge speedups. If you're storing video data, you'll get a (very) slight performance drop as it scompresses the first few blocks of the file and then gives up. I suspect that in general, the performance differences won't be noticable unless you have highly compressible large files, but if you _really_ care about it, benchmark it(*). Hugo. (*) If you don't want to go through the effort of benchmarking, you don't care enough about it, and should just pick something at random. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- And what rough beast, its hour come round at last / slouches --- towards Bethlehem, to be born? signature.asc Description: Digital signature
Re: BTRFS setup advice for laptop performance ?
On 2014-04-04 08:48, Swâmi Petaramesh wrote: Le vendredi 4 avril 2014 08:33:10 Austin S Hemmelgarn a écrit : However I'm still concerned with chronic BTRFS dreadful performance and still find that BRTFS degrades much over time even with periodic defrag and best practices etc. I keep hearing this from people, but i personally don't see this to be the case at all. I'm pretty sure the 'big' performance degradation that people are seeing is due to how they are using snapshots, not a result using BTRFS itself (I don't use them for anything other than ensuring a stable system image for rsync and/or tar based backups). Maybe I was wrong to suppose that if a feature exists, it is supposed to be usable... I have used ZFS for years, and on ZFS having *hundreds* of snapshots of any given FS have exactly zero impact on performance... With BTRFS, some time ago I tried to use SuSE snapper that passes its time doing and releasing snapshots, but it soon made my systems unusable... Now, I only keep 2-3 manually made snapshots just for keeping a stable and OK archive of my machine in a known state just in case... But if even this has a noticeable negative impact on BTRFS performance, then what the hell are BTRFS snapshots good at ?? Kind regards. I'm not saying that using a few snapshots is a bad thing, I'm saying that thousands of snapshots is a bad thing (I have actually seen people with hat many, including one individual who had almost 32,000 snapshots on the same drive). I personally do keep a few around on my system on a regular basis, even aside from the backups, and have no noticable performance degradation. For reference, the (main) system that I am using has a Intel Celeron 847 running at 1.1GHz, 4G of DDR3-1333 RAM, and a 500G 5400 RPM SATAII hard disk. My root filesystem is BTRFS volume mounted with autodefrag,space_cache,compress-force=lzo,noatime (the noatime improves performance (and power efficency) for btrfs because metadata updates end up cascading up the metadata tree (updating the atime on /etc/foo/bar causes the atime to be updated on /etc/foo, which causes the atime to be updated on /etc, which causes the atime to be updated on /) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] Btrfs: send, add calculate data size flag to allow for progress estimation
On Fri, Apr 04, 2014 at 04:20:41PM +0100, Filipe David Borba Manana wrote: @@ -4307,6 +4348,22 @@ out: return num_read; } +static int send_total_data_size(struct send_ctx *sctx, u64 data_size) +{ + int ret; + + ret = begin_cmd(sctx, BTRFS_SEND_C_TOTAL_DATA_SIZE); + if (ret 0) + goto out; + + TLV_PUT_U64(sctx, BTRFS_SEND_A_SIZE, data_size); + ret = send_cmd(sctx); + +tlv_put_failure: +out: + return ret; +} + /* * Send a clone command to user space. */ --- a/fs/btrfs/send.h +++ b/fs/btrfs/send.h @@ -87,6 +87,7 @@ enum btrfs_send_cmd { BTRFS_SEND_C_END, BTRFS_SEND_C_UPDATE_EXTENT, + BTRFS_SEND_C_TOTAL_DATA_SIZE, Though it is a simple modification, it changes the existing send protocol. The unpatched receiver would fail (it has a lower value of __BTRFS_SEND_C_MAX), so it could work even without revving the protocol. But, it's not the cleanest way. There's a number of defficiencies found in v1 protocol, see https://btrfs.wiki.kernel.org/index.php/Design_notes_on_Send/Receive#Send_stream_v2_draft I would rather see a proper v2 revision instead of relying on the fact that current implementation will deal with the change. __BTRFS_SEND_C_MAX, }; #define BTRFS_SEND_C_MAX (__BTRFS_SEND_C_MAX - 1) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] Btrfs: send, add calculate data size flag to allow for progress estimation
On Fri, Apr 4, 2014 at 3:52 PM, Konstantinos Skarlatos k.skarla...@gmail.com wrote: On 4/4/2014 6:20 μμ, Filipe David Borba Manana wrote: This new send flag makes send calculate first the amount of new file data (in bytes) the send root has relatively to the parent root, or for the case of a non-incremental send, the total amount of file data we will send through the send stream. In other words, it computes the sum of the lengths of all write and clone operations that will be sent through the send stream. This data size value is sent in a new command, named BTRFS_SEND_C_TOTAL_DATA_SIZE, that immediately follows a BTRFS_SEND_C_SUBVOL or BTRFS_SEND_C_SNAPSHOT command, and precedes any command that changes a file or the filesystem hierarchy. Upon receiving a write or clone command, the receiving end can increment a counter by the data length of that command and therefore report progress by comparing the counter's value with the data size value received in the BTRFS_SEND_C_TOTAL_DATA_SIZE command. The approach is simple, before the normal operation of send, do a scan in the file system tree for new inodes and file extent items, just like in send's normal operation, and keep incrementing a counter with new inodes' size and the size of file extents that are going to be written or cloned. This is actually a simpler and more lightweight tree scan/processing than the one we do when sending the changes, as it doesn't process inode references nor does any lookups in the extent tree for example. After modifying btrfs-progs to understand this new command and report progress, here's an example (the -o flag tells btrfs send to pass the new flag to the kernel's send ioctl): $ btrfs send -o /mnt/sdd/base | btrfs receive /mnt/sdc At subvol /mnt/sdd/base At subvol base About to receive 9211507211 bytes Subvolume/snapshot /mnt/sdc//base, progress 24.73%, 2278015008 bytes received (9211507211 total bytes) $ btrfs send -o -p /mnt/sdd/base /mnt/sdd/incr | btrfs receive /mnt/sdc At subvol /mnt/sdd/incr At snapshot incr About to receive 9211747739 bytes Subvolume/snapshot /mnt/sdc//incr, progress 63.42%, 5843024211 bytes received (9211747739 total bytes) Hi, as a user of send i can say that this feature is very useful. Is it possible to add current speed indication (MB/sec)? Yes, it is. Signed-off-by: Filipe David Borba Manana fdman...@gmail.com --- fs/btrfs/send.c| 194 + fs/btrfs/send.h| 1 + include/uapi/linux/btrfs.h | 13 ++- 3 files changed, 175 insertions(+), 33 deletions(-) diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c index c81e0d9..fa378c7 100644 --- a/fs/btrfs/send.c +++ b/fs/btrfs/send.c @@ -81,7 +81,13 @@ struct clone_root { #define SEND_CTX_MAX_NAME_CACHE_SIZE 128 #define SEND_CTX_NAME_CACHE_CLEAN_SIZE (SEND_CTX_MAX_NAME_CACHE_SIZE * 2) +enum btrfs_send_phase { + SEND_PHASE_STREAM_CHANGES, + SEND_PHASE_COMPUTE_DATA_SIZE, +}; + struct send_ctx { + enum btrfs_send_phase phase; struct file *send_filp; loff_t send_off; char *send_buf; @@ -116,6 +122,7 @@ struct send_ctx { u64 cur_inode_last_extent; u64 send_progress; + u64 total_data_size; struct list_head new_refs; struct list_head deleted_refs; @@ -687,6 +694,8 @@ static int send_rename(struct send_ctx *sctx, { int ret; + ASSERT(sctx-phase != SEND_PHASE_COMPUTE_DATA_SIZE); + verbose_printk(btrfs: send_rename %s - %s\n, from-start, to-start); ret = begin_cmd(sctx, BTRFS_SEND_C_RENAME); @@ -711,6 +720,8 @@ static int send_link(struct send_ctx *sctx, { int ret; + ASSERT(sctx-phase != SEND_PHASE_COMPUTE_DATA_SIZE); + verbose_printk(btrfs: send_link %s - %s\n, path-start, lnk-start); ret = begin_cmd(sctx, BTRFS_SEND_C_LINK); @@ -734,6 +745,8 @@ static int send_unlink(struct send_ctx *sctx, struct fs_path *path) { int ret; + ASSERT(sctx-phase != SEND_PHASE_COMPUTE_DATA_SIZE); + verbose_printk(btrfs: send_unlink %s\n, path-start); ret = begin_cmd(sctx, BTRFS_SEND_C_UNLINK); @@ -756,6 +769,8 @@ static int send_rmdir(struct send_ctx *sctx, struct fs_path *path) { int ret; + ASSERT(sctx-phase != SEND_PHASE_COMPUTE_DATA_SIZE); + verbose_printk(btrfs: send_rmdir %s\n, path-start); ret = begin_cmd(sctx, BTRFS_SEND_C_RMDIR); @@ -2286,6 +2301,9 @@ static int send_truncate(struct send_ctx *sctx, u64 ino, u64 gen, u64 size) int ret = 0; struct fs_path *p; + if (sctx-phase == SEND_PHASE_COMPUTE_DATA_SIZE) + return 0; + verbose_printk(btrfs: send_truncate %llu size=%llu\n, ino, size); p = fs_path_alloc(); @@ -2315,6 +2333,8 @@ static int send_chmod(struct send_ctx *sctx, u64 ino, u64 gen, u64
Re: [RFC PATCH] Btrfs: send, add calculate data size flag to allow for progress estimation
On Fri, Apr 4, 2014 at 4:53 PM, David Sterba dste...@suse.cz wrote: On Fri, Apr 04, 2014 at 04:20:41PM +0100, Filipe David Borba Manana wrote: @@ -4307,6 +4348,22 @@ out: return num_read; } +static int send_total_data_size(struct send_ctx *sctx, u64 data_size) +{ + int ret; + + ret = begin_cmd(sctx, BTRFS_SEND_C_TOTAL_DATA_SIZE); + if (ret 0) + goto out; + + TLV_PUT_U64(sctx, BTRFS_SEND_A_SIZE, data_size); + ret = send_cmd(sctx); + +tlv_put_failure: +out: + return ret; +} + /* * Send a clone command to user space. */ --- a/fs/btrfs/send.h +++ b/fs/btrfs/send.h @@ -87,6 +87,7 @@ enum btrfs_send_cmd { BTRFS_SEND_C_END, BTRFS_SEND_C_UPDATE_EXTENT, + BTRFS_SEND_C_TOTAL_DATA_SIZE, Though it is a simple modification, it changes the existing send protocol. The unpatched receiver would fail (it has a lower value of __BTRFS_SEND_C_MAX), so it could work even without revving the protocol. But, it's not the cleanest way. Same problem happened when BTRFS_SEND_C_UPDATE_EXTENT was added. Since it's a command that's only sent if a new special flag is supplied, I don't think it's that bad. There's a number of defficiencies found in v1 protocol, see https://btrfs.wiki.kernel.org/index.php/Design_notes_on_Send/Receive#Send_stream_v2_draft I would rather see a proper v2 revision instead of relying on the fact that current implementation will deal with the change. Right, by 2020 we'll have v2 fully specified and maybe implemented :) Thanks David __BTRFS_SEND_C_MAX, }; #define BTRFS_SEND_C_MAX (__BTRFS_SEND_C_MAX - 1) -- Filipe David Manana, Reasonable men adapt themselves to the world. Unreasonable men adapt the world to themselves. That's why all progress depends on unreasonable men. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Hard restart required
On 04/03/2014 05:41 PM, Avi Miller wrote: UUID should work fine on OL6. Can you confirm that you have the UEK3 (3.8.18) kernel running? If you’ve installed from OL6U5 media, it should be enabled by default, but older OL6 ISOs only had UEK2 on the media and the UEK3 yum channel would need to be manually enabled. As suggested earlier on this list, I started with a fresh, vanilla CentOS 6 install and used the centos2ol.sh script to switch to using OL before doing a yum update/upgrade and install of BTRFS. Here's the relevant info after the switchover. [root@oracle ~]# uname -a Linux oracle.schoolpathways.com 2.6.32-431.11.2.el6.x86_64 #1 SMP Tue Mar 25 08:15:39 PDT 2014 x86_64 x86_64 x86_64 GNU/Linux [root@oracle ~]# rpm -qa | grep -i kernel dracut-kernel-004-336.0.1.el6_5.2.noarch kernel-firmware-2.6.32-431.11.2.el6.noarch kernel-2.6.32-431.11.2.el6.x86_64 kernel-2.6.32-358.el6.x86_64 [root@oracle ~]# cat /etc/yum.repos.d/public-yum-ol6.repo [ol6_latest] name=Oracle Linux $releasever Latest ($basearch) baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL6/latest/$basearch/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle gpgcheck=1 enabled=1 [ol6_addons] name=Oracle Linux $releasever Add ons ($basearch) baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL6/addons/$basearch/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle gpgcheck=1 enabled=0 [ol6_ga_base] name=Oracle Linux $releasever GA installation media copy ($basearch) baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL6/0/base/$basearch/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle gpgcheck=1 enabled=0 [ol6_u1_base] name=Oracle Linux $releasever Update 1 installation media copy ($basearch) baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL6/1/base/$basearch/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle gpgcheck=1 enabled=0 [ol6_u2_base] name=Oracle Linux $releasever Update 2 installation media copy ($basearch) baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL6/2/base/$basearch/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle gpgcheck=1 enabled=0 [ol6_u3_base] name=Oracle Linux $releasever Update 3 installation media copy ($basearch) baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL6/3/base/$basearch/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle gpgcheck=1 enabled=0 [ol6_u4_base] name=Oracle Linux $releasever Update 4 installation media copy ($basearch) baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL6/4/base/$basearch/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle gpgcheck=1 enabled=0 [ol6_u5_base] name=Oracle Linux $releasever Update 5 installation media copy ($basearch) baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL6/5/base/$basearch/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle gpgcheck=1 enabled=0 [ol6_UEKR3_latest] name=Latest Unbreakable Enterprise Kernel for Oracle Linux $releasever ($basearch) baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL6/UEKR3/latest/$basearch/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle gpgcheck=1 enabled=0 [ol6_UEK_latest] name=Latest Unbreakable Enterprise Kernel for Oracle Linux $releasever ($basearch) baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL6/UEK/latest/$basearch/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle gpgcheck=1 enabled=1 [ol6_UEK_base] name=Unbreakable Enterprise Kernel for Oracle Linux $releasever ($basearch) baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL6/UEK/base/$basearch/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle gpgcheck=1 enabled=0 [ol6_playground_latest] name=Latest mainline stable kernel for Oracle Linux 6 ($basearch) - Unsupported baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL6/playground/latest/$basearch/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle gpgcheck=1 enabled=0 [ol6_MySQL] name=MySQL 5.5 for Oracle Linux 6 ($basearch) baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL6/MySQL/$basearch/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle gpgcheck=1 enabled=0 [ol6_gdm_multiseat] name=Oracle Linux 6 GDM Multiseat ($basearch) baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL6/gdm_multiseat/$basearch/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle gpgcheck=1 enabled=0 [ol6_ofed_UEK] name=OFED supporting tool packages for Unbreakable Enterprise Kernel on Oracle Linux 6 ($basearch) baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL6/ofed_UEK/$basearch/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle gpgcheck=1 enabled=0 [ol6_MySQL56] name=MySQL 5.6 for Oracle Linux 6 ($basearch) baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL6/MySQL56/$basearch/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle gpgcheck=1 enabled=0 [ol6_spacewalk20_server] name=Spacewalk Server 2.0 for Oracle Linux 6 ($basearch) baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL6/spacewalk20/server/$basearch/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle gpgcheck=1 enabled=0 [ol6_spacewalk20_client]
Re: [RFC PATCH] Btrfs: send, add calculate data size flag to allow for progress estimation
On Fri, Apr 04, 2014 at 05:01:41PM +0100, Filipe David Manana wrote: * Send a clone command to user space. */ --- a/fs/btrfs/send.h +++ b/fs/btrfs/send.h @@ -87,6 +87,7 @@ enum btrfs_send_cmd { BTRFS_SEND_C_END, BTRFS_SEND_C_UPDATE_EXTENT, + BTRFS_SEND_C_TOTAL_DATA_SIZE, Though it is a simple modification, it changes the existing send protocol. The unpatched receiver would fail (it has a lower value of __BTRFS_SEND_C_MAX), so it could work even without revving the protocol. But, it's not the cleanest way. Same problem happened when BTRFS_SEND_C_UPDATE_EXTENT was added. That's a good example why objecting from the beginning is wise, because it will backfire later. You can blame me that I did not object back then, but was involved in the patch reviews. Since it's a command that's only sent if a new special flag is supplied, I don't think it's that bad. Yeah it's not that bad and I don't want to stand in the way of a good enhancement. The flag makes it better wrt backward compatibility. There's a number of defficiencies found in v1 protocol, see https://btrfs.wiki.kernel.org/index.php/Design_notes_on_Send/Receive#Send_stream_v2_draft I would rather see a proper v2 revision instead of relying on the fact that current implementation will deal with the change. Right, by 2020 we'll have v2 fully specified and maybe implemented :) I saw that coming :) We're not waiting for a full v2 spec, but someone who implements what's been collected so far. And because it also includes implementing the versioning infrastructure, it's not that attractive compared to adding one more TLV command. So, if others thing this is ok for v1, proceed. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
BTRFS send/receive limitations
I read recently that you can't send/receive concurrent streams on the same filesystem, which begs the question of what is meant by a filesystem. Is that to say that you can't send/receive snapshots on different subvolumes to the same root filesystem? Or that you can't send/receive multiple snapshots on the same subvolume? Can you send/receive a snapshot or subvolume to the same root filesystem? -Ben -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS send/receive limitations
On Fri, Apr 04, 2014 at 09:50:05AM -0700, Lists wrote: I read recently that you can't send/receive concurrent streams on the same filesystem, which begs the question of what is meant by a filesystem. Is that to say that you can't send/receive snapshots on different subvolumes to the same root filesystem? Or that you can't send/receive multiple snapshots on the same subvolume? Can you send/receive a snapshot or subvolume to the same root filesystem? The restriction was on the same *filesystem* as a whole: there was a global lock on the whole FS, which could cause deadlocks with send and receive both accessing the same FS (any subvolumes). I don't recall hearing about problems with two sends from different subvols on the same FS, but that might just be because I wasn't paying attention. :) I think those restrictions are gone now, in some patch in the pipeline. Possibly for 3.15 -- I'm not sure if the patches made it into 3.14. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Gomez, darling, don't torture yourself. That's my job. --- signature.asc Description: Digital signature
Re: [PATCH 24/27] btrfs-progs: Convert man page for btrfs-zero-log
On Wed, Apr 02, 2014 at 04:29:35PM +0800, Qu Wenruo wrote: Convert man page for btrfs-zero-log Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com --- Documentation/Makefile | 2 +- Documentation/btrfs-zero-log.txt | 39 +++ 2 files changed, 40 insertions(+), 1 deletion(-) create mode 100644 Documentation/btrfs-zero-log.txt diff --git a/Documentation/Makefile b/Documentation/Makefile index e002d53..de06629 100644 --- a/Documentation/Makefile +++ b/Documentation/Makefile @@ -11,7 +11,7 @@ MAN8_TXT += btrfs-image.txt MAN8_TXT += btrfs-map-logical.txt MAN8_TXT += btrfs-show-super.txt MAN8_TXT += btrfstune.txt -#MAN8_TXT += btrfs-zero-log.txt +MAN8_TXT += btrfs-zero-log.txt #MAN8_TXT += fsck.btrfs.txt #MAN8_TXT += mkfs.btrfs.txt diff --git a/Documentation/btrfs-zero-log.txt b/Documentation/btrfs-zero-log.txt new file mode 100644 index 000..e3041fa --- /dev/null +++ b/Documentation/btrfs-zero-log.txt @@ -0,0 +1,39 @@ +btrfs-zero-log(8) += + +NAME + +btrfs-zero-log - clear out log tree + +SYNOPSIS + +'btrfs-zero-log' dev + +DESCRIPTION +--- +'btrfs-zero-log' will remove the log tree if log tree is corrupt, which will +allow you to mount the filesystem again. + +The common case where this happens has been fixed a long time ago, +so it is unlikely that you will see this particular problem. A note on this one: this can happen if your SSD rites things in the wrong order or potentially writes garbage when power is lost, or before locking up. I hit this problem about 10 times and it wasn't a btrfs bug, just the drive doing bad things. I had debian add this to the initramfs initrd by default so that someone can recover their root filesystem with this command if it won't mount. What got fixed is the kernel used to oops and crash, and now it gives a nice can't mount error message. Marc -- A mouse is a device used to point at the xterm you want to type in - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS setup advice for laptop performance ?
Austin S Hemmelgarn posted on Fri, 04 Apr 2014 08:33:10 -0400 as excerpted: On 2014-04-04 04:02, Swâmi Petaramesh wrote: Hi, I'm going to receive a new small laptop with a 500 GB 5400 RPM mechanical ole' rust HD, and I plan ton install BTRFS on it. Reminds me of my query to the list, some months ago. (Altho I was/am using dual 238 GiB SSDs, in btrfs raid1 mode both data and metadata, in a desktop, additionally with a 500 gig spinning rust drive for media that is still running reiserfs, so the details are somewhat different.) It will have a kernel 3.13 for now, until 3.14 gets released. $ uname -r 3.14.0 =:^) But it's good you (SP) keep reasonably current. I see people posting with old 2.6.* kernels and wonder why they're even bothering with btrfs, since they obviously aren't current, kernel-wise. However I'm still concerned with chronic BTRFS dreadful performance and still find that BRTFS degrades much over time even with periodic defrag and best practices etc. I keep hearing this from people, but i personally don't see this to be the case at all. I'm pretty sure the 'big' performance degradation that people are seeing is due to how they are using snapshots, not a result using BTRFS itself (I don't use them for anything other than ensuring a stable system image for rsync and/or tar based backups). I'll second what you (AH) and Hugo say elsewhere, and I've written some on the subject in other threads too. Snapshots per se aren't bad, but there's really no reason to have thousands of them against the same base subvolume -- in practice, if you need to mount a snapshot a month or six old, are you really going to know or care what exact minute to mount? While I /personally/ think per-minute snapshots are overdoing it, per hour or so is definitely logically supportable and if you /want/ per- minute, well, fine. But per-minute or per-hour or per-day, or just taking an occasional manual snapshot, /do/ strongly consider thinning them out on a reasonable schedule, and the more frequently you take 'em the more you need to thin. So if for example you're taking per-minute, thin them down to perhaps one per half-hour after six hours and one per hour after a day, then to one a day after a week and one a week after four weeks. At some point between a month and a quarter, external backups should have taken over, and deleting older snapshots or only keeping perhaps one every 13 weeks (quarter) should suffice. Meanwhile, as Hugo hints there are still known issues with snapshots and large (half-gig-plus) frequently internally rewritten files such as VM images, databases, etc, even if set NOCOW. If you're running something like this, strongly consider putting those files on a dedicated subvolume and using conventional backups instead of snapshotting for that subvolume. (And set NOCOW using the directory inheritance mechanism described in other posts.) For smaller stuff the autodefrag option should help. So I'd like to start with the best possible options and have a few questions : - Is it still recommended to mkfs with a nodesize or leafsize different (bigger) than the default ? I wouldn't like to lose too much disk space anyway (1/2 nodesize per file on average ?), as it will be limited... This depends on many things, the average size of the files on the disk is the biggest factor. In general you should get the best disk utilization [snip] As Hugo says, btrfs' current nodesize settings, etc, apply to metadata, not data, which is currently the standard 4K page-size on x86. Metadata nodesize now defaults to 16K with newer mkfs.btrfs, which should be reasonable. (There's work to make the data-block size configurable as well, in part because it's currently not possible to mount btrfs created on architectures with different page sizes, tho luckily both arm and x86/ amd64 have 4k page sizes so are compatible.) - Is it recommended to alter the FS to have skinny extents ? I've done this on all of my BTRFS machines without problem, still the kernel spits a notice at mount time, and I'm worrying kind of Why is the kernel warning me I have skinny extents ? Is it bad ? Is it something I should avoid ? I think that the primary reason for the warning is that it is backward incompatible, older kernels can't mount filesystems using it. Agreed. When skinny extents first came out there were some initial bugs, but I believe they've been worked out by now in general, so it shouldn't be a problem. The big remaining issue is backward compatibility. Tho at least here (where I've been running 3.14 pre-releases since before rc1), the on-mount skinny-extents comment seems more informational than actual warning. That said, more conservative users might wish to stay with fat extents, since AFAIK that's still the default, so it's going to get the most testing. FWIW, when I last re-did my partitions in ordered to take advantage of the 16k metadata
Re: [PATCH 14/27] btrfs-progs: Convert man page for btrfs-replace.
On Wed, Apr 02, 2014 at 04:29:25PM +0800, Qu Wenruo wrote: +If the source device is not available anymore, or if the -r option is set, +the data is built only using the RAID redundancy mechanisms. +After completion of the operation, the source device is removed from the +filesystem. Woudl it make sense to add a paragraph explaining that for raid5/6, someone should either: 1) use balance to rebuild on a new drive if one of the drives is missing 2) use btrfs device add of a new drive, then btrfs device delete of the drive to replace, and effectively btrfs will do the same thing that replace would ? Thanks, Marc -- A mouse is a device used to point at the xterm you want to type in - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs: lock inversion between delayed_node-mutex and found-groups_sem
On 03/26/2014 01:01 PM, Jeff Mahoney wrote: On 3/17/14, 9:05 AM, David Sterba wrote: On Fri, Mar 14, 2014 at 08:12:16PM -0400, Sasha Levin wrote: While fuzzing with trinity inside a KVM tools guest running the latest -next kernel I've stumbled on the following: [ 788.458756]CPU0CPU1 [ 788.459188] [ 788.459625] lock(found-groups_sem); [ 788.460041] local_irq_disable(); [ 788.460041] lock(delayed_node-mutex); [ 788.460041] lock(found-groups_sem); [ 788.460041] Interrupt [ 788.460041] lock(delayed_node-mutex); [ 788.460041] [ 788.460041] *** DEADLOCK *** [ 788.460041] [ 788.460041] 2 locks held by kswapd3/4199: I've once (3.14-rc5) seen the same warning also caused by xfstests/generic/224 I think this is from my sysfs patches. We call kobject_add while holding the group_sem. kobject_add ultimately allocates with GFP_KERNEL, so it can enter reclaim. This particular case isn't dangerous, but it could hit while hot-adding a device. The fix should be pretty simple. Is that fix available anywhere? I'm still seeing the issue in -next. Thanks, Sasha -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Hard restart required
Hi, On 5 Apr 2014, at 3:26 am, Lists li...@benjamindsmith.com wrote: On 04/03/2014 05:41 PM, Avi Miller wrote: UUID should work fine on OL6. Can you confirm that you have the UEK3 (3.8.18) kernel running? If you’ve installed from OL6U5 media, it should be enabled by default, but older OL6 ISOs only had UEK2 on the media and the UEK3 yum channel would need to be manually enabled. As suggested earlier on this list, I started with a fresh, vanilla CentOS 6 install and used the centos2ol.sh script to switch to using OL before doing a yum update/upgrade and install of BTRFS. Here's the relevant info after the switchover. Any particular reason why you didn’t just start with Oracle Linux 6? The installation media is available for download from the official Oracle Software Delivery Cloud at http://edelivery.oracle.com/linux or if you can’t be bothered with the registration requirement, from one of the mirrors listed at https://wikis.oracle.com/display/oraclelinux/Downloading+Oracle+Linux There is no good reason to start with CentOS. [root@oracle ~]# uname -a Linux oracle.schoolpathways.com 2.6.32-431.11.2.el6.x86_64 #1 SMP Tue Mar 25 08:15:39 PDT 2014 x86_64 x86_64 x86_64 GNU/Linux This kernel is very, very, very old in btrfs terms. In fact, this is the RHCK or Red Hat Compatible Kernel that ships with RHEL6/C6 and isn’t one of the newer Oracle UEK releases at all. [ol6_UEKR3_latest] name=Latest Unbreakable Enterprise Kernel for Oracle Linux $releasever ($basearch) baseurl=http://public-yum.oracle.com/repo/OracleLinux/OL6/UEKR3/latest/$basearch/ gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-oracle gpgcheck=1 enabled=0 Enable this repository and do a yum update to get the latest UEK Release 3, i.e. the 3.8.13 kernel. Cheers, Avi -- Oracle http://www.oracle.com Avi Miller | Product Management Director | +61 (3) 8616 3496 Oracle Linux and Virtualization 417 St Kilda Road, Melbourne, Victoria 3004 Australia -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS setup advice for laptop performance ?
Le vendredi 4 avril 2014 16:09:06 Hugo Mills a écrit : We don't have lots of reports of massive slowdowns after a long period of use, so whatever you're doing, there seems to be something unusual involved. It's almost certainly not your fault, but there would appear to be something in your configuration or your use-case which is leading to these problems, and without knowing what's different, it's hard to set about identifying the problem. I would have hard times finding what ! I have seen this on each and every machine on which I have installed BTRFS over the past 2 years. These first were Ubuntus, now there is one Mint, 2 Arch, 1 Fedora, all with decently recent kernels and alls updates applied. All those machines do mainly boring office tasks, email, web surf, word processing, spreadsheets. No databases except for system packages DB and KDE akonadi email storage... Few compilations, if any, no heavy disk tasks, all mounted with noatime, space_cache, inode_cache, etc... No torrents, no bitcoins, very seldom used Virtualbox (and this is nocow). No filesystem is over 80% full, some are below 20%... No filesystems currently have more than 4 active snapshots. All get slow like hell over time. How are you measuring the slowdown -- do you have a specific piece of benchmarking software, or just anecdotal evidence? When your system slowly shifts from normally responsive to dreadfully slow over time, that starting any app takes over a full minute with HD LED steady lit, that booting bhas become so long that the GUI DM dies of timeout before it even starts, and you have to restart it manualle... you can tell it's gone sloow without any benchmark figures... (Disk health good on all machines...) Go figure... -- Swâmi Petaramesh sw...@petaramesh.org http://petaramesh.org PGP 9076E32E -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html