Re: duperemove : some real world figures on BTRFS deduplication
> 2016-12-08 16:11 GMT+01:00 Swâmi Petaramesh: > > Then it took another 48 hours just for "loading the hashes of duplicate > extents". > This issue i adressing currently with the following patches: https://github.com/Floyddotnet/duperemove/commits/digest_trigger Tested with a 3,9 TB directory, with 4723 objects: old implementation of dbfile_load_hashes took 36593ms new implementation of dbfile_load_hashes took 11ms You can use this versions save. But i have to do more work. (for example a migrationscript for existing hashfiles) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] btrfs-convert: Fix migrate_super_block() to work with 64k sectorsize
On Friday, December 09, 2016 09:09:29 AM Qu Wenruo wrote: > > At 12/08/2016 09:56 PM, Chandan Rajendra wrote: > > migrate_super_block() uses sectorsize to refer to the size of the > > superblock. Hence on 64k sectorsize filesystems, it ends up computing > > checksum beyond the super block length (i.e. > > BTRFS_SUPER_INFO_SIZE). This commit fixes the bug by using > > BTRFS_SUPER_INFO_SIZE instead of sectorsize of the underlying > > filesystem. > > > > Signed-off-by: Chandan Rajendra> > Reviewed-by: Qu Wenruo > > BTW would you please enhance the convert tests? > Current convert tests only uses 4K as block size. > So adding 64K blocksize would definitely improve the tests. > Thanks for the hint. I just executed btrfs/012 with 64k blocksize hardcoded and found that 'btrfs rollback' failed. I will fix rollback first and then work on getting btrfs/012 to support 64k blocksize. -- chandan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/6] Btrfs: add DAX support for nocow btrfs
On Wed, Dec 07, 2016 at 01:45:08PM -0800, Liu Bo wrote: > Since I haven't figure out how to map multiple devices to userspace without > pagecache, this DAX support is only for single-device, and I don't think > DAX(Direct Access) can work with cow, this is limited to nocow case. I made > this by setting nodatacow in dax mount option. DAX can be made to work with COW quite easily - it's already been done, in fact. Go look up Nova for how it works with DAX: https://github.com/Andiry/nova Essentially, it has a set of "temporary pages" it links to the inode where writes are done directly, and when a synchronisation event occurs it pulls them from the per-inode list, does whatever transformations are needed (e.g. CRC calculation, mirroring, etc) and marks them them as current in the inode extent list. When a new overwrite comes along, it allocates a new block in the temporary page list, copies the existing data into it, and then uses that block for DAX until the next synchronisation event occurs. For XFS, CoW for DAX through read/write isn't really any different to the direct IO path we currently already have. And for page write faults on shared extents, instead of zeroing the newly allocated block we simply copy the original data into the new block before the allocation returns. It does mean, however, that XFS does not have the capability for data transformations in the IO path. This limits us to atomic write devices (software raid 0 or hardware redundancy such as DIMM mirroring), but we can still do out-of-band online data transformations and movement (e.g. dedupe, defrag) with DAX. Yes, I know these methods are very different to how btrfs uses COW. However, my point is that DAX and CoW and/or mulitple devices are not incompatible if the architecture is correctly structured. i.e DAX should be able to work even with most of btrfs's special magic still enabled. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/6] Btrfs: add mount option for dax
On Wed, Dec 07, 2016 at 01:45:05PM -0800, Liu Bo wrote: > Signed-off-by: Liu Bo> --- > fs/btrfs/ctree.h | 1 + > fs/btrfs/super.c | 40 +++- > 2 files changed, 40 insertions(+), 1 deletion(-) > > diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h > index 0b8ce2b..e54c6e6 100644 > --- a/fs/btrfs/ctree.h > +++ b/fs/btrfs/ctree.h > @@ -1317,6 +1317,7 @@ static inline u32 BTRFS_MAX_XATTR_SIZE(const struct > btrfs_root *root) > #define BTRFS_MOUNT_FRAGMENT_METADATA(1 << 25) > #define BTRFS_MOUNT_FREE_SPACE_TREE (1 << 26) > #define BTRFS_MOUNT_NOLOGREPLAY (1 << 27) > +#define BTRFS_MOUNT_DAX (1 << 28) > > #define BTRFS_DEFAULT_COMMIT_INTERVAL(30) > #define BTRFS_DEFAULT_MAX_INLINE (2048) > diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c > index 74ed5aa..9b18f3d 100644 > --- a/fs/btrfs/super.c > +++ b/fs/btrfs/super.c > @@ -323,7 +323,7 @@ enum { > Opt_commit_interval, Opt_barrier, Opt_nodefrag, Opt_nodiscard, > Opt_noenospc_debug, Opt_noflushoncommit, Opt_acl, Opt_datacow, > Opt_datasum, Opt_treelog, Opt_noinode_cache, Opt_usebackuproot, > - Opt_nologreplay, Opt_norecovery, > + Opt_nologreplay, Opt_norecovery, Opt_dax, Can we please not create more filesystems with a DAX mount option? This was only even an enabler, and not meant to be a permanent thing. The permanent functionality for DAX is supposed to be per-inode inheritable DAX flags - not mount options - so that applications can choose on a per file basis to enable/disable DAX access as they see fit. This also enables the filesystem to reject the attempt to turn on DAX if the set of contexts for the file are not DAX compatible Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] btrfs-progs: btrfs-convert: Prevent accounting blocks beyond end of device
On Friday, December 09, 2016 09:03:57 AM Qu Wenruo wrote: > Hi Chandan, > > Thanks for the patch. > > At 12/08/2016 09:56 PM, Chandan Rajendra wrote: > > When looping across data block bitmap, __ext2_add_one_block() may add > > blocks which do not exist on the underlying disk. This commit prevents > > this from happening by checking the block index against the maximum > > block count that was present in the ext4 filesystem instance that is > > being converted. > > The patch looks good to me. > > Reviewed-by: Qu Wenruo> > Just curious about if such image can pass e2fsck. > And would you please upload a minimal image as btrfs-progs test case? > Hi Qu, Such an ext4 filesystem can be consistently created on ppc64 with 64k as the blocksize of the filesystem. Also, the filesystem thus created passes e2fsck. -- chandan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: duperemove : some real world figures on BTRFS deduplication
On Thu, Dec 8, 2016 at 8:11 AM, Swâmi Petarameshwrote: > Well, the damn thing has been running for 15 days uninterrupted ! > ...Until I [Ctrl]-C it this morning as I had to move with the machine (I > wasn't expecting it to last THAT long...). Can you check some bigger files and see if they've become fragmented? I'm seeing 1.4GiB files with 2-3 extents reported by filefrag, go to over 5000 fragments during dedupe. This is not something I recall happening some months ago. I inadvertently replied to the wrong dedupe thread about my test and what I'm finding, it's here. https://www.spinics.net/lists/linux-btrfs/msg61304.html But if you're seeing something similar, then it would explain why it's so slow in your case. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: out-of-band dedup status?
On Thu, Dec 8, 2016 at 7:26 PM, Darrick J. Wongwrote: > On Thu, Dec 08, 2016 at 05:45:40PM -0700, Chris Murphy wrote: >> OK something's wrong. >> >> Kernel 4.8.12 and duperemove v0.11.beta4. Brand new file system >> (mkfs.btrfs -dsingle -msingle, default mount options) and two >> identical files separately copied. >> >> [chris@f25s]$ ls -li /mnt/test >> total 2811904 >> 260 -rw-r--r--. 1 root root 1439694848 Dec 8 17:26 >> Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso >> 259 -rw-r--r--. 1 root root 1439694848 Dec 8 17:26 >> Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2 >> >> [chris@f25s]$ filefrag /mnt/test/* >> /mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso: 3 extents found >> /mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2: 2 extents found >> >> >> [chris@f25s duperemove]$ sudo ./duperemove -dv /mnt/test/* >> Using 128K blocks >> Using hash: murmur3 >> Gathering file list... >> Using 4 threads for file hashing phase >> [1/2] (50.00%) csum: /mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso >> [2/2] (100.00%) csum: >> /mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2 >> Total files: 2 >> Total hashes: 21968 >> Loading only duplicated hashes from hashfile. >> Using 4 threads for dedupe phase >> [0xba8400] (1/10947) Try to dedupe extents with id e47862ea >> [0xba84a0] (3/10947) Try to dedupe extents with id ffed44f2 >> [0xba84f0] (2/10947) Try to dedupe extents with id ffeefcdd >> [0xba8540] (4/10947) Try to dedupe extents with id ffe4cf64 >> [0xba8540] Add extent for file >> "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" at offset >> 1182924800 (4) >> [0xba8540] Add extent for file >> "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset >> 1182924800 (5) >> [0xba8540] Dedupe 1 extents (id: ffe4cf64) with target: (1182924800, >> 131072), "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" > > Ew, it's deduping these two 1.4GB files 128K at a time, which results in > 12000 ioctl calls. Each of those 12000 calls has to lock the two > inodes, read the file contents, remap the blocks, etc. instead of > finding the maximal identical range and making a single call for the > whole range. > > That's probably why it's taking forever to dedupe. Yes but it looks like it's also heavily fragmenting the files as a result as well. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 00/10] Qgroup fixes for kdave/for-next-20161125 branch
The branch can be fetched from github: https://github.com/adam900710/linux.git for-david-next-qgroup-fixes If David wants to push these fixes to 4.10, then I can rebase these patches to Chris' for-linus branch. Recent qgroup fixes for several problems: 1) Qgroup reserved space underflow Caused by several reasons, from buffer write happens before qgroup enable, to freeing qgroup space not reserved by caller. 2) inode_cache mount option corruption 3) Enhance qgroup trace point Used for debugging above problems. All patches are already submitted to mail list. The 1st patch is the diff between V4 and V5 patch which adds WARN_ON() for underflowing qgroup reserved space. Qu Wenruo (10): btrfs: qgroup: Fix wrong qgroup passed to reserved space error report btrfs: qgroup: Add trace point for qgroup reserved space btrfs: qgroup: Re-arrange tracepoint timing to co-operate with reserved space tracepoint btrfs: qgroup: Fix qgroup corruption caused by inode_cache mount option btrfs: qgroup: Add quick exit for non-fs extents btrfs: qgroup: Cleanup btrfs_qgroup_prepare_account_extents function btrfs: qgroup: Return actually freed bytes for qgroup release or free data btrfs: qgroup: Fix qgroup reserved space underflow caused by buffered write and quota enable btrfs: qgroup: Introduce extent changeset for qgroup reserve functions btrfs: qgroup: Fix qgroup reserved space underflow by only freeing reserved ranges fs/btrfs/ctree.h | 14 +-- fs/btrfs/extent-tree.c | 32 --- fs/btrfs/extent_io.h | 24 +- fs/btrfs/file.c | 45 ++ fs/btrfs/inode-map.c | 4 +- fs/btrfs/inode.c | 69 +-- fs/btrfs/ioctl.c | 9 +- fs/btrfs/qgroup.c| 197 +++ fs/btrfs/qgroup.h| 14 ++- fs/btrfs/relocation.c| 12 +-- fs/btrfs/transaction.c | 20 ++--- include/trace/events/btrfs.h | 43 ++ 12 files changed, 341 insertions(+), 142 deletions(-) -- 2.10.2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: out-of-band dedup status?
On Thu, Dec 08, 2016 at 05:45:40PM -0700, Chris Murphy wrote: > OK something's wrong. > > Kernel 4.8.12 and duperemove v0.11.beta4. Brand new file system > (mkfs.btrfs -dsingle -msingle, default mount options) and two > identical files separately copied. > > [chris@f25s]$ ls -li /mnt/test > total 2811904 > 260 -rw-r--r--. 1 root root 1439694848 Dec 8 17:26 > Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso > 259 -rw-r--r--. 1 root root 1439694848 Dec 8 17:26 > Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2 > > [chris@f25s]$ filefrag /mnt/test/* > /mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso: 3 extents found > /mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2: 2 extents found > > > [chris@f25s duperemove]$ sudo ./duperemove -dv /mnt/test/* > Using 128K blocks > Using hash: murmur3 > Gathering file list... > Using 4 threads for file hashing phase > [1/2] (50.00%) csum: /mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso > [2/2] (100.00%) csum: > /mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2 > Total files: 2 > Total hashes: 21968 > Loading only duplicated hashes from hashfile. > Using 4 threads for dedupe phase > [0xba8400] (1/10947) Try to dedupe extents with id e47862ea > [0xba84a0] (3/10947) Try to dedupe extents with id ffed44f2 > [0xba84f0] (2/10947) Try to dedupe extents with id ffeefcdd > [0xba8540] (4/10947) Try to dedupe extents with id ffe4cf64 > [0xba8540] Add extent for file > "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" at offset > 1182924800 (4) > [0xba8540] Add extent for file > "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset > 1182924800 (5) > [0xba8540] Dedupe 1 extents (id: ffe4cf64) with target: (1182924800, > 131072), "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" Ew, it's deduping these two 1.4GB files 128K at a time, which results in 12000 ioctl calls. Each of those 12000 calls has to lock the two inodes, read the file contents, remap the blocks, etc. instead of finding the maximal identical range and making a single call for the whole range. That's probably why it's taking forever to dedupe. --D > [0xba8540] (4/10947) Try to dedupe extents with id ffe4cf64 > [0xba84a0] Add extent for file > "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" at offset > 543293440 (4) > [0xba84a0] Add extent for file > "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset > 543293440 (5) > [0xba84a0] Dedupe 1 extents (id: ffed44f2) with target: (543293440, > 131072), "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" > [0xba8540] Add extent for file > "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset > 1182924800 (5) > [0xba8540] Add extent for file > "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" at offset > 1182924800 (4) > [0xba8540] Dedupe 1 extents (id: ffe4cf64) with target: (1182924800, > 131072), "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" > [0xba84a0] (3/10947) Try to dedupe extents with id ffed44f2 > [0xba84a0] Add extent for file > "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset > 543293440 (5) > [0xba84a0] Add extent for file > "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" at offset > 543293440 (4) > [0xba84a0] Dedupe 1 extents (id: ffed44f2) with target: (543293440, > 131072), "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" > [0xba84f0] Add extent for file > "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" at offset > 101580800 (4) > [0xba84f0] Add extent for file > "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset > 101580800 (5) > [0xba84f0] Dedupe 1 extents (id: ffeefcdd) with target: (101580800, > 131072), "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" > [0xba84a0] (5/10947) Try to dedupe extents with id ffe24eaf > [0xba84a0] Add extent for file > "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" at offset > 171835392 (4) > [0xba84a0] Add extent for file > "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset > 171835392 (5) > [0xba84a0] Dedupe 1 extents (id: ffe24eaf) with target: (171835392, > 131072), "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" > [0xba84f0] (2/10947) Try to dedupe extents with id ffeefcdd > [0xba8540] (6/10947) Try to dedupe extents with id ffe116c8 > [0xba8400] Add extent for file > "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" at offset > 52035584 (4) > [0xba8400] Add extent for file > "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset > 52035584 (5) > [0xba8400] Add extent for file > "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset > 52166656 (5) > [0xba8400] Add extent for file > "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset > 60030976 (5) > [0xba8400] Add extent for file > "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset >
[PATCH v3 2/2] fstests: btrfs: Use _require_btrfs_qgroup_report to replace open code
Introduce new _require_btrfs_qgroup_report function, which will check the accessibility to "btrfs check --qgroup-report", then set a global flag to info _check_scratch_fs() to do extra qgroup check. Signed-off-by: Qu Wenruo--- v2: Use "${RESULT_DIR}/require_scratch.require_qgroup_report" instead of global variant Rebased to latest master Replace btrfsck with $BTRFS_UTIL_PROG check. v3: Fix a rebase error, which removes a necessary check of btrfs/042 --- check | 4 ++-- common/btrfs| 22 +- common/rc | 5 +++-- tests/btrfs/022 | 4 tests/btrfs/028 | 5 ++--- tests/btrfs/042 | 1 + tests/btrfs/099 | 1 + tests/btrfs/104 | 20 +--- tests/btrfs/122 | 9 +++-- tests/btrfs/123 | 5 ++--- tests/btrfs/126 | 1 + 11 files changed, 45 insertions(+), 32 deletions(-) diff --git a/check b/check index 8f2a1bb..76eb054 100755 --- a/check +++ b/check @@ -405,11 +405,11 @@ _check_filesystems() { if [ -f ${RESULT_DIR}/require_test ]; then _check_test_fs || err=true - rm -f ${RESULT_DIR}/require_test + rm -f ${RESULT_DIR}/require_test* fi if [ -f ${RESULT_DIR}/require_scratch ]; then _check_scratch_fs || err=true - rm -f ${RESULT_DIR}/require_scratch + rm -f ${RESULT_DIR}/require_scratch* fi } diff --git a/common/btrfs b/common/btrfs index 302edc6..a860228 100644 --- a/common/btrfs +++ b/common/btrfs @@ -40,6 +40,13 @@ _require_btrfs_command() [ $? -eq 0 ] || _notrun "$BTRFS_UTIL_PROG too old (must support $cmd $param)" } +# Require extra check on btrfs qgroup numbers +_require_btrfs_qgroup_report() +{ + _require_btrfs_command check --qgroup-report + touch ${RESULT_DIR}/require_scratch.require_qgroup_report +} + _run_btrfs_util_prog() { run_check $BTRFS_UTIL_PROG $* @@ -97,7 +104,20 @@ _check_btrfs_filesystem() mountpoint=`_umount_or_remount_ro $device` fi - btrfsck $device >$tmp.fsck 2>&1 + if [ -f ${RESULT_DIR}/require_scratch.require_qgroup_report ]; then + $BTRFS_UTIL_PROG check $device --qgroup-report > $tmp.qgroup_report 2>&1 + if grep -qE "Counts for qgroup.*are different" $tmp.qgroup_report ; then + echo "_check_btrfs_filesystem: filesystem on $device has wrong qgroup numbers (see $seqres.full)" + echo "_check_btrfs_filesystem: filesystem on $device has wrong qgroup numbers" \ + >> $seqres.full + echo "*** qgroup_report.$FSTYP output ***" >>$seqres.full + cat $tmp.qgroup_report >>$seqres.full + echo "*** qgroup_report.$FSTYP output ***" >>$seqres.full + fi + rm -f $tmp.qgroup_report + fi + + $BTRFS_UTIL_PROG check $device >$tmp.fsck 2>&1 if [ $? -ne 0 ]; then echo "_check_btrfs_filesystem: filesystem on $device is inconsistent (see $seqres.full)" diff --git a/common/rc b/common/rc index 2719b23..b50283c 100644 --- a/common/rc +++ b/common/rc @@ -1222,8 +1222,9 @@ _notrun() { echo "$*" > $seqres.notrun echo "$seq not run: $*" -rm -f ${RESULT_DIR}/require_test -rm -f ${RESULT_DIR}/require_scratch +rm -f ${RESULT_DIR}/require_test* +rm -f ${RESULT_DIR}/require_scratch* + status=0 exit } diff --git a/tests/btrfs/022 b/tests/btrfs/022 index 56d4f3d..9abad6c 100755 --- a/tests/btrfs/022 +++ b/tests/btrfs/022 @@ -43,6 +43,7 @@ _cleanup() _supported_fs btrfs _supported_os Linux _require_scratch +_require_btrfs_qgroup_report rm -f $seqres.full @@ -125,16 +126,19 @@ _scratch_mkfs > /dev/null 2>&1 _scratch_mount _basic_test _scratch_unmount +_check_scratch_fs _scratch_mkfs > /dev/null 2>&1 _scratch_mount _rescan_test _scratch_unmount +_check_scratch_fs _scratch_mkfs > /dev/null 2>&1 _scratch_mount _limit_test_exceed _scratch_unmount +_check_scratch_fs _scratch_mkfs > /dev/null 2>&1 _scratch_mount diff --git a/tests/btrfs/028 b/tests/btrfs/028 index 1425609..a3d9a27 100755 --- a/tests/btrfs/028 +++ b/tests/btrfs/028 @@ -51,6 +51,7 @@ rm -f $seqres.full _supported_fs btrfs _supported_os Linux _require_scratch +_require_btrfs_qgroup_report _scratch_mkfs _scratch_mount @@ -86,9 +87,7 @@ _run_btrfs_util_prog filesystem sync $SCRATCH_MNT _scratch_unmount -# generate a qgroup report and look for inconsistent groups -$BTRFS_UTIL_PROG check --qgroup-report $SCRATCH_DEV 2>&1 | \ - grep -E "Counts for qgroup.*are different" +# qgroup will be checked at _check_scratch_fs() by fstest. echo "Silence is golden" status=0 diff --git a/tests/btrfs/042 b/tests/btrfs/042 index 498ccc9..cf3eac2 100755 --- a/tests/btrfs/042 +++ b/tests/btrfs/042 @@ -43,6 +43,7 @@ _cleanup()
WARNING at extent-tree.c:1621 lookup_inline_extent_backref and extent-tree.c:2960 btrfs_run_delayed_refs (Linux 4.8.11)
Basically some sectors on multi-disk btrfs filesystem got corrupted due HDD fault. Now after scrub have finished (with some uncorrectable errors) when doing metadata balance $ btrfs balance start -m /mnt/Data [ 612.683979] WARNING: CPU: 0 PID: 6 at /mnt/linux/fs/btrfs/extent-tree.c:1621 lookup_inline_extent_backref+0x45a/0x5a0 [btrfs] [ 612.683980] Modules linked in: xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun cfg80211 rfkill nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT xt_tcpudp ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_broute ebtable_nat ip6table_mangle ip6table_security ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_raw iptable_mangle iptable_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_raw ebtable_filter ebtables ip6table_filter bridge ip6_tables stp iptable_filter llc it87 hwmon_vid kvm_amd kvm irqbypass crct10dif_pclmul saa7134_alsa crc32_pclmul snd_hda_codec_hdmi snd_hda_codec_realtek ghash_clmulni_intel aesni_intel snd_hda_codec_generic aes_x86_64 saa7134 lrw gf128mul snd_hda_intel [ 612.684023] tveeprom snd_hda_codec glue_helper rc_core ablk_helper snd_hda_core v4l2_common cryptd videobuf2_dma_sg videobuf2_memops videobuf2_v4l2 videobuf2_core videodev mac_hid evdev mxm_wmi pcspkr k10temp fam15h_power media r8169 mii snd_hwdep snd_pcm snd_timer snd sp5100_tco acpi_cpufreq shpchp tpm_infineon tpm_tis tpm_tis_core wmi button tpm i2c_piix4 soundcore nfsd auth_rpcgss oid_registry nfs_acl lockd sch_fq_codel grace sunrpc nvidia_uvm(PO) btrfs xor zlib_deflate raid6_pq sd_mod ata_generic pata_acpi serio_raw atkbd libps2 nvidia_drm(PO) nvidia_modeset(PO) ohci_pci nvidia(PO) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops xhci_pci drm xhci_hcd firewire_ohci ohci_hcd ehci_pci ehci_hcd firewire_core crc_itu_t pata_atiixp usbcore ahci libahci usb_common i2c_core i8042 serio [ 612.684077] mvsas libsas scsi_transport_sas libata scsi_mod crc32c_generic crc32c_intel vfat fat ip_tables x_tables [ 612.684088] CPU: 0 PID: 6 Comm: kworker/u16:0 Tainted: P O4.8.11-ARCH-dirty #1 [ 612.684089] Hardware name: Gigabyte Technology Co., Ltd. GA-990FXA-UD3/GA-990FXA-UD3, BIOS FFe 11/08/2013 [ 612.684107] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs] [ 612.684110] 0286 defb6477 880612f27a10 812e9a00 [ 612.684113] 880612f27a50 8107bcab [ 612.684116] 06550053 880477ed6e00 009626bbf000 [ 612.684119] Call Trace: [ 612.684125] [] dump_stack+0x63/0x83 [ 612.684128] [] __warn+0xcb/0xf0 [ 612.684131] [] warn_slowpath_null+0x1d/0x20 [ 612.684145] [] lookup_inline_extent_backref+0x45a/0x5a0 [btrfs] [ 612.684159] [] ? leaf_space_used+0xeb/0x120 [btrfs] [ 612.684174] [] insert_inline_extent_backref+0x68/0xf0 [btrfs] [ 612.684188] [] ? btrfs_alloc_path+0x1a/0x20 [btrfs] [ 612.684202] [] __btrfs_inc_extent_ref.isra.22+0x9c/0x260 [btrfs] [ 612.684218] [] ? find_ref_head+0x6f/0x80 [btrfs] [ 612.684233] [] __btrfs_run_delayed_refs+0x768/0x12e0 [btrfs] [ 612.684249] [] btrfs_run_delayed_refs+0xa1/0x2a0 [btrfs] [ 612.684264] [] delayed_ref_async_start+0x94/0xb0 [btrfs] [ 612.684281] [] btrfs_scrubparity_helper+0x7d/0x340 [btrfs] [ 612.684298] [] btrfs_extent_refs_helper+0xe/0x10 [btrfs] [ 612.684302] [] process_one_work+0x1e5/0x470 [ 612.684304] [] worker_thread+0x48/0x4e0 [ 612.684307] [] ? process_one_work+0x470/0x470 [ 612.684310] [] kthread+0xd8/0xf0 [ 612.684312] [] ? __switch_to+0x300/0x720 [ 612.684316] [] ret_from_fork+0x1f/0x40 [ 612.684319] [] ? kthread_worker_fn+0x170/0x170 [ 612.684320] ---[ end trace 89b3284677d454e0 ]--- [ 612.684323] [ cut here ] [ 612.684339] WARNING: CPU: 0 PID: 6 at /mnt/linux/fs/btrfs/extent-tree.c:2960 btrfs_run_delayed_refs+0x277/0x2a0 [btrfs] [ 612.684340] BTRFS: Transaction aborted (error -5) [ 612.684340] Modules linked in: xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun cfg80211 rfkill nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT xt_tcpudp ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_broute ebtable_nat ip6table_mangle ip6table_security ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_raw iptable_mangle iptable_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_raw ebtable_filter ebtables ip6table_filter bridge ip6_tables stp iptable_filter llc it87 hwmon_vid kvm_amd kvm irqbypass crct10dif_pclmul saa7134_alsa crc32_pclmul snd_hda_codec_hdmi snd_hda_codec_realtek ghash_clmulni_intel aesni_intel snd_hda_codec_generic aes_x86_64 saa7134 lrw gf128mul snd_hda_intel [ 612.684375] tveeprom snd_hda_codec glue_helper rc_core ablk_helper snd_hda_core v4l2_common cryptd videobuf2_dma_sg videobuf2_memops
Re: [PATCH 2/2] btrfs-convert: Fix migrate_super_block() to work with 64k sectorsize
At 12/08/2016 09:56 PM, Chandan Rajendra wrote: migrate_super_block() uses sectorsize to refer to the size of the superblock. Hence on 64k sectorsize filesystems, it ends up computing checksum beyond the super block length (i.e. BTRFS_SUPER_INFO_SIZE). This commit fixes the bug by using BTRFS_SUPER_INFO_SIZE instead of sectorsize of the underlying filesystem. Signed-off-by: Chandan RajendraReviewed-by: Qu Wenruo BTW would you please enhance the convert tests? Current convert tests only uses 4K as block size. So adding 64K blocksize would definitely improve the tests. Thanks, Qu --- convert/main.c | 23 --- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/convert/main.c b/convert/main.c index 1148a36..fd6f77b 100644 --- a/convert/main.c +++ b/convert/main.c @@ -1360,7 +1360,7 @@ err: /* * Migrate super block to its default position and zero 0 ~ 16k */ -static int migrate_super_block(int fd, u64 old_bytenr, u32 sectorsize) +static int migrate_super_block(int fd, u64 old_bytenr) { int ret; struct extent_buffer *buf; @@ -1368,13 +1368,13 @@ static int migrate_super_block(int fd, u64 old_bytenr, u32 sectorsize) u32 len; u32 bytenr; - buf = malloc(sizeof(*buf) + sectorsize); + buf = malloc(sizeof(*buf) + BTRFS_SUPER_INFO_SIZE); if (!buf) return -ENOMEM; - buf->len = sectorsize; - ret = pread(fd, buf->data, sectorsize, old_bytenr); - if (ret != sectorsize) + buf->len = BTRFS_SUPER_INFO_SIZE; + ret = pread(fd, buf->data, BTRFS_SUPER_INFO_SIZE, old_bytenr); + if (ret != BTRFS_SUPER_INFO_SIZE) goto fail; super = (struct btrfs_super_block *)buf->data; @@ -1382,19 +1382,20 @@ static int migrate_super_block(int fd, u64 old_bytenr, u32 sectorsize) btrfs_set_super_bytenr(super, BTRFS_SUPER_INFO_OFFSET); csum_tree_block_size(buf, BTRFS_CRC32_SIZE, 0); - ret = pwrite(fd, buf->data, sectorsize, BTRFS_SUPER_INFO_OFFSET); - if (ret != sectorsize) + ret = pwrite(fd, buf->data, BTRFS_SUPER_INFO_SIZE, + BTRFS_SUPER_INFO_OFFSET); + if (ret != BTRFS_SUPER_INFO_SIZE) goto fail; ret = fsync(fd); if (ret) goto fail; - memset(buf->data, 0, sectorsize); + memset(buf->data, 0, BTRFS_SUPER_INFO_SIZE); for (bytenr = 0; bytenr < BTRFS_SUPER_INFO_OFFSET; ) { len = BTRFS_SUPER_INFO_OFFSET - bytenr; - if (len > sectorsize) - len = sectorsize; + if (len > BTRFS_SUPER_INFO_SIZE) + len = BTRFS_SUPER_INFO_SIZE; ret = pwrite(fd, buf->data, len, bytenr); if (ret != len) { fprintf(stderr, "unable to zero fill device\n"); @@ -2519,7 +2520,7 @@ static int do_convert(const char *devname, int datacsum, int packing, * If this step succeed, we get a mountable btrfs. Otherwise * the source fs is left unchanged. */ - ret = migrate_super_block(fd, mkfs_cfg.super_bytenr, blocksize); + ret = migrate_super_block(fd, mkfs_cfg.super_bytenr); if (ret) { error("unable to migrate super block: %d", ret); goto fail; -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] btrfs-progs: btrfs-convert: Prevent accounting blocks beyond end of device
Hi Chandan, Thanks for the patch. At 12/08/2016 09:56 PM, Chandan Rajendra wrote: When looping across data block bitmap, __ext2_add_one_block() may add blocks which do not exist on the underlying disk. This commit prevents this from happening by checking the block index against the maximum block count that was present in the ext4 filesystem instance that is being converted. The patch looks good to me. Reviewed-by: Qu WenruoJust curious about if such image can pass e2fsck. And would you please upload a minimal image as btrfs-progs test case? Thanks, Qu Signed-off-by: Chandan Rajendra --- convert/main.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/convert/main.c b/convert/main.c index 4b4cea4..1148a36 100644 --- a/convert/main.c +++ b/convert/main.c @@ -1525,6 +1525,9 @@ static int __ext2_add_one_block(ext2_filsys fs, char *bitmap, offset /= EXT2FS_CLUSTER_RATIO(fs); offset += group_nr * EXT2_CLUSTERS_PER_GROUP(fs->super); for (i = 0; i < EXT2_CLUSTERS_PER_GROUP(fs->super); i++) { + if ((i + offset) >= ext2fs_blocks_count(fs->super)) + break; + if (ext2fs_test_bit(i, bitmap)) { u64 start; -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] btrfs-progs: btrfs-convert: Prevent accounting blocks beyond end of device
Hi Chandan, Thanks for the patch. At 12/08/2016 09:56 PM, Chandan Rajendra wrote: When looping across data block bitmap, __ext2_add_one_block() may add blocks which do not exist on the underlying disk. This commit prevents this from happening by checking the block index against the maximum block count that was present in the ext4 filesystem instance that is being converted. The patch looks good to me. Reviewed-by: Qu WenruoJust curious about if such image can pass e2fsck. And would you please upload a minimal image as btrfs-progs test case? Thanks, Qu Signed-off-by: Chandan Rajendra --- convert/main.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/convert/main.c b/convert/main.c index 4b4cea4..1148a36 100644 --- a/convert/main.c +++ b/convert/main.c @@ -1525,6 +1525,9 @@ static int __ext2_add_one_block(ext2_filsys fs, char *bitmap, offset /= EXT2FS_CLUSTER_RATIO(fs); offset += group_nr * EXT2_CLUSTERS_PER_GROUP(fs->super); for (i = 0; i < EXT2_CLUSTERS_PER_GROUP(fs->super); i++) { + if ((i + offset) >= ext2fs_blocks_count(fs->super)) + break; + if (ext2fs_test_bit(i, bitmap)) { u64 start; -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: out-of-band dedup status?
OK something's wrong. Kernel 4.8.12 and duperemove v0.11.beta4. Brand new file system (mkfs.btrfs -dsingle -msingle, default mount options) and two identical files separately copied. [chris@f25s]$ ls -li /mnt/test total 2811904 260 -rw-r--r--. 1 root root 1439694848 Dec 8 17:26 Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso 259 -rw-r--r--. 1 root root 1439694848 Dec 8 17:26 Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2 [chris@f25s]$ filefrag /mnt/test/* /mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso: 3 extents found /mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2: 2 extents found [chris@f25s duperemove]$ sudo ./duperemove -dv /mnt/test/* Using 128K blocks Using hash: murmur3 Gathering file list... Using 4 threads for file hashing phase [1/2] (50.00%) csum: /mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso [2/2] (100.00%) csum: /mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2 Total files: 2 Total hashes: 21968 Loading only duplicated hashes from hashfile. Using 4 threads for dedupe phase [0xba8400] (1/10947) Try to dedupe extents with id e47862ea [0xba84a0] (3/10947) Try to dedupe extents with id ffed44f2 [0xba84f0] (2/10947) Try to dedupe extents with id ffeefcdd [0xba8540] (4/10947) Try to dedupe extents with id ffe4cf64 [0xba8540] Add extent for file "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" at offset 1182924800 (4) [0xba8540] Add extent for file "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset 1182924800 (5) [0xba8540] Dedupe 1 extents (id: ffe4cf64) with target: (1182924800, 131072), "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" [0xba8540] (4/10947) Try to dedupe extents with id ffe4cf64 [0xba84a0] Add extent for file "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" at offset 543293440 (4) [0xba84a0] Add extent for file "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset 543293440 (5) [0xba84a0] Dedupe 1 extents (id: ffed44f2) with target: (543293440, 131072), "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" [0xba8540] Add extent for file "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset 1182924800 (5) [0xba8540] Add extent for file "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" at offset 1182924800 (4) [0xba8540] Dedupe 1 extents (id: ffe4cf64) with target: (1182924800, 131072), "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" [0xba84a0] (3/10947) Try to dedupe extents with id ffed44f2 [0xba84a0] Add extent for file "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset 543293440 (5) [0xba84a0] Add extent for file "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" at offset 543293440 (4) [0xba84a0] Dedupe 1 extents (id: ffed44f2) with target: (543293440, 131072), "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" [0xba84f0] Add extent for file "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" at offset 101580800 (4) [0xba84f0] Add extent for file "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset 101580800 (5) [0xba84f0] Dedupe 1 extents (id: ffeefcdd) with target: (101580800, 131072), "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" [0xba84a0] (5/10947) Try to dedupe extents with id ffe24eaf [0xba84a0] Add extent for file "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" at offset 171835392 (4) [0xba84a0] Add extent for file "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset 171835392 (5) [0xba84a0] Dedupe 1 extents (id: ffe24eaf) with target: (171835392, 131072), "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" [0xba84f0] (2/10947) Try to dedupe extents with id ffeefcdd [0xba8540] (6/10947) Try to dedupe extents with id ffe116c8 [0xba8400] Add extent for file "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso" at offset 52035584 (4) [0xba8400] Add extent for file "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset 52035584 (5) [0xba8400] Add extent for file "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset 52166656 (5) [0xba8400] Add extent for file "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset 60030976 (5) [0xba8400] Add extent for file "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset 60162048 (5) [0xba8400] Add extent for file "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset 60293120 (5) [0xba8400] Add extent for file "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset 60424192 (5) [0xba8400] Add extent for file "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset 60555264 (5) [0xba8400] Add extent for file "/mnt/test/Fedora-Workstation-Live-x86_64-25_Beta-1.1.iso2" at offset 60686336 (5) [...snip...] 10 minutes later... [0xba84f0] (06233/10947) Try to dedupe extents with id 703ebf5c [0xba8400] (06234/10947) Try to dedupe extents with
Re: [PATCH 1/2] btrfs: qgroup: Introduce extent changeset for qgroup reserve functions
At 12/08/2016 10:29 PM, Chandan Rajendra wrote: On Friday, December 02, 2016 10:03:06 AM Qu Wenruo wrote: Introduce a new parameter, struct extent_changeset for btrfs_qgroup_reserved_data() and its callers. Such extent_changeset was used in btrfs_qgroup_reserve_data() to record which range it reserved in current reserve, so it can free it at error path. The reason we need to export it to callers is, at buffered write error path, without knowing what exactly which range we reserved in current allocation, we can free space which is not reserved by us. This will lead to qgroup reserved space underflow. Hi Qu, On which git tree are these patches based off? This patch fails to apply cleanly on kdave/for-next branch that is available as of today. It's based on David's for-next-20161125 branch, but with several other qgroup fixes as prerequisite. btrfs: qgroup: Fix qgroup reserved space underflow caused by buffered write and quota enable btrfs: qgroup: Return actually freed bytes for qgroup release or free data btrfs: Add trace point for qgroup reserved space btrfs: Add WARN_ON for qgroup reserved underflow Or you can get the branch from my github repo: https://github.com/adam900710/linux.git qgroup_fixes Thanks, Qu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: out-of-band dedup status?
On Thursday 08 December 2016 13:41:36 Chris Murphy wrote: > Pretty sure it will not dedupe extents that are referenced in a read > only subvolume. I've used duperemove to de-duplicate files in read-only snapshots (of different systems) on my backup drive, so unless you're referencing some specific issue, I'm pretty sure you're wrong about that. Maybe you're thinking of the occasionally mentioned old dedup kernel implementation? -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup signature.asc Description: This is a digitally signed message part.
Re: out-of-band dedup status?
On Thu, 2016-12-08 at 13:41 -0700, Chris Murphy wrote: > Pretty sure it will not dedupe extents that are referenced in a read > only subvolume. Oh... hm.. well that would be quite some limitation, cause as soon as one has a snapshot of the full fs (which is probably not so unlikely) i won't work anymore, cause everything is referenced by the backup ro- snapshots... :( Cheers, Chris. smime.p7s Description: S/MIME cryptographic signature
Re: duperemove : some real world figures on BTRFS deduplication
On 2016-12-08 15:07, Jeff Mahoney wrote: On 12/8/16 10:42 AM, Austin S. Hemmelgarn wrote: On 2016-12-08 10:11, Swâmi Petaramesh wrote: Hi, Some real world figures about running duperemove deduplication on BTRFS : I have an external 2,5", 5400 RPM, 1 TB HD, USB3, on which I store the BTRFS backups (full rsync) of 5 PCs, using 2 different distros, typically at the same update level, and all of them more of less sharing the entirety or part of the same set of user files. For each of these PCs I keep a series of 4-5 BTRFS subvolume snapshots for having complete backups at different points in time. The HD was full to 93% and made a good testbed for deduplicating. So I ran duperemove on this HD, on a machine doing "only this", using a hashfile. The machine being an Intel i5 with 6 GB of RAM. Well, the damn thing has been running for 15 days uninterrupted ! ...Until I [Ctrl]-C it this morning as I had to move with the machine (I wasn't expecting it to last THAT long...). It took about 48 hours just for calculating the files hashes. Then it took another 48 hours just for "loading the hashes of duplicate extents". Then it took 11 days deduplicating until I killed it. At the end, the disk that was 93% full is now 76% full, so I saved 17% of 1 TB (170 GB) by deduplicating for 15 days. Well the thing "works" and my disk isn't full anymore, so that's a very partial success, but still l wonder if the gain is worth the effort... So, some general explanation here: Duperemove hashes data in blocks of (by default) 128kB, which means for ~930GB, you've got about 7618560 blocks to hash, which partly explains why it took so long to hash. Once that's done, it then has to compare hashes for all combinations of those blocks, which totals to 58042456473600 comparisons (hence that taking a long time). The block size thus becomes a trade-off between performance when hashing and actual space savings (smaller block size makes hashing take longer, but gives overall slightly better results for deduplication). IIRC, the core of the duperemove duplicate matcher isn't an O(n^2) algorithm. I think Mark used a bloom filter to reduce the data set prior to matching, but I haven't looked at the code in a while. You're right, I had completely forgotten about that. Regardless of that though, it's still a lot of processing that needs done. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: out-of-band dedup status?
Pretty sure it will not dedupe extents that are referenced in a read only subvolume. Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: out-of-band dedup status?
On 12/8/16 1:36 PM, Christoph Anton Mitterer wrote: > Hey. > > I just wondered whether out-of-band/"offline" dedup is safe for general > use... https://btrfs.wiki.kernel.org/index.php/Status kinda implies so > (it tells about unspecified performance issues), but this seems again > already outdated (kernel 4.7)... > :-( SUSE supports it in SLE12 using our 3.12 and 4.4 -based kernels. There haven't been a lot of changes to the kernel component of it. It's pretty simple: check to see if the ranges are identical between two files and then reflink between them. > My intention was to use it with duperemove, but AFAIU, the kernel > itself will anyway do a byte-by-byte comparison before any > deduplication, so in principle it should be totally safe regardless of > the stability of the userland tool, right? > Especially I wouldn't want that "identity" is only assumed because of > some checksum identity (or collision ;) ). Yep. It does a full check in the kernel for precisely that reason. It's not even enough to do it in userspace because we don't want dedupe to be race prone. It's either atomically identical or it's not, and we don't dedupe if it's not. If it changes immediately after the ioctl returns, that's fine -- the cloned range will be CoW'd properly. > Also, is there anything to take note of when this is used with > compression and snapshots? I don't believe so. IIRC dedupe maps the file to see if it's already cloned, so it's safe for snapshots (or could relink extents in a snapshot that diverged and then were restored to their original contents. Dedupe works with the uncompressed data, so compression shouldn't matter here. I haven't tested it, though. > What when I use it with incremental send/receive... i.e. I dedupe the > "master" and then send/receive this to another btrfs... will it work > (that is will the copy be also deduplicated, with no longer needed > extents properly being freed)... or at least not cause any corruptions? It should. IIRC send also maps the file (using a different mechanism) and receive will clone those ranges on the other end. > Any other things in terms of possible issues, data corruption, etc. > that one should know when using deduplication? There shouldn't be. We haven't had any bug reports at SUSE. -Jeff -- Jeff Mahoney SUSE Labs signature.asc Description: OpenPGP digital signature
Re: duperemove : some real world figures on BTRFS deduplication
On 12/8/16 10:42 AM, Austin S. Hemmelgarn wrote: > On 2016-12-08 10:11, Swâmi Petaramesh wrote: >> Hi, Some real world figures about running duperemove deduplication on >> BTRFS : >> >> I have an external 2,5", 5400 RPM, 1 TB HD, USB3, on which I store the >> BTRFS backups (full rsync) of 5 PCs, using 2 different distros, >> typically at the same update level, and all of them more of less sharing >> the entirety or part of the same set of user files. >> >> For each of these PCs I keep a series of 4-5 BTRFS subvolume snapshots >> for having complete backups at different points in time. >> >> The HD was full to 93% and made a good testbed for deduplicating. >> >> So I ran duperemove on this HD, on a machine doing "only this", using a >> hashfile. The machine being an Intel i5 with 6 GB of RAM. >> >> Well, the damn thing has been running for 15 days uninterrupted ! >> ...Until I [Ctrl]-C it this morning as I had to move with the machine (I >> wasn't expecting it to last THAT long...). >> >> It took about 48 hours just for calculating the files hashes. >> >> Then it took another 48 hours just for "loading the hashes of duplicate >> extents". >> >> Then it took 11 days deduplicating until I killed it. >> >> At the end, the disk that was 93% full is now 76% full, so I saved 17% >> of 1 TB (170 GB) by deduplicating for 15 days. >> >> Well the thing "works" and my disk isn't full anymore, so that's a very >> partial success, but still l wonder if the gain is worth the effort... > So, some general explanation here: > Duperemove hashes data in blocks of (by default) 128kB, which means for > ~930GB, you've got about 7618560 blocks to hash, which partly explains > why it took so long to hash. Once that's done, it then has to compare > hashes for all combinations of those blocks, which totals to > 58042456473600 comparisons (hence that taking a long time). The block > size thus becomes a trade-off between performance when hashing and > actual space savings (smaller block size makes hashing take longer, but > gives overall slightly better results for deduplication). IIRC, the core of the duperemove duplicate matcher isn't an O(n^2) algorithm. I think Mark used a bloom filter to reduce the data set prior to matching, but I haven't looked at the code in a while. -Jeff -- Jeff Mahoney SUSE Labs signature.asc Description: OpenPGP digital signature
Re: duperemove : some real world figures on BTRFS deduplication
On 12/8/16 10:11 AM, Swâmi Petaramesh wrote: > Hi, Some real world figures about running duperemove deduplication on > BTRFS : > > I have an external 2,5", 5400 RPM, 1 TB HD, USB3, on which I store the > BTRFS backups (full rsync) of 5 PCs, using 2 different distros, > typically at the same update level, and all of them more of less sharing > the entirety or part of the same set of user files. > > For each of these PCs I keep a series of 4-5 BTRFS subvolume snapshots > for having complete backups at different points in time. > > The HD was full to 93% and made a good testbed for deduplicating. > > So I ran duperemove on this HD, on a machine doing "only this", using a > hashfile. The machine being an Intel i5 with 6 GB of RAM. > > Well, the damn thing has been running for 15 days uninterrupted ! > ...Until I [Ctrl]-C it this morning as I had to move with the machine (I > wasn't expecting it to last THAT long...). > > It took about 48 hours just for calculating the files hashes. > > Then it took another 48 hours just for "loading the hashes of duplicate > extents". > > Then it took 11 days deduplicating until I killed it. > > At the end, the disk that was 93% full is now 76% full, so I saved 17% > of 1 TB (170 GB) by deduplicating for 15 days. > > Well the thing "works" and my disk isn't full anymore, so that's a very > partial success, but still l wonder if the gain is worth the effort... What version were you using? I know Mark had put a bunch of effort into reducing the memory footprint and runtime. The earlier versions were "can we get this thing working" while the newer versions are more efficient. What throughput are you getting to that disk? I get that it's USB3, but reading 1TB doesn't take a terribly long time so 15 days is pretty ridiculous. At any rate, the good news is that when you run it again, assuming you used the hash file, it will not have to rescan most of your data set. -Jeff -- Jeff Mahoney SUSE Labs signature.asc Description: OpenPGP digital signature
out-of-band dedup status?
Hey. I just wondered whether out-of-band/"offline" dedup is safe for general use... https://btrfs.wiki.kernel.org/index.php/Status kinda implies so (it tells about unspecified performance issues), but this seems again already outdated (kernel 4.7)... :-( My intention was to use it with duperemove, but AFAIU, the kernel itself will anyway do a byte-by-byte comparison before any deduplication, so in principle it should be totally safe regardless of the stability of the userland tool, right? Especially I wouldn't want that "identity" is only assumed because of some checksum identity (or collision ;) ). Also, is there anything to take note of when this is used with compression and snapshots? What when I use it with incremental send/receive... i.e. I dedupe the "master" and then send/receive this to another btrfs... will it work (that is will the copy be also deduplicated, with no longer needed extents properly being freed)... or at least not cause any corruptions? Any other things in terms of possible issues, data corruption, etc. that one should know when using deduplication? Thanks :) Chris. smime.p7s Description: S/MIME cryptographic signature
Re: duperemove : some real world figures on BTRFS deduplication
2016-12-08 18:42 GMT+03:00 Austin S. Hemmelgarn: > On 2016-12-08 10:11, Swâmi Petaramesh wrote: >> >> Hi, Some real world figures about running duperemove deduplication on >> BTRFS : >> >> I have an external 2,5", 5400 RPM, 1 TB HD, USB3, on which I store the >> BTRFS backups (full rsync) of 5 PCs, using 2 different distros, >> typically at the same update level, and all of them more of less sharing >> the entirety or part of the same set of user files. >> >> For each of these PCs I keep a series of 4-5 BTRFS subvolume snapshots >> for having complete backups at different points in time. >> >> The HD was full to 93% and made a good testbed for deduplicating. >> >> So I ran duperemove on this HD, on a machine doing "only this", using a >> hashfile. The machine being an Intel i5 with 6 GB of RAM. >> >> Well, the damn thing has been running for 15 days uninterrupted ! >> ...Until I [Ctrl]-C it this morning as I had to move with the machine (I >> wasn't expecting it to last THAT long...). >> >> It took about 48 hours just for calculating the files hashes. >> >> Then it took another 48 hours just for "loading the hashes of duplicate >> extents". >> >> Then it took 11 days deduplicating until I killed it. >> >> At the end, the disk that was 93% full is now 76% full, so I saved 17% >> of 1 TB (170 GB) by deduplicating for 15 days. >> >> Well the thing "works" and my disk isn't full anymore, so that's a very >> partial success, but still l wonder if the gain is worth the effort... > > So, some general explanation here: > Duperemove hashes data in blocks of (by default) 128kB, which means for > ~930GB, you've got about 7618560 blocks to hash, which partly explains why > it took so long to hash. Once that's done, it then has to compare hashes > for all combinations of those blocks, which totals to 58042456473600 > comparisons (hence that taking a long time). The block size thus becomes a > trade-off between performance when hashing and actual space savings (smaller > block size makes hashing take longer, but gives overall slightly better > results for deduplication). > > As far as the rest, given your hashing performance (which is not > particularly good I might add, roughly 5.6MB/s), the amount of time it was > taking to do the actual deduplication is reasonable since the deduplication > ioctl does a byte-wise comparison of the extents to be deduplicated prior to > actually ref-linking them to ensure you don't lose data. > > Because of this, generic batch deduplication is not all that great on BTRFS. > There are cases where it can work, but usually they're pretty specific > cases. In most cases though, you're better off doing a custom tool that > knows about how your data is laid out and what's likely to be duplicated > (I've actually got two tools for this for the two cases where I use > deduplication, they use knowledge of the data-set itself to figure out > what's duplicated, then just call the ioctl through a wrapper (previously > the one included in duperemove, currently xfs_io)). > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Zygo do the good job on this too. Try: https://github.com/Zygo/bees It's cool and can work better on large massive of data, because it dedup in the same time with scanning phase. -- Have a nice day, Timofey. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RESEND][PATCH v2] btrfs-progs: add dev stats returncode option
On 2016-12-08 12:20, David Sterba wrote: On Mon, Dec 05, 2016 at 01:35:20PM -0500, Austin S. Hemmelgarn wrote: Currently, `btrfs device stats` returns non-zero only when there was an error getting the counter values. This is fine for when it gets run by a user directly, but is a serious pain when trying to use it in a script or for monitoring since you need to parse the (not at all machine friendly) output to check the counter values. This patch adds an option ('-s') which causes `btrfs device stats` to set bit 6 in the return code if any of the counters are non-zero. This greatly simplifies checking from a script or monitoring software if any errors have been recorded. In the event that this switch is passed and an error occurs reading the stats, the return code will have bit 0 set (so if there are errors reading counters, and the counters which were read were non-zero, the return value will be 65). So a typical check in a script would look for either 64 or 65 returned from the command, I don't think we can do it simpler. The option naming is a bit confusing to me, as it duplicates the 'stats' from the command itself. I'd suggest to use '--check' instead, does it sound OK to you? I'll apply the patch as-is for now (and maybe do some cleanups in the surrounding code). Yeah, --check is fine. Like I said, I'm not too picky about the name as long as it works. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RESEND][PATCH v2] btrfs-progs: add dev stats returncode option
On Mon, Dec 05, 2016 at 01:35:20PM -0500, Austin S. Hemmelgarn wrote: > Currently, `btrfs device stats` returns non-zero only when there was an > error getting the counter values. This is fine for when it gets run by a > user directly, but is a serious pain when trying to use it in a script or > for monitoring since you need to parse the (not at all machine friendly) > output to check the counter values. > > This patch adds an option ('-s') which causes `btrfs device stats` > to set bit 6 in the return code if any of the counters are non-zero. > This greatly simplifies checking from a script or monitoring software if > any errors have been recorded. In the event that this switch is passed > and an error occurs reading the stats, the return code will have bit > 0 set (so if there are errors reading counters, and the counters which > were read were non-zero, the return value will be 65). So a typical check in a script would look for either 64 or 65 returned from the command, I don't think we can do it simpler. The option naming is a bit confusing to me, as it duplicates the 'stats' from the command itself. I'd suggest to use '--check' instead, does it sound OK to you? I'll apply the patch as-is for now (and maybe do some cleanups in the surrounding code). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/6] btrfs dax IO
On Thu, Dec 08, 2016 at 07:48:18AM +0100, Xin Zhou wrote: > Hi Liu, > > From the patch, is the snapshot disabled by disabling the COW in the mounting > path? > It seems the create_snapshot() in ioctl.c does not get changed. Well, I think I made a mistake in this cover letter, snapshot still works while mounting with dax, but if a snapshot is taken, then we'll get -EIO while writing to dax inodes that belong to either snapshot tree or its source tree. So in fact, only a readonly snapshot makes sense in practice, I'll update the patch so that we only allow a readonly snapshot to be taken. COW is disabled by letting the dax mount option imply the "nodatacow" option. Thanks for spotting this! Thanks, -liubo > > I experienced some similar system but am a bit new to the brtfs code. > > Thanks, > Xin > > > > Subject: [PATCH 0/6] btrfs dax IOFrom: Liu BoDate: > Wed, 7 Dec 2016 13:45:04 -0800Cc: Chris Mason , Jan Kara > , David Sterba > This is a prelimanary patch set to add dax support for btrfs, with > this we can do normal read/write to dax files and can mmap dax files > to userspace so that applications have the ability to access > persistent memory directly. > > Please note that currently this is limited to nocow, i.e. all dax > inodes do not have COW behaviour. > > COW: no > mutliple device: no > clone/reflink:no > snapshot: no > compression: no > checksum: no > > Right now snapshot is disabled while mounting with -odax, but snapshot > can be created without -odax, and writing to a dax file in snapshot > will get -EIO. > > Clone/reflink is dealt with as same as snapshot, -EIO will be returned > when writing to shared extents. > > This has adopted the latest iomap framework for dax read/write > and dax mmap. > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/6] Btrfs: add DAX support for nocow btrfs
On Thu, Dec 08, 2016 at 11:47:41AM +0100, Jan Kara wrote: > On Wed 07-12-16 17:15:42, Chris Mason wrote: > > On 12/07/2016 04:45 PM, Liu Bo wrote: > > >This has implemented DAX support for btrfs with nocow and single-device. > > > > > >DAX is developed for block devices that are memory-like in order to avoid > > >double buffer in both page cache and the storage, so DAX can performs > > >reads and > > >writes directly to the storage device, and for those who prefer to using > > >filesystem, filesystem dax support can help to map the storage into > > >userspace > > >for file-mapping. > > > > > >Since I haven't figure out how to map multiple devices to userspace without > > >pagecache, this DAX support is only for single-device, and I don't think > > >DAX(Direct Access) can work with cow, this is limited to nocow case. I > > >made > > >this by setting nodatacow in dax mount option. > > > > Interesting, this is a nice small start. It might make more sense to limit > > snapshots to readonly in DAX mode until we can figure out how to cow > > properly. I think it can be done, I just need to sit down with the dax code > > to do a good review. > > > > But bigger picture, if we can't cow and we can't crc and we can't > > multi-device, I'd rather let XFS/ext4 sort out the dax space until we pull > > in more of the btrfs features too. > > So normal DAX IO (via read(2) and write(2)) is very similar to direct IO so > I don't think there would be any obstacle to support all the features with > that. For DAX IO via read(2)/write(2), cow is OK while the mutliple devices is a problem as currently iomap_dax_actor only takes onepair: - raid 0, one device is written once a time - raid 1/10 and others, 2 or more devices need to be written each time > For mmap(2) things get more difficult but still: The filesystem gets > normal ->fault notifications when the page is first faulted in. So you > can COW if you need to at that moment. Right. > Also DAX PTEs can be write-protected (well, as of the coming merge > window) as normal PTEs and then you'll get ->pfn_mkwrite / > ->page_mkwrite notification when someone tries to write via mmap and > you can do your stuff at that point. That's right, but I think the problem comes from the fact that only ->fault with FAULT_FLAG_WRITE gets to space allocation where we could cow to new location. For page_mkwrite, btrfs does cow while writing back a dirty page, but dax doesn't do delayed allocation so dax_writeback_one doesn't have place to do cow. Also thank you for the great write-protected patch, since another reason I decided to disable cow is that there is no write-protected on DAX PTEs, so without that even if we can do cow, we don't have a way to update every pte pointing to our cow'd dax pfn. > So DAX mappings are not that > different from filesystem point of view. There are some differences wrt. > locking (you don't have page lock, but you use a lock bit in radix tree > entry instead for that) but that's about it. So I don't see a principial > reason why we cannot support all btrfs features for DAX... But if you see > some problem, let me know and we can talk if we could somehow help from the > DAX side. Yeah, looks like we have two problems at least, one is dax_writeback_one and the other is iomap. > > BTW, I also don't see how the multiple devices are a problem. Actually XFS > supports that (with its real-time devices) just fine - your ->iomap_begin() > returns a pair and that should be all that's needed, > no? xfs is a bit different, it only writes to one device at a time, sort of a raid0. Thanks, -liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: duperemove : some real world figures on BTRFS deduplication
On 2016-12-08 10:11, Swâmi Petaramesh wrote: Hi, Some real world figures about running duperemove deduplication on BTRFS : I have an external 2,5", 5400 RPM, 1 TB HD, USB3, on which I store the BTRFS backups (full rsync) of 5 PCs, using 2 different distros, typically at the same update level, and all of them more of less sharing the entirety or part of the same set of user files. For each of these PCs I keep a series of 4-5 BTRFS subvolume snapshots for having complete backups at different points in time. The HD was full to 93% and made a good testbed for deduplicating. So I ran duperemove on this HD, on a machine doing "only this", using a hashfile. The machine being an Intel i5 with 6 GB of RAM. Well, the damn thing has been running for 15 days uninterrupted ! ...Until I [Ctrl]-C it this morning as I had to move with the machine (I wasn't expecting it to last THAT long...). It took about 48 hours just for calculating the files hashes. Then it took another 48 hours just for "loading the hashes of duplicate extents". Then it took 11 days deduplicating until I killed it. At the end, the disk that was 93% full is now 76% full, so I saved 17% of 1 TB (170 GB) by deduplicating for 15 days. Well the thing "works" and my disk isn't full anymore, so that's a very partial success, but still l wonder if the gain is worth the effort... So, some general explanation here: Duperemove hashes data in blocks of (by default) 128kB, which means for ~930GB, you've got about 7618560 blocks to hash, which partly explains why it took so long to hash. Once that's done, it then has to compare hashes for all combinations of those blocks, which totals to 58042456473600 comparisons (hence that taking a long time). The block size thus becomes a trade-off between performance when hashing and actual space savings (smaller block size makes hashing take longer, but gives overall slightly better results for deduplication). As far as the rest, given your hashing performance (which is not particularly good I might add, roughly 5.6MB/s), the amount of time it was taking to do the actual deduplication is reasonable since the deduplication ioctl does a byte-wise comparison of the extents to be deduplicated prior to actually ref-linking them to ensure you don't lose data. Because of this, generic batch deduplication is not all that great on BTRFS. There are cases where it can work, but usually they're pretty specific cases. In most cases though, you're better off doing a custom tool that knows about how your data is laid out and what's likely to be duplicated (I've actually got two tools for this for the two cases where I use deduplication, they use knowledge of the data-set itself to figure out what's duplicated, then just call the ioctl through a wrapper (previously the one included in duperemove, currently xfs_io)). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: LSF/MM 2017: Call for Proposals
On Thu 08-12-16 07:30:43, James Bottomley wrote: > On Thu, 2016-12-08 at 13:26 +0100, Michal Hocko wrote: > > On Wed 07-12-16 06:57:06, James Bottomley wrote: > > [...] > > > Just on this point, since there seems to be a lot of confusion: lsf > > > -pc > > > is the list for contacting the programme committee, so you cannot > > > subscribe to it. > > > > > > There is no -discuss equivalent, like kernel summit has, because we > > > expect you to cc the relevant existing mailing list and have the > > > discussion there instead rather than expecting people to subscribe > > > to a > > > new list. > > > > There used to be l...@lists.linux-foundation.org. Is it not active > > anymore? > > Not until we get the list of invitees sorted out. And then it's only > for stuff actually to do with lsf (like logistics, bof or session > requests or other meetups). Any technical stuff should still be cc'd > to the relevant mailing list. OK, I tend to forget about that. Thanks for the clarification! -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: LSF/MM 2017: Call for Proposals
On Thu, 2016-12-08 at 13:26 +0100, Michal Hocko wrote: > On Wed 07-12-16 06:57:06, James Bottomley wrote: > [...] > > Just on this point, since there seems to be a lot of confusion: lsf > > -pc > > is the list for contacting the programme committee, so you cannot > > subscribe to it. > > > > There is no -discuss equivalent, like kernel summit has, because we > > expect you to cc the relevant existing mailing list and have the > > discussion there instead rather than expecting people to subscribe > > to a > > new list. > > There used to be l...@lists.linux-foundation.org. Is it not active > anymore? Not until we get the list of invitees sorted out. And then it's only for stuff actually to do with lsf (like logistics, bof or session requests or other meetups). Any technical stuff should still be cc'd to the relevant mailing list. James -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: mkfs, balance convert: warn about RAID5/6 in fiery letters
On Wed, Nov 30, 2016 at 04:12:46PM +0100, David Sterba wrote: > On Mon, Nov 28, 2016 at 07:51:53PM +0100, Adam Borowski wrote: > > People who don't frequent IRC nor the mailing list tend to believe RAID 5/6 > > are stable; this leads to data loss. Thus, let's do warn them. > > > > At this point, I think fiery letters that won't be missed are warranted. > > > > Kernel 4.9 and its -progs will be a part of LTS of multiple distributions, > > so leaving experimental features without a warning is inappropriate. > > I'm ok with adding the warning about raid56 feature, but I have some > comments to how it's implemented. > > Special case warning for the raid56 is ok, as it corresponds to the > 'mkfs_features' table where the missing value for 'safe' should lead to > a similar warning. This is planned to be more generic, so I just want to > make sure we can adjust it later without problems. There are at least two tools that can make a raid56: mkfs and balance convert. mkfs_features is used for the former but not for the latter. It doesn't appear to be machine-parseable as well (but that's a matter of editing the #define). Of course, making this generic is doable, I just don't know how do you plan to do that. Should I try to use mkfs_features somehow, or do it as a variant of the current patch for now? > The warning should go last, after the final summary (and respect verbosity > level). and for balance convert, before the output. > The colors seem a bit too much to me, red text or just emphasize > 'warning' would IMHO suffice. (in the paragraph before:) > If the message were not colored, I'd completely miss the warning. ... which means the color works! It _is_ intended to be offensive, so people can't possibly miss it. XFS uses something far more tame: black text on red background (\e[0;30;41m) but I want something that conveys the message appropriately. Obviously I'm not married to a particular color for this bikeshed, I just want a color that makes the neighbours complain :p English lacks a proper word for this, something akin to https://en.wiktionary.org/wiki/oczojebny -- u-boot problems can be solved with the help of your old SCSI manuals, the parts that deal with goat termination. You need a black-handled knife, and an appropriate set of candles (number and color matters). Or was it a silver-handled knife? Crap, need to look that up. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/6] Btrfs: set single device limit for dax usecase
On Thu, Dec 08, 2016 at 02:35:59PM +0100, David Sterba wrote: > On Wed, Dec 07, 2016 at 01:45:06PM -0800, Liu Bo wrote: > > Dax on btrfs is not ready for multiple device. > > How about DUP? Technically it's not multi-device but still stores > multiple copies, so I don't know if the implementation is ok with that. Good question. For this patch set, meta DUP is OK while data DUP is not, the main obstacle is that the actual copy-to-device operation happens in iomap code (iomap_dax_actor), not inside filesystem itself, and it only takes onepair to copy to device. Thanks, -liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: Initialize ret to suppress compiler warning
On Thu, Dec 01, 2016 at 11:28:20AM -0600, Goldwyn Rodrigues wrote: > From: Goldwyn RodriguesThe path that leaves ret unintialized goes through the second if block and requires dback->found_ref to be 0. Quick search leads to several places where it's set according to found items so we won't reach the for loop with found_ref 0. Cleaning up such compiler warnings is desired, but the fix description should also say why it is ok to just set the value. Patch applied with updated description. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
duperemove : some real world figures on BTRFS deduplication
Hi, Some real world figures about running duperemove deduplication on BTRFS : I have an external 2,5", 5400 RPM, 1 TB HD, USB3, on which I store the BTRFS backups (full rsync) of 5 PCs, using 2 different distros, typically at the same update level, and all of them more of less sharing the entirety or part of the same set of user files. For each of these PCs I keep a series of 4-5 BTRFS subvolume snapshots for having complete backups at different points in time. The HD was full to 93% and made a good testbed for deduplicating. So I ran duperemove on this HD, on a machine doing "only this", using a hashfile. The machine being an Intel i5 with 6 GB of RAM. Well, the damn thing has been running for 15 days uninterrupted ! ...Until I [Ctrl]-C it this morning as I had to move with the machine (I wasn't expecting it to last THAT long...). It took about 48 hours just for calculating the files hashes. Then it took another 48 hours just for "loading the hashes of duplicate extents". Then it took 11 days deduplicating until I killed it. At the end, the disk that was 93% full is now 76% full, so I saved 17% of 1 TB (170 GB) by deduplicating for 15 days. Well the thing "works" and my disk isn't full anymore, so that's a very partial success, but still l wonder if the gain is worth the effort... Best regards. ॐ -- Swâmi PetarameshPGP 9076E32E -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: Fix extents after finding all errors
On Thu, Dec 01, 2016 at 11:28:08AM -0600, Goldwyn Rodrigues wrote: > From: Goldwyn Rodrigues> > Simplifying the logic of fixing. > > Calling fixup_extent_ref() after encountering every error causes > more error messages after the extent is fixed. In case of multiple errors, > this is confusing because the error message is displayed after the fix > message and it works on stale data. It is best to show all errors and > then fix the extents. > > Set a variable and call fixup_extent_ref() if it is set. err is not used, > so cleared it. > > Changes since v1: > + assign fix from ret for a correct repair_abort code path > > Signed-off-by: Goldwyn Rodrigues Applied, thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs-progs: man mkfs: warn about RAID5/6 being experimental
On Mon, Nov 28, 2016 at 07:52:43PM +0100, Adam Borowski wrote: > Signed-off-by: Adam BorowskiApplied, thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] btrfs: qgroup: Introduce extent changeset for qgroup reserve functions
On Friday, December 02, 2016 10:03:06 AM Qu Wenruo wrote: > Introduce a new parameter, struct extent_changeset for > btrfs_qgroup_reserved_data() and its callers. > > Such extent_changeset was used in btrfs_qgroup_reserve_data() to record > which range it reserved in current reserve, so it can free it at error > path. > > The reason we need to export it to callers is, at buffered write error > path, without knowing what exactly which range we reserved in current > allocation, we can free space which is not reserved by us. > > This will lead to qgroup reserved space underflow. > Hi Qu, On which git tree are these patches based off? This patch fails to apply cleanly on kdave/for-next branch that is available as of today. -- chandan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] btrfs-convert: Fix migrate_super_block() to work with 64k sectorsize
migrate_super_block() uses sectorsize to refer to the size of the superblock. Hence on 64k sectorsize filesystems, it ends up computing checksum beyond the super block length (i.e. BTRFS_SUPER_INFO_SIZE). This commit fixes the bug by using BTRFS_SUPER_INFO_SIZE instead of sectorsize of the underlying filesystem. Signed-off-by: Chandan Rajendra--- convert/main.c | 23 --- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/convert/main.c b/convert/main.c index 1148a36..fd6f77b 100644 --- a/convert/main.c +++ b/convert/main.c @@ -1360,7 +1360,7 @@ err: /* * Migrate super block to its default position and zero 0 ~ 16k */ -static int migrate_super_block(int fd, u64 old_bytenr, u32 sectorsize) +static int migrate_super_block(int fd, u64 old_bytenr) { int ret; struct extent_buffer *buf; @@ -1368,13 +1368,13 @@ static int migrate_super_block(int fd, u64 old_bytenr, u32 sectorsize) u32 len; u32 bytenr; - buf = malloc(sizeof(*buf) + sectorsize); + buf = malloc(sizeof(*buf) + BTRFS_SUPER_INFO_SIZE); if (!buf) return -ENOMEM; - buf->len = sectorsize; - ret = pread(fd, buf->data, sectorsize, old_bytenr); - if (ret != sectorsize) + buf->len = BTRFS_SUPER_INFO_SIZE; + ret = pread(fd, buf->data, BTRFS_SUPER_INFO_SIZE, old_bytenr); + if (ret != BTRFS_SUPER_INFO_SIZE) goto fail; super = (struct btrfs_super_block *)buf->data; @@ -1382,19 +1382,20 @@ static int migrate_super_block(int fd, u64 old_bytenr, u32 sectorsize) btrfs_set_super_bytenr(super, BTRFS_SUPER_INFO_OFFSET); csum_tree_block_size(buf, BTRFS_CRC32_SIZE, 0); - ret = pwrite(fd, buf->data, sectorsize, BTRFS_SUPER_INFO_OFFSET); - if (ret != sectorsize) + ret = pwrite(fd, buf->data, BTRFS_SUPER_INFO_SIZE, + BTRFS_SUPER_INFO_OFFSET); + if (ret != BTRFS_SUPER_INFO_SIZE) goto fail; ret = fsync(fd); if (ret) goto fail; - memset(buf->data, 0, sectorsize); + memset(buf->data, 0, BTRFS_SUPER_INFO_SIZE); for (bytenr = 0; bytenr < BTRFS_SUPER_INFO_OFFSET; ) { len = BTRFS_SUPER_INFO_OFFSET - bytenr; - if (len > sectorsize) - len = sectorsize; + if (len > BTRFS_SUPER_INFO_SIZE) + len = BTRFS_SUPER_INFO_SIZE; ret = pwrite(fd, buf->data, len, bytenr); if (ret != len) { fprintf(stderr, "unable to zero fill device\n"); @@ -2519,7 +2520,7 @@ static int do_convert(const char *devname, int datacsum, int packing, * If this step succeed, we get a mountable btrfs. Otherwise * the source fs is left unchanged. */ - ret = migrate_super_block(fd, mkfs_cfg.super_bytenr, blocksize); + ret = migrate_super_block(fd, mkfs_cfg.super_bytenr); if (ret) { error("unable to migrate super block: %d", ret); goto fail; -- 2.5.5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] btrfs-progs: btrfs-convert: Prevent accounting blocks beyond end of device
When looping across data block bitmap, __ext2_add_one_block() may add blocks which do not exist on the underlying disk. This commit prevents this from happening by checking the block index against the maximum block count that was present in the ext4 filesystem instance that is being converted. Signed-off-by: Chandan Rajendra--- convert/main.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/convert/main.c b/convert/main.c index 4b4cea4..1148a36 100644 --- a/convert/main.c +++ b/convert/main.c @@ -1525,6 +1525,9 @@ static int __ext2_add_one_block(ext2_filsys fs, char *bitmap, offset /= EXT2FS_CLUSTER_RATIO(fs); offset += group_nr * EXT2_CLUSTERS_PER_GROUP(fs->super); for (i = 0; i < EXT2_CLUSTERS_PER_GROUP(fs->super); i++) { + if ((i + offset) >= ext2fs_blocks_count(fs->super)) + break; + if (ext2fs_test_bit(i, bitmap)) { u64 start; -- 2.5.5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/6] Btrfs: set single device limit for dax usecase
On Wed, Dec 07, 2016 at 01:45:06PM -0800, Liu Bo wrote: > Dax on btrfs is not ready for multiple device. How about DUP? Technically it's not multi-device but still stores multiple copies, so I don't know if the implementation is ok with that. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PULL] Btrfs cleanups for 4.10
Hi, now that the regular patches are on the way to 4.10, it's a good time to squeeze the cleanups before we start merging further patches. This pull request contains the series from Jeff, the code changes are all over the map. I think we won't find a better time to merge that so be it now. The branch is based on my previous pull request head, and with a signed tag. The following changes since commit 515bdc479097ec9d5f389202842345af3162f71c: Merge branch 'misc-4.10' into for-chris-4.10-20161130 (2016-11-30 14:02:20 +0100) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-chris-4.10-20161206 for you to fetch changes up to 34441361c4f52a5f6e41d8de8e5debbeb415dbf0: btrfs: opencode chunk locking, remove helpers (2016-12-06 16:07:00 +0100) David Sterba (1): btrfs: opencode chunk locking, remove helpers Jeff Mahoney (19): btrfs: call functions that overwrite their root parameter with fs_info btrfs: call functions that always use the same root with fs_info instead btrfs: btrfs_init_new_device should use fs_info->dev_root btrfs: alloc_reserved_file_extent trace point should use extent_root btrfs: struct btrfsic_state->root should be an fs_info btrfs: struct reada_control.root -> reada_control.fs_info btrfs: root->fs_info cleanup, use fs_info->dev_root everywhere btrfs: root->fs_info cleanup, io_ctl_init btrfs: pull node/sector/stripe sizes out of root and into fs_info btrfs: root->fs_info cleanup, btrfs_calc_{trans,trunc}_metadata_size btrfs: root->fs_info cleanup, lock/unlock_chunks btrfs: root->fs_info cleanup, update_block_group{,flags} btrfs: root->fs_info cleanup, add fs_info convenience variables btrfs: root->fs_info cleanup, access fs_info->delayed_root directly btrfs: convert extent-tree tracepoints to use fs_info btrfs: simplify btrfs_wait_cache_io prototype btrfs: take an fs_info directly when the root is not used otherwise btrfs: split btrfs_wait_marked_extents into normal and tree log functions btrfs: remove root parameter from transaction commit/end routines fs/btrfs/backref.c | 10 +- fs/btrfs/check-integrity.c | 73 +- fs/btrfs/check-integrity.h |5 +- fs/btrfs/compression.c | 54 +- fs/btrfs/ctree.c | 468 ++-- fs/btrfs/ctree.h | 216 +++--- fs/btrfs/delayed-inode.c | 138 ++-- fs/btrfs/delayed-inode.h | 19 +- fs/btrfs/dev-replace.c | 68 +- fs/btrfs/dev-replace.h |4 +- fs/btrfs/dir-item.c| 45 +- fs/btrfs/disk-io.c | 546 +++--- fs/btrfs/disk-io.h | 30 +- fs/btrfs/export.c | 10 +- fs/btrfs/extent-tree.c | 1300 fs/btrfs/extent_io.c | 63 +- fs/btrfs/extent_io.h |8 +- fs/btrfs/file-item.c | 152 ++-- fs/btrfs/file.c| 214 +++--- fs/btrfs/free-space-cache.c| 154 ++-- fs/btrfs/free-space-cache.h| 12 +- fs/btrfs/free-space-tree.c | 44 +- fs/btrfs/inode-item.c | 11 +- fs/btrfs/inode-map.c | 22 +- fs/btrfs/inode.c | 765 ++- fs/btrfs/ioctl.c | 577 +++--- fs/btrfs/ordered-data.c| 38 +- fs/btrfs/ordered-data.h|4 +- fs/btrfs/print-tree.c | 19 +- fs/btrfs/print-tree.h |4 +- fs/btrfs/props.c |5 +- fs/btrfs/qgroup.c | 64 +- fs/btrfs/qgroup.h |2 +- fs/btrfs/raid56.c | 62 +- fs/btrfs/raid56.h |8 +- fs/btrfs/reada.c | 34 +- fs/btrfs/relocation.c | 257 --- fs/btrfs/root-tree.c | 28 +- fs/btrfs/scrub.c | 166 ++-- fs/btrfs/send.c| 33 +- fs/btrfs/super.c | 134 ++-- fs/btrfs/tests/btrfs-tests.c | 13 +- fs/btrfs/tests/btrfs-tests.h |4 +- fs/btrfs/tests/extent-buffer-tests.c |7 +- fs/btrfs/tests/extent-io-tests.c |5 +- fs/btrfs/tests/free-space-tests.c | 18 +- fs/btrfs/tests/free-space-tree-tests.c |9 +- fs/btrfs/tests/inode-tests.c | 16 +- fs/btrfs/tests/qgroup-tests.c | 11 +- fs/btrfs/transaction.c | 615 +++ fs/btrfs/transaction.h | 29 +- fs/btrfs/tree-log.c| 195 ++--- fs/btrfs/uuid-tree.c
Re: LSF/MM 2017: Call for Proposals
On Wed 07-12-16 06:57:06, James Bottomley wrote: [...] > Just on this point, since there seems to be a lot of confusion: lsf-pc > is the list for contacting the programme committee, so you cannot > subscribe to it. > > There is no -discuss equivalent, like kernel summit has, because we > expect you to cc the relevant existing mailing list and have the > discussion there instead rather than expecting people to subscribe to a > new list. There used to be l...@lists.linux-foundation.org. Is it not active anymore? -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Lsf-pc] LSF/MM 2017: Call for Proposals
On Thu 08-12-16 13:26:19, Michal Hocko wrote: > On Wed 07-12-16 06:57:06, James Bottomley wrote: > [...] > > Just on this point, since there seems to be a lot of confusion: lsf-pc > > is the list for contacting the programme committee, so you cannot > > subscribe to it. > > > > There is no -discuss equivalent, like kernel summit has, because we > > expect you to cc the relevant existing mailing list and have the > > discussion there instead rather than expecting people to subscribe to a > > new list. > > There used to be l...@lists.linux-foundation.org. Is it not active > anymore? No sure about the current state but usually it gets populated by the selected attendees each year. Honza -- Jan KaraSUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/6] Btrfs: add DAX support for nocow btrfs
On Wed 07-12-16 17:15:42, Chris Mason wrote: > On 12/07/2016 04:45 PM, Liu Bo wrote: > >This has implemented DAX support for btrfs with nocow and single-device. > > > >DAX is developed for block devices that are memory-like in order to avoid > >double buffer in both page cache and the storage, so DAX can performs reads > >and > >writes directly to the storage device, and for those who prefer to using > >filesystem, filesystem dax support can help to map the storage into userspace > >for file-mapping. > > > >Since I haven't figure out how to map multiple devices to userspace without > >pagecache, this DAX support is only for single-device, and I don't think > >DAX(Direct Access) can work with cow, this is limited to nocow case. I made > >this by setting nodatacow in dax mount option. > > Interesting, this is a nice small start. It might make more sense to limit > snapshots to readonly in DAX mode until we can figure out how to cow > properly. I think it can be done, I just need to sit down with the dax code > to do a good review. > > But bigger picture, if we can't cow and we can't crc and we can't > multi-device, I'd rather let XFS/ext4 sort out the dax space until we pull > in more of the btrfs features too. So normal DAX IO (via read(2) and write(2)) is very similar to direct IO so I don't think there would be any obstacle to support all the features with that. For mmap(2) things get more difficult but still: The filesystem gets normal ->fault notifications when the page is first faulted in. So you can COW if you need to at that moment. Also DAX PTEs can be write-protected (well, as of the coming merge window) as normal PTEs and then you'll get ->pfn_mkwrite / ->page_mkwrite notification when someone tries to write via mmap and you can do your stuff at that point. So DAX mappings are not that different from filesystem point of view. There are some differences wrt. locking (you don't have page lock, but you use a lock bit in radix tree entry instead for that) but that's about it. So I don't see a principial reason why we cannot support all btrfs features for DAX... But if you see some problem, let me know and we can talk if we could somehow help from the DAX side. BTW, I also don't see how the multiple devices are a problem. Actually XFS supports that (with its real-time devices) just fine - your ->iomap_begin() returns apair and that should be all that's needed, no? Honza Honza -- Jan Kara SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/2] fstests: btrfs: Use _require_btrfs_qgroup_report to replace open code
On Thu, Dec 08, 2016 at 05:20:49PM +0800, Qu Wenruo wrote: > > > At 12/08/2016 04:47 PM, Eryu Guan wrote: > > On Thu, Dec 08, 2016 at 04:12:13PM +0800, Qu Wenruo wrote: > > > Introduce new _require_btrfs_qgroup_report function, which will check > > > the accessibility to "btrfs check --qgroup-report", then set a global > > > flag to info _check_scratch_fs() to do extra qgroup check. > > > > > > Signed-off-by: Qu Wenruo> > > --- > > > v2: > > > Use "${RESULT_DIR}/require_scratch.require_qgroup_report" instead of > > > global variant > > > Rebased to latest master > > > Replace btrfsck with $BTRFS_UTIL_PROG check. > > [snip] > > > diff --git a/tests/btrfs/042 b/tests/btrfs/042 > > > index 498ccc9..dc9b762 100755 > > > --- a/tests/btrfs/042 > > > +++ b/tests/btrfs/042 > > > @@ -43,6 +43,7 @@ _cleanup() > > > _supported_fs btrfs > > > _supported_os Linux > > > _require_scratch > > > +_require_btrfs_qgroup_report > > > > > > rm -f $seqres.full > > > > > > @@ -84,10 +85,7 @@ for i in `seq 10 -1 1`; do > > > total_written=$(($total_written+$filesize)) > > > done > > > > > > -#check if total written exceeds limit > > > -if [ $total_written -gt $LIMIT_SIZE ];then > > > - _fail "total written should be less than $LIMIT_SIZE" > > > -fi > > > +# qgroup will be checked automatically at _check_scratch_fs() by fstest > > > > This doesn't look like an equivalent replacement, and btrfs/042 fails > > for me after this update (wrong qgroup numbers) on 4.9-rc4 kernel. Is > > this change intentional? > > > > Thanks, > > Eryu > > > > > Sorry, just a typo.For 042 only the _require_btrfs_qgroup_report line is > need. > > I also ran the test with v4.9-rc6 with some new btrfs patches (David's > for-next branch), it ran without problem. > > So the error seems to be a fixed btrfs qgroup problem. Ok, good to know. Can you please send an updated patch? Thanks, Eryu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/2] fstests: btrfs: Use _require_btrfs_qgroup_report to replace open code
At 12/08/2016 04:47 PM, Eryu Guan wrote: On Thu, Dec 08, 2016 at 04:12:13PM +0800, Qu Wenruo wrote: Introduce new _require_btrfs_qgroup_report function, which will check the accessibility to "btrfs check --qgroup-report", then set a global flag to info _check_scratch_fs() to do extra qgroup check. Signed-off-by: Qu Wenruo--- v2: Use "${RESULT_DIR}/require_scratch.require_qgroup_report" instead of global variant Rebased to latest master Replace btrfsck with $BTRFS_UTIL_PROG check. [snip] diff --git a/tests/btrfs/042 b/tests/btrfs/042 index 498ccc9..dc9b762 100755 --- a/tests/btrfs/042 +++ b/tests/btrfs/042 @@ -43,6 +43,7 @@ _cleanup() _supported_fs btrfs _supported_os Linux _require_scratch +_require_btrfs_qgroup_report rm -f $seqres.full @@ -84,10 +85,7 @@ for i in `seq 10 -1 1`; do total_written=$(($total_written+$filesize)) done -#check if total written exceeds limit -if [ $total_written -gt $LIMIT_SIZE ];then - _fail "total written should be less than $LIMIT_SIZE" -fi +# qgroup will be checked automatically at _check_scratch_fs() by fstest This doesn't look like an equivalent replacement, and btrfs/042 fails for me after this update (wrong qgroup numbers) on 4.9-rc4 kernel. Is this change intentional? Thanks, Eryu Sorry, just a typo.For 042 only the _require_btrfs_qgroup_report line is need. I also ran the test with v4.9-rc6 with some new btrfs patches (David's for-next branch), it ran without problem. So the error seems to be a fixed btrfs qgroup problem. Thanks, Qu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/2] fstests: btrfs: Use _require_btrfs_qgroup_report to replace open code
On Thu, Dec 08, 2016 at 04:12:13PM +0800, Qu Wenruo wrote: > Introduce new _require_btrfs_qgroup_report function, which will check > the accessibility to "btrfs check --qgroup-report", then set a global > flag to info _check_scratch_fs() to do extra qgroup check. > > Signed-off-by: Qu Wenruo> --- > v2: > Use "${RESULT_DIR}/require_scratch.require_qgroup_report" instead of > global variant > Rebased to latest master > Replace btrfsck with $BTRFS_UTIL_PROG check. [snip] > diff --git a/tests/btrfs/042 b/tests/btrfs/042 > index 498ccc9..dc9b762 100755 > --- a/tests/btrfs/042 > +++ b/tests/btrfs/042 > @@ -43,6 +43,7 @@ _cleanup() > _supported_fs btrfs > _supported_os Linux > _require_scratch > +_require_btrfs_qgroup_report > > rm -f $seqres.full > > @@ -84,10 +85,7 @@ for i in `seq 10 -1 1`; do > total_written=$(($total_written+$filesize)) > done > > -#check if total written exceeds limit > -if [ $total_written -gt $LIMIT_SIZE ];then > - _fail "total written should be less than $LIMIT_SIZE" > -fi > +# qgroup will be checked automatically at _check_scratch_fs() by fstest This doesn't look like an equivalent replacement, and btrfs/042 fails for me after this update (wrong qgroup numbers) on 4.9-rc4 kernel. Is this change intentional? Thanks, Eryu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 2/2] fstests: btrfs: Use _require_btrfs_qgroup_report to replace open code
Introduce new _require_btrfs_qgroup_report function, which will check the accessibility to "btrfs check --qgroup-report", then set a global flag to info _check_scratch_fs() to do extra qgroup check. Signed-off-by: Qu Wenruo--- v2: Use "${RESULT_DIR}/require_scratch.require_qgroup_report" instead of global variant Rebased to latest master Replace btrfsck with $BTRFS_UTIL_PROG check. --- check | 4 ++-- common/btrfs| 22 +- common/rc | 5 +++-- tests/btrfs/022 | 4 tests/btrfs/028 | 5 ++--- tests/btrfs/042 | 6 ++ tests/btrfs/099 | 1 + tests/btrfs/104 | 20 +--- tests/btrfs/122 | 9 +++-- tests/btrfs/123 | 5 ++--- tests/btrfs/126 | 1 + 11 files changed, 46 insertions(+), 36 deletions(-) diff --git a/check b/check index 8f2a1bb..76eb054 100755 --- a/check +++ b/check @@ -405,11 +405,11 @@ _check_filesystems() { if [ -f ${RESULT_DIR}/require_test ]; then _check_test_fs || err=true - rm -f ${RESULT_DIR}/require_test + rm -f ${RESULT_DIR}/require_test* fi if [ -f ${RESULT_DIR}/require_scratch ]; then _check_scratch_fs || err=true - rm -f ${RESULT_DIR}/require_scratch + rm -f ${RESULT_DIR}/require_scratch* fi } diff --git a/common/btrfs b/common/btrfs index 302edc6..a860228 100644 --- a/common/btrfs +++ b/common/btrfs @@ -40,6 +40,13 @@ _require_btrfs_command() [ $? -eq 0 ] || _notrun "$BTRFS_UTIL_PROG too old (must support $cmd $param)" } +# Require extra check on btrfs qgroup numbers +_require_btrfs_qgroup_report() +{ + _require_btrfs_command check --qgroup-report + touch ${RESULT_DIR}/require_scratch.require_qgroup_report +} + _run_btrfs_util_prog() { run_check $BTRFS_UTIL_PROG $* @@ -97,7 +104,20 @@ _check_btrfs_filesystem() mountpoint=`_umount_or_remount_ro $device` fi - btrfsck $device >$tmp.fsck 2>&1 + if [ -f ${RESULT_DIR}/require_scratch.require_qgroup_report ]; then + $BTRFS_UTIL_PROG check $device --qgroup-report > $tmp.qgroup_report 2>&1 + if grep -qE "Counts for qgroup.*are different" $tmp.qgroup_report ; then + echo "_check_btrfs_filesystem: filesystem on $device has wrong qgroup numbers (see $seqres.full)" + echo "_check_btrfs_filesystem: filesystem on $device has wrong qgroup numbers" \ + >> $seqres.full + echo "*** qgroup_report.$FSTYP output ***" >>$seqres.full + cat $tmp.qgroup_report >>$seqres.full + echo "*** qgroup_report.$FSTYP output ***" >>$seqres.full + fi + rm -f $tmp.qgroup_report + fi + + $BTRFS_UTIL_PROG check $device >$tmp.fsck 2>&1 if [ $? -ne 0 ]; then echo "_check_btrfs_filesystem: filesystem on $device is inconsistent (see $seqres.full)" diff --git a/common/rc b/common/rc index 2719b23..b50283c 100644 --- a/common/rc +++ b/common/rc @@ -1222,8 +1222,9 @@ _notrun() { echo "$*" > $seqres.notrun echo "$seq not run: $*" -rm -f ${RESULT_DIR}/require_test -rm -f ${RESULT_DIR}/require_scratch +rm -f ${RESULT_DIR}/require_test* +rm -f ${RESULT_DIR}/require_scratch* + status=0 exit } diff --git a/tests/btrfs/022 b/tests/btrfs/022 index 56d4f3d..9abad6c 100755 --- a/tests/btrfs/022 +++ b/tests/btrfs/022 @@ -43,6 +43,7 @@ _cleanup() _supported_fs btrfs _supported_os Linux _require_scratch +_require_btrfs_qgroup_report rm -f $seqres.full @@ -125,16 +126,19 @@ _scratch_mkfs > /dev/null 2>&1 _scratch_mount _basic_test _scratch_unmount +_check_scratch_fs _scratch_mkfs > /dev/null 2>&1 _scratch_mount _rescan_test _scratch_unmount +_check_scratch_fs _scratch_mkfs > /dev/null 2>&1 _scratch_mount _limit_test_exceed _scratch_unmount +_check_scratch_fs _scratch_mkfs > /dev/null 2>&1 _scratch_mount diff --git a/tests/btrfs/028 b/tests/btrfs/028 index 1425609..a3d9a27 100755 --- a/tests/btrfs/028 +++ b/tests/btrfs/028 @@ -51,6 +51,7 @@ rm -f $seqres.full _supported_fs btrfs _supported_os Linux _require_scratch +_require_btrfs_qgroup_report _scratch_mkfs _scratch_mount @@ -86,9 +87,7 @@ _run_btrfs_util_prog filesystem sync $SCRATCH_MNT _scratch_unmount -# generate a qgroup report and look for inconsistent groups -$BTRFS_UTIL_PROG check --qgroup-report $SCRATCH_DEV 2>&1 | \ - grep -E "Counts for qgroup.*are different" +# qgroup will be checked at _check_scratch_fs() by fstest. echo "Silence is golden" status=0 diff --git a/tests/btrfs/042 b/tests/btrfs/042 index 498ccc9..dc9b762 100755 --- a/tests/btrfs/042 +++ b/tests/btrfs/042 @@ -43,6 +43,7 @@ _cleanup() _supported_fs btrfs _supported_os Linux _require_scratch
[PATCH v2 1/2] fstests: common: rename and enhance _require_btrfs to _require_btrfs_command
Rename _require_btrfs() to _require_btrfs_command() to avoid confusion, as all other _require_btrfs_* has a quite clear suffix, like _require_btrfs_mkfs_feature() or _require_btrfs_fs_feature(). Also enhance _require_btrfs_command() to accept 2nd level commands or options. Options will be determined by the first "-" char. This is quite useful for case like "btrfs inspect-internal dump-tree" and "btrfs check --qgroup-report". Signed-off-by: Qu Wenruo--- v2: Replace _subcommand with _command. Rebase to latest master Use grep -w and -q to make it safer and less noisy Update the _notrun string Fix typo _not_run --- common/btrfs| 22 -- tests/btrfs/004 | 3 ++- tests/btrfs/048 | 2 +- tests/btrfs/059 | 2 +- tests/btrfs/131 | 2 +- 5 files changed, 25 insertions(+), 6 deletions(-) diff --git a/common/btrfs b/common/btrfs index ab6497d..302edc6 100644 --- a/common/btrfs +++ b/common/btrfs @@ -10,16 +10,34 @@ _btrfs_get_subvolid() $BTRFS_UTIL_PROG sub list $mnt | grep $name | awk '{ print $2 }' } +# _require_btrfs_command [|] # We check for btrfs and (optionally) features of the btrfs command -_require_btrfs() +# It can both subfunction like "inspect-internal dump-tree" and +# options like "check --qgroup-report" +_require_btrfs_command() { cmd=$1 + param=$2 + _require_command "$BTRFS_UTIL_PROG" btrfs if [ -z "$1" ]; then return 1; fi - $BTRFS_UTIL_PROG $cmd --help >/dev/null 2>&1 + $BTRFS_UTIL_PROG $cmd --help &> /dev/null [ $? -eq 0 ] || _notrun "$BTRFS_UTIL_PROG too old (must support $cmd)" + + test -z "$param" && return + + # If $param is an option, replace leading "-"s for grep + if [ ${param:0:1} == "-" ]; then + safe_param=$(echo $param | sed 's/^-*//') + $BTRFS_UTIL_PROG $cmd --help | grep -wq $safe_param || \ + _notrun "$BTRFS_UTIL_PROG too old (must support $cmd $param)" + return + fi + + $BTRFS_UTIL_PROG $cmd $param --help &> /dev/null + [ $? -eq 0 ] || _notrun "$BTRFS_UTIL_PROG too old (must support $cmd $param)" } _run_btrfs_util_prog() diff --git a/tests/btrfs/004 b/tests/btrfs/004 index 905770a..3f8330f 100755 --- a/tests/btrfs/004 +++ b/tests/btrfs/004 @@ -51,7 +51,8 @@ _supported_fs btrfs _supported_os Linux _require_scratch _require_no_large_scratch_dev -_require_btrfs inspect-internal +_require_btrfs_command inspect-internal logical-resolve +_require_btrfs_command inspect-internal inode-resolve _require_command "/usr/sbin/filefrag" filefrag rm -f $seqres.full diff --git a/tests/btrfs/048 b/tests/btrfs/048 index 0b907b0..e03b3c5 100755 --- a/tests/btrfs/048 +++ b/tests/btrfs/048 @@ -48,7 +48,7 @@ _supported_fs btrfs _supported_os Linux _require_test _require_scratch -_require_btrfs "property" +_require_btrfs_command "property" send_files_dir=$TEST_DIR/btrfs-test-$seq diff --git a/tests/btrfs/059 b/tests/btrfs/059 index 8f106d2..2d1ec23 100755 --- a/tests/btrfs/059 +++ b/tests/btrfs/059 @@ -51,7 +51,7 @@ _supported_fs btrfs _supported_os Linux _require_test _require_scratch -_require_btrfs "property" +_require_btrfs_command "property" rm -f $seqres.full diff --git a/tests/btrfs/131 b/tests/btrfs/131 index d1a11d2..ce486e6 100755 --- a/tests/btrfs/131 +++ b/tests/btrfs/131 @@ -48,7 +48,7 @@ rm -f $seqres.full _supported_fs btrfs _supported_os Linux _require_scratch -_require_btrfs inspect-internal +_require_btrfs_command inspect-internal dump-super mkfs_v1() { -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html