On 2017年09月11日 17:14, Qu Wenruo wrote:


On 2017年09月11日 16:57, shally verma wrote:
On Mon, Sep 11, 2017 at 1:42 PM, Qu Wenruo <quwenruo.bt...@gmx.com> wrote:


On 2017年09月11日 15:54, shally verma wrote:

On Mon, Sep 11, 2017 at 12:16 PM, Qu Wenruo <quwenruo.bt...@gmx.com>
wrote:



On 2017年09月11日 14:05, shally verma wrote:


I was going through  BTRFS Deduplication page
(https://btrfs.wiki.kernel.org/index.php/Deduplication) and I read

"As such, xfs_io, is able to perform deduplication on a BTRFS file
system," ..

following this, I followed on to xfs_io link
https://linux.die.net/man/8/xfs_io

As I understand, these are set of commands allow us to do different
operations on "xfs" filesystem.



Nope, it's just a tool triggering different read/write or ioctls.
In fact most of its command is fs independent.
Only a limited number of operations are only supported by XFS.

It's just due to historical reasons it's still named as xfs_io.

I won't be surprised if one day it's split as an independent tool.

and command set mentioned here, couldn't see which is command to
invoke dedupe task.



"dedupe" and "reflink" command.

Oh. That means page link referred on BTRFS Wiki page is not updated
with this. I googled another page that has reference of these two
command in xfs_io here
https://www.systutorials.com/docs/linux/man/8-xfs_io/
May be Wiki need an update here.


If XFS has a regularly updated online man page, we can just use that.
(But unfortunately, not every fs user tools use asciidoc like btrfs, which
can generate both man page and html).



and how this works with BTRFS.



Fs support FIDEDUPERANGE or BTRFS_IOC_FILE_EXTENT_SAME ioctl can use it
to
determine if two ranges are containing identical data.

And if they are identical, we use FICLONERANGE or BTRFS_IOC_CLONE_RANGE
ioctl to reflink one to another, freeing one of them.

BTW nowadays, such dedupe and reflink ioctl is genericized in VFS.
file_operations structure now includes both clone_file_range() and
dedupe_file_range() callbacks now.

Yea. Understand that part. So going by description of "dedupe" and
"reflink", seems through these commands, one can do deduplication part
and NOT duplicate find part.


Yes, one don't need to call "dedupe" ioctl if they already knows some data
is identical and can go reflink straightforward.

That's still out of xfs_io command scope.


Not sure what the scope here you mean, sorry for that.

By "scope", I meant duplicate find part but that contradicts statement
you just written below:
Since xfs_io can be used to find duplication,

Since "dedupe" command input only a "source file" and src and
dst_offset within that, so it can deduplicate the content within a
file where actual FS dedupe IOCTL can first ensure if two extents are
identical and if yes, then deduplicate them.

By "deduplicate", if you mean "removing duplication" then xfs_io "dedupe" command itself doesn't do that.

The old btrfs ioctl describe this better, FILE_EXTENT_SAME.
"dedupe" command itself is only verifying if they have the same content.

So to make it clear, "dedupe" command and ioctl only do the *verification* work.

Sorry, I just checked the code and tried the ioctl.

If they are the same, "dedupe" will do "reflink" part also.

Code also shows that:
---
        /* pass original length for comparison so we stay within i_size */
        ret = btrfs_cmp_data(olen, &cmp);
        if (ret == 0)
                ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1);
---

So "dedupe" ioctl itself can do de-duplication.
And my previous answer is just totally wrong.

Sorry for that,
Qu


"Reflink" will really remove the duplication (or even non-duplicated data if you really want).


But please be careful, "reflink" is much like copy, so it can be executed on file ranges with different contents. In that case, reflink can free some space, but it also modifies the content.

So for full de-duplication, one must go through the full *verify* then *reflink* circle. Although "dedupe"(FILE_EXTENT_SAME) ioctl provides one verification method, it's not the only solution.

But anyway, "dedupe" and "reflink" command provided by xfs_io does provide every pieces to do de-duplication, so the wiki is still correct IMHO.

Thanks,
Qu


Is that correct?

Thanks
Shally

  and can remove duplication, I
don't find anything strange in that wiki page.
(Especially considering how popular the tool is, you can't find any more
handy tool than xfs_io)

Thanks,
Qu


Is that understanding correct?
Thanks
Shally


Thanks,
Qu



So, can anyone help here and point me what am I missing here.

Thanks
Shally
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to