Re: How to find (out if) files sharing content?
On Wed, Oct 31, 2012 at 09:02:15PM +0800, Jeff Liu wrote: I propose this because OCFS2 report shared space in this way combine with du(1). An old patch set to teach du(1) aware of reflinked file: https://oss.oracle.com/pipermail/ocfs2-devel/2010-September/007293.html Patch looks ok, the shared size is requested by an option. Do you means that the costs is very expensive for userland extent status checkup per file? The most expensive part is IMO not in userspace, it does in-memory lookups. And without any possibility to turn this off,I'm afraid this will render FIEMAP unusable in practice. For OCFS2, the FIEMAP_EXTENT_SHARED flag will be set upon fiemap ioctl(2) if an extent is OCFS2_EXT_REFCOUNTED(i.e. reflinked or cloned), which means that FIEMAP_EXTENT_SHARED is not a persistent flag, but I have no idea how Btrfs would be in this point. :( After some research, I think this could work for btrfs without unwanted performance penalties. There's the fiemap::fm_flags field that can be extended to request the shared extent info from fiemap, so the information is not computed unconditionally (that was my concern before). The rest is only implementation details how to speed up the file extent - refcount info lookups. david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to find (out if) files sharing content?
On 11/06/2012 06:45 AM, David Sterba wrote: On Wed, Oct 31, 2012 at 09:02:15PM +0800, Jeff Liu wrote: I propose this because OCFS2 report shared space in this way combine with du(1). An old patch set to teach du(1) aware of reflinked file: https://oss.oracle.com/pipermail/ocfs2-devel/2010-September/007293.html Patch looks ok, the shared size is requested by an option. Do you means that the costs is very expensive for userland extent status checkup per file? The most expensive part is IMO not in userspace, it does in-memory lookups. And without any possibility to turn this off,I'm afraid this will render FIEMAP unusable in practice. For OCFS2, the FIEMAP_EXTENT_SHARED flag will be set upon fiemap ioctl(2) if an extent is OCFS2_EXT_REFCOUNTED(i.e. reflinked or cloned), which means that FIEMAP_EXTENT_SHARED is not a persistent flag, but I have no idea how Btrfs would be in this point. :( After some research, I think this could work for btrfs without unwanted performance penalties. There's the fiemap::fm_flags field that can be extended to request the shared extent info from fiemap, so the information is not computed unconditionally (that was my concern before). The rest is only implementation details how to speed up the file extent - refcount info lookups. Thanks for your confirmation. -Jeff david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to find (out if) files sharing content?
On Wed, Oct 31, 2012 at 10:30:22AM +0800, Jeff Liu wrote: One idea is to mark those cloned extents as FIEMAP_EXTENT_SHARED so that we can go through a file to figure out how many extents are shared through fiemap(2), and calculate the real storage(fs/subvolume) footprint in the end. This will cost at least one more seek per extent to find out that the extent is shared, could be quite expensive. And without any possibility to turn this off, I'm afraid this will render FIEMAP unusable in practice. david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to find (out if) files sharing content?
On 10/31/2012 07:31 PM, David Sterba wrote: On Wed, Oct 31, 2012 at 10:30:22AM +0800, Jeff Liu wrote: One idea is to mark those cloned extents as FIEMAP_EXTENT_SHARED so that we can go through a file to figure out how many extents are shared through fiemap(2), and calculate the real storage(fs/subvolume) footprint in the end. This will cost at least one more seek per extent to find out that the extent is shared, could be quite expensive. I propose this because OCFS2 report shared space in this way combine with du(1). An old patch set to teach du(1) aware of reflinked file: https://oss.oracle.com/pipermail/ocfs2-devel/2010-September/007293.html Do you means that the costs is very expensive for userland extent status checkup per file? If yes, I have once tested an 50Gb OCFS2 partition filled with reflinked files on an old laptop, it spent around 4 minutes to show the totally results if I recalled correct, but this definitely depending on the real world scenarios. And without any possibility to turn this off,I'm afraid this will render FIEMAP unusable in practice. For OCFS2, the FIEMAP_EXTENT_SHARED flag will be set upon fiemap ioctl(2) if an extent is OCFS2_EXT_REFCOUNTED(i.e. reflinked or cloned), which means that FIEMAP_EXTENT_SHARED is not a persistent flag, but I have no idea how Btrfs would be in this point. :( Thanks, -Jeff david -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
How to find (out if) files sharing content?
Hi, How could one find out if 2 files share any extents on a btrfs file system? A more generic variation of the above: How to list files on the same file system/subvolume sharing content? Thanks, Gábor -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to find (out if) files sharing content?
On Tue, Oct 30, 2012 at 04:20:05PM +0100, Gábor Nyers wrote: Hi, How could one find out if 2 files share any extents on a btrfs file system? A more generic variation of the above: How to list files on the same file system/subvolume sharing content? You have direct (read-only) access to the metadata trees through the TREE_SEARCH ioctl. It should be possible to walk through the extents of a given file, and (I think) follow back-refs from the extent back to the other files that share it. There's no simple code to do that right now, though. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- And what rough beast, its hour come round at last / slouches --- towards Bethlehem, to be born? signature.asc Description: Digital signature
Re: How to find (out if) files sharing content?
On Tue, October 30, 2012 at 16:39 (+0100), Hugo Mills wrote: It should be possible to walk through the extents of a given file, and (I think) follow back-refs from the extent back to the other files that share it. You wish :-) Backrefs are not made to walk them while the file system is online. However btrfs inspect logical manages quite well, at least I haven't heard otherwise so far. You still need to get the logical block numbers, either by TREE_SEARCH ioctl or by filefrag. -Jan -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to find (out if) files sharing content?
On 10/30/2012 11:20 PM, Gábor Nyers wrote: Hi, How could one find out if 2 files share any extents on a btrfs file system? A more generic variation of the above: How to list files on the same file system/subvolume sharing content? Indeed ocfs2 already has the feature where you can get shared parts via 'du', we're planning to support this in btrfs, too. thanks, liubo Thanks, Gábor -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: How to find (out if) files sharing content?
On 10/31/2012 08:40 AM, Liu Bo wrote: On 10/30/2012 11:20 PM, Gábor Nyers wrote: Hi, How could one find out if 2 files share any extents on a btrfs file system? A more generic variation of the above: How to list files on the same file system/subvolume sharing content? One idea is to mark those cloned extents as FIEMAP_EXTENT_SHARED so that we can go through a file to figure out how many extents are shared through fiemap(2), and calculate the real storage(fs/subvolume) footprint in the end. Thanks, -Jeff Indeed ocfs2 already has the feature where you can get shared parts via 'du', we're planning to support this in btrfs, too. thanks, liubo Thanks, Gábor -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html