I've merged this back to bug 6048
On 05/09/2012 03:28 PM, Kai Petzke wrote:
> Hello,
>
>
> there has been work by others about adding support for the OCFS2 "reflink"
> ioctl() call, which is similiar to the btrfs "clone" call, and creates a
> copy-on-write copy of the original, thus allowing to "copy" even gigabyte
> sized files within a tiny fraction of a second, and without using much
> additional file system space. See:
> http://lists.gnu.org/archive/html/coreutils/2011-08/msg00046.html
> http://lists.gnu.org/archive/html/bug-coreutils/2010-04/msg00185.html
>
> I have updated those patches to work against coreutils 8.16, removed those
> bugs, that I spotted. In particular, if the destination file exists, the
> "reflink" ist automatically tried again after removing it, and if not all
> attributes are copied, it is made sure, that the following open() system call
> does not truncate the just created copy.
>
> I strongly suggest including that patch in the coreutils package,
I'm less enthused about adding this as it doesn't fit very cleanly.
> even though the interface to use to different system calls to achieve the
> same thing is awkward.
> But, as laid out in the comments in the source, btrfs clone and ocfs2 reflink
> are semantically
> quite different, so that unifying them into one on the kernel side is not
> likely to happen, soon, if it happens at all.
That would be unfortunate. Hopefully a generic reflink() call can be sorted out.
> If users don't use the --reflink option of "cp", the additional code makes no
> difference, so it doesn't hurt.
Fair point.
> And if users use "--reflink" on either of the supported file systems, they
> get a huge advantage out of it!
I really dislike that xattrs are copied unconditionally.
It might be best to auto clear xattrs after the "reflink", if possible?
cheers,
Pádraig.
--- copy.c.orig 2012-03-24 21:26:51.000000000 +0100
+++ copy.c 2012-05-09 16:07:46.000000000 +0200
@@ -60,6 +60,12 @@
#include "areadlink.h"
#include "yesno.h"
+#if HAVE_SYS_VFS_H
+# include <sys/vfs.h>
+#else
+# include <sys/statfs.h>
+#endif
+
#if USE_XATTR
# include <attr/error_context.h>
# include <attr/libattr.h>
@@ -218,6 +224,47 @@
return true;
}
+/* Perform the OCFS2 CoW reflink ioctl(2) operation if possible.
+ When using '-p' option, the file's default attributes(i.e. mode,timestamp,
+ ownership and security context if possbile) are reflinked to the destination
+ file as well. We will then skip over the standard preserve process for such
+ attributes. Also, 'xattrs' are reflinked always even if
'REFLINK_ATTR_NONE'.
+ Upon success, return 0, Otherwise, return -1 and set errno. */
+static inline int
+reflink_file (char const *src_name, char const *dst_name,
+ bool preserve_attrs, int src_fd)
+{
+#ifdef __linux__
+# ifndef REFLINK_ATTR_NONE
+# define REFLINK_ATTR_NONE 0
+# endif
+# ifndef REFLINK_ATTR_PRESERVE
+# define REFLINK_ATTR_PRESERVE 1
+# endif
+# ifndef OCFS2_IOC_REFLINK
+ struct reflink_arguments {
+ uint64_t old_path;
+ uint64_t new_path;
+ uint64_t preserve;
+ };
+# define OCFS2_IOC_REFLINK _IOW ('o', 4, struct reflink_arguments)
+# endif
+ struct reflink_arguments args = {
+ .old_path = (unsigned long) src_name,
+ .new_path = (unsigned long) dst_name,
+ .preserve = preserve_attrs ? REFLINK_ATTR_PRESERVE : REFLINK_ATTR_NONE,
+ };
+ return ioctl (src_fd, OCFS2_IOC_REFLINK, &args);
+#else
+ (void) src_name;
+ (void) dst_name;
+ (void) preserve_attrs;
+ (void) src_fd;
+ errno = ENOTSUP;
+ return -1;
+#endif
+}
+
/* Perform the O(1) btrfs clone operation, if possible.
Upon success, return 0. Otherwise, return -1 and set errno. */
static inline int
@@ -822,11 +869,55 @@
goto close_src_desc;
}
+ bool reflink_ok = false;
+ if (x->reflink_mode)
+ {
+ /* When cp is invoked with '--reflink=[WHEN]', try to do OCFS2 reflink
+ ioctl(2) first. If it fails, then try Btrfs clone later on.
+ The reason to perform those operations separately is because
+ the OCFS2 reflink ioctl() works on file names, while Btrfs clone
+ works on open file descriptors.
+ If OCFS2 reflink ioctl() succeeds and attribute preservation was
+ enabled, we are done. If OCFS2 reflink succeeds and only some of
+ the attributes are preserved, we still have to open the destination
+ file and go through the attribute copying code, but don't need
+ to execute the actual copy. Of course, the open() system call must
+ be performed without O_TRUNC set in that case.
+ If OCFS2 reflink fails, Btrfs clone is tried later on, after the
+ destination file has been opened normally.
+
+ Note, that OCFS2 reflink ioctl() fails with errno set to EEXIST,
+ if the destination file already exists. If that happens, we
+ unlink() the destination file and try again. */
+ bool preserve_attributes = (x->preserve_ownership
+ && x->preserve_mode
+ && x->preserve_timestamps);
+ reflink_ok = reflink_file (src_name, dst_name, preserve_attributes,
+ source_desc) == 0;
+ if (! reflink_ok && errno == EEXIST)
+ {
+ reflink_ok = unlink (dst_name) == 0 &&
+ reflink_file (src_name, dst_name, preserve_attributes,
+ source_desc) == 0;
+ }
+ if (reflink_ok)
+ {
+ *new_dst = false;
+ data_copy_required = false;
+
+ /* Skip over the standard attributes preserve process
+ if reflink succeeds and they are already reflinked. */
+ if (preserve_attributes)
+ goto close_src_desc;
+ }
+ }
+
/* The semantics of the following open calls are mandated
by the specs for both cp and mv. */
if (! *new_dst)
{
- dest_desc = open (dst_name, O_WRONLY | O_TRUNC | O_BINARY);
+ int open_flags = O_WRONLY | (reflink_ok ? 0 : O_TRUNC) | O_BINARY;
+ dest_desc = open (dst_name, open_flags);
dest_errno = errno;
/* When using cp --preserve=context to copy to an existing destination,
@@ -955,18 +1046,19 @@
/* --attributes-only overrides --reflink. */
if (data_copy_required && x->reflink_mode)
{
+ /* If the preceeding OCFS2 reflink failed, try Btrfs clone now.
+ If it fails again and `cp' is invoked with '--reflink=always',
+ report an error, otherwise, fall back to a standard copy. */
bool clone_ok = clone_file (dest_desc, source_desc) == 0;
- if (clone_ok || x->reflink_mode == REFLINK_ALWAYS)
+ if (!clone_ok && x->reflink_mode == REFLINK_ALWAYS)
{
- if (!clone_ok)
- {
- error (0, errno, _("failed to clone %s from %s"),
- quote_n (0, dst_name), quote_n (1, src_name));
- return_val = false;
- goto close_src_and_dst_desc;
- }
- data_copy_required = false;
- }
+ error (0, errno, _("failed to clone %s from %s"),
+ quote_n (0, dst_name), quote_n (1, src_name));
+ return_val = false;
+ goto close_src_and_dst_desc;
+ }
+ if (clone_ok)
+ data_copy_required = false;
}
if (data_copy_required)