On Fri, Mar 12, 2021 at 02:30:15AM +0100, David Disseldorp wrote: > From: Luis Henriques <lhenriq...@suse.de>
I guess you can drop the above as you've added quite a bit of work on top of my original patch (which was just an RFC) ;-) > This helper function uses the copy_file_range(2) Linux syscall for > in-kernel transfer between two file descriptors. Copy-on-write > enabled filesystems may optimize I/O by using reflinks. > > Signed-off-by: David Disseldorp <dd...@suse.de> > --- > configure.ac | 6 ++++++ > src/extern.h | 4 ++++ > src/util.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 60 insertions(+) > > diff --git a/configure.ac b/configure.ac > index 0a35e4c..ed6e3d2 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -51,6 +51,12 @@ AC_COMPILE_CHECK_RETTYPE([minor], [0]) > AC_CHECK_FUNCS([fchmod fchown]) > # This is needed for mingw build > AC_CHECK_FUNCS([setmode getpwuid getpwnam getgrgid getgrnam pipe fork getuid > geteuid]) > +AC_CHECK_FUNC([copy_file_range], > + [AC_CHECK_SIZEOF(off_t) > + AC_CHECK_SIZEOF(loff_t) > + if (test "$ac_cv_sizeof_off_t" -eq "$ac_cv_sizeof_loff_t"); > then > + AC_DEFINE(HAVE_CPIO_REFLINK, [1], [cpio --reflink > support]) > + fi]) > > # gnulib modules > gl_INIT > diff --git a/src/extern.h b/src/extern.h > index 3b59053..6be95cf 100644 > --- a/src/extern.h > +++ b/src/extern.h > @@ -171,6 +171,10 @@ void tape_toss_input (int in_des, off_t num_bytes); > void copy_files_tape_to_disk (int in_des, int out_des, off_t num_bytes); > void copy_files_disk_to_tape (int in_des, int out_des, off_t num_bytes, char > *filename); > void copy_files_disk_to_disk (int in_des, int out_des, off_t num_bytes, char > *filename); > +#ifdef HAVE_CPIO_REFLINK > +ssize_t copy_files_range (int in_des, int out_des, off_t num_bytes); > +#endif > + > void warn_if_file_changed (char *file_name, off_t old_file_size, > time_t old_file_mtime); > void create_all_directories (char const *name); > diff --git a/src/util.c b/src/util.c > index d81ef30..4c0d75f 100644 > --- a/src/util.c > +++ b/src/util.c > @@ -516,6 +516,56 @@ copy_files_disk_to_tape (int in_des, int out_des, off_t > num_bytes, > in_buff += size; > } > } > + > +#ifdef HAVE_CPIO_REFLINK > +/* > + * Use the copy_file_range(2) syscall for in-kernel transfer between two file > + * descriptors. Copy-on-write enabled filesystems may optimize I/O by using > + * reflinks. > + */ > +ssize_t > +copy_files_range (int in_des, int out_des, off_t num_bytes) > +{ > + loff_t in_off = 0, out_off = 0; > + ssize_t total = 0; > + ssize_t ret; > + > + in_off = lseek (in_des, 0, SEEK_CUR); > + if (in_off < 0) > + return -errno; > + > + out_off = lseek (out_des, 0, SEEK_CUR); > + if (out_off < 0) > + return -errno; > + As I've already told you somewhere else, copy_file_range (crf for short!) implementations may expand any existing holes in the original file. That's OK, I guess, but it may be worth adding a comment here and possibly on the cpio(1) manpage, when documenting the --reflink option. Another option would be to add a bit more complexity here by calling crf in a loop and lseek'ing SEEK_DATA/SEEK_HOLE. That's what my original RFC patch was doing, but I agree that it's probably not worth it. My suggestion is to, at least, add here a small comment mentioning this decision and referring to the cfr manpage for the details. Cheers, -- Luís > + while (total < num_bytes) > + { > + ret = copy_file_range (in_des, &in_off, out_des, &out_off, > + num_bytes - total, 0); > + /* simplify read/write fallback: don't partially seek on error */ > + if (ret == 0) > + return -ENODATA; > + if (ret < 0) > + return -errno; > + total += ret; > + } > + > + /* copy complete. seek to new offset */ > + in_off = lseek (in_des, in_off, SEEK_SET); > + if (in_off < 0) > + error(1, errno, "failed final src seek"); > + > + out_off = lseek (out_des, out_off, SEEK_SET); > + if (out_off < 0) > + error(1, errno, "failed final dst seek"); > + > + output_bytes += total; > + input_bytes += total; > + > + return total; > +} > +#endif /* HAVE_CPIO_REFLINK */ > + > /* Copy a file using the input and output buffers, which may start out > partly full. After the copy, the files are not closed nor the last > block flushed to output, and the input buffer may still be partly > -- > 2.26.2 >