Paul Eggert wrote: > (By "oodles faster" I mean "as much faster as you like". > The benchmark below shows a 2800x speedup.) > > In response to an idea by Kit Westneat for GNU tar reported in > <http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00038.html>, > Eric Blake wrote: > >> Meanwhile, if you are indeed correct that there are easy ways to detect >> completely sparse files, even when the ioctl or SEEK_HOLE directives are >> not present, then the coreutils cp(1) hole iteration routine should >> probably be taught that corner case to recognize an entirely sparse file >> as a single hole. > > Here's a patch to coreutils to implement this idea. It's based on a patch > <http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00043.html> that > I just now installed into GNU tar. I think of it as a quick first cut > at full fiemap / SEEK_HOLE implementation, but unlike the full > implementation this optimization does not depend on any special ioctls > or lseek extensions, so it should work on any POSIX or POSIX-like host. > > On a simple benchmark this sped up GNU cp by a factor of 2800 > (measuring by real-time seconds) on my host: > > $ truncate -s 10GB bigfile > $ time old/cp bigfile bigfile-slow > > real 2m3.231s > user 0m1.497s > sys 0m5.738s > $ time new/cp bigfile bigfile-fast > > real 0m0.044s > user 0m0.000s > sys 0m0.002s > $ ls -ls bigfile* > 0 -rw-r--r-- 1 eggert csfac 10000000000 Aug 24 22:11 bigfile > 0 -rw-r--r-- 1 eggert csfac 10000000000 Aug 24 22:14 bigfile-fast > 0 -rw-r--r-- 1 eggert csfac 10000000000 Aug 24 22:14 bigfile-slow > >>From 2e535b590d675e6d96f954c1f840d678fb133f6a Mon Sep 17 00:00:00 2001 > From: Paul Eggert <[email protected]> > Date: Tue, 24 Aug 2010 22:20:55 -0700 > Subject: [PATCH] cp: copy entirely-sparse files oodles faster > > * src/copy.c (copy_reg): Bypass reads if the file is entirely > sparse. Idea suggested for by Kit Westneat via Bernd Shubert in > <http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00038.html> > for the Lustre file system. Implementation stolen from my patch > <http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00043.html> > to GNU tar. On my machine this sped up a cp benchmark, which > copied a 10 GB entirely-sparse file on an NFS file system, by a > factor of 2800 in real seconds.
Hi Paul, Somehow I didn't see this patch from you until now, while looking through the hundreds of outstanding (bug mostly resolved) bugs at http://debbugs.gnu.org/coreutils. Sorry about that. Now that we have FIEMAP support, (by the looks of things we will soon have SEEK_HOLE support in cp and in the linux kernel) do you think adding support for this special case is worthwhile? I could go either way. If so, would you care to rebase it for 8.13? coreutils-8.12 will probably be coming soon to adjust FIEMAP support not to collide with the combination of XFS, 2.6.39 release-candidate kernels and so called "unwritten extents".
