Re: better buffer size for copy

2005-11-24 Thread Jim Meyering
Paul Eggert [EMAIL PROTECTED] wrote:

 [EMAIL PROTECTED] (Robert Latham) writes:

 That's what i thought you'd say.  Ok, this patch vs. today's
 CVS adds buffer-lcm.h and buffer-lcm.c, adds those files to
 Makefile.am,  and makes copy.c call
 buffer_lcm.

 That patch is a reasonable first cut, but it mishandles sparse files
 among other things.  I installed the following instead.  Thanks for
 prompting us to look into the problem.

 2005-11-23  Paul Eggert  [EMAIL PROTECTED]

   * src/copy.c: Improve performance a bit by optimizing away
   unnecessary system calls and going to a block size of at least
...

Thanks for handling that.
I see that you too are adding declarations after statements :-)

   {
 +   word *wp = NULL;
 +
 +   ssize_t n_read = read (source_desc, buf, buf_size);

For the record, we've discussed this before, but now there are
two files in coreutils/src that use the C99 feature allowing
declarations after statements: copy.c and remove.c.

The plan is that people stuck with compilers unable to deal
with that syntax will be able to apply a patch converting to
equivalent c89.  It may even happen automatically: if/when
configure detects the lack of a suitable compiler, it'd apply
the c99-c89 patch.  The only hitch is that we'll have to maintain
the patch manually, but that shouldn't involve too much work.


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: better buffer size for copy

2005-11-23 Thread Paul Eggert
[EMAIL PROTECTED] (Robert Latham) writes:

 That's what i thought you'd say.  Ok, this patch vs. today's
 CVS adds buffer-lcm.h and buffer-lcm.c, adds those files to
 Makefile.am,  and makes copy.c call
 buffer_lcm. 

That patch is a reasonable first cut, but it mishandles sparse files
among other things.  I installed the following instead.  Thanks for
prompting us to look into the problem.

2005-11-23  Paul Eggert  [EMAIL PROTECTED]

* src/copy.c: Improve performance a bit by optimizing away
unnecessary system calls and going to a block size of at least
8192 (on normal hosts, anyway).  This improved performance 5% on my
Debian stable host (2.4.27 kernel, x86, copying from root
ext3 file system to itself).
Include buffer-lcm.h.
(copy_reg): Omit last argument.  All callers changed.
Use xmalloc to allocate rather than trusting alloca
(which is unwise with large block sizes).
Declare locals more locally, if possible.
Use uintptr_t words instead of int words, for a bit more speed
when looking for null blocks on 64-bit hosts.
Optimize away reads of zero bytes on regular files.
In the typical case, insist on 8 KiB buffers, at least.
Avoid unnecessary extra call to fstat when checking for sparse files.
Avoid now-unnecessary cast to off_t, and 0L.
Avoid unnecessary test of *new_dst when checking for same owner
and group.

* Makefile.am (libcoreutils_a_SOURCES): Add buffer-lcm.c, buffer-lcm.h.
* buffer-lcm.c, buffer-lcm.h: New files, from diffutils.

Index: src/copy.c
===
RCS file: /fetish/cu/src/copy.c,v
retrieving revision 1.190
diff -p -u -r1.190 copy.c
--- src/copy.c  25 Sep 2005 03:07:33 -  1.190
+++ src/copy.c  24 Nov 2005 06:40:25 -
@@ -31,6 +31,7 @@
 
 #include system.h
 #include backupfile.h
+#include buffer-lcm.h
 #include copy.h
 #include cp-hash.h
 #include dirname.h
@@ -199,29 +200,21 @@ copy_dir (char const *src_name_in, char 
X provides many option settings.
Return true if successful.
*NEW_DST and *CHOWN_SUCCEEDED are as in copy_internal.
-   SRC_SB and DST_SB are the results of calling XSTAT (aka stat for
-   SRC_SB) on SRC_NAME and DST_NAME.  */
+   SRC_SB is the result of calling XSTAT (aka stat) on SRC_NAME.  */
 
 static bool
 copy_reg (char const *src_name, char const *dst_name,
  const struct cp_options *x, mode_t dst_mode, bool *new_dst,
  bool *chown_succeeded,
- struct stat const *src_sb,
- struct stat const *dst_sb)
+ struct stat const *src_sb)
 {
   char *buf;
-  size_t buf_size;
-  size_t buf_alignment;
+  char *buf_alloc = NULL;
   int dest_desc;
   int source_desc;
   struct stat sb;
   struct stat src_open_sb;
-  char *cp;
-  int *ip;
   bool return_val = true;
-  off_t n_read_total = 0;
-  bool last_write_made_hole = false;
-  bool make_holes = false;
 
   source_desc = open (src_name, O_RDONLY | O_BINARY);
   if (source_desc  0)
@@ -282,8 +275,6 @@ copy_reg (char const *src_name, char con
   goto close_src_desc;
 }
 
-  /* Determine the optimal buffer size.  */
-
   if (fstat (dest_desc, sb))
 {
   error (0, errno, _(cannot fstat %s), quote (dst_name));
@@ -291,126 +282,167 @@ copy_reg (char const *src_name, char con
   goto close_src_and_dst_desc;
 }
 
-  buf_size = ST_BLKSIZE (sb);
-
-  /* Even with --sparse=always, try to create holes only
- if the destination is a regular file.  */
-  if (x-sparse_mode == SPARSE_ALWAYS  S_ISREG (sb.st_mode))
-make_holes = true;
-
-#if HAVE_STRUCT_STAT_ST_BLOCKS
-  if (x-sparse_mode == SPARSE_AUTO  S_ISREG (sb.st_mode))
+  if (! (S_ISREG (src_open_sb.st_mode)  src_open_sb.st_size == 0))
 {
-  /* Use a heuristic to determine whether SRC_NAME contains any
-sparse blocks. */
+  typedef uintptr_t word;
+  off_t n_read_total = 0;
 
-  if (fstat (source_desc, sb))
-   {
- error (0, errno, _(cannot fstat %s), quote (src_name));
- return_val = false;
- goto close_src_and_dst_desc;
-   }
+  /* Choose a suitable buffer size; it may be adjusted later.  */
+  size_t buf_alignment = lcm (getpagesize (), sizeof (word));
+  size_t buf_alignment_slop = sizeof (word) + buf_alignment - 1;
+  size_t buf_size = ST_BLKSIZE (sb);
+
+  /* Deal with sparse files.  */
+  bool last_write_made_hole = false;
+  bool make_holes = false;
+
+  if (S_ISREG (sb.st_mode))
+   {
+ /* Even with --sparse=always, try to create holes only
+if the destination is a regular file.  */
+ if (x-sparse_mode == SPARSE_ALWAYS)
+   make_holes = true;
 
-  /* If the file has fewer blocks than would normally
-be needed for a file of its size, then
-at least one of the blocks in the file is a hole. */
-  if (S_ISREG (sb.st_mode)

Re: better buffer size for copy

2005-11-22 Thread Robert Latham
On Mon, Nov 21, 2005 at 12:45:40AM -0500, Phillip Susi wrote:
 If we are talking about the conventional blocking cached read,
 followed by a blocking cached write, then I think you will find that
 using a buffer size of several pages ( say 32 or 64 KB ) will be
 MUCH more efficient than 1024 bytes ( the typical local filesystem
 block size ), so using st_blksize for the size of the read/write
 buffer is not good.  I think you may be ascribing meaning to
 st_blksize that is not there. 

I mean no offense cutting out most of your points.  You describe great
ways to achieve high I/O rates for anyone writing a custom file mover.
I shouldn't have mentioned network file systems.  It's a distraction
from the real point of my patch: cp(1) should consider both the source
and the destination st_blksize.

All I expect from st_blksize is what the stat(2)
manpage suggests:

   The value st_blocks gives the size of  the  file  in  512-byte
   blocks.  (This  may  be  smaller than st_size/512 e.g. when the
   file has holes.) The value st_blksize gives the preferred
   blocksize for efficient file system  I/O.  (Writing to a file
   in smaller chunks may cause an inefficient
   read-modify-rewrite.)

All I really want is for cp(1) to do the right thing no matter what
the soruce or destination st_blksize value might be. 

In copying from a 4k blocksize file sytem to a 64k blocksize
filesystem, cp(1) will perform well, as it is using a 64k buffer.  

In copying *from* that 64k blocksize filesystem *to* a 4k blocksize
filesytem, cp(1) will not perform as well: it's using a 4k buffer and
so reading from the source filesystem in less-than-ideal chunks.

Thanks again for taking the time to respond.  I hope I have made the
intent of my patch more clear. 

==rob

-- 
Rob Latham
Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF
Argonne National Labs, IL USAB29D F333 664A 4280 315B


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: better buffer size for copy

2005-11-22 Thread Phillip Susi

Robert Latham wrote:

I mean no offense cutting out most of your points.  You describe great
ways to achieve high I/O rates for anyone writing a custom file mover.
I shouldn't have mentioned network file systems.  It's a distraction
from the real point of my patch: cp(1) should consider both the source
and the destination st_blksize.



No problem... I kind of went off on a tangent there.


All I expect from st_blksize is what the stat(2)
manpage suggests:

   The value st_blocks gives the size of  the  file  in  512-byte
   blocks.  (This  may  be  smaller than st_size/512 e.g. when the
   file has holes.) The value st_blksize gives the preferred
   blocksize for efficient file system  I/O.  (Writing to a file
   in smaller chunks may cause an inefficient
   read-modify-rewrite.)

All I really want is for cp(1) to do the right thing no matter what
the soruce or destination st_blksize value might be. 



Ok, I see what you are talking about now.  Using a copy block size 
smaller than the filesystem block size can result in a lot of extra IO, 
thus reducing throughput.  Of course, this doesn't really apply in a 
typical use case because the kernel will cache the writes and combine 
them when it flushes the IO to disk, however, yes... it is a good idea 
to use an IO block size that is at least as large as the larger of the 
source and destination filesystem block sizes.



In copying from a 4k blocksize file sytem to a 64k blocksize
filesystem, cp(1) will perform well, as it is using a 64k buffer.  


In copying *from* that 64k blocksize filesystem *to* a 4k blocksize
filesytem, cp(1) will not perform as well: it's using a 4k buffer and
so reading from the source filesystem in less-than-ideal chunks.



Again, this probably won't happen in real practice due to the influence 
of the filesystem cache, but I do see your point.  In practice though, I 
don't know of any filesystem with a 64k block size.  By default ext2/3 
use 1k, and reiserfs uses 4k.  These are going to be typical values for 
st_blksize, yet if you use a copy block size of say, 64k, I think you 
will find the performance to be significantly better than either 1k or 
4k.  I think that a good case in point is copying to/from a typical 
ext2/3 filesystem using a 1k block size.  Using a buffer smaller than a 
single 4k page is going to significantly degrade performance.  You 
certainly do not want to go smaller than the block size, but really, you 
should be going larger.



Thanks again for taking the time to respond.  I hope I have made the
intent of my patch more clear. 


==rob



You did... and I thank you as well and hope that I have made myself more 
clear.




___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: better buffer size for copy

2005-11-20 Thread Phillip Susi
What would such network filesystems report as their blocksize?  I have a 
feeling it isn't going to be on the order of a MB.  At least for local 
filesystems, the ideal transfer block size is going to be quite a bit 
larger than the filesystem block size ( if the filesystem is even block 
oriented... think reiser4, or cramfs ).  In the case of network 
filesystems, they should be performing readahead in the background 
between small block copies to keep the pipeline full.  As long as the 
copy program isn't blocked elsewhere for long periods, say in the write 
to the destination, then the readahead mechanism should keep the 
pipeline full.  Up to a point, using larger block sizes saves some cpu 
by lowering the number of system calls.  After a certain point, the copy 
program can start to waste enough time in the write that the readahead 
stops and stalls the pipeline. 

If you want really fast copies of large files, then you want to send 
down multiple overlapped aio ( real aio, not the glibc threaded 
implementation ) O_DIRECT reads and writes, but that gets quite 
complicated.  Simply using blocking O_DIRECT reads into a memory mapped 
destination file buffer performs nearly as well, provided you use a 
decent block size.  On my system I have found that 128 KB+ buffers are 
needed to keep the pipeline full because I'm using a 2 disk raid0 with a 
64k stripe factor.  As a result, blocks smaller than 128 KB only keep 
one disk going at a time.  That's probably getting a bit too complicated 
though for this conversation. 

If we are talking about the conventional blocking cached read, followed 
by a blocking cached write, then I think you will find that using a 
buffer size of several pages ( say 32 or 64 KB ) will be MUCH more 
efficient than 1024 bytes ( the typical local filesystem block size ), 
so using st_blksize for the size of the read/write buffer is not good.  
I think you may be ascribing meaning to st_blksize that is not there. 



Robert Latham wrote:


In local file systems, i'm sure you are correct.  If you are working
with a remote file system, however, the optimal size is on the order
of megabytes, not kilobytes.  For a specific example, consider the
PVFS2 file system, where the plateau in blocksize vs. bandwitdh is
two orders of magnitude larger than 64 KB.  PVFS2 is a parallel file
system for linux clusters.  I am not nearly as familiar with Lustre,
GPFS, or GFS, but I suspect those filesystems too would benefit from
block sizes larger than 64 KB.  


Are you taking umbrage at the idea of using st_blksize to direct how
large the transfer size should be for I/O?  I don't know what other
purpose st_blksize should have, nor are there any other fields which
are remotely valid for that purpose.  

Thanks for your feedback. 
==rob


 





___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: better buffer size for copy

2005-11-19 Thread Phillip Susi
I don't see why the filesystem's cluster size should have a thing to do 
with the buffer size used to copy files.  For optimal performance, the 
larger the buffer, the better.  Diminishing returns applies of course, 
so at some point the increase in buffer size results in little to no 
further increase in performance, so that's the size you should use.  I 
believe that the optimal size is about 64 KB. 




Robert Latham wrote:


(README says to ping if there's not been an ack of a patch after two
weeks.  here i go)

This patch to today's (18 Nov 2005) coreutils CVS makes copy.c
consider both the source and destination blocksize when computing
buf_size.  With this patch, src/copy.c will use the LCM of the soruce
and destination block sizes.  As Paul suggested, I used the buffer_lcm
routine from diffutils. 


For what it's worth, this patch does not introduce any regressions
into the coreutils testsuite.

When copying from a remote filesystem with a block size of 4MB to a
filesystem with a 4k blocksize, the copy is *very* slow.  Going from a
filesystem with 4k blocks to a filesystem with 4MB blocks is much
faster.  With this patch, both operations are equally performant.

I went ahead and added a ChangeLog entry as well.  


Thanks.  I'll be more than happy to incorporate any suggestions or
comments.

==rob


 





___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: better buffer size for copy

2005-11-18 Thread Robert Latham

(README says to ping if there's not been an ack of a patch after two
weeks.  here i go)

This patch to today's (18 Nov 2005) coreutils CVS makes copy.c
consider both the source and destination blocksize when computing
buf_size.  With this patch, src/copy.c will use the LCM of the soruce
and destination block sizes.  As Paul suggested, I used the buffer_lcm
routine from diffutils. 

For what it's worth, this patch does not introduce any regressions
into the coreutils testsuite.

When copying from a remote filesystem with a block size of 4MB to a
filesystem with a 4k blocksize, the copy is *very* slow.  Going from a
filesystem with 4k blocks to a filesystem with 4MB blocks is much
faster.  With this patch, both operations are equally performant.

I went ahead and added a ChangeLog entry as well.  

Thanks.  I'll be more than happy to incorporate any suggestions or
comments.

==rob


-- 
Rob Latham
Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF
Argonne National Labs, IL USAB29D F333 664A 4280 315B
diff -burpN -x CVS -x 'cscope*' -x Makefile.in -x configure -x autom4te.cache 
-x aclocal.m4 coreutils/ChangeLog coreutils.lcm/ChangeLog
--- coreutils/ChangeLog 2005-11-18 16:24:52.0 -0600
+++ coreutils.lcm/ChangeLog 2005-11-18 16:24:34.0 -0600
@@ -1,3 +1,10 @@
+2005-11-18  Rob Latham [EMAIL PROTECTED]
+   * lib/Makefile.am, lib/buffer-lcm.c, lib/buffer-lcm.h: add code to find
+ least common multiple of two values, with logic to handle unusual
+ input (taken from diffutils)
+   * src/copy.c: use the LCM of the source and dest blocksize when
+ figuring out the ideal blocksize.
+
 2005-11-17  Jim Meyering  [EMAIL PROTECTED]
 
* Version 6.0-cvs.
diff -burpN -x CVS -x 'cscope*' -x Makefile.in -x configure -x autom4te.cache 
-x aclocal.m4 coreutils/lib/buffer-lcm.c coreutils.lcm/lib/buffer-lcm.c
--- coreutils/lib/buffer-lcm.c  1969-12-31 18:00:00.0 -0600
+++ coreutils.lcm/lib/buffer-lcm.c  2005-11-18 10:04:54.0 -0600
@@ -0,0 +1,47 @@
+/* buffer-lcm.c - an lcm routine used for computing optimal buffer size
+ 
+   Copyright (C) 2005 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2, or (at your option)
+   any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the Free Software Foundation,
+   Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.  */
+
+
+/* Least common multiple of two buffer sizes A and B.  However, if
+   either A or B is zero, or if the multiple is greater than LCM_MAX,
+   return a reasonable buffer size.  
+ 
+   This method was taken from diffutils/lib/cmpbuf.c */
+
+#include sys/types.h
+
+size_t
+buffer_lcm (size_t a, size_t b, size_t lcm_max)
+{
+  size_t lcm, m, n, q, r;
+
+  /* Yield reasonable values if buffer sizes are zero.  */
+  if (!a)
+return b ? b : 8 * 1024;
+  if (!b)
+return a;
+
+  /* n = gcd (a, b) */
+  for (m = a, n = b;  (r = m % n) != 0;  m = n, n = r)
+continue;
+
+  /* Yield a if there is an overflow.  */
+  q = a / n;
+  lcm = q * b;
+  return lcm = lcm_max  lcm / b == q ? lcm : a;
+}
diff -burpN -x CVS -x 'cscope*' -x Makefile.in -x configure -x autom4te.cache 
-x aclocal.m4 coreutils/lib/buffer-lcm.h coreutils.lcm/lib/buffer-lcm.h
--- coreutils/lib/buffer-lcm.h  1969-12-31 18:00:00.0 -0600
+++ coreutils.lcm/lib/buffer-lcm.h  2005-11-18 10:04:54.0 -0600
@@ -0,0 +1,23 @@
+/* buffer-lcm.c - an lcm routine used for computing optimal buffer size
+
+   Copyright (C) 2005 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2, or (at your option)
+   any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the Free Software Foundation,
+   Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.  */
+
+/* Taken from diffutils/lib/cmpbuf.c */
+
+#include sys/types.h
+
+size_t buffer_lcm(size_t a, size_t b, size_t lcm_max);
diff -burpN -x CVS -x 'cscope*' -x Makefile.in -x configure -x autom4te.cache 
-x 

Re: better buffer size for copy

2005-11-07 Thread Robert Latham
On Fri, Nov 04, 2005 at 10:07:51PM -0800, Paul Eggert wrote:
 [EMAIL PROTECTED] (Robert Latham) writes:
 
  In the time since the above thread was started, there is now an
  implementation of lcm in src/system.h.
 
 I'd rather use something more like buffer_lcm in diffutils, since it
 handles weird cases without dumping core.
 

Ok, no problem.  In the old thread you wanted a new file under lib to
contain the implementation of buffer_lcm.  Coreutils now has a lot of
inlined routines in src/system.h, so would it be better to add
buffer_lcm to src/system.h, or stick with creating new files under
lib/ ?

Thanks
==rob

-- 
Rob Latham
Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF
Argonne National Labs, IL USAB29D F333 664A 4280 315B


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: better buffer size for copy

2005-11-07 Thread Paul Eggert
It's too much for an inlined function, I think.

/* Buffer primitives for comparison operations.

   Copyright (C) 1993, 1995, 1998, 2001, 2002 Free Software Foundation, Inc.

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 2, or (at your option)
   any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program; see the file COPYING.
   If not, write to the Free Software Foundation,
   59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.  */

...

/* Least common multiple of two buffer sizes A and B.  However, if
   either A or B is zero, or if the multiple is greater than LCM_MAX,
   return a reasonable buffer size.  */

size_t
buffer_lcm (size_t a, size_t b, size_t lcm_max)
{
  size_t lcm, m, n, q, r;

  /* Yield reasonable values if buffer sizes are zero.  */
  if (!a)
return b ? b : 8 * 1024;
  if (!b)
return a;

  /* n = gcd (a, b) */
  for (m = a, n = b;  (r = m % n) != 0;  m = n, n = r)
continue;

  /* Yield a if there is an overflow.  */
  q = a / n;
  lcm = q * b;
  return lcm = lcm_max  lcm / b == q ? lcm : a;
}


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: better buffer size for copy

2005-11-07 Thread Robert Latham
On Mon, Nov 07, 2005 at 12:20:47PM -0800, Paul Eggert wrote:
 It's too much for an inlined function, I think.

That's what i thought you'd say.  Ok, this patch vs. today's
CVS adds buffer-lcm.h and buffer-lcm.c, adds those files to
Makefile.am,  and makes copy.c call
buffer_lcm. 

I left alone the other places that call lcm.  

Thanks for the feedback

==rob


diff -burpN -x CVS -x 'cscope*' -x Makefile.in -x configure -x autom4te.cache 
-x aclocal.m4 coreutils/lib/buffer-lcm.c coreutils.lcm/lib/buffer-lcm.c
--- coreutils/lib/buffer-lcm.c  1969-12-31 18:00:00.0 -0600
+++ coreutils.lcm/lib/buffer-lcm.c  2005-11-07 14:46:29.0 -0600
@@ -0,0 +1,47 @@
+/* buffer-lcm.c - an lcm routine used for computing optimal buffer size
+ 
+   Copyright (C) 2005 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2, or (at your option)
+   any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the Free Software Foundation,
+   Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.  */
+
+
+/* Least common multiple of two buffer sizes A and B.  However, if
+   either A or B is zero, or if the multiple is greater than LCM_MAX,
+   return a reasonable buffer size.  
+ 
+   This method was taken from diffutils/lib/cmpbuf.c */
+
+#include sys/types.h
+
+size_t
+buffer_lcm (size_t a, size_t b, size_t lcm_max)
+{
+  size_t lcm, m, n, q, r;
+
+  /* Yield reasonable values if buffer sizes are zero.  */
+  if (!a)
+return b ? b : 8 * 1024;
+  if (!b)
+return a;
+
+  /* n = gcd (a, b) */
+  for (m = a, n = b;  (r = m % n) != 0;  m = n, n = r)
+continue;
+
+  /* Yield a if there is an overflow.  */
+  q = a / n;
+  lcm = q * b;
+  return lcm = lcm_max  lcm / b == q ? lcm : a;
+}
diff -burpN -x CVS -x 'cscope*' -x Makefile.in -x configure -x autom4te.cache 
-x aclocal.m4 coreutils/lib/buffer-lcm.h coreutils.lcm/lib/buffer-lcm.h
--- coreutils/lib/buffer-lcm.h  1969-12-31 18:00:00.0 -0600
+++ coreutils.lcm/lib/buffer-lcm.h  2005-11-07 14:46:26.0 -0600
@@ -0,0 +1,23 @@
+/* buffer-lcm.c - an lcm routine used for computing optimal buffer size
+
+   Copyright (C) 2005 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2, or (at your option)
+   any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the Free Software Foundation,
+   Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.  */
+
+/* Taken from diffutils/lib/cmpbuf.c */
+
+#include sys/types.h
+
+size_t buffer_lcm(size_t a, size_t b, size_t lcm_max);
diff -burpN -x CVS -x 'cscope*' -x Makefile.in -x configure -x autom4te.cache 
-x aclocal.m4 coreutils/lib/Makefile.am coreutils.lcm/lib/Makefile.am
--- coreutils/lib/Makefile.am   2005-10-05 09:54:17.0 -0500
+++ coreutils.lcm/lib/Makefile.am   2005-11-07 14:49:01.0 -0600
@@ -27,6 +27,7 @@ DEFS += -DLIBDIR=\$(libdir)\
 
 libcoreutils_a_SOURCES = \
   allocsa.c allocsa.h \
+  buffer-lcm.c buffer-lcm.h \
   euidaccess.h \
   exit.h \
   fprintftime.c fprintftime.h \
diff -burpN -x CVS -x 'cscope*' -x Makefile.in -x configure -x autom4te.cache 
-x aclocal.m4 coreutils/src/copy.c coreutils.lcm/src/copy.c
--- coreutils/src/copy.c2005-09-24 22:07:33.0 -0500
+++ coreutils.lcm/src/copy.c2005-11-07 14:42:27.0 -0600
@@ -31,6 +31,7 @@
 
 #include system.h
 #include backupfile.h
+#include buffer-lcm.h
 #include copy.h
 #include cp-hash.h
 #include dirname.h
@@ -291,7 +292,7 @@ copy_reg (char const *src_name, char con
   goto close_src_and_dst_desc;
 }
 
-  buf_size = ST_BLKSIZE (sb);
+  buf_size = buffer_lcm(ST_BLKSIZE (sb), ST_BLKSIZE(src_open_sb), SIZE_MAX);
 
   /* Even with --sparse=always, try to create holes only
  if the destination is a regular file.  */

-- 
Rob Latham
Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF
Argonne National Labs, IL USAB29D F333 664A 4280 315B


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org

Re: better buffer size for copy

2005-11-04 Thread Paul Eggert
[EMAIL PROTECTED] (Robert Latham) writes:

 In the time since the above thread was started, there is now an
 implementation of lcm in src/system.h.

I'd rather use something more like buffer_lcm in diffutils, since it
handles weird cases without dumping core.


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils