On Saturday 08 November 2008 20:05:25 Jim Meyering wrote: > Andrew McGill <[EMAIL PROTECTED]> wrote: > > Greetings coreutils folks, > > > > There are a number of interesting filesystems (glusterfs, lustre? ... > > NFS) which could benefit from userspace utilities doing certain > > operatings in parallel. (I have a very slow glusterfs installation that > > makes me think that some things can be done better.) > > > > For example, copying a number of files is currently done in series ... > > cp a b c d e f g h dest/ > > but, on certain filesystems, it would be roughly twice as efficient if > > implemented in two parallel threads, something like: > > cp a c e g dest/ & > > cp b d f h dest/ > > since the source and destination files can be stored on multiple physical > > volumes. > > How about parallelizing it via xargs, e.g., > > $ echo a b c d e f g h | xargs -t -n4 --no-run-if-empty \ > --max-procs=2 -- cp --target-directory=dest > cp --target-directory=dest a b c d > cp --target-directory=dest e f g h > > Obviously the above is tailored (-L4) to your 8-input example. > In practice, you'd use a larger number, unless latency is > so high as to dwarf the cost of extra "fork/exec" syscalls, > in which case even -L1 might make sense. I did the command above with md5sum as the command, and got missing lines in the output. I optimistically hoped that would not happen!
> mv and ln also accept the --target-directory=dest option. > > > Simlarly, ls -l . will readdir(), and then stat() each file in the > > directory. On a filesystem with high latency, it would be faster to issue > > the stat() calls asynchronously, and in parallel, and then collect the > > results for > > If you can demonstrate a large performance gain on > systems that many people use, then maybe... > > There is more than a little value in keeping programs > like those in the coreutils package relatively simple, > but if the cost(maintenance+portability burden)/benefit > ratio is low enough, then anything is possible. > > For example, a well-encapsulated, optionally-threaded > "stat_all_dir_entries" API might be useful in some situations. So a relatively small change for parallel stat() in "ls" could fly. > If getting any eventual patch into upstream coreutils is > important to you, be sure there is some consensus on this > list before doing a lot of work on it. Any ideas on how to do a parallel cp / mv in a way that is not Considered Harmful? Maybe prefetch_files(max_bytes,file1,...,NULL) ... aargh. > > display. (This could improve performance for NFS, in proportion to the > > latency and the number of threads.) > > > > > > Question: Is there already a set of "improved" utilities that implement > > this kind of technique? > > Not that I know of. > > > If not, would this kind of performance enhancements be > > considered useful? > > It's impossible to say without knowing more. On the (de?)merits of xargs for parallel processing: What would you expect this to do --: find -type f -print0 | xargs -0 -n 8 --max-procs=16 md5sum >& ~/md5sums sort -k2 < md5sums > md5sums.sorted Compared to this? find -type f -print0 | xargs -0 md5sum >& ~/md5sums sort -k2 < md5sums > md5sums.sorted I was a little surprised that on my system running in parallel (the first version) loses around 1 line of output per thousand (md5sum of 22Gb in mostly small files). Is there a correct way to do md5sums in parallel without having a shared output buffer which eats output (I presume) -- or is losing output when haphazardly combining output streams actually strange and unusual? _______________________________________________ Bug-coreutils mailing list [email protected] http://lists.gnu.org/mailman/listinfo/bug-coreutils
