On 3/7/06, Carl Lowenstein <[EMAIL PROTECTED]> wrote:
> On 3/7/06, Michael O'Keefe <[EMAIL PROTECTED]> wrote:
> > > Does not xargs(1) accept the output from find(1) as it arrives, and
> > > ship it off to grep(1) in suitable buffersful without waiting for
> > > find(1) to finish?
> >
> > No, xargs takes from stdin a list and builds the argv[] for it's command.
> > So worst case scenario it will wait for 1023 "lines" of input (without
> > the minus-ell flag) before doing it's fork()
> > This can of course take a LONG time to accumulate if you are walking
> > many NFS mounts. But if you use a suitable minus-ell flag, it will still
> > be faster than -exec
>
> Are we not still saying the same thing -- xargs does not wait for
> stdin to finish (from the previous process) but does wait for some
> amount of input to accumulate.
> "worst case 1023 lines" must be implementation-dependent. My memory
> of xargs dates back to when it was a lot less than that.
> In any case it can be changed by the --max-linex=<lines> or -l<liines>
> or -L<lines> switch.
Interesting experimetal evidence, since my previous experiment was
still lying around in another window.
[EMAIL PROTECTED] include]$ time find . -type f -print | xargs -L477 grep -i
largefile64 /dev/null > /tmp/find_largefile64
real 0m0.229s
user 0m0.067s
sys 0m0.185s
[EMAIL PROTECTED] include]$ time find . -type f -print | xargs -L478 grep -i
largefile64 /dev/null > /tmp/find_largefile64
xargs: argument list too long
real 0m0.196s
user 0m0.054s
sys 0m0.107s
[EMAIL PROTECTED] include]$
So the default value is not hard-wired to 1023. Presumably without a
specified number of lines, xargs maxes out on some other property of
its input. Food for thought:
[EMAIL PROTECTED] include]$ find . -type f -print | head -477 | wc -c
13095
[EMAIL PROTECTED] include]$ find . -type f -print | head -478 | wc -c
13114
More food for thought. What size of argument lists are produced by
xargs in this case? A fair amount of fumbling around results in the
following command:
[EMAIL PROTECTED] include]$ find . -type f | xargs | \
gawk '{printf ("%d ", length); gsub (/[^ ]/,""); print length}'
22286 938
22293 878
22294 1021
22298 749
22281 569
22277 517
22278 561
22307 581
22291 660
16836 597
[EMAIL PROTECTED] include]$
Explanation: use xargs to split up a long list into acceptable pieces
(10 of them in this case).
Count number of characters in each piece, and number of spaces.
Number of spaces is within 1 of the number of original find(1) output
lines that are concatenated by xargs.
Roughly speaking, xargs(1) seems to produce an argv() list of just
over 22000 characters, aggregating 500 to 900 of its inputs to do
this. Again YMMV.
carl
--
carl lowenstein marine physical lab u.c. san diego
[EMAIL PROTECTED]
--
[email protected]
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-list