Re: Overriding ARG_MAX

2002-01-08 Thread Garance A Drosihn

At 10:54 PM -0500 1/4/02, David Miller wrote:
What I usually want to do is something more like ls *.out |wc -l,
or grep something *.data or cat *.foo | grep something.

I have rebuilt the system in the past after greatly expanding
ARG_MAX, and that does what I want.  I'm just looking for an easy
way to preserve it across cvsups, not looking for alternate ways
to list the files in a directory:)

While greatly expanding ARG_MAX might do what you want, it is a bad
idea as there are a number of side-effects to doing that.  You are not
just fixing your problem, you are greatly increasing the memory usage
of many things in the system -- some of which are going to assume the
official POSIX setting for ARG_MAX (intentionally or unintentionally)
no matter what you change it to.  That is a mighty big hammer to swing
to fix the problem you're talking about, and it's a hammer that you're
going to have to keep expanding as you get more files to process.

I doubt you'll be thrilled with this answer, as I am also going to
ignore your direct question to answer what *I* consider to be the
bigger question, but I would do this some other way.  If it were me,
I would write a script in perl or ruby which would do the operations
I feel I need to do on these directories of files.  Maybe I'd even
generalize it, so I could feed it normal-looking commands, and the
script would know how to break up the list of files to get the right
results -- without going over ARG_MAX.  This way you don't have to
care about changing the size of ARG_MAX.

-- 
Garance Alistair Drosehn=   [EMAIL PROTECTED]
Senior Systems Programmer   or  [EMAIL PROTECTED]
Rensselaer Polytechnic Instituteor  [EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Overriding ARG_MAX

2002-01-05 Thread Terry Lambert

David Miller wrote:
  Probably, you are doing something whic you aren't telling us,
  like saying ls *.c | wc -l or otherwise using globbing that
  the shell expands to too large a list.
 
  The easy answer is use ``find'' instead of ``ls''.
 
 Indeed, but it doesn't answer the basic questions, which was: Is there any
 easy way to override ARG_MAX (or arbitrary other paramaters) in make.conf
 or a config file, or something else altogether?

No, there is no easy way.  The limit is there in the POSIX
standard, and it exists because of the need to pass data
from a parent to a child over a fork + exec.

The correct answer is to use find, and either pipe or
call xargs on the output.

No matter what you do, you are going to not overcome the
limit on programs that are already built (like your shell),
which includded the header file and referenced the manifest
constant where the limit is defined.  So compatability with
exiting binary apoplications already limits any change you
might want to make to this.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Overriding ARG_MAX

2002-01-05 Thread Oliver Fromme

David Miller [EMAIL PROTECTED] wrote:
  What I usually want to do is something more like ls *.out |wc -l

ls | grep '\.out$' | wc -l

  or grep something *.data

ls | grep '\.data$' | xargs grep something

  or cat *.foo | grep something.

ls | grep '\.foo$' | xargs cat | grep something

In general, changing ARG_MAX is bad for several reasons.
 - It makes your commands nonportable.
 - It only hides the actual problem without really solving
   it.
 - It just moves the limit further away, but it doesn't
   remove it.

The alternative commands that I suggested above are
portable and work for _any_ number of files, no matter
what the ARG_MAX limit is.  Sure, they're a bit longer
to type, but if you're concerned about that, then you
could wrap them into small scripts or shell functions.

Regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH  Co KG, Oettingenstr. 2, 80538 München
Any opinions expressed in this message may be personal to the author
and may not necessarily reflect the opinions of secnetix in any way.

All that we see or seem is just a dream within a dream (E. A. Poe)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Overriding ARG_MAX

2002-01-05 Thread Leo Bicknell


Ok, there are several issues here that I just have to point out. :-)

In a message written on Sat, Jan 05, 2002 at 09:48:40PM +0100, Oliver Fromme wrote:
 ls | grep '\.out$' | wc -l

One process shorter: find -name *.out -maxdepth 0 | wc -l

 ls | grep '\.data$' | xargs grep something

Two problems, first use find, second use /dev/null:

find -name *.data -maxdepth 0 | xargs grep something /dev/null 

 ls | grep '\.foo$' | xargs cat | grep somethingA

A completely unnecessary use of cat.


One of the often missed things is the use of /dev/null on grep in
this case.  If you grep a single file (grep string file) then it
prints the matches.  If you grep multiple files (grep string file1
file2) it prints the matches with the file name prepended.

When using xargs, if the number of things to search results in just
one being left for the last call to grep you will get all of your
results prepended with the file name, except for the last file
which will be just the results.  Adding /dev/null insures grep
always has 2 or more files.

Another fix would be:

find -name *.data -maxdepth 0 | xargs grep -H something

I don't believe old grep's had -H though, which is where the /dev/null
trick came from.

In any event, using find is much better, not so much for this example,
but because it allows you to do things like check permissions in a 
portable way:

find -name *.data -perm 444 | xargs grep -H something

You can't do that with ls | grep, since only the filenames make it
to grep.

-- 
   Leo Bicknell - [EMAIL PROTECTED] - CCIE 3440
PGP keys at http://www.ufp.org/~bicknell/
Read TMBG List - [EMAIL PROTECTED], www.tmbg.org

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Overriding ARG_MAX

2002-01-05 Thread Oliver Fromme

Leo Bicknell [EMAIL PROTECTED] wrote:
  In a message written on Sat, Jan 05, 2002 at 09:48:40PM +0100, Oliver Fromme wro
  te:
   ls | grep '\.out$' | wc -l
  
  One process shorter: find -name *.out -maxdepth 0 | wc -l

OK, but I tried to be as portable as possible, while
-maxdepth is not portable (e.g. Solaris doesn't have it).

   ls | grep '\.data$' | xargs grep something
  
  Two problems, first use find, second use /dev/null:
  find -name *.data -maxdepth 0 | xargs grep something /dev/null 

But that one will give you different result than the
original command.

I'm well aware of the /dev/null issue, but this wasn't
the question in the first place.

   ls | grep '\.foo$' | xargs cat | grep somethingA
  
  A completely unnecessary use of cat.

Yep, already in the original command, so I kept it.
(I had already typed a comment about useless use of cat
in my mail, but deleted it again, thinking that it was
intentional to achieve the same effect as grep -h.)

  You can't do that with ls | grep, since only the filenames make it
  to grep.

You could do it with ls -l | awk, though.  ;-)

Regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH  Co KG, Oettingenstr. 2, 80538 München
Any opinions expressed in this message may be personal to the author
and may not necessarily reflect the opinions of secnetix in any way.

All that we see or seem is just a dream within a dream (E. A. Poe)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Overriding ARG_MAX

2002-01-05 Thread Oliver Fromme

Sorry for replying to myself, but this just came to my
mind ...

Oliver Fromme [EMAIL PROTECTED] wrote:
  David Miller [EMAIL PROTECTED] wrote:
What I usually want to do is something more like ls *.out |wc -l
  
  ls | grep '\.out$' | wc -l

A smaller solution would be:
echo *.out | wc -w

Note that the ARG_MAX limitation does not apply to echo,
because it is a shell-builtin.

Similarly:

or grep something *.data
  
  ls | grep '\.data$' | xargs grep something

echo *.data | xargs grep something

or cat *.foo | grep something.
  
  ls | grep '\.foo$' | xargs cat | grep something

echo *.foo | xargs cat | grep something

(Yes, I know, useless cat.  The same can probably
achieved like this:
echo *.foo | xargs grep -h something
)

Regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH  Co KG, Oettingenstr. 2, 80538 München
Any opinions expressed in this message may be personal to the author
and may not necessarily reflect the opinions of secnetix in any way.

All that we see or seem is just a dream within a dream (E. A. Poe)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Overriding ARG_MAX

2002-01-04 Thread David Miller

Apologies if this belongs on -questions.  I couldn't find what I needed in
the archives or handbook.


I have a system where I need/want to handle lots of files in a single
directory.  Lots as in 100-200K files.  ls | wc -l breaks because the
value of ARG_MAX in sys/syslimits.h is too small.  If I change it from
65536 to 4meg and rebuild the world it works fine.

I do cvsup from time to time and have to re-edit the file, which I usually
forget.  Is there a way to set this parameter in make.conf or the config
file so it's always done when compiling the kernel?


Thanks in advance,

--- David


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Overriding ARG_MAX

2002-01-04 Thread Brooks Davis

On Fri, Jan 04, 2002 at 09:50:45PM -0500, David Miller wrote:
 Apologies if this belongs on -questions.  I couldn't find what I needed in
 the archives or handbook.

It almost certaintly did.

 I have a system where I need/want to handle lots of files in a single
 directory.  Lots as in 100-200K files.  ls | wc -l breaks because the
 value of ARG_MAX in sys/syslimits.h is too small.  If I change it from
 65536 to 4meg and rebuild the world it works fine.

ls | xargs wc -l

would work with an arbitrary number of files.

 I do cvsup from time to time and have to re-edit the file, which I usually
 forget.  Is there a way to set this parameter in make.conf or the config
 file so it's always done when compiling the kernel?

One solution is to use cvsup to maintain a local copy of the cvs
tree and check out source tree out using cvs.  This will mean that cvs's
automerging support will keep your changes untouched.  You may have to
resolve an occational conflict if something changes near your changes,
but otherwise your changes will remain intact.

-- Brooks

-- 
Any statement of the form X is the one, true Y is FALSE.
PGP fingerprint 655D 519C 26A7 82E7 2529  9BF0 5D8E 8BE9 F238 1AD4



msg30712/pgp0.pgp
Description: PGP signature


Re: Overriding ARG_MAX

2002-01-04 Thread Terry Lambert

David Miller wrote:
 Apologies if this belongs on -questions.  I couldn't find what I needed in
 the archives or handbook.
 
 I have a system where I need/want to handle lots of files in a single
 directory.  Lots as in 100-200K files.  ls | wc -l breaks because the
 value of ARG_MAX in sys/syslimits.h is too small.  If I change it from
 65536 to 4meg and rebuild the world it works fine.

I don't believe you.  There is no ARG_MAX limit on pipes.

Probably, you are doing something whic you aren't telling us,
like saying ls *.c | wc -l or otherwise using globbing that
the shell expands to too large a list.

The easy answer is use ``find'' instead of ``ls''.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Overriding ARG_MAX

2002-01-04 Thread Terry Lambert

Brooks Davis wrote:
  I have a system where I need/want to handle lots of files in a single
  directory.  Lots as in 100-200K files.  ls | wc -l breaks because the
  value of ARG_MAX in sys/syslimits.h is too small.  If I change it from
  65536 to 4meg and rebuild the world it works fine.
 
 ls | xargs wc -l
 
 would work with an arbitrary number of files.

No, it wouldn't.  First off, you would be line counting the file
contents, not the number of files.  Second, the ls command alone
will *never* hit ARG_MAX, and neither will wc -l, if it's pipe'd
to to count the number of files.  He's obviously using a globbing
expression he's not telling us about, and the message is from the
shell expasion of the expression.

Thirdly, even if it was the number of lines in the files he wanted
to count, the totals will be off if xargs were to invoke wc -l
multiple times, so the command as written can't work anyway.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Overriding ARG_MAX

2002-01-04 Thread David Miller

On Fri, 4 Jan 2002, Terry Lambert wrote:

 David Miller wrote:
  Apologies if this belongs on -questions.  I couldn't find what I needed in
  the archives or handbook.
  
  I have a system where I need/want to handle lots of files in a single
  directory.  Lots as in 100-200K files.  ls | wc -l breaks because the
  value of ARG_MAX in sys/syslimits.h is too small.  If I change it from
  65536 to 4meg and rebuild the world it works fine.
 
 I don't believe you.  There is no ARG_MAX limit on pipes.

My apologies.  Leo also noted that.

I was trying to give a quick example to a general problem and I got a
little too quick.

What I usually want to do is something more like ls *.out |wc -l, or 
grep something *.data or cat *.foo | grep something.

I have rebuilt the system in the past after greatly expanding ARG_MAX, and
that does what I want.  I'm just looking for an easy way to preserve it
across cvsups, not looking for alternate ways to list the files in a
directory:)

 
 Probably, you are doing something whic you aren't telling us,
 like saying ls *.c | wc -l or otherwise using globbing that
 the shell expands to too large a list.
 
 The easy answer is use ``find'' instead of ``ls''.

Indeed, but it doesn't answer the basic questions, which was: Is there any
easy way to override ARG_MAX (or arbitrary other paramaters) in make.conf
or a config file, or something else altogether?

--- David



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Overriding ARG_MAX

2002-01-04 Thread Brooks Davis

On Fri, Jan 04, 2002 at 07:53:52PM -0800, Terry Lambert wrote:
 Brooks Davis wrote:
   I have a system where I need/want to handle lots of files in a single
   directory.  Lots as in 100-200K files.  ls | wc -l breaks because the
   value of ARG_MAX in sys/syslimits.h is too small.  If I change it from
   65536 to 4meg and rebuild the world it works fine.
  
  ls | xargs wc -l
  
  would work with an arbitrary number of files.
 
 No, it wouldn't.

Correct, I misread what he was trying to do.  The second part of my
answer is a working solution to do what he asked to do.

-- Brooks

-- 
Any statement of the form X is the one, true Y is FALSE.
PGP fingerprint 655D 519C 26A7 82E7 2529  9BF0 5D8E 8BE9 F238 1AD4



msg30720/pgp0.pgp
Description: PGP signature