Re: limit to number of files seen by ls?

2009-07-30 Thread Karl Vogel
 Karl Vogel wrote:

K The main reason I stick with 1000 is because directories are read
K linearly unless you're using something like ReiserFS...

 On Sun, 26 Jul 2009 08:34:50 +0100, 
 Matthew Seaman m.sea...@infracaninophile.co.uk said:

M You mean filesystems like FreeBSD UFS2 with DIRHASH?  The problem with
M linear time scanning of directory contents has been solved for awhile...

   Sure, that's why I said something like.  Not everyone is using the
   latest and greatest, especially if you have anything to do with the
   public sector.  It's not unusual to see people using servers that
   are 8-10 years old and run around the clock, and they can't upgrade
   because they're not allowed the downtime.

   I'm not saying we should act like everyone's using the moral equivalent
   of FreeBSD 2.2.7.  I am saying that if you have a design decision to
   make, you'll solve more problems than you cause if you add the extra
   2-3 lines of code to hash a huge directory into several smaller ones.

-- 
Karl Vogel  I don't speak for the USAF or my company

Since we have to speak well of the dead, let's knock them while they're alive.
 --John Sloan
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: limit to number of files seen by ls?

2009-07-27 Thread John Almberg

understanding what is going on. I'm reading up on this, and as soon
as I know enough to either understand the issue, or ask an
intelligent question, I will do so...


When a program is executed with arguments, there is a system  
imposed limit on
the size of this argument list. On FreeBSD this limit can be seen  
with sysctl

kern.argmax, which is the length in bytes.
When you do ls *, what really happens is that the shell expands  
the asterisk
to all entries in the current directory, except entries starting  
with a dot

(hidden files and directories). As a result, ls is really called as:
ls file1 file2  fileN

If the string length of file1 to fileN is bigger then kern.argmax,  
then you

will get argument list too long error.


Mel,

What I get is this:

 sysctl kern.argmax
kern.argmax: 262144

Which is why I'm starting to think that (a) my problem is different  
or (b) I'm so clueless that there isn't any problem at all, and I'm  
just not understanding something (most likely scenario!)


I'm going to write a little script that generates a bunch of files to  
test my hypothesis that once I get more than n files in a directory,  
some things stop working correctly, like ls and ftp directory  
listings, and to discover the value of n. That will give me some hard  
data to work with.


This problem has been nagging at me for a while, so it's time I nail  
it down once and for all...


I'll be back...

-- John



___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: limit to number of files seen by ls?

2009-07-27 Thread Chris Cowart
John Almberg wrote:
 Which is why I'm starting to think that (a) my problem is different  
 or (b) I'm so clueless that there isn't any problem at all, and I'm  
 just not understanding something (most likely scenario!)

It looks to me like the thread began assuming that you must be typing
`ls *` in order to run into problems. I think we'll have better luck
helping you if you tell us exactly what it is you're typing when you
observe the problem.

-- 
Chris Cowart
Network Technical Lead
Network  Infrastructure Services, RSSP-IT
UC Berkeley


pgpRRYgwUaZNY.pgp
Description: PGP signature


Re: limit to number of files seen by ls?

2009-07-27 Thread Mel Flynn
On Monday 27 July 2009 12:42:32 Chris Cowart wrote:
 John Almberg wrote:
  Which is why I'm starting to think that (a) my problem is different
  or (b) I'm so clueless that there isn't any problem at all, and I'm
  just not understanding something (most likely scenario!)

 It looks to me like the thread began assuming that you must be typing
 `ls *` in order to run into problems.

Yeah, I just noticed that too. So how did you determine there should be ~4000 
files in the directory when ls shows ~2300. Also, does ls give an error 
message?
ls -l /tmp/out should clear that up and you can use wc -l /tmp/out to see how 
many files are returned.
-- 
Mel
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: limit to number of files seen by ls?

2009-07-26 Thread Matthew Seaman

Karl Vogel wrote:


   That arbitrary number has worked very nicely for me for 20 years under
   Solaris, Linux, and several BSD variants.  The main reason I stick
   with 1000 is because directories are read linearly unless you're using
   something like ReiserFS, and I get impatient waiting for more than that
   number of filenames to be sorted when using ls.


Um... you mean filesystems like FreeBSD UFS2 with DIRHASH?  The problem
with linear time scanning of directory contents has been solved for a while,
and directories containing of the order of 10^5 files are nowadays usable.

You are entirely right that with ls(1) one of the biggest causes of delay
when returning a long list of files is actually sorting the output, but
that would happen whatever filesystem you're using.  If this is a problem,
most of the time you can just avoid the sorting altogether by using 'ls -f'


   If your application is trying to create hundreds of thousands or
   millions of files in any one directory, or you're creating lots of
   200-character filenames from hell, then your design is a poor match for
   most varieties of Unix; small directories perform better than enormous
   ones, and lots of commonly-used scripts and programs will fall over
   when handed zillion-file argument lists.


Yep.  You are correct here, but I think your concept of 'small' could grow
by an order of magnitude when using a reasonably up to date OS -- anything
produced in the last 3 -- 5 years should be able to cope with directories
containing tens of thousands of files without slowing down disastrously.

Most of the limitation on the number of arguments a command will accept
are due to the shell imposing a maximum, which in turn is due to limitations
on the size of the argv[] array allowed for execve(2).  This is a general
limitation and applies to anything listed on a command line, not just file
names.  It's fairly rare to run into this as a practical limitation during
most day to day use, and there are various tricks like using xargs(1) to
extend the usable range.  Even so, for really big applications that need
to process long lists of data, you'ld have to code the whole thing to
input the list via a file or pipe.

Long filenames per-se aren't a bad thing -- a descriptive filename is
quite beneficial to human users. But long names aren't necessary: after all,
just 12 alphanumeric characters will give you:

 8114042066856017096132973186621192079364039587244176589984832159744

possible different filenames, which should be enough for anyone, and all
the computer cares about is that the name is distinct.  (that doesn't even
include punctuation characters)).  So what if you can't remember whether
Jx64rQWundkS contains The Beatles: White Album or Annual Report of the
Departmental Sub-committee in Charge of Producing Really Boring Reports.

I've seen long filenames taken to extremes: people saving the text of a letter
using a filename that consists of the names of the sender and recipients, the
date sent and a precis of the contents.  I'm pretty sure that some of those
filenames were longer than the actual letters...


   I'm sure the latest version of insert-cool-OS-or-filesystem-here
   fixes all these objections, but not everyone gets to run the latest
   and greatest.  Don't fight your filesystem, and it won't fight you.


So, FreeBSD is a *cool OS*?   But we knew that already...

Cheers,

Matthew

--
Dr Matthew J Seaman MA, D.Phil.   7 Priory Courtyard
 Flat 3
PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate
 Kent, CT11 9PW



signature.asc
Description: OpenPGP digital signature


Re: limit to number of files seen by ls?

2009-07-26 Thread Mel Flynn
On Saturday 25 July 2009 23:34:50 Matthew Seaman wrote:

 It's fairly rare to run into this as a practical
 limitation during most day to day use, and there are various tricks like
 using xargs(1) to extend the usable range.  Even so, for really big
 applications that need to process long lists of data, you'ld have to code
 the whole thing to input the list via a file or pipe.

ls itself is not glob(3) aware, but there are programs that are, like scp. So 
the fastest solution in those cases is to single quote the argument and let 
the program expand the glob. for loops are also a common work around:
ls */* == for f in */*; do ls $f; done

Point of it all being, that the cause of the OP's observed behavior is only 
indirectly related to the directory size. He will have the same problem if he 
divides the 4000 files over 4 directories and calls ls */*.
-- 
Mel
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: limit to number of files seen by ls?

2009-07-26 Thread John Almberg


On Jul 26, 2009, at 4:45 AM, Mel Flynn wrote:


On Saturday 25 July 2009 23:34:50 Matthew Seaman wrote:


It's fairly rare to run into this as a practical
limitation during most day to day use, and there are various  
tricks like

using xargs(1) to extend the usable range.  Even so, for really big
applications that need to process long lists of data, you'ld have  
to code

the whole thing to input the list via a file or pipe.


ls itself is not glob(3) aware, but there are programs that are,  
like scp. So
the fastest solution in those cases is to single quote the argument  
and let

the program expand the glob. for loops are also a common work around:
ls */* == for f in */*; do ls $f; done

Point of it all being, that the cause of the OP's observed behavior  
is only
indirectly related to the directory size. He will have the same  
problem if he

divides the 4000 files over 4 directories and calls ls */*


H'mmm... I haven't come back on this question, because I want my next  
question to be an intelligent one, but I'm having a hard time  
understanding what is going on. I'm reading up on this, and as soon  
as I know enough to either understand the issue, or ask an  
intelligent question, I will do so...


Thanks for all the comments...

-- John

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: limit to number of files seen by ls?

2009-07-26 Thread Mel Flynn
On Sunday 26 July 2009 10:24:31 John Almberg wrote:
 On Jul 26, 2009, at 4:45 AM, Mel Flynn wrote:
  On Saturday 25 July 2009 23:34:50 Matthew Seaman wrote:
  It's fairly rare to run into this as a practical
  limitation during most day to day use, and there are various
  tricks like
  using xargs(1) to extend the usable range.  Even so, for really big
  applications that need to process long lists of data, you'ld have
  to code
  the whole thing to input the list via a file or pipe.
 
  ls itself is not glob(3) aware, but there are programs that are,
  like scp. So
  the fastest solution in those cases is to single quote the argument
  and let
  the program expand the glob. for loops are also a common work around:
  ls */* == for f in */*; do ls $f; done
 
  Point of it all being, that the cause of the OP's observed behavior
  is only
  indirectly related to the directory size. He will have the same
  problem if he
  divides the 4000 files over 4 directories and calls ls */*

 H'mmm... I haven't come back on this question, because I want my next
 question to be an intelligent one, but I'm having a hard time
 understanding what is going on. I'm reading up on this, and as soon
 as I know enough to either understand the issue, or ask an
 intelligent question, I will do so...

When a program is executed with arguments, there is a system imposed limit on 
the size of this argument list. On FreeBSD this limit can be seen with sysctl 
kern.argmax, which is the length in bytes.
When you do ls *, what really happens is that the shell expands the asterisk 
to all entries in the current directory, except entries starting with a dot 
(hidden files and directories). As a result, ls is really called as:
ls file1 file2  fileN

If the string length of file1 to fileN is bigger then kern.argmax, then you 
will get argument list too long error.
-- 
Mel
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: limit to number of files seen by ls?

2009-07-25 Thread Karl Vogel
 On Thursday 23 July 2009 09:41:26 Karl Vogel wrote:

K Every version of Unix I've ever used had an upper limit on the size of
K the argument list you could pass to a program, so it won't just be ls
K that's affected here.  That's why I use 1,000 as a rule of thumb for the
K maximum number of files I put in a directory.

 On Thu, 23 Jul 2009 10:25:49 -0800, 
 Mel Flynn mel.flynn+fbsd.questi...@mailing.thruhere.net said:

M That arbitrary number works simply because kern.argmax default has been
M raised somewhere in 6.x (before it was 64kB).

   That arbitrary number has worked very nicely for me for 20 years under
   Solaris, Linux, and several BSD variants.  The main reason I stick
   with 1000 is because directories are read linearly unless you're using
   something like ReiserFS, and I get impatient waiting for more than that
   number of filenames to be sorted when using ls.

M And MAXNAMLEN in sys/dirent.h is 255.

   That's the maximum length of a single filename in a directory.  Since
   I keep my filenames much shorter, I don't have a problem.

M Knowing your way around maximum arguments length through xargs as
M suggested in this thread is much better solution then trying to exercise
M control over directory sizes, which may or not be under your control in
M the first place.

   Xargs is very useful, but it's not a substitute for poor design, and
   it's not something you can drop into any existing pipeline without a
   little thought first.

   If your application is trying to create hundreds of thousands or
   millions of files in any one directory, or you're creating lots of
   200-character filenames from hell, then your design is a poor match for
   most varieties of Unix; small directories perform better than enormous
   ones, and lots of commonly-used scripts and programs will fall over
   when handed zillion-file argument lists.

   I'm sure the latest version of insert-cool-OS-or-filesystem-here
   fixes all these objections, but not everyone gets to run the latest
   and greatest.  Don't fight your filesystem, and it won't fight you.

-- 
Karl Vogel  I don't speak for the USAF or my company
Birds of a feather flock together and usually crap on your car.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: limit to number of files seen by ls?

2009-07-23 Thread Matthew Seaman

John Almberg wrote:

I seem to have run into an odd problem...

A client has a directory with a big-ish number of jpgs... maybe 4000. 
Problem is, I can only see 2329 of them with ls, and I'm running into 
other problems, I think.


Question: Is there some limit to the number of files that a directory 
can contain? Or rather, is there some number where things like ls start 
working incorrectly?


There's a limit to the number of arguments the shell will deal with
for one command.  So if you type:

   % ls -lh *

(meaning the shell expands '*' to a list of filenames), you'll run into
that limitation.  However, if you type

   % ls -lh

and let ls(1) read the directory contents itself, it should cope
with 4000 items easily.  [It might slow down because of sorting the
results, but for only 4000 items that's probably not significant]

Now, if your problem is that these 4000 jpegs are mixed up with other
files and you only want to list the jpeg files, then you could do
something like this:

   % find . -name '*.jpeg' -print0 | xargs -0 ls -lh

or even just:

   % find . -name '*.jpg' -ls

Cheers,

Matthew

--
Dr Matthew J Seaman MA, D.Phil.   7 Priory Courtyard
 Flat 3
PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate
 Kent, CT11 9PW



signature.asc
Description: OpenPGP digital signature


Re: limit to number of files seen by ls?

2009-07-23 Thread Karl Vogel
 On Wed, 22 Jul 2009 20:01:57 -0400, 
 John Almberg jalmb...@identry.com said:

J A client has a directory with a big-ish number of jpgs... maybe 4000.
J Problem is, I can only see 2329 of them with ls, and I'm running into
J other problems, I think.

J Question: Is there some limit to the number of files that a directory
J can contain? Or rather, is there some number where things like ls start
J working incorrectly?

   Every version of Unix I've ever used had an upper limit on the size
   of the argument list you could pass to a program, so it won't just be
   ls that's affected here.  That's why I use 1,000 as a rule of thumb
   for the maximum number of files I put in a directory.

   A longer-term fix for your client would be to break up that JPEG file
   list into smaller sets based on (say) date or image topic or whatever.

-- 
Karl Vogel  I don't speak for the USAF or my company
Al Capone's business card said he was a used furniture dealer.
--item for a lull in conversation
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: limit to number of files seen by ls?

2009-07-23 Thread Mel Flynn
On Thursday 23 July 2009 09:41:26 Karl Vogel wrote:
  On Wed, 22 Jul 2009 20:01:57 -0400,
  John Almberg jalmb...@identry.com said:

 J A client has a directory with a big-ish number of jpgs... maybe 4000.
 J Problem is, I can only see 2329 of them with ls, and I'm running into
 J other problems, I think.

 J Question: Is there some limit to the number of files that a directory
 J can contain? Or rather, is there some number where things like ls start
 J working incorrectly?

Every version of Unix I've ever used had an upper limit on the size
of the argument list you could pass to a program, so it won't just be
ls that's affected here.  That's why I use 1,000 as a rule of thumb
for the maximum number of files I put in a directory.

That arbitrary number works simply because kern.argmax default has been raised 
somewhere in 6.x (before it was 64kB).
% echo `sysctl -n kern.argmax`/1000|bc
262

And MAXNAMLEN in sys/dirent.h is 255.
Knowing your way around maximum arguments length through xargs as suggested in 
this thread is much better solution then trying to exercise control over 
directory sizes, which may or not be under your control in the first place.
-- 
Mel
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: limit to number of files seen by ls?

2009-07-22 Thread Julian Zottl
How are you using ls?  I presume something along the lines of ls -la |
more.

What does sysctl fs.file-ma and sysctl kern.maxfiles tell you?

I've seen directories with 1+ files.  The only problem I've ever had
with that many is using the rm command.  In that case, you will need to use
something like find ./ -type f -exec rm {}\;

Take care,


Julian


On Wed, Jul 22, 2009 at 8:01 PM, John Almberg jalmb...@identry.com wrote:

 I seem to have run into an odd problem...

 A client has a directory with a big-ish number of jpgs... maybe 4000.
 Problem is, I can only see 2329 of them with ls, and I'm running into other
 problems, I think.

 Question: Is there some limit to the number of files that a directory can
 contain? Or rather, is there some number where things like ls start working
 incorrectly?

 -- John


 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to 
 freebsd-questions-unsubscr...@freebsd.org

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org