Re: limit to number of files seen by ls?
>> Karl Vogel wrote: K> The main reason I stick with 1000 is because directories are read K> linearly unless you're using something like ReiserFS... >> On Sun, 26 Jul 2009 08:34:50 +0100, >> Matthew Seaman said: M> You mean filesystems like FreeBSD UFS2 with DIRHASH? The problem with M> linear time scanning of directory contents has been solved for awhile... Sure, that's why I said "something like". Not everyone is using the latest and greatest, especially if you have anything to do with the public sector. It's not unusual to see people using servers that are 8-10 years old and run around the clock, and they can't upgrade because they're not allowed the downtime. I'm not saying we should act like everyone's using the moral equivalent of FreeBSD 2.2.7. I am saying that if you have a design decision to make, you'll solve more problems than you cause if you add the extra 2-3 lines of code to hash a huge directory into several smaller ones. -- Karl Vogel I don't speak for the USAF or my company Since we have to speak well of the dead, let's knock them while they're alive. --John Sloan ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: limit to number of files seen by ls?
On Monday 27 July 2009 12:42:32 Chris Cowart wrote: > John Almberg wrote: > > Which is why I'm starting to think that (a) my problem is different > > or (b) I'm so clueless that there isn't any problem at all, and I'm > > just not understanding something (most likely scenario!) > > It looks to me like the thread began assuming that you must be typing > `ls *` in order to run into problems. Yeah, I just noticed that too. So how did you determine there should be ~4000 files in the directory when ls shows ~2300. Also, does ls give an error message? ls -l >/tmp/out should clear that up and you can use wc -l /tmp/out to see how many files are returned. -- Mel ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: limit to number of files seen by ls?
John Almberg wrote: > Which is why I'm starting to think that (a) my problem is different > or (b) I'm so clueless that there isn't any problem at all, and I'm > just not understanding something (most likely scenario!) It looks to me like the thread began assuming that you must be typing `ls *` in order to run into problems. I think we'll have better luck helping you if you tell us exactly what it is you're typing when you observe the problem. -- Chris Cowart Network Technical Lead Network & Infrastructure Services, RSSP-IT UC Berkeley pgpRRYgwUaZNY.pgp Description: PGP signature
Re: limit to number of files seen by ls?
understanding what is going on. I'm reading up on this, and as soon as I know enough to either understand the issue, or ask an intelligent question, I will do so... When a program is executed with arguments, there is a system imposed limit on the size of this argument list. On FreeBSD this limit can be seen with sysctl kern.argmax, which is the length in bytes. When you do "ls *", what really happens is that the shell expands the asterisk to all entries in the current directory, except entries starting with a dot ("hidden" files and directories). As a result, ls is really called as: ls file1 file2 fileN If the string length of file1 to fileN is bigger then kern.argmax, then you will get argument list too long error. Mel, What I get is this: > sysctl kern.argmax kern.argmax: 262144 Which is why I'm starting to think that (a) my problem is different or (b) I'm so clueless that there isn't any problem at all, and I'm just not understanding something (most likely scenario!) I'm going to write a little script that generates a bunch of files to test my hypothesis that once I get more than n files in a directory, some things stop working correctly, like ls and ftp directory listings, and to discover the value of n. That will give me some hard data to work with. This problem has been nagging at me for a while, so it's time I nail it down once and for all... I'll be back... -- John ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: limit to number of files seen by ls?
On Sunday 26 July 2009 10:24:31 John Almberg wrote: > On Jul 26, 2009, at 4:45 AM, Mel Flynn wrote: > > On Saturday 25 July 2009 23:34:50 Matthew Seaman wrote: > >> It's fairly rare to run into this as a practical > >> limitation during most day to day use, and there are various > >> tricks like > >> using xargs(1) to extend the usable range. Even so, for really big > >> applications that need to process long lists of data, you'ld have > >> to code > >> the whole thing to input the list via a file or pipe. > > > > ls itself is not glob(3) aware, but there are programs that are, > > like scp. So > > the fastest solution in those cases is to single quote the argument > > and let > > the program expand the glob. for loops are also a common work around: > > ls */* == for f in */*; do ls $f; done > > > > Point of it all being, that the cause of the OP's observed behavior > > is only > > indirectly related to the directory size. He will have the same > > problem if he > > divides the 4000 files over 4 directories and calls ls */* > > H'mmm... I haven't come back on this question, because I want my next > question to be an intelligent one, but I'm having a hard time > understanding what is going on. I'm reading up on this, and as soon > as I know enough to either understand the issue, or ask an > intelligent question, I will do so... When a program is executed with arguments, there is a system imposed limit on the size of this argument list. On FreeBSD this limit can be seen with sysctl kern.argmax, which is the length in bytes. When you do "ls *", what really happens is that the shell expands the asterisk to all entries in the current directory, except entries starting with a dot ("hidden" files and directories). As a result, ls is really called as: ls file1 file2 fileN If the string length of file1 to fileN is bigger then kern.argmax, then you will get argument list too long error. -- Mel ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: limit to number of files seen by ls?
On Jul 26, 2009, at 4:45 AM, Mel Flynn wrote: On Saturday 25 July 2009 23:34:50 Matthew Seaman wrote: It's fairly rare to run into this as a practical limitation during most day to day use, and there are various tricks like using xargs(1) to extend the usable range. Even so, for really big applications that need to process long lists of data, you'ld have to code the whole thing to input the list via a file or pipe. ls itself is not glob(3) aware, but there are programs that are, like scp. So the fastest solution in those cases is to single quote the argument and let the program expand the glob. for loops are also a common work around: ls */* == for f in */*; do ls $f; done Point of it all being, that the cause of the OP's observed behavior is only indirectly related to the directory size. He will have the same problem if he divides the 4000 files over 4 directories and calls ls */* H'mmm... I haven't come back on this question, because I want my next question to be an intelligent one, but I'm having a hard time understanding what is going on. I'm reading up on this, and as soon as I know enough to either understand the issue, or ask an intelligent question, I will do so... Thanks for all the comments... -- John ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: limit to number of files seen by ls?
On Saturday 25 July 2009 23:34:50 Matthew Seaman wrote: > It's fairly rare to run into this as a practical > limitation during most day to day use, and there are various tricks like > using xargs(1) to extend the usable range. Even so, for really big > applications that need to process long lists of data, you'ld have to code > the whole thing to input the list via a file or pipe. ls itself is not glob(3) aware, but there are programs that are, like scp. So the fastest solution in those cases is to single quote the argument and let the program expand the glob. for loops are also a common work around: ls */* == for f in */*; do ls $f; done Point of it all being, that the cause of the OP's observed behavior is only indirectly related to the directory size. He will have the same problem if he divides the 4000 files over 4 directories and calls ls */*. -- Mel ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: limit to number of files seen by ls?
Karl Vogel wrote: That arbitrary number has worked very nicely for me for 20 years under Solaris, Linux, and several BSD variants. The main reason I stick with 1000 is because directories are read linearly unless you're using something like ReiserFS, and I get impatient waiting for more than that number of filenames to be sorted when using ls. Um... you mean filesystems like FreeBSD UFS2 with DIRHASH? The problem with linear time scanning of directory contents has been solved for a while, and directories containing of the order of 10^5 files are nowadays usable. You are entirely right that with ls(1) one of the biggest causes of delay when returning a long list of files is actually sorting the output, but that would happen whatever filesystem you're using. If this is a problem, most of the time you can just avoid the sorting altogether by using 'ls -f' If your application is trying to create hundreds of thousands or millions of files in any one directory, or you're creating lots of 200-character filenames from hell, then your design is a poor match for most varieties of Unix; small directories perform better than enormous ones, and lots of commonly-used scripts and programs will fall over when handed zillion-file argument lists. Yep. You are correct here, but I think your concept of 'small' could grow by an order of magnitude when using a reasonably up to date OS -- anything produced in the last 3 -- 5 years should be able to cope with directories containing tens of thousands of files without slowing down disastrously. Most of the limitation on the number of arguments a command will accept are due to the shell imposing a maximum, which in turn is due to limitations on the size of the argv[] array allowed for execve(2). This is a general limitation and applies to anything listed on a command line, not just file names. It's fairly rare to run into this as a practical limitation during most day to day use, and there are various tricks like using xargs(1) to extend the usable range. Even so, for really big applications that need to process long lists of data, you'ld have to code the whole thing to input the list via a file or pipe. Long filenames per-se aren't a bad thing -- a descriptive filename is quite beneficial to human users. But long names aren't necessary: after all, just 12 alphanumeric characters will give you: 8114042066856017096132973186621192079364039587244176589984832159744 possible different filenames, which should be enough for anyone, and all the computer cares about is that the name is distinct. (that doesn't even include punctuation characters)). So what if you can't remember whether Jx64rQWundkS contains "The Beatles: White Album" or "Annual Report of the Departmental Sub-committee in Charge of Producing Really Boring Reports." I've seen long filenames taken to extremes: people saving the text of a letter using a filename that consists of the names of the sender and recipients, the date sent and a precis of the contents. I'm pretty sure that some of those filenames were longer than the actual letters... I'm sure the latest version of fixes all these objections, but not everyone gets to run the latest and greatest. Don't fight your filesystem, and it won't fight you. So, FreeBSD is a *cool OS*? But we knew that already... Cheers, Matthew -- Dr Matthew J Seaman MA, D.Phil. 7 Priory Courtyard Flat 3 PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate Kent, CT11 9PW signature.asc Description: OpenPGP digital signature
Re: limit to number of files seen by ls?
>> On Thursday 23 July 2009 09:41:26 Karl Vogel wrote: K> Every version of Unix I've ever used had an upper limit on the size of K> the argument list you could pass to a program, so it won't just be "ls" K> that's affected here. That's why I use 1,000 as a rule of thumb for the K> maximum number of files I put in a directory. >> On Thu, 23 Jul 2009 10:25:49 -0800, >> Mel Flynn said: M> That arbitrary number works simply because kern.argmax default has been M> raised somewhere in 6.x (before it was 64kB). That arbitrary number has worked very nicely for me for 20 years under Solaris, Linux, and several BSD variants. The main reason I stick with 1000 is because directories are read linearly unless you're using something like ReiserFS, and I get impatient waiting for more than that number of filenames to be sorted when using ls. M> And MAXNAMLEN in sys/dirent.h is 255. That's the maximum length of a single filename in a directory. Since I keep my filenames much shorter, I don't have a problem. M> Knowing your way around maximum arguments length through xargs as M> suggested in this thread is much better solution then trying to exercise M> control over directory sizes, which may or not be under your control in M> the first place. Xargs is very useful, but it's not a substitute for poor design, and it's not something you can drop into any existing pipeline without a little thought first. If your application is trying to create hundreds of thousands or millions of files in any one directory, or you're creating lots of 200-character filenames from hell, then your design is a poor match for most varieties of Unix; small directories perform better than enormous ones, and lots of commonly-used scripts and programs will fall over when handed zillion-file argument lists. I'm sure the latest version of fixes all these objections, but not everyone gets to run the latest and greatest. Don't fight your filesystem, and it won't fight you. -- Karl Vogel I don't speak for the USAF or my company Birds of a feather flock together and usually crap on your car. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: limit to number of files seen by ls?
On Thursday 23 July 2009 09:41:26 Karl Vogel wrote: > >> On Wed, 22 Jul 2009 20:01:57 -0400, > >> John Almberg said: > > J> A client has a directory with a big-ish number of jpgs... maybe 4000. > J> Problem is, I can only see 2329 of them with ls, and I'm running into > J> other problems, I think. > > J> Question: Is there some limit to the number of files that a directory > J> can contain? Or rather, is there some number where things like ls start > J> working incorrectly? > >Every version of Unix I've ever used had an upper limit on the size >of the argument list you could pass to a program, so it won't just be >"ls" that's affected here. That's why I use 1,000 as a rule of thumb >for the maximum number of files I put in a directory. That arbitrary number works simply because kern.argmax default has been raised somewhere in 6.x (before it was 64kB). % echo `sysctl -n kern.argmax`/1000|bc 262 And MAXNAMLEN in sys/dirent.h is 255. Knowing your way around maximum arguments length through xargs as suggested in this thread is much better solution then trying to exercise control over directory sizes, which may or not be under your control in the first place. -- Mel ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: limit to number of files seen by ls?
>> On Wed, 22 Jul 2009 20:01:57 -0400, >> John Almberg said: J> A client has a directory with a big-ish number of jpgs... maybe 4000. J> Problem is, I can only see 2329 of them with ls, and I'm running into J> other problems, I think. J> Question: Is there some limit to the number of files that a directory J> can contain? Or rather, is there some number where things like ls start J> working incorrectly? Every version of Unix I've ever used had an upper limit on the size of the argument list you could pass to a program, so it won't just be "ls" that's affected here. That's why I use 1,000 as a rule of thumb for the maximum number of files I put in a directory. A longer-term fix for your client would be to break up that JPEG file list into smaller sets based on (say) date or image topic or whatever. -- Karl Vogel I don't speak for the USAF or my company Al Capone's business card said he was a used furniture dealer. --item for a lull in conversation ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: limit to number of files seen by ls?
John Almberg wrote: I seem to have run into an odd problem... A client has a directory with a big-ish number of jpgs... maybe 4000. Problem is, I can only see 2329 of them with ls, and I'm running into other problems, I think. Question: Is there some limit to the number of files that a directory can contain? Or rather, is there some number where things like ls start working incorrectly? There's a limit to the number of arguments the shell will deal with for one command. So if you type: % ls -lh * (meaning the shell expands '*' to a list of filenames), you'll run into that limitation. However, if you type % ls -lh and let ls(1) read the directory contents itself, it should cope with 4000 items easily. [It might slow down because of sorting the results, but for only 4000 items that's probably not significant] Now, if your problem is that these 4000 jpegs are mixed up with other files and you only want to list the jpeg files, then you could do something like this: % find . -name '*.jpeg' -print0 | xargs -0 ls -lh or even just: % find . -name '*.jpg' -ls Cheers, Matthew -- Dr Matthew J Seaman MA, D.Phil. 7 Priory Courtyard Flat 3 PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate Kent, CT11 9PW signature.asc Description: OpenPGP digital signature
Re: limit to number of files seen by ls?
How are you using ls? I presume something along the lines of "ls -la | more". What does "sysctl fs.file-ma" and "sysctl kern.maxfiles" tell you? I've seen directories with 1+ files. The only problem I've ever had with that many is using the rm command. In that case, you will need to use something like find ./ -type f -exec rm {}\; Take care, Julian On Wed, Jul 22, 2009 at 8:01 PM, John Almberg wrote: > I seem to have run into an odd problem... > > A client has a directory with a big-ish number of jpgs... maybe 4000. > Problem is, I can only see 2329 of them with ls, and I'm running into other > problems, I think. > > Question: Is there some limit to the number of files that a directory can > contain? Or rather, is there some number where things like ls start working > incorrectly? > > -- John > > > ___ > freebsd-questions@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to " > freebsd-questions-unsubscr...@freebsd.org" > ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"