Re: Hashing and directories

2001-04-27 Thread Daniel Phillips
On Thursday 08 March 2001 13:42, Goswin Brederlow wrote: > > " " == Pavel Machek <[EMAIL PROTECTED]> writes: > > Hi! > > > >> I was hoping to point out that in real life, most systems that > >> need to access large numbers of files are already designed to > >> do some

Re: Hashing and directories

2001-04-27 Thread Goswin Brederlow
> " " == Pavel Machek <[EMAIL PROTECTED]> writes: > Hi! >> I was hoping to point out that in real life, most systems that >> need to access large numbers of files are already designed to >> do some kind of hashing, or at least to divide-and-conquer by >> using

Re: Hashing and directories

2001-04-27 Thread Goswin Brederlow
== Pavel Machek [EMAIL PROTECTED] writes: Hi! I was hoping to point out that in real life, most systems that need to access large numbers of files are already designed to do some kind of hashing, or at least to divide-and-conquer by using multi-level directory

Re: Hashing and directories

2001-04-27 Thread Daniel Phillips
On Thursday 08 March 2001 13:42, Goswin Brederlow wrote: == Pavel Machek [EMAIL PROTECTED] writes: Hi! I was hoping to point out that in real life, most systems that need to access large numbers of files are already designed to do some kind of hashing, or at

Re: Hashing and directories

2001-03-12 Thread Xavier Bestel
Le 12 Mar 2001 21:05:58 +1100, Herbert Xu a écrit : > Pavel Machek <[EMAIL PROTECTED]> wrote: > > > xargs is very ugly. I want to rm 12*. Just plain "rm 12*". *Not* "find > > . -name "12*" | xargs rm, which has terrible issues with files names > > Try > > printf "%s\0" 12* | xargs -0 rm Or

Re: Hashing and directories

2001-03-12 Thread Herbert Xu
Pavel Machek <[EMAIL PROTECTED]> wrote: > xargs is very ugly. I want to rm 12*. Just plain "rm 12*". *Not* "find > . -name "12*" | xargs rm, which has terrible issues with files names Try printf "%s\0" 12* | xargs -0 rm -- Debian GNU/Linux 2.2 is out! ( http://www.debian.org/ ) Email:

Re: Hashing and directories

2001-03-12 Thread Herbert Xu
Pavel Machek [EMAIL PROTECTED] wrote: xargs is very ugly. I want to rm 12*. Just plain "rm 12*". *Not* "find . -name "12*" | xargs rm, which has terrible issues with files names Try printf "%s\0" 12* | xargs -0 rm -- Debian GNU/Linux 2.2 is out! ( http://www.debian.org/ ) Email: Herbert Xu

Re: Hashing and directories

2001-03-12 Thread Xavier Bestel
Le 12 Mar 2001 21:05:58 +1100, Herbert Xu a crit : Pavel Machek [EMAIL PROTECTED] wrote: xargs is very ugly. I want to rm 12*. Just plain "rm 12*". *Not* "find . -name "12*" | xargs rm, which has terrible issues with files names Try printf "%s\0" 12* | xargs -0 rm Or find -print0

Re: Hashing and directories

2001-03-10 Thread Kai Henningsen
[EMAIL PROTECTED] (Bill Crawford) wrote on 22.02.01 in <[EMAIL PROTECTED]>: > A particular reason for this, apart from filesystem efficiency, > is to make it easier for people to find things, as it is usually > easier to spot what you want amongst a hundred things than among > a thousand or

Re: Hashing and directories

2001-03-07 Thread Linus Torvalds
In article <003701c0a722$f6b02700$5517fea9@local>, Manfred Spraul <[EMAIL PROTECTED]> wrote: > >exec_mmap currenly avoids mm_alloc()/activate_mm()/mm_drop() for single >threaded apps, and that would become impossible. >I'm not sure how expensive these calls are. They aren't that expensive:

Re: Hashing and directories

2001-03-07 Thread Manfred Spraul
From: "Jamie Lokier" <[EMAIL PROTECTED]> > Manfred Spraul wrote: > > I'm not sure that this is the right way: It means that every exec() > > must call dup_mmap(), and usually only to copy a few hundert > > bytes. But I don't see a sane alternative. I won't propose to > > create a temporary file

Re: Hashing and directories

2001-03-07 Thread Jamie Lokier
Manfred Spraul wrote: > I'm not sure that this is the right way: It means that every exec() must > call dup_mmap(), and usually only to copy a few hundert bytes. But I > don't see a sane alternative. I won't propose to create a temporary file > in a kernel tmpfs mount ;-) Every exec creates a

Re: Hashing and directories

2001-03-07 Thread Manfred Spraul
Jamie wrote: > Linus Torvalds wrote: > > The long-term solution for this is to create the new VM space for the > > new process early, and add it to the list of mm_struct's that the > > swapper knows about, and then just get rid of the pages[MAX_ARG_PAGES] > > array completely and instead just

Re: Hashing and directories

2001-03-07 Thread Jamie Lokier
Linus Torvalds wrote: > The long-term solution for this is to create the new VM space for the > new process early, and add it to the list of mm_struct's that the > swapper knows about, and then just get rid of the pages[MAX_ARG_PAGES] > array completely and instead just populate the new VM

Re: Hashing and directories

2001-03-07 Thread Jamie Lokier
Linus Torvalds wrote: The long-term solution for this is to create the new VM space for the new process early, and add it to the list of mm_struct's that the swapper knows about, and then just get rid of the pages[MAX_ARG_PAGES] array completely and instead just populate the new VM directly.

Re: Hashing and directories

2001-03-07 Thread Manfred Spraul
Jamie wrote: Linus Torvalds wrote: The long-term solution for this is to create the new VM space for the new process early, and add it to the list of mm_struct's that the swapper knows about, and then just get rid of the pages[MAX_ARG_PAGES] array completely and instead just populate

Re: Hashing and directories

2001-03-07 Thread Jamie Lokier
Manfred Spraul wrote: I'm not sure that this is the right way: It means that every exec() must call dup_mmap(), and usually only to copy a few hundert bytes. But I don't see a sane alternative. I won't propose to create a temporary file in a kernel tmpfs mount ;-) Every exec creates a whole

Re: Hashing and directories

2001-03-07 Thread Manfred Spraul
From: "Jamie Lokier" [EMAIL PROTECTED] Manfred Spraul wrote: I'm not sure that this is the right way: It means that every exec() must call dup_mmap(), and usually only to copy a few hundert bytes. But I don't see a sane alternative. I won't propose to create a temporary file in a kernel

Re: Hashing and directories

2001-03-06 Thread Linus Torvalds
In article <[EMAIL PROTECTED]>, Jamie Lokier <[EMAIL PROTECTED]> wrote: >Pavel Machek wrote: >> > the space allowed for arguments is not a userland issue, it is a kernel >> > limit defined by MAX_ARG_PAGES in binfmts.h, so one could tweak it if one >> > wanted to without breaking any userland.

Re: Hashing and directories

2001-03-06 Thread Jamie Lokier
Pavel Machek wrote: > > the space allowed for arguments is not a userland issue, it is a kernel > > limit defined by MAX_ARG_PAGES in binfmts.h, so one could tweak it if one > > wanted to without breaking any userland. > > Which is exactly what I done on my system. 2MB for command line is > very

Re: Hashing and directories

2001-03-06 Thread Jamie Lokier
Pavel Machek wrote: the space allowed for arguments is not a userland issue, it is a kernel limit defined by MAX_ARG_PAGES in binfmts.h, so one could tweak it if one wanted to without breaking any userland. Which is exactly what I done on my system. 2MB for command line is very nice.

Re: Hashing and directories

2001-03-06 Thread Linus Torvalds
In article [EMAIL PROTECTED], Jamie Lokier [EMAIL PROTECTED] wrote: Pavel Machek wrote: the space allowed for arguments is not a userland issue, it is a kernel limit defined by MAX_ARG_PAGES in binfmts.h, so one could tweak it if one wanted to without breaking any userland. Which is

Re: Hashing and directories

2001-03-02 Thread Bill Crawford
Pavel Machek wrote: > Hi! > > I was hoping to point out that in real life, most systems that > > need to access large numbers of files are already designed to do > > some kind of hashing, or at least to divide-and-conquer by using > > multi-level directory structures. > Yes -- because their

Re: Hashing and directories

2001-03-02 Thread Tim Wright
On Fri, Mar 02, 2001 at 10:04:10AM +0100, Pavel Machek wrote: > > xargs is very ugly. I want to rm 12*. Just plain "rm 12*". *Not* "find > . -name "12*" | xargs rm, which has terrible issues with files names > > "xyzzy" > "bla" > "xyzzy bla" > "12 xyzzy bla" > Getting a bit OffTopic(TM) here,

Re: Hashing and directories

2001-03-02 Thread David Weinehall
On Fri, Mar 02, 2001 at 10:04:10AM +0100, Pavel Machek wrote: > Hi! > > > > > * userland issues (what, you thought that limits on the > > > > command size will go away?) > > > > > > Last I checked, the command line size limit wasn't a userland issue, but > > > rather a limit of the

Re: Hashing and directories

2001-03-02 Thread Tobias Ringstrom
On 2 Mar 2001, Oystein Viggen wrote: > Pavel Machek wrote: > > xargs is very ugly. I want to rm 12*. Just plain "rm 12*". *Not* "find > These you work around using the smarter, \0 terminated, version: Another example demonstrating why xargs is not always good (and why a bigger command line is

Re: Hashing and directories

2001-03-02 Thread Oystein Viggen
Pavel Machek wrote: > xargs is very ugly. I want to rm 12*. Just plain "rm 12*". *Not* "find > . -name "12*" | xargs rm, which has terrible issues with files names > > "xyzzy" > "bla" > "xyzzy bla" > "12 xyzzy bla" These you work around using the smarter, \0 terminated, version: find . -name

Re: Hashing and directories

2001-03-02 Thread Pavel Machek
Hi! > > > * userland issues (what, you thought that limits on the > > > command size will go away?) > > > > Last I checked, the command line size limit wasn't a userland issue, but > > rather a limit of the kernel exec(). This might have changed. > > I _really_ don't want to trust the

Re: Hashing and directories

2001-03-02 Thread Pavel Machek
Hi! > > > I was hoping to point out that in real life, most systems that > > > need to access large numbers of files are already designed to do > > > some kind of hashing, or at least to divide-and-conquer by using > > > multi-level directory structures. > > > > Yes -- because their workaround

Re: Hashing and directories

2001-03-02 Thread Pavel Machek
Hi! > > * userland issues (what, you thought that limits on the > > command size will go away?) > > the space allowed for arguments is not a userland issue, it is a kernel > limit defined by MAX_ARG_PAGES in binfmts.h, so one could tweak it if one > wanted to without breaking any userland.

Re: Hashing and directories

2001-03-02 Thread Pavel Machek
Hi! * userland issues (what, you thought that limits on the command size will go away?) the space allowed for arguments is not a userland issue, it is a kernel limit defined by MAX_ARG_PAGES in binfmts.h, so one could tweak it if one wanted to without breaking any userland. Which

Re: Hashing and directories

2001-03-02 Thread Pavel Machek
Hi! I was hoping to point out that in real life, most systems that need to access large numbers of files are already designed to do some kind of hashing, or at least to divide-and-conquer by using multi-level directory structures. Yes -- because their workaround kernel

Re: Hashing and directories

2001-03-02 Thread Pavel Machek
Hi! * userland issues (what, you thought that limits on the command size will go away?) Last I checked, the command line size limit wasn't a userland issue, but rather a limit of the kernel exec(). This might have changed. I _really_ don't want to trust the ability of

Re: Hashing and directories

2001-03-02 Thread Oystein Viggen
Pavel Machek wrote: xargs is very ugly. I want to rm 12*. Just plain "rm 12*". *Not* "find . -name "12*" | xargs rm, which has terrible issues with files names "xyzzy" "bla" "xyzzy bla" "12 xyzzy bla" These you work around using the smarter, \0 terminated, version: find . -name "12*"

Re: Hashing and directories

2001-03-02 Thread Tobias Ringstrom
On 2 Mar 2001, Oystein Viggen wrote: Pavel Machek wrote: xargs is very ugly. I want to rm 12*. Just plain "rm 12*". *Not* "find These you work around using the smarter, \0 terminated, version: Another example demonstrating why xargs is not always good (and why a bigger command line is

Re: Hashing and directories

2001-03-02 Thread David Weinehall
On Fri, Mar 02, 2001 at 10:04:10AM +0100, Pavel Machek wrote: Hi! * userland issues (what, you thought that limits on the command size will go away?) Last I checked, the command line size limit wasn't a userland issue, but rather a limit of the kernel exec(). This

Re: Hashing and directories

2001-03-02 Thread Tim Wright
On Fri, Mar 02, 2001 at 10:04:10AM +0100, Pavel Machek wrote: xargs is very ugly. I want to rm 12*. Just plain "rm 12*". *Not* "find . -name "12*" | xargs rm, which has terrible issues with files names "xyzzy" "bla" "xyzzy bla" "12 xyzzy bla" Getting a bit OffTopic(TM) here, but

Re: Hashing and directories

2001-03-02 Thread Bill Crawford
Pavel Machek wrote: Hi! I was hoping to point out that in real life, most systems that need to access large numbers of files are already designed to do some kind of hashing, or at least to divide-and-conquer by using multi-level directory structures. Yes -- because their workaround

Re: Hashing and directories

2001-03-01 Thread Andreas Dilger
H. Peter Anvin writes [re hashed directories]: > I don't see there being any fundamental reason to not do such an > improvement, except the one Alan Cox mentioned -- crash recovery -- > (which I think can be dealt with; in my example above as long as the leaf > nodes can get recovered, the tree

Re: Hashing and directories

2001-03-01 Thread Bill Crawford
Before I reply: I apologise for starting this argument, or at least making it worse, and please let me say again that I really would like to see improvements in directory searching etc. ... my original point was simply a half-joking aside to the effect that we should not encourage people to put

Re: Hashing and directories

2001-03-01 Thread H. Peter Anvin
Alexander Viro wrote: > > I _really_ don't want to trust the ability of shell to deal with long > command lines. I also don't like the failure modes with history expansion > causing OOM, etc. > > AFAICS right now we hit the kernel limit first, but I really doubt that > raising said limit is a

Re: Hashing and directories

2001-03-01 Thread Alexander Viro
On Thu, 1 Mar 2001, H. Peter Anvin wrote: > > * userland issues (what, you thought that limits on the > > command size will go away?) > > Last I checked, the command line size limit wasn't a userland issue, but > rather a limit of the kernel exec(). This might have changed. I

Re: Hashing and directories

2001-03-01 Thread Tigran Aivazian
On Thu, 1 Mar 2001, Alexander Viro wrote: > * userland issues (what, you thought that limits on the > command size will go away?) the space allowed for arguments is not a userland issue, it is a kernel limit defined by MAX_ARG_PAGES in binfmts.h, so one could tweak it if one wanted to

Re: Hashing and directories

2001-03-01 Thread H. Peter Anvin
Alexander Viro wrote: > > > > Yes -- because their workaround kernel slowness. > > Pavel, I'm afraid that you are missing the point. Several, actually: > * limits of _human_ capability to deal with large unstructured > sets of objects Not an issue if you're a machine. > *

Re: Hashing and directories

2001-03-01 Thread Alexander Viro
On Sat, 1 Jan 2000, Pavel Machek wrote: > Hi! > > > I was hoping to point out that in real life, most systems that > > need to access large numbers of files are already designed to do > > some kind of hashing, or at least to divide-and-conquer by using > > multi-level directory structures. >

Re: Hashing and directories

2001-03-01 Thread Pavel Machek
Hi! > I was hoping to point out that in real life, most systems that > need to access large numbers of files are already designed to do > some kind of hashing, or at least to divide-and-conquer by using > multi-level directory structures. Yes -- because their workaround kernel slowness. I had

Re: Hashing and directories

2001-03-01 Thread Pavel Machek
Hi! I was hoping to point out that in real life, most systems that need to access large numbers of files are already designed to do some kind of hashing, or at least to divide-and-conquer by using multi-level directory structures. Yes -- because their workaround kernel slowness. I had to

Re: Hashing and directories

2001-03-01 Thread Alexander Viro
On Sat, 1 Jan 2000, Pavel Machek wrote: Hi! I was hoping to point out that in real life, most systems that need to access large numbers of files are already designed to do some kind of hashing, or at least to divide-and-conquer by using multi-level directory structures. Yes --

Re: Hashing and directories

2001-03-01 Thread H. Peter Anvin
Alexander Viro wrote: Yes -- because their workaround kernel slowness. Pavel, I'm afraid that you are missing the point. Several, actually: * limits of _human_ capability to deal with large unstructured sets of objects Not an issue if you're a machine. * userland

Re: Hashing and directories

2001-03-01 Thread Tigran Aivazian
On Thu, 1 Mar 2001, Alexander Viro wrote: * userland issues (what, you thought that limits on the command size will go away?) the space allowed for arguments is not a userland issue, it is a kernel limit defined by MAX_ARG_PAGES in binfmts.h, so one could tweak it if one wanted to

Re: Hashing and directories

2001-03-01 Thread Alexander Viro
On Thu, 1 Mar 2001, H. Peter Anvin wrote: * userland issues (what, you thought that limits on the command size will go away?) Last I checked, the command line size limit wasn't a userland issue, but rather a limit of the kernel exec(). This might have changed. I _really_

Re: Hashing and directories

2001-03-01 Thread H. Peter Anvin
Alexander Viro wrote: I _really_ don't want to trust the ability of shell to deal with long command lines. I also don't like the failure modes with history expansion causing OOM, etc. AFAICS right now we hit the kernel limit first, but I really doubt that raising said limit is a good

Re: Hashing and directories

2001-03-01 Thread Bill Crawford
Before I reply: I apologise for starting this argument, or at least making it worse, and please let me say again that I really would like to see improvements in directory searching etc. ... my original point was simply a half-joking aside to the effect that we should not encourage people to put

Re: Hashing and directories

2001-03-01 Thread Andreas Dilger
H. Peter Anvin writes [re hashed directories]: I don't see there being any fundamental reason to not do such an improvement, except the one Alan Cox mentioned -- crash recovery -- (which I think can be dealt with; in my example above as long as the leaf nodes can get recovered, the tree can

Re: Hashing and directories

2001-02-22 Thread Bill Crawford
"H. Peter Anvin" wrote: > Bill Crawford wrote: ... > > We use Solaris and NFS a lot, too, so large directories are a bad > > thing in general for us, so we tend to subdivide things using a > > very simple scheme: taking the first letter and then sometimes > > the second letter or a pair of

Re: Hashing and directories

2001-02-22 Thread H. Peter Anvin
Bill Crawford wrote: > > A particular reason for this, apart from filesystem efficiency, > is to make it easier for people to find things, as it is usually > easier to spot what you want amongst a hundred things than among > a thousand or ten thousand. > > A couple of practical examples from

Re: Hashing and directories

2001-02-22 Thread H. Peter Anvin
Bill Crawford wrote: A particular reason for this, apart from filesystem efficiency, is to make it easier for people to find things, as it is usually easier to spot what you want amongst a hundred things than among a thousand or ten thousand. A couple of practical examples from work

Re: Hashing and directories

2001-02-22 Thread Bill Crawford
"H. Peter Anvin" wrote: Bill Crawford wrote: ... We use Solaris and NFS a lot, too, so large directories are a bad thing in general for us, so we tend to subdivide things using a very simple scheme: taking the first letter and then sometimes the second letter or a pair of letters from