Re: [RFC] Parallelize IO for e2fsck

2008-02-03 Thread KOSAKI Motohiro
Hi Pavel > > > As user pages are always in highmem, this should be easy to decide: > > > only send SIGDANGER when highmem is full. (Yes, there are > > > inodes/dentries/file descriptors in lowmem, but I doubt apps will > > > respond to SIGDANGER by closing files). > > > > Good point; for a system

Re: Kernel Event Notifications (was: [RFC] Parallelize IO for e2fsck)

2008-02-03 Thread KOSAKI Motohiro
Hi Jon > I looked at this a year or two back, then ran out of time. But the thing > I wanted to do was have libc's memory allocation routines extended to > handle these through reservations - the kernel should send a userspace > notification and then there should be some kind of concept of

Re: Kernel Event Notifications (was: [RFC] Parallelize IO for e2fsck)

2008-02-03 Thread KOSAKI Motohiro
Hi Jon I looked at this a year or two back, then ran out of time. But the thing I wanted to do was have libc's memory allocation routines extended to handle these through reservations - the kernel should send a userspace notification and then there should be some kind of concept of returning

Re: [RFC] Parallelize IO for e2fsck

2008-02-03 Thread KOSAKI Motohiro
Hi Pavel As user pages are always in highmem, this should be easy to decide: only send SIGDANGER when highmem is full. (Yes, there are inodes/dentries/file descriptors in lowmem, but I doubt apps will respond to SIGDANGER by closing files). Good point; for a system with at least

Re: [RFC] Parallelize IO for e2fsck

2008-01-28 Thread david
On Mon, 28 Jan 2008, Theodore Tso wrote: On Mon, Jan 28, 2008 at 07:30:05PM +, Pavel Machek wrote: As user pages are always in highmem, this should be easy to decide: only send SIGDANGER when highmem is full. (Yes, there are inodes/dentries/file descriptors in lowmem, but I doubt apps

Re: Kernel Event Notifications (was: [RFC] Parallelize IO for e2fsck)

2008-01-28 Thread Jon Masters
On Sat, 2008-01-26 at 16:55 +0300, Al Boldi wrote: > KOSAKI Motohiro wrote: > > > > And from a performance point of view letting applications voluntarily > > > > free some memory is better even than starting to swap. > > > > > > Absolutely. > > > > the mem_notify patch can realize "just before

Re: [RFC] Parallelize IO for e2fsck

2008-01-28 Thread Pavel Machek
On Mon 2008-01-28 14:56:33, Theodore Tso wrote: > On Mon, Jan 28, 2008 at 07:30:05PM +, Pavel Machek wrote: > > > > As user pages are always in highmem, this should be easy to decide: > > only send SIGDANGER when highmem is full. (Yes, there are > > inodes/dentries/file descriptors in lowmem,

Re: [RFC] Parallelize IO for e2fsck

2008-01-28 Thread Theodore Tso
On Mon, Jan 28, 2008 at 07:30:05PM +, Pavel Machek wrote: > > As user pages are always in highmem, this should be easy to decide: > only send SIGDANGER when highmem is full. (Yes, there are > inodes/dentries/file descriptors in lowmem, but I doubt apps will > respond to SIGDANGER by closing

Re: [RFC] Parallelize IO for e2fsck

2008-01-28 Thread Pavel Machek
Hi! > It's been discussed before, but I suspect the main reason why it was > never done is no one submitted a patch. Also, the problem is actually > a pretty complex one. There are a couple of different stages where > you might want to send an alert to processes: > > * Data is starting to

Re: [RFC] Parallelize IO for e2fsck

2008-01-28 Thread Pavel Machek
Hi! It's been discussed before, but I suspect the main reason why it was never done is no one submitted a patch. Also, the problem is actually a pretty complex one. There are a couple of different stages where you might want to send an alert to processes: * Data is starting to get

Re: [RFC] Parallelize IO for e2fsck

2008-01-28 Thread Pavel Machek
On Mon 2008-01-28 14:56:33, Theodore Tso wrote: On Mon, Jan 28, 2008 at 07:30:05PM +, Pavel Machek wrote: As user pages are always in highmem, this should be easy to decide: only send SIGDANGER when highmem is full. (Yes, there are inodes/dentries/file descriptors in lowmem, but I

Re: Kernel Event Notifications (was: [RFC] Parallelize IO for e2fsck)

2008-01-28 Thread Jon Masters
On Sat, 2008-01-26 at 16:55 +0300, Al Boldi wrote: KOSAKI Motohiro wrote: And from a performance point of view letting applications voluntarily free some memory is better even than starting to swap. Absolutely. the mem_notify patch can realize just before starting swapping

Re: [RFC] Parallelize IO for e2fsck

2008-01-28 Thread david
On Mon, 28 Jan 2008, Theodore Tso wrote: On Mon, Jan 28, 2008 at 07:30:05PM +, Pavel Machek wrote: As user pages are always in highmem, this should be easy to decide: only send SIGDANGER when highmem is full. (Yes, there are inodes/dentries/file descriptors in lowmem, but I doubt apps

Re: [RFC] Parallelize IO for e2fsck

2008-01-28 Thread Theodore Tso
On Mon, Jan 28, 2008 at 07:30:05PM +, Pavel Machek wrote: As user pages are always in highmem, this should be easy to decide: only send SIGDANGER when highmem is full. (Yes, there are inodes/dentries/file descriptors in lowmem, but I doubt apps will respond to SIGDANGER by closing

Re: Kernel Event Notifications (was: [RFC] Parallelize IO for e2fsck)

2008-01-26 Thread KOSAKI Motohiro
Hi Al > > the mem_notify patch can realize "just before starting swapping" > > notification :) > > > > to be honest, I don't know fs guys requirement. > > if lacking feature of fs guys needed, I implement it with presure if > > you tell me it. > > These notifications are really useful, but it may

Kernel Event Notifications (was: [RFC] Parallelize IO for e2fsck)

2008-01-26 Thread Al Boldi
KOSAKI Motohiro wrote: > > > And from a performance point of view letting applications voluntarily > > > free some memory is better even than starting to swap. > > > > Absolutely. > > the mem_notify patch can realize "just before starting swapping" > notification :) > > to be honest, I don't know

Re: [RFC] Parallelize IO for e2fsck

2008-01-26 Thread Theodore Tso
On Fri, Jan 25, 2008 at 05:55:51PM -0800, Bryan Henderson wrote: > I was surprised to see AIX do late allocation by default, because IBM's > traditional style is bulletproof systems. A system where a process can be > killed at unpredictable times because of resource demands of unrelated >

Re: [RFC] Parallelize IO for e2fsck

2008-01-26 Thread KOSAKI Motohiro
> > And from a performance point of view letting applications voluntarily > > free some memory is better even than starting to swap. > > Absolutely. the mem_notify patch can realize "just before starting swapping" notification :) to be honest, I don't know fs guys requirement. if lacking feature

Re: [RFC] Parallelize IO for e2fsck

2008-01-26 Thread KOSAKI Motohiro
> The commentary on the mem_notify threads claimed that the signal is > easily provided by setting up the file handle for SIGIO. BTW: Of cource, you can receive any signal instead SIGIO by use fcntl(F_SETSIG) :-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the

Re: [RFC] Parallelize IO for e2fsck

2008-01-26 Thread KOSAKI Motohiro
The commentary on the mem_notify threads claimed that the signal is easily provided by setting up the file handle for SIGIO. BTW: Of cource, you can receive any signal instead SIGIO by use fcntl(F_SETSIG) :-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body

Re: [RFC] Parallelize IO for e2fsck

2008-01-26 Thread KOSAKI Motohiro
And from a performance point of view letting applications voluntarily free some memory is better even than starting to swap. Absolutely. the mem_notify patch can realize just before starting swapping notification :) to be honest, I don't know fs guys requirement. if lacking feature of fs

Re: [RFC] Parallelize IO for e2fsck

2008-01-26 Thread Theodore Tso
On Fri, Jan 25, 2008 at 05:55:51PM -0800, Bryan Henderson wrote: I was surprised to see AIX do late allocation by default, because IBM's traditional style is bulletproof systems. A system where a process can be killed at unpredictable times because of resource demands of unrelated

Re: Kernel Event Notifications (was: [RFC] Parallelize IO for e2fsck)

2008-01-26 Thread KOSAKI Motohiro
Hi Al the mem_notify patch can realize just before starting swapping notification :) to be honest, I don't know fs guys requirement. if lacking feature of fs guys needed, I implement it with presure if you tell me it. These notifications are really useful, but it may be much wiser

Re: [RFC] Parallelize IO for e2fsck

2008-01-25 Thread Bryan Henderson
>> Incidentally, some context for the AIX approach to the OOM problem: a >> process may exclude itself from OOM vulnerability altogether. It places >> itself in "early allocation" mode, which means at the time it creates >> virtual memory, it reserves enough backing store for the worst case.

Re: [RFC] Parallelize IO for e2fsck

2008-01-25 Thread Zan Lynx
On Fri, 2008-01-25 at 04:09 -0700, Andreas Dilger wrote: > On Jan 24, 2008 17:25 -0700, Zan Lynx wrote: > > Have y'all been following the /dev/mem_notify patches? > > http://article.gmane.org/gmane.linux.kernel/628653 > > Having the notification be via poll() is a very restrictive processing >

Re: [RFC] Parallelize IO for e2fsck

2008-01-25 Thread Andreas Dilger
On Jan 24, 2008 17:25 -0700, Zan Lynx wrote: > Have y'all been following the /dev/mem_notify patches? > http://article.gmane.org/gmane.linux.kernel/628653 Having the notification be via poll() is a very restrictive processing model. Having the notification be via a signal means that any kind of

Re: [RFC] Parallelize IO for e2fsck

2008-01-25 Thread Bryan Henderson
Incidentally, some context for the AIX approach to the OOM problem: a process may exclude itself from OOM vulnerability altogether. It places itself in early allocation mode, which means at the time it creates virtual memory, it reserves enough backing store for the worst case. The

Re: [RFC] Parallelize IO for e2fsck

2008-01-25 Thread Andreas Dilger
On Jan 24, 2008 17:25 -0700, Zan Lynx wrote: Have y'all been following the /dev/mem_notify patches? http://article.gmane.org/gmane.linux.kernel/628653 Having the notification be via poll() is a very restrictive processing model. Having the notification be via a signal means that any kind of

Re: [RFC] Parallelize IO for e2fsck

2008-01-24 Thread Zan Lynx
On Thu, 2008-01-24 at 18:40 -0500, Theodore Tso wrote: > On Fri, Jan 25, 2008 at 01:08:09AM +0200, Adrian Bunk wrote: > > In practice, there is a small number of programs that are both the > > common memory hogs and should be able to reduce their memory consumption > > by 10% or 20% without big

Re: [RFC] Parallelize IO for e2fsck

2008-01-24 Thread Theodore Tso
On Fri, Jan 25, 2008 at 01:08:09AM +0200, Adrian Bunk wrote: > In practice, there is a small number of programs that are both the > common memory hogs and should be able to reduce their memory consumption > by 10% or 20% without big problems when requested (e.g. Java VMs, > Firefox and databases

Re: [RFC] Parallelize IO for e2fsck

2008-01-24 Thread Adrian Bunk
On Thu, Jan 24, 2008 at 06:32:15PM +0100, Bodo Eggert wrote: > Alan Cox <[EMAIL PROTECTED]> wrote: > > >> I'd tried to advocate SIGDANGER some years ago as well, but none of > >> the kernel maintainers were interested. It definitely makes sense > >> to have some sort of mechanism like this. At

Re: [RFC] Parallelize IO for e2fsck

2008-01-24 Thread Andreas Dilger
On Jan 24, 2008 18:32 +0100, Bodo Eggert wrote: > I think a single, system-wide signal is the second-to worst solution: All > applications (or the wrong one, if you select one) would free their caches > and start to crawl, and either stay in this state or slowly increase their > caches again

Re: [RFC] Parallelize IO for e2fsck

2008-01-24 Thread Bodo Eggert
Alan Cox <[EMAIL PROTECTED]> wrote: >> I'd tried to advocate SIGDANGER some years ago as well, but none of >> the kernel maintainers were interested. It definitely makes sense >> to have some sort of mechanism like this. At the time I first brought >> it up it was in conjunction with Netscape

Re: [RFC] Parallelize IO for e2fsck

2008-01-24 Thread Bodo Eggert
Alan Cox [EMAIL PROTECTED] wrote: I'd tried to advocate SIGDANGER some years ago as well, but none of the kernel maintainers were interested. It definitely makes sense to have some sort of mechanism like this. At the time I first brought it up it was in conjunction with Netscape using too

Re: [RFC] Parallelize IO for e2fsck

2008-01-24 Thread Andreas Dilger
On Jan 24, 2008 18:32 +0100, Bodo Eggert wrote: I think a single, system-wide signal is the second-to worst solution: All applications (or the wrong one, if you select one) would free their caches and start to crawl, and either stay in this state or slowly increase their caches again until

Re: [RFC] Parallelize IO for e2fsck

2008-01-24 Thread Adrian Bunk
On Thu, Jan 24, 2008 at 06:32:15PM +0100, Bodo Eggert wrote: Alan Cox [EMAIL PROTECTED] wrote: I'd tried to advocate SIGDANGER some years ago as well, but none of the kernel maintainers were interested. It definitely makes sense to have some sort of mechanism like this. At the time I

Re: [RFC] Parallelize IO for e2fsck

2008-01-24 Thread Theodore Tso
On Fri, Jan 25, 2008 at 01:08:09AM +0200, Adrian Bunk wrote: In practice, there is a small number of programs that are both the common memory hogs and should be able to reduce their memory consumption by 10% or 20% without big problems when requested (e.g. Java VMs, Firefox and databases come

Re: [RFC] Parallelize IO for e2fsck

2008-01-24 Thread Zan Lynx
On Thu, 2008-01-24 at 18:40 -0500, Theodore Tso wrote: On Fri, Jan 25, 2008 at 01:08:09AM +0200, Adrian Bunk wrote: In practice, there is a small number of programs that are both the common memory hogs and should be able to reduce their memory consumption by 10% or 20% without big problems

Re: [RFC] Parallelize IO for e2fsck

2008-01-22 Thread Bryan Henderson
>I think there is a clear need for applications to be able to >register a callback from the kernel to indicate that the machine as >a whole is running out of memory and that the application should >trim it's caches to reduce memory utilisation. > >Perhaps instead of swapping immediately, a

Re: [RFC] Parallelize IO for e2fsck

2008-01-22 Thread Arnaldo Carvalho de Melo
Em Tue, Jan 22, 2008 at 09:40:52AM -0500, Theodore Tso escreveu: > On Tue, Jan 22, 2008 at 12:00:50AM -0700, Andreas Dilger wrote: > > > AIX had SIGDANGER some 15 years ago. Admittedly, that was sent when > > > the system was about to hit OOM, not when it was about to start swapping. > > > > I'd

Re: [RFC] Parallelize IO for e2fsck

2008-01-22 Thread Theodore Tso
On Tue, Jan 22, 2008 at 12:00:50AM -0700, Andreas Dilger wrote: > > AIX had SIGDANGER some 15 years ago. Admittedly, that was sent when > > the system was about to hit OOM, not when it was about to start swapping. > > I'd tried to advocate SIGDANGER some years ago as well, but none of > the

Re: [RFC] Parallelize IO for e2fsck

2008-01-22 Thread Alan Cox
> I'd tried to advocate SIGDANGER some years ago as well, but none of > the kernel maintainers were interested. It definitely makes sense > to have some sort of mechanism like this. At the time I first brought > it up it was in conjunction with Netscape using too much cache on some > system, but

Re: [RFC] Parallelize IO for e2fsck

2008-01-22 Thread David Chinner
On Tue, Jan 22, 2008 at 12:05:11AM -0700, Andreas Dilger wrote: > On Jan 22, 2008 14:38 +1100, David Chinner wrote: > > On Mon, Jan 21, 2008 at 04:00:41PM -0700, Andreas Dilger wrote: > > > I discussed this with Ted at one point also. This is a generic problem, > > > not just for readahead,

Re: [RFC] Parallelize IO for e2fsck

2008-01-22 Thread David Chinner
On Tue, Jan 22, 2008 at 12:05:11AM -0700, Andreas Dilger wrote: On Jan 22, 2008 14:38 +1100, David Chinner wrote: On Mon, Jan 21, 2008 at 04:00:41PM -0700, Andreas Dilger wrote: I discussed this with Ted at one point also. This is a generic problem, not just for readahead, because fsck

Re: [RFC] Parallelize IO for e2fsck

2008-01-22 Thread Alan Cox
I'd tried to advocate SIGDANGER some years ago as well, but none of the kernel maintainers were interested. It definitely makes sense to have some sort of mechanism like this. At the time I first brought it up it was in conjunction with Netscape using too much cache on some system, but it

Re: [RFC] Parallelize IO for e2fsck

2008-01-22 Thread Theodore Tso
On Tue, Jan 22, 2008 at 12:00:50AM -0700, Andreas Dilger wrote: AIX had SIGDANGER some 15 years ago. Admittedly, that was sent when the system was about to hit OOM, not when it was about to start swapping. I'd tried to advocate SIGDANGER some years ago as well, but none of the kernel

Re: [RFC] Parallelize IO for e2fsck

2008-01-22 Thread Arnaldo Carvalho de Melo
Em Tue, Jan 22, 2008 at 09:40:52AM -0500, Theodore Tso escreveu: On Tue, Jan 22, 2008 at 12:00:50AM -0700, Andreas Dilger wrote: AIX had SIGDANGER some 15 years ago. Admittedly, that was sent when the system was about to hit OOM, not when it was about to start swapping. I'd tried to

Re: [RFC] Parallelize IO for e2fsck

2008-01-22 Thread Bryan Henderson
I think there is a clear need for applications to be able to register a callback from the kernel to indicate that the machine as a whole is running out of memory and that the application should trim it's caches to reduce memory utilisation. Perhaps instead of swapping immediately, a SIGLOWMEM

Re: [RFC] Parallelize IO for e2fsck

2008-01-21 Thread Andreas Dilger
On Jan 22, 2008 14:38 +1100, David Chinner wrote: > On Mon, Jan 21, 2008 at 04:00:41PM -0700, Andreas Dilger wrote: > > I discussed this with Ted at one point also. This is a generic problem, > > not just for readahead, because "fsck" can run multiple e2fsck in parallel > > and in case of many

Re: [RFC] Parallelize IO for e2fsck

2008-01-21 Thread Andreas Dilger
On Jan 21, 2008 23:17 -0500, [EMAIL PROTECTED] wrote: > On Tue, 22 Jan 2008 14:38:30 +1100, David Chinner said: > > Perhaps instead of swapping immediately, a SIGLOWMEM could be sent > > to a processes that aren't masking the signal followed by a short > > grace period to allow the processes to

Re: [RFC] Parallelize IO for e2fsck

2008-01-21 Thread Valdis . Kletnieks
On Tue, 22 Jan 2008 14:38:30 +1100, David Chinner said: > Perhaps instead of swapping immediately, a SIGLOWMEM could be sent > to a processes that aren't masking the signal followed by a short > grace period to allow the processes to free up some memory before > swapping out pages from that

Re: [RFC] Parallelize IO for e2fsck

2008-01-21 Thread David Chinner
On Mon, Jan 21, 2008 at 04:00:41PM -0700, Andreas Dilger wrote: > On Jan 16, 2008 13:30 -0800, Valerie Henson wrote: > > I have a partial solution that sort of blindly manages the buffer > > cache. First, the user passes e2fsck a parameter saying how much > > memory is available as buffer cache.

Re: [RFC] Parallelize IO for e2fsck

2008-01-21 Thread Andreas Dilger
On Jan 16, 2008 13:30 -0800, Valerie Henson wrote: > I have a partial solution that sort of blindly manages the buffer > cache. First, the user passes e2fsck a parameter saying how much > memory is available as buffer cache. The readahead thread reads > things in and immediately throws them

Re: [RFC] Parallelize IO for e2fsck

2008-01-21 Thread Andreas Dilger
On Jan 16, 2008 13:30 -0800, Valerie Henson wrote: I have a partial solution that sort of blindly manages the buffer cache. First, the user passes e2fsck a parameter saying how much memory is available as buffer cache. The readahead thread reads things in and immediately throws them away so

Re: [RFC] Parallelize IO for e2fsck

2008-01-21 Thread David Chinner
On Mon, Jan 21, 2008 at 04:00:41PM -0700, Andreas Dilger wrote: On Jan 16, 2008 13:30 -0800, Valerie Henson wrote: I have a partial solution that sort of blindly manages the buffer cache. First, the user passes e2fsck a parameter saying how much memory is available as buffer cache. The

Re: [RFC] Parallelize IO for e2fsck

2008-01-21 Thread Andreas Dilger
On Jan 21, 2008 23:17 -0500, [EMAIL PROTECTED] wrote: On Tue, 22 Jan 2008 14:38:30 +1100, David Chinner said: Perhaps instead of swapping immediately, a SIGLOWMEM could be sent to a processes that aren't masking the signal followed by a short grace period to allow the processes to free up

Re: [RFC] Parallelize IO for e2fsck

2008-01-21 Thread Andreas Dilger
On Jan 22, 2008 14:38 +1100, David Chinner wrote: On Mon, Jan 21, 2008 at 04:00:41PM -0700, Andreas Dilger wrote: I discussed this with Ted at one point also. This is a generic problem, not just for readahead, because fsck can run multiple e2fsck in parallel and in case of many large

Re: [RFC] Parallelize IO for e2fsck

2008-01-17 Thread Valerie Henson
On Jan 17, 2008 5:15 PM, David Chinner <[EMAIL PROTECTED]> wrote: > On Wed, Jan 16, 2008 at 01:30:43PM -0800, Valerie Henson wrote: > > Hi y'all, > > > > This is a request for comments on the rewrite of the e2fsck IO > > parallelization patches I sent out a few months ago. The mechanism is > >

Re: [RFC] Parallelize IO for e2fsck

2008-01-17 Thread David Chinner
On Wed, Jan 16, 2008 at 01:30:43PM -0800, Valerie Henson wrote: > Hi y'all, > > This is a request for comments on the rewrite of the e2fsck IO > parallelization patches I sent out a few months ago. The mechanism is > totally different. Previously IO was parallelized by issuing IOs from >

Re: [RFC] Parallelize IO for e2fsck

2008-01-17 Thread David Chinner
On Wed, Jan 16, 2008 at 01:30:43PM -0800, Valerie Henson wrote: Hi y'all, This is a request for comments on the rewrite of the e2fsck IO parallelization patches I sent out a few months ago. The mechanism is totally different. Previously IO was parallelized by issuing IOs from multiple

Re: [RFC] Parallelize IO for e2fsck

2008-01-17 Thread Valerie Henson
On Jan 17, 2008 5:15 PM, David Chinner [EMAIL PROTECTED] wrote: On Wed, Jan 16, 2008 at 01:30:43PM -0800, Valerie Henson wrote: Hi y'all, This is a request for comments on the rewrite of the e2fsck IO parallelization patches I sent out a few months ago. The mechanism is totally