Re: [RFC 0/8] Cpuset aware writeback

2007-04-21 Thread Christoph Lameter
On Sat, 21 Apr 2007, Ethan Solomita wrote: >Exactly -- your patch should be consistent and do it the same way as > whatever your patch is built against. Your patch is built against a kernel > that subtracts off highmem. "Do it..." are you handing off the patch and are > done with it? Yes as

Re: [RFC 0/8] Cpuset aware writeback

2007-04-21 Thread Ethan Solomita
Christoph Lameter wrote: On Fri, 20 Apr 2007, Ethan Solomita wrote: cpuset_write_dirty_map.htm In __set_page_dirty_nobuffers() you always call cpuset_update_dirty_nodes() but in __set_page_dirty_buffers() you call it only if page->mapping is still set after locking. Is there a reason

Re: [RFC 0/8] Cpuset aware writeback

2007-04-21 Thread Ethan Solomita
Christoph Lameter wrote: On Fri, 20 Apr 2007, Ethan Solomita wrote: cpuset_write_dirty_map.htm In __set_page_dirty_nobuffers() you always call cpuset_update_dirty_nodes() but in __set_page_dirty_buffers() you call it only if page-mapping is still set after locking. Is there a reason for

Re: [RFC 0/8] Cpuset aware writeback

2007-04-21 Thread Christoph Lameter
On Sat, 21 Apr 2007, Ethan Solomita wrote: Exactly -- your patch should be consistent and do it the same way as whatever your patch is built against. Your patch is built against a kernel that subtracts off highmem. Do it... are you handing off the patch and are done with it? Yes as said

Re: [RFC 0/8] Cpuset aware writeback

2007-04-20 Thread Christoph Lameter
On Fri, 20 Apr 2007, Ethan Solomita wrote: > cpuset_write_dirty_map.htm > >In __set_page_dirty_nobuffers() you always call cpuset_update_dirty_nodes() > but in __set_page_dirty_buffers() you call it only if page->mapping is still > set after locking. Is there a reason for the difference?

Re: [RFC 0/8] Cpuset aware writeback

2007-04-20 Thread Ethan Solomita
Christoph Lameter wrote: H Sorry. I got distracted and I have sent them to Kame-san who was interested in working on them. I have placed the most recent version at http://ftp.kernel.org/pub/linux/kernel/people/christoph/cpuset_dirty Hi Christoph -- a few comments on the

Re: [RFC 0/8] Cpuset aware writeback

2007-04-20 Thread Ethan Solomita
Christoph Lameter wrote: H Sorry. I got distracted and I have sent them to Kame-san who was interested in working on them. I have placed the most recent version at http://ftp.kernel.org/pub/linux/kernel/people/christoph/cpuset_dirty Hi Christoph -- a few comments on the

Re: [RFC 0/8] Cpuset aware writeback

2007-04-20 Thread Christoph Lameter
On Fri, 20 Apr 2007, Ethan Solomita wrote: cpuset_write_dirty_map.htm In __set_page_dirty_nobuffers() you always call cpuset_update_dirty_nodes() but in __set_page_dirty_buffers() you call it only if page-mapping is still set after locking. Is there a reason for the difference? Also a

Re: [RFC 0/8] Cpuset aware writeback

2007-04-19 Thread Christoph Lameter
On Thu, 19 Apr 2007, Ethan Solomita wrote: > > H Sorry. I got distracted and I have sent them to Kame-san who was > > interested in working on them. > > I have placed the most recent version at > > http://ftp.kernel.org/pub/linux/kernel/people/christoph/cpuset_dirty > > > >Do you

Re: [RFC 0/8] Cpuset aware writeback

2007-04-19 Thread Ethan Solomita
Christoph Lameter wrote: On Wed, 18 Apr 2007, Ethan Solomita wrote: Any new ETA? I'm trying to decide whether to go back to your original patches or wait for the new set. Adding new knobs isn't as important to me as having something that fixes the core problem, so hopefully this isn't

Re: [RFC 0/8] Cpuset aware writeback

2007-04-19 Thread Ethan Solomita
Christoph Lameter wrote: On Wed, 18 Apr 2007, Ethan Solomita wrote: Any new ETA? I'm trying to decide whether to go back to your original patches or wait for the new set. Adding new knobs isn't as important to me as having something that fixes the core problem, so hopefully this isn't

Re: [RFC 0/8] Cpuset aware writeback

2007-04-19 Thread Christoph Lameter
On Thu, 19 Apr 2007, Ethan Solomita wrote: H Sorry. I got distracted and I have sent them to Kame-san who was interested in working on them. I have placed the most recent version at http://ftp.kernel.org/pub/linux/kernel/people/christoph/cpuset_dirty Do you expect any

Re: [RFC 0/8] Cpuset aware writeback

2007-04-18 Thread Christoph Lameter
On Wed, 18 Apr 2007, Ethan Solomita wrote: >Any new ETA? I'm trying to decide whether to go back to your original > patches or wait for the new set. Adding new knobs isn't as important to me as > having something that fixes the core problem, so hopefully this isn't waiting > on them. They

Re: [RFC 0/8] Cpuset aware writeback

2007-04-18 Thread Ethan Solomita
Christoph Lameter wrote: On Wed, 21 Mar 2007, Ethan Solomita wrote: Christoph Lameter wrote: On Thu, 1 Feb 2007, Ethan Solomita wrote: Hi Christoph -- has anything come of resolving the NFS / OOM concerns that Andrew Morton expressed concerning the patch? I'd be happy to

Re: [RFC 0/8] Cpuset aware writeback

2007-04-18 Thread Ethan Solomita
Christoph Lameter wrote: On Wed, 21 Mar 2007, Ethan Solomita wrote: Christoph Lameter wrote: On Thu, 1 Feb 2007, Ethan Solomita wrote: Hi Christoph -- has anything come of resolving the NFS / OOM concerns that Andrew Morton expressed concerning the patch? I'd be happy to

Re: [RFC 0/8] Cpuset aware writeback

2007-04-18 Thread Christoph Lameter
On Wed, 18 Apr 2007, Ethan Solomita wrote: Any new ETA? I'm trying to decide whether to go back to your original patches or wait for the new set. Adding new knobs isn't as important to me as having something that fixes the core problem, so hopefully this isn't waiting on them. They could

Re: [RFC 0/8] Cpuset aware writeback

2007-03-21 Thread Christoph Lameter
On Wed, 21 Mar 2007, Andrew Morton wrote: > > The NFS patch went into Linus tree a couple of days ago > > Did it fix the oom issues which you were observing? Yes it reduced the dirty ratios to reasonable numbers in a simple copy operation that created large amounts of dirty pages before. The

Re: [RFC 0/8] Cpuset aware writeback

2007-03-21 Thread Andrew Morton
On Wed, 21 Mar 2007 14:29:42 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Wed, 21 Mar 2007, Ethan Solomita wrote: > > > Christoph Lameter wrote: > > > On Thu, 1 Feb 2007, Ethan Solomita wrote: > > > > > > >Hi Christoph -- has anything come of resolving the NFS / OOM

Re: [RFC 0/8] Cpuset aware writeback

2007-03-21 Thread Christoph Lameter
On Wed, 21 Mar 2007, Ethan Solomita wrote: > Christoph Lameter wrote: > > On Thu, 1 Feb 2007, Ethan Solomita wrote: > > > > >Hi Christoph -- has anything come of resolving the NFS / OOM concerns > > > that > > > Andrew Morton expressed concerning the patch? I'd be happy to see some > > >

Re: [RFC 0/8] Cpuset aware writeback

2007-03-21 Thread Ethan Solomita
Christoph Lameter wrote: On Thu, 1 Feb 2007, Ethan Solomita wrote: Hi Christoph -- has anything come of resolving the NFS / OOM concerns that Andrew Morton expressed concerning the patch? I'd be happy to see some progress on getting this patch (i.e. the one you posted on 1/23) through.

Re: [RFC 0/8] Cpuset aware writeback

2007-03-21 Thread Christoph Lameter
On Wed, 21 Mar 2007, Ethan Solomita wrote: Christoph Lameter wrote: On Thu, 1 Feb 2007, Ethan Solomita wrote: Hi Christoph -- has anything come of resolving the NFS / OOM concerns that Andrew Morton expressed concerning the patch? I'd be happy to see some progress on getting

Re: [RFC 0/8] Cpuset aware writeback

2007-03-21 Thread Andrew Morton
On Wed, 21 Mar 2007 14:29:42 -0700 (PDT) Christoph Lameter [EMAIL PROTECTED] wrote: On Wed, 21 Mar 2007, Ethan Solomita wrote: Christoph Lameter wrote: On Thu, 1 Feb 2007, Ethan Solomita wrote: Hi Christoph -- has anything come of resolving the NFS / OOM concerns that

Re: [RFC 0/8] Cpuset aware writeback

2007-03-21 Thread Christoph Lameter
On Wed, 21 Mar 2007, Andrew Morton wrote: The NFS patch went into Linus tree a couple of days ago Did it fix the oom issues which you were observing? Yes it reduced the dirty ratios to reasonable numbers in a simple copy operation that created large amounts of dirty pages before. The

Re: [RFC 0/8] Cpuset aware writeback

2007-02-01 Thread Andrew Morton
On Thu, 1 Feb 2007 21:29:06 -0800 (PST) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Thu, 1 Feb 2007, Andrew Morton wrote: > > > > Peter Zilkstra addressed the NFS issue. > > > > Did he? Are you yet in a position to confirm that? > > He provided a solution to fix the congestion issue in

Re: [RFC 0/8] Cpuset aware writeback

2007-02-01 Thread Neil Brown
On Thursday February 1, [EMAIL PROTECTED] wrote: > > > The network stack is of course a different (much harder) problem. > > An NFS solution is possible without solving the network stack issue? NFS is currently able to make more than max_dirty_ratio of memory Dirty/Writeback without being

Re: [RFC 0/8] Cpuset aware writeback

2007-02-01 Thread Christoph Lameter
On Fri, 2 Feb 2007, Neil Brown wrote: > md/raid doesn't cause any problems here. It preallocates enough to be > sure that it can always make forward progress. In general the entire > block layer from generic_make_request down can always successfully > write a block out in a reasonable amount of

Re: [RFC 0/8] Cpuset aware writeback

2007-02-01 Thread Neil Brown
On Thursday February 1, [EMAIL PROTECTED] wrote: >The NFS problems also exist for non cpuset scenarios > and we have by and large been able to live with it so I think they are > lower priority. It seems that the basic problem is created by the dirty > ratios in a cpuset.

Re: [RFC 0/8] Cpuset aware writeback

2007-02-01 Thread Christoph Lameter
On Thu, 1 Feb 2007, Andrew Morton wrote: > > Peter Zilkstra addressed the NFS issue. > > Did he? Are you yet in a position to confirm that? He provided a solution to fix the congestion issue in NFS. I thought that is what you were looking for? That should make NFS behave more like a block

Re: [RFC 0/8] Cpuset aware writeback

2007-02-01 Thread Andrew Morton
On Thu, 1 Feb 2007 18:16:05 -0800 (PST) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Thu, 1 Feb 2007, Ethan Solomita wrote: > > >Hi Christoph -- has anything come of resolving the NFS / OOM concerns > > that > > Andrew Morton expressed concerning the patch? I'd be happy to see some >

Re: [RFC 0/8] Cpuset aware writeback

2007-02-01 Thread Christoph Lameter
On Thu, 1 Feb 2007, Ethan Solomita wrote: >Hi Christoph -- has anything come of resolving the NFS / OOM concerns that > Andrew Morton expressed concerning the patch? I'd be happy to see some > progress on getting this patch (i.e. the one you posted on 1/23) through. Peter Zilkstra addressed

Re: [RFC 0/8] Cpuset aware writeback

2007-02-01 Thread Ethan Solomita
Hi Christoph -- has anything come of resolving the NFS / OOM concerns that Andrew Morton expressed concerning the patch? I'd be happy to see some progress on getting this patch (i.e. the one you posted on 1/23) through. Thanks, -- Ethan - To unsubscribe from this list: send the line

Re: [RFC 0/8] Cpuset aware writeback

2007-02-01 Thread Ethan Solomita
Hi Christoph -- has anything come of resolving the NFS / OOM concerns that Andrew Morton expressed concerning the patch? I'd be happy to see some progress on getting this patch (i.e. the one you posted on 1/23) through. Thanks, -- Ethan - To unsubscribe from this list: send the line

Re: [RFC 0/8] Cpuset aware writeback

2007-02-01 Thread Christoph Lameter
On Thu, 1 Feb 2007, Ethan Solomita wrote: Hi Christoph -- has anything come of resolving the NFS / OOM concerns that Andrew Morton expressed concerning the patch? I'd be happy to see some progress on getting this patch (i.e. the one you posted on 1/23) through. Peter Zilkstra addressed the

Re: [RFC 0/8] Cpuset aware writeback

2007-02-01 Thread Andrew Morton
On Thu, 1 Feb 2007 18:16:05 -0800 (PST) Christoph Lameter [EMAIL PROTECTED] wrote: On Thu, 1 Feb 2007, Ethan Solomita wrote: Hi Christoph -- has anything come of resolving the NFS / OOM concerns that Andrew Morton expressed concerning the patch? I'd be happy to see some progress

Re: [RFC 0/8] Cpuset aware writeback

2007-02-01 Thread Christoph Lameter
On Thu, 1 Feb 2007, Andrew Morton wrote: Peter Zilkstra addressed the NFS issue. Did he? Are you yet in a position to confirm that? He provided a solution to fix the congestion issue in NFS. I thought that is what you were looking for? That should make NFS behave more like a block device

Re: [RFC 0/8] Cpuset aware writeback

2007-02-01 Thread Neil Brown
On Thursday February 1, [EMAIL PROTECTED] wrote: The NFS problems also exist for non cpuset scenarios and we have by and large been able to live with it so I think they are lower priority. It seems that the basic problem is created by the dirty ratios in a cpuset. Some

Re: [RFC 0/8] Cpuset aware writeback

2007-02-01 Thread Christoph Lameter
On Fri, 2 Feb 2007, Neil Brown wrote: md/raid doesn't cause any problems here. It preallocates enough to be sure that it can always make forward progress. In general the entire block layer from generic_make_request down can always successfully write a block out in a reasonable amount of

Re: [RFC 0/8] Cpuset aware writeback

2007-02-01 Thread Neil Brown
On Thursday February 1, [EMAIL PROTECTED] wrote: The network stack is of course a different (much harder) problem. An NFS solution is possible without solving the network stack issue? NFS is currently able to make more than max_dirty_ratio of memory Dirty/Writeback without being

Re: [RFC 0/8] Cpuset aware writeback

2007-02-01 Thread Andrew Morton
On Thu, 1 Feb 2007 21:29:06 -0800 (PST) Christoph Lameter [EMAIL PROTECTED] wrote: On Thu, 1 Feb 2007, Andrew Morton wrote: Peter Zilkstra addressed the NFS issue. Did he? Are you yet in a position to confirm that? He provided a solution to fix the congestion issue in NFS. I

Re: [RFC 0/8] Cpuset aware writeback

2007-01-17 Thread Christoph Lameter
On Wed, 17 Jan 2007, Andrew Morton wrote: > > The problem there is that we do a GFP_ATOMIC allocation (no allocation > > context) that may fail when the first page is dirtied. We must therefore > > be able to subsequently allocate the nodemask_t in set_page_dirty(). > > Otherwise the first

Re: [RFC 0/8] Cpuset aware writeback

2007-01-17 Thread Andrew Morton
> On Wed, 17 Jan 2007 17:10:25 -0800 (PST) Christoph Lameter <[EMAIL > PROTECTED]> wrote: > On Wed, 17 Jan 2007, Andrew Morton wrote: > > > > The inode lock is not taken when the page is dirtied. > > > > The inode_lock is taken when the address_space's first page is dirtied. It > > is > >

Re: [RFC 0/8] Cpuset aware writeback

2007-01-17 Thread Christoph Lameter
On Wed, 17 Jan 2007, Andrew Morton wrote: > > The inode lock is not taken when the page is dirtied. > > The inode_lock is taken when the address_space's first page is dirtied. It is > also taken when the address_space's last dirty page is cleaned. So the place > where the inode is added to and

Re: [RFC 0/8] Cpuset aware writeback

2007-01-17 Thread Andrew Morton
> On Wed, 17 Jan 2007 11:43:42 -0800 (PST) Christoph Lameter <[EMAIL > PROTECTED]> wrote: > On Tue, 16 Jan 2007, Andrew Morton wrote: > > > Do what blockdevs do: limit the number of in-flight requests (Peter's > > recent patch seems to be doing that for us) (perhaps only when PF_MEMALLOC > > is

Re: [RFC 0/8] Cpuset aware writeback

2007-01-17 Thread Christoph Lameter
On Tue, 16 Jan 2007, Andrew Morton wrote: > Do what blockdevs do: limit the number of in-flight requests (Peter's > recent patch seems to be doing that for us) (perhaps only when PF_MEMALLOC > is in effect, to keep Trond happy) and implement a mempool for the NFS > request critical store.

Re: [RFC 0/8] Cpuset aware writeback

2007-01-17 Thread Andrew Morton
> On Wed, 17 Jan 2007 00:01:58 -0800 Paul Jackson <[EMAIL PROTECTED]> wrote: > Andrew wrote: > > - consider going off-cpuset for critical allocations. > > We do ... in mm/page_alloc.c: > > * This is the last chance, in general, before the goto nopage. > * Ignore cpuset if

Re: [RFC 0/8] Cpuset aware writeback

2007-01-17 Thread Paul Jackson
Andrew wrote: > - consider going off-cpuset for critical allocations. We do ... in mm/page_alloc.c: * This is the last chance, in general, before the goto nopage. * Ignore cpuset if GFP_ATOMIC (!wait) rather than fail alloc. * See also cpuset_zone_allowed() comment in

Re: [RFC 0/8] Cpuset aware writeback

2007-01-17 Thread Paul Jackson
Andrew wrote: - consider going off-cpuset for critical allocations. We do ... in mm/page_alloc.c: * This is the last chance, in general, before the goto nopage. * Ignore cpuset if GFP_ATOMIC (!wait) rather than fail alloc. * See also cpuset_zone_allowed() comment in

Re: [RFC 0/8] Cpuset aware writeback

2007-01-17 Thread Andrew Morton
On Wed, 17 Jan 2007 00:01:58 -0800 Paul Jackson [EMAIL PROTECTED] wrote: Andrew wrote: - consider going off-cpuset for critical allocations. We do ... in mm/page_alloc.c: * This is the last chance, in general, before the goto nopage. * Ignore cpuset if GFP_ATOMIC

Re: [RFC 0/8] Cpuset aware writeback

2007-01-17 Thread Christoph Lameter
On Tue, 16 Jan 2007, Andrew Morton wrote: Do what blockdevs do: limit the number of in-flight requests (Peter's recent patch seems to be doing that for us) (perhaps only when PF_MEMALLOC is in effect, to keep Trond happy) and implement a mempool for the NFS request critical store.

Re: [RFC 0/8] Cpuset aware writeback

2007-01-17 Thread Andrew Morton
On Wed, 17 Jan 2007 11:43:42 -0800 (PST) Christoph Lameter [EMAIL PROTECTED] wrote: On Tue, 16 Jan 2007, Andrew Morton wrote: Do what blockdevs do: limit the number of in-flight requests (Peter's recent patch seems to be doing that for us) (perhaps only when PF_MEMALLOC is in effect,

Re: [RFC 0/8] Cpuset aware writeback

2007-01-17 Thread Christoph Lameter
On Wed, 17 Jan 2007, Andrew Morton wrote: The inode lock is not taken when the page is dirtied. The inode_lock is taken when the address_space's first page is dirtied. It is also taken when the address_space's last dirty page is cleaned. So the place where the inode is added to and

Re: [RFC 0/8] Cpuset aware writeback

2007-01-17 Thread Andrew Morton
On Wed, 17 Jan 2007 17:10:25 -0800 (PST) Christoph Lameter [EMAIL PROTECTED] wrote: On Wed, 17 Jan 2007, Andrew Morton wrote: The inode lock is not taken when the page is dirtied. The inode_lock is taken when the address_space's first page is dirtied. It is also taken when the

Re: [RFC 0/8] Cpuset aware writeback

2007-01-17 Thread Christoph Lameter
On Wed, 17 Jan 2007, Andrew Morton wrote: The problem there is that we do a GFP_ATOMIC allocation (no allocation context) that may fail when the first page is dirtied. We must therefore be able to subsequently allocate the nodemask_t in set_page_dirty(). Otherwise the first failure

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Andrew Morton
> On Tue, 16 Jan 2007 22:27:36 -0800 (PST) Christoph Lameter <[EMAIL > PROTECTED]> wrote: > On Tue, 16 Jan 2007, Andrew Morton wrote: > > > > Yes this is the result of the hierachical nature of cpusets which already > > > causes issues with the scheduler. It is rather typical that cpusets are

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Christoph Lameter
On Tue, 16 Jan 2007, Andrew Morton wrote: > > Yes this is the result of the hierachical nature of cpusets which already > > causes issues with the scheduler. It is rather typical that cpusets are > > used to partition the memory and cpus. Overlappig cpusets seem to have > > mainly an

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Andrew Morton
> On Tue, 16 Jan 2007 19:40:17 -0800 (PST) Christoph Lameter <[EMAIL > PROTECTED]> wrote: > On Tue, 16 Jan 2007, Andrew Morton wrote: > > > Consider: non-exclusive cpuset A consists of mems 0-15, non-exclusive > > cpuset B consists of mems 0-3. A task running in cpuset A can freely dirty > >

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Paul Jackson
> Yes this is the result of the hierachical nature of cpusets which already > causes issues with the scheduler. It is rather typical that cpusets are > used to partition the memory and cpus. Overlappig cpusets seem to have > mainly an administrative function. Paul? The heavy weight tasks,

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Christoph Lameter
On Tue, 16 Jan 2007, Andrew Morton wrote: > Consider: non-exclusive cpuset A consists of mems 0-15, non-exclusive > cpuset B consists of mems 0-3. A task running in cpuset A can freely dirty > all of cpuset B's memory. A task running in cpuset B gets oomkilled. > > Consider: a 32-node machine

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Andrew Morton
> On Tue, 16 Jan 2007 17:30:26 -0800 (PST) Christoph Lameter <[EMAIL > PROTECTED]> wrote: > > Nope. You've completely omitted the little fact that we'll do writeback in > > the offending zone off the LRU. Slower, maybe. But it should work and the > > system should recover. If it's not doing

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Christoph Lameter
On Tue, 16 Jan 2007, Andrew Morton wrote: > Nope. You've completely omitted the little fact that we'll do writeback in > the offending zone off the LRU. Slower, maybe. But it should work and the > system should recover. If it's not doing that (it isn't) then we should > fix it rather than

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Andrew Morton
> On Tue, 16 Jan 2007 16:16:30 -0800 (PST) Christoph Lameter <[EMAIL > PROTECTED]> wrote: > On Tue, 16 Jan 2007, Andrew Morton wrote: > > > It's a workaround for a still-unfixed NFS problem. > > No its doing proper throttling. Without this patchset there will *no* > writeback and throttling at

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Christoph Lameter
On Tue, 16 Jan 2007, Andrew Morton wrote: > It's a workaround for a still-unfixed NFS problem. No its doing proper throttling. Without this patchset there will *no* writeback and throttling at all. F.e. lets say we have 20 nodes of 1G each and a cpuset that only spans one node. Then a process

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread David Chinner
On Tue, Jan 16, 2007 at 01:53:25PM -0800, Andrew Morton wrote: > > On Mon, 15 Jan 2007 21:47:43 -0800 (PST) Christoph Lameter > > <[EMAIL PROTECTED]> wrote: > > > > Currently cpusets are not able to do proper writeback since dirty ratio > > calculations and writeback are all done for the system as

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Andrew Morton
> On Tue, 16 Jan 2007 14:15:56 -0800 (PST) Christoph Lameter <[EMAIL > PROTECTED]> wrote: > > ... > > > > This may result in a large percentage of a cpuset > > > to become dirty without writeout being triggered. Under NFS > > > this can lead to OOM conditions. > > > > OK, a big question: is this

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Christoph Lameter
On Wed, 17 Jan 2007, Andi Kleen wrote: > > Secondly we modify the dirty limit calculation to be based > > on the acctive cpuset. > > The global dirty limit definitely seems to be a problem > in several cases, but my feeling is that the cpuset is the wrong unit > to keep track of it. Most likely

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Christoph Lameter
On Tue, 16 Jan 2007, Andrew Morton wrote: > > On Mon, 15 Jan 2007 21:47:43 -0800 (PST) Christoph Lameter <[EMAIL > > PROTECTED]> wrote: > > > > Currently cpusets are not able to do proper writeback since > > dirty ratio calculations and writeback are all done for the system > > as a whole. > >

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Andi Kleen
> Secondly we modify the dirty limit calculation to be based > on the acctive cpuset. The global dirty limit definitely seems to be a problem in several cases, but my feeling is that the cpuset is the wrong unit to keep track of it. Most likely it should be more fine grained. > If we are in a

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Andrew Morton
> On Mon, 15 Jan 2007 21:47:43 -0800 (PST) Christoph Lameter <[EMAIL > PROTECTED]> wrote: > > Currently cpusets are not able to do proper writeback since > dirty ratio calculations and writeback are all done for the system > as a whole. We _do_ do proper writeback. But it's less efficient than

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Christoph Lameter
On Tue, 16 Jan 2007, Peter Zijlstra wrote: > > B. We add a new counter NR_UNRECLAIMABLE that is subtracted > >from the available pages in a node. This allows us to > >accurately calculate the dirty ratio even if large portions > >of the node have been allocated for huge pages or for >

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Christoph Lameter
On Tue, 16 Jan 2007, Paul Jackson wrote: > > 1. The nodemask expands the inode structure significantly if the > > architecture allows a high number of nodes. This is only an issue > > for IA64. > > Should that logic be disabled if HOTPLUG is configured on? Or is > nr_node_ids a valid upper

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Paul Jackson
Christoph wrote: > Currently cpusets are not able to do proper writeback since > dirty ratio calculations and writeback are all done for the system > as a whole. Thanks for tackling this - it is sorely needed. I'm afraid my review will be mostly cosmetic; I'm not competent to comment on the

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Paul Jackson
Christoph wrote: Currently cpusets are not able to do proper writeback since dirty ratio calculations and writeback are all done for the system as a whole. Thanks for tackling this - it is sorely needed. I'm afraid my review will be mostly cosmetic; I'm not competent to comment on the really

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Christoph Lameter
On Tue, 16 Jan 2007, Paul Jackson wrote: 1. The nodemask expands the inode structure significantly if the architecture allows a high number of nodes. This is only an issue for IA64. Should that logic be disabled if HOTPLUG is configured on? Or is nr_node_ids a valid upper limit on

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Christoph Lameter
On Tue, 16 Jan 2007, Peter Zijlstra wrote: B. We add a new counter NR_UNRECLAIMABLE that is subtracted from the available pages in a node. This allows us to accurately calculate the dirty ratio even if large portions of the node have been allocated for huge pages or for slab

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Andrew Morton
On Mon, 15 Jan 2007 21:47:43 -0800 (PST) Christoph Lameter [EMAIL PROTECTED] wrote: Currently cpusets are not able to do proper writeback since dirty ratio calculations and writeback are all done for the system as a whole. We _do_ do proper writeback. But it's less efficient than it might

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Andi Kleen
Secondly we modify the dirty limit calculation to be based on the acctive cpuset. The global dirty limit definitely seems to be a problem in several cases, but my feeling is that the cpuset is the wrong unit to keep track of it. Most likely it should be more fine grained. If we are in a

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Christoph Lameter
On Tue, 16 Jan 2007, Andrew Morton wrote: On Mon, 15 Jan 2007 21:47:43 -0800 (PST) Christoph Lameter [EMAIL PROTECTED] wrote: Currently cpusets are not able to do proper writeback since dirty ratio calculations and writeback are all done for the system as a whole. We _do_ do

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Christoph Lameter
On Wed, 17 Jan 2007, Andi Kleen wrote: Secondly we modify the dirty limit calculation to be based on the acctive cpuset. The global dirty limit definitely seems to be a problem in several cases, but my feeling is that the cpuset is the wrong unit to keep track of it. Most likely it

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Andrew Morton
On Tue, 16 Jan 2007 14:15:56 -0800 (PST) Christoph Lameter [EMAIL PROTECTED] wrote: ... This may result in a large percentage of a cpuset to become dirty without writeout being triggered. Under NFS this can lead to OOM conditions. OK, a big question: is this patchset a

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread David Chinner
On Tue, Jan 16, 2007 at 01:53:25PM -0800, Andrew Morton wrote: On Mon, 15 Jan 2007 21:47:43 -0800 (PST) Christoph Lameter [EMAIL PROTECTED] wrote: Currently cpusets are not able to do proper writeback since dirty ratio calculations and writeback are all done for the system as a whole.

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Christoph Lameter
On Tue, 16 Jan 2007, Andrew Morton wrote: It's a workaround for a still-unfixed NFS problem. No its doing proper throttling. Without this patchset there will *no* writeback and throttling at all. F.e. lets say we have 20 nodes of 1G each and a cpuset that only spans one node. Then a process

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Andrew Morton
On Tue, 16 Jan 2007 16:16:30 -0800 (PST) Christoph Lameter [EMAIL PROTECTED] wrote: On Tue, 16 Jan 2007, Andrew Morton wrote: It's a workaround for a still-unfixed NFS problem. No its doing proper throttling. Without this patchset there will *no* writeback and throttling at all. F.e.

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Christoph Lameter
On Tue, 16 Jan 2007, Andrew Morton wrote: Nope. You've completely omitted the little fact that we'll do writeback in the offending zone off the LRU. Slower, maybe. But it should work and the system should recover. If it's not doing that (it isn't) then we should fix it rather than

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Andrew Morton
On Tue, 16 Jan 2007 17:30:26 -0800 (PST) Christoph Lameter [EMAIL PROTECTED] wrote: Nope. You've completely omitted the little fact that we'll do writeback in the offending zone off the LRU. Slower, maybe. But it should work and the system should recover. If it's not doing that (it

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Christoph Lameter
On Tue, 16 Jan 2007, Andrew Morton wrote: Consider: non-exclusive cpuset A consists of mems 0-15, non-exclusive cpuset B consists of mems 0-3. A task running in cpuset A can freely dirty all of cpuset B's memory. A task running in cpuset B gets oomkilled. Consider: a 32-node machine has

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Paul Jackson
Yes this is the result of the hierachical nature of cpusets which already causes issues with the scheduler. It is rather typical that cpusets are used to partition the memory and cpus. Overlappig cpusets seem to have mainly an administrative function. Paul? The heavy weight tasks, which

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Andrew Morton
On Tue, 16 Jan 2007 19:40:17 -0800 (PST) Christoph Lameter [EMAIL PROTECTED] wrote: On Tue, 16 Jan 2007, Andrew Morton wrote: Consider: non-exclusive cpuset A consists of mems 0-15, non-exclusive cpuset B consists of mems 0-3. A task running in cpuset A can freely dirty all of cpuset

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Christoph Lameter
On Tue, 16 Jan 2007, Andrew Morton wrote: Yes this is the result of the hierachical nature of cpusets which already causes issues with the scheduler. It is rather typical that cpusets are used to partition the memory and cpus. Overlappig cpusets seem to have mainly an administrative

Re: [RFC 0/8] Cpuset aware writeback

2007-01-16 Thread Andrew Morton
On Tue, 16 Jan 2007 22:27:36 -0800 (PST) Christoph Lameter [EMAIL PROTECTED] wrote: On Tue, 16 Jan 2007, Andrew Morton wrote: Yes this is the result of the hierachical nature of cpusets which already causes issues with the scheduler. It is rather typical that cpusets are used to

Re: [RFC 0/8] Cpuset aware writeback

2007-01-15 Thread Peter Zijlstra
On Mon, 2007-01-15 at 21:47 -0800, Christoph Lameter wrote: > Currently cpusets are not able to do proper writeback since > dirty ratio calculations and writeback are all done for the system > as a whole. This may result in a large percentage of a cpuset > to become dirty without writeout being

[RFC 0/8] Cpuset aware writeback

2007-01-15 Thread Christoph Lameter
Currently cpusets are not able to do proper writeback since dirty ratio calculations and writeback are all done for the system as a whole. This may result in a large percentage of a cpuset to become dirty without writeout being triggered. Under NFS this can lead to OOM conditions. Writeback will

[RFC 0/8] Cpuset aware writeback

2007-01-15 Thread Christoph Lameter
Currently cpusets are not able to do proper writeback since dirty ratio calculations and writeback are all done for the system as a whole. This may result in a large percentage of a cpuset to become dirty without writeout being triggered. Under NFS this can lead to OOM conditions. Writeback will

Re: [RFC 0/8] Cpuset aware writeback

2007-01-15 Thread Peter Zijlstra
On Mon, 2007-01-15 at 21:47 -0800, Christoph Lameter wrote: Currently cpusets are not able to do proper writeback since dirty ratio calculations and writeback are all done for the system as a whole. This may result in a large percentage of a cpuset to become dirty without writeout being