Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-12 Thread Andrey Ryabinin
On 02/11/2016 11:51 PM, Andrew Morton wrote: > On Wed, 10 Feb 2016 16:24:16 -0800 Tim Chen > wrote: > >> On Wed, 2016-02-10 at 13:28 -0800, Andrew Morton wrote: >> >>> >>> If a process is unmapping 4MB then it's pretty crazy for us to be >>> hitting the percpu_counter 32 separate times for that

Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-12 Thread Andrey Ryabinin
On 02/11/2016 11:51 PM, Andrew Morton wrote: > On Wed, 10 Feb 2016 16:24:16 -0800 Tim Chen > wrote: > >> On Wed, 2016-02-10 at 13:28 -0800, Andrew Morton wrote: >> >>> >>> If a process is unmapping 4MB then it's pretty crazy for us to be >>> hitting the

Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-11 Thread Tim Chen
On Thu, 2016-02-11 at 12:51 -0800, Andrew Morton wrote: > On Wed, 10 Feb 2016 16:24:16 -0800 Tim Chen > wrote: > > > On Wed, 2016-02-10 at 13:28 -0800, Andrew Morton wrote: > > > > > > > > If a process is unmapping 4MB then it's pretty crazy for us to be > > > hitting the percpu_counter 32

Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-11 Thread Andrew Morton
On Wed, 10 Feb 2016 16:24:16 -0800 Tim Chen wrote: > On Wed, 2016-02-10 at 13:28 -0800, Andrew Morton wrote: > > > > > If a process is unmapping 4MB then it's pretty crazy for us to be > > hitting the percpu_counter 32 separate times for that single operation. > > > > Is there some way in

Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-11 Thread Dave Hansen
On 02/11/2016 10:20 AM, Tim Chen wrote: > The brk1 test is also somewhat pathologic. It > does nothing but brk which is unlikely for real workload. > So we have to be careful when we are tuning our system > behavior for brk1 throughput. We'll need to make sure > whatever changes we made don't

Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-11 Thread Tim Chen
On Thu, 2016-02-11 at 16:54 +0300, Andrey Ryabinin wrote: > > On 02/11/2016 03:24 AM, Tim Chen wrote: > > On Wed, 2016-02-10 at 13:28 -0800, Andrew Morton wrote: > > > >> > >> If a process is unmapping 4MB then it's pretty crazy for us to be > >> hitting the percpu_counter 32 separate times for

Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-11 Thread Tim Chen
On Thu, 2016-02-11 at 16:36 +0300, Andrey Ryabinin wrote: > On 02/10/2016 08:46 PM, Konstantin Khlebnikov wrote: > > On Wed, Feb 10, 2016 at 5:52 PM, Andrey Ryabinin > > wrote: > >> Currently we use percpu_counter for accounting committed memory. Change > >> of committed memory on more than

Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-11 Thread Andrey Ryabinin
On 02/11/2016 03:24 AM, Tim Chen wrote: > On Wed, 2016-02-10 at 13:28 -0800, Andrew Morton wrote: > >> >> If a process is unmapping 4MB then it's pretty crazy for us to be >> hitting the percpu_counter 32 separate times for that single operation. >> >> Is there some way in which we can batch up

Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-11 Thread Andrey Ryabinin
On 02/10/2016 08:46 PM, Konstantin Khlebnikov wrote: > On Wed, Feb 10, 2016 at 5:52 PM, Andrey Ryabinin > wrote: >> Currently we use percpu_counter for accounting committed memory. Change >> of committed memory on more than vm_committed_as_batch pages leads to >> grab of counter's spinlock. The

Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-11 Thread Dave Hansen
On 02/11/2016 10:20 AM, Tim Chen wrote: > The brk1 test is also somewhat pathologic. It > does nothing but brk which is unlikely for real workload. > So we have to be careful when we are tuning our system > behavior for brk1 throughput. We'll need to make sure > whatever changes we made don't

Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-11 Thread Andrey Ryabinin
On 02/11/2016 03:24 AM, Tim Chen wrote: > On Wed, 2016-02-10 at 13:28 -0800, Andrew Morton wrote: > >> >> If a process is unmapping 4MB then it's pretty crazy for us to be >> hitting the percpu_counter 32 separate times for that single operation. >> >> Is there some way in which we can batch up

Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-11 Thread Andrey Ryabinin
On 02/10/2016 08:46 PM, Konstantin Khlebnikov wrote: > On Wed, Feb 10, 2016 at 5:52 PM, Andrey Ryabinin > wrote: >> Currently we use percpu_counter for accounting committed memory. Change >> of committed memory on more than vm_committed_as_batch pages leads to >> grab of

Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-11 Thread Andrew Morton
On Wed, 10 Feb 2016 16:24:16 -0800 Tim Chen wrote: > On Wed, 2016-02-10 at 13:28 -0800, Andrew Morton wrote: > > > > > If a process is unmapping 4MB then it's pretty crazy for us to be > > hitting the percpu_counter 32 separate times for that single operation. > >

Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-11 Thread Tim Chen
On Thu, 2016-02-11 at 12:51 -0800, Andrew Morton wrote: > On Wed, 10 Feb 2016 16:24:16 -0800 Tim Chen > wrote: > > > On Wed, 2016-02-10 at 13:28 -0800, Andrew Morton wrote: > > > > > > > > If a process is unmapping 4MB then it's pretty crazy for us to be > > >

Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-11 Thread Tim Chen
On Thu, 2016-02-11 at 16:36 +0300, Andrey Ryabinin wrote: > On 02/10/2016 08:46 PM, Konstantin Khlebnikov wrote: > > On Wed, Feb 10, 2016 at 5:52 PM, Andrey Ryabinin > > wrote: > >> Currently we use percpu_counter for accounting committed memory. Change > >> of committed

Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-11 Thread Tim Chen
On Thu, 2016-02-11 at 16:54 +0300, Andrey Ryabinin wrote: > > On 02/11/2016 03:24 AM, Tim Chen wrote: > > On Wed, 2016-02-10 at 13:28 -0800, Andrew Morton wrote: > > > >> > >> If a process is unmapping 4MB then it's pretty crazy for us to be > >> hitting the percpu_counter 32 separate times for

Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-10 Thread Tim Chen
On Wed, 2016-02-10 at 13:28 -0800, Andrew Morton wrote: > > If a process is unmapping 4MB then it's pretty crazy for us to be > hitting the percpu_counter 32 separate times for that single operation. > > Is there some way in which we can batch up the modifications within the > caller and update

Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-10 Thread Andrew Morton
On Wed, 10 Feb 2016 10:00:53 -0800 Tim Chen wrote: > On Wed, 2016-02-10 at 17:52 +0300, Andrey Ryabinin wrote: > > Currently we use percpu_counter for accounting committed memory. Change > > of committed memory on more than vm_committed_as_batch pages leads to > > grab of counter's spinlock. The

Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-10 Thread Tim Chen
On Wed, 2016-02-10 at 17:52 +0300, Andrey Ryabinin wrote: > Currently we use percpu_counter for accounting committed memory. Change > of committed memory on more than vm_committed_as_batch pages leads to > grab of counter's spinlock. The batch size is quite small - from 32 pages > up to 0.4% of

Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-10 Thread Konstantin Khlebnikov
On Wed, Feb 10, 2016 at 5:52 PM, Andrey Ryabinin wrote: > Currently we use percpu_counter for accounting committed memory. Change > of committed memory on more than vm_committed_as_batch pages leads to > grab of counter's spinlock. The batch size is quite small - from 32 pages > up to 0.4% of the

[RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-10 Thread Andrey Ryabinin
Currently we use percpu_counter for accounting committed memory. Change of committed memory on more than vm_committed_as_batch pages leads to grab of counter's spinlock. The batch size is quite small - from 32 pages up to 0.4% of the memory/cpu (usually several MBs even on large machines). So

[RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-10 Thread Andrey Ryabinin
Currently we use percpu_counter for accounting committed memory. Change of committed memory on more than vm_committed_as_batch pages leads to grab of counter's spinlock. The batch size is quite small - from 32 pages up to 0.4% of the memory/cpu (usually several MBs even on large machines). So

Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-10 Thread Konstantin Khlebnikov
On Wed, Feb 10, 2016 at 5:52 PM, Andrey Ryabinin wrote: > Currently we use percpu_counter for accounting committed memory. Change > of committed memory on more than vm_committed_as_batch pages leads to > grab of counter's spinlock. The batch size is quite small - from 32

Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-10 Thread Tim Chen
On Wed, 2016-02-10 at 17:52 +0300, Andrey Ryabinin wrote: > Currently we use percpu_counter for accounting committed memory. Change > of committed memory on more than vm_committed_as_batch pages leads to > grab of counter's spinlock. The batch size is quite small - from 32 pages > up to 0.4% of

Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-10 Thread Andrew Morton
On Wed, 10 Feb 2016 10:00:53 -0800 Tim Chen wrote: > On Wed, 2016-02-10 at 17:52 +0300, Andrey Ryabinin wrote: > > Currently we use percpu_counter for accounting committed memory. Change > > of committed memory on more than vm_committed_as_batch pages leads to > >

Re: [RFC PATCH 3/3] mm: increase scalability of global memory commitment accounting

2016-02-10 Thread Tim Chen
On Wed, 2016-02-10 at 13:28 -0800, Andrew Morton wrote: > > If a process is unmapping 4MB then it's pretty crazy for us to be > hitting the percpu_counter 32 separate times for that single operation. > > Is there some way in which we can batch up the modifications within the > caller and update