Re: [PATCH v2 1/2] free_pcppages_bulk: do not hold lock when picking pages to free

2018-02-22 Thread Aaron Lu
On Thu, Feb 15, 2018 at 04:46:44AM -0800, Matthew Wilcox wrote:
> On Thu, Jan 25, 2018 at 03:21:44PM +0800, Aaron Lu wrote:
> > When freeing a batch of pages from Per-CPU-Pages(PCP) back to buddy,
> > the zone->lock is held and then pages are chosen from PCP's migratetype
> > list. While there is actually no need to do this 'choose part' under
> > lock since it's PCP pages, the only CPU that can touch them is us and
> > irq is also disabled.
> 
> I have no objection to this patch.  If you're looking for ideas for
> future improvement though, I wonder whether using a LIST_HEAD is the
> best way to store these pages temporarily.  If you batch them into a
> pagevec and then free the entire pagevec, the CPU should be a little
> faster scanning a short array than walking a linked list.

Thanks for the suggestion.

> It would also puts a hard boundary on how long zone->lock is held, as
> you'd drop it and go back for another batch after 15 pages.  That might
> be bad, of course.

Yes that's a concern.
As Mel reponded in another email, I think I'll just keep using list
here.

> 
> Another minor change I'd like to see is free_pcpages_bulk updating
> pcp->count itself; all of the callers do it currently.

Sounds good, I'll prepare a separate patch for this, thanks!


Re: [PATCH v2 1/2] free_pcppages_bulk: do not hold lock when picking pages to free

2018-02-22 Thread Aaron Lu
On Thu, Feb 15, 2018 at 04:46:44AM -0800, Matthew Wilcox wrote:
> On Thu, Jan 25, 2018 at 03:21:44PM +0800, Aaron Lu wrote:
> > When freeing a batch of pages from Per-CPU-Pages(PCP) back to buddy,
> > the zone->lock is held and then pages are chosen from PCP's migratetype
> > list. While there is actually no need to do this 'choose part' under
> > lock since it's PCP pages, the only CPU that can touch them is us and
> > irq is also disabled.
> 
> I have no objection to this patch.  If you're looking for ideas for
> future improvement though, I wonder whether using a LIST_HEAD is the
> best way to store these pages temporarily.  If you batch them into a
> pagevec and then free the entire pagevec, the CPU should be a little
> faster scanning a short array than walking a linked list.

Thanks for the suggestion.

> It would also puts a hard boundary on how long zone->lock is held, as
> you'd drop it and go back for another batch after 15 pages.  That might
> be bad, of course.

Yes that's a concern.
As Mel reponded in another email, I think I'll just keep using list
here.

> 
> Another minor change I'd like to see is free_pcpages_bulk updating
> pcp->count itself; all of the callers do it currently.

Sounds good, I'll prepare a separate patch for this, thanks!


Re: [PATCH v2 1/2] free_pcppages_bulk: do not hold lock when picking pages to free

2018-02-22 Thread Aaron Lu
On Thu, Feb 15, 2018 at 12:06:08PM +, Mel Gorman wrote:
> On Thu, Jan 25, 2018 at 03:21:44PM +0800, Aaron Lu wrote:
> > When freeing a batch of pages from Per-CPU-Pages(PCP) back to buddy,
> > the zone->lock is held and then pages are chosen from PCP's migratetype
> > list. While there is actually no need to do this 'choose part' under
> > lock since it's PCP pages, the only CPU that can touch them is us and
> > irq is also disabled.
> > 
> > Moving this part outside could reduce lock held time and improve
> > performance. Test with will-it-scale/page_fault1 full load:
> > 
> > kernel  Broadwell(2S)  Skylake(2S)   Broadwell(4S)  Skylake(4S)
> > v4.15-rc4   90373328000124   13642741   15728686
> > this patch  9608786 +6.3%  8368915 +4.6% 14042169 +2.9% 17433559 +10.8%
> > 
> > What the test does is: starts $nr_cpu processes and each will repeatedly
> > do the following for 5 minutes:
> > 1 mmap 128M anonymouse space;
> > 2 write access to that space;
> > 3 munmap.
> > The score is the aggregated iteration.
> > 
> > https://github.com/antonblanchard/will-it-scale/blob/master/tests/page_fault1.c
> > 
> > Acked-by: Mel Gorman 
> > Signed-off-by: Aaron Lu 
> 
> It looks like this series may have gotten lost because it was embedded
> within an existing thread or else it was the proximity to the merge
> window. I suggest a rebase, retest and resubmit unless there was some
> major objection that I missed. Patch 1 is fine by me at least. I never
> explicitly acked patch 2 but I've no major objection to it, just am a tad
> uncomfortable with prefetch magic sauce in general.

Thanks for the suggestion.
I just got back from vacation and will send out once I collected all the
required date.


Re: [PATCH v2 1/2] free_pcppages_bulk: do not hold lock when picking pages to free

2018-02-22 Thread Aaron Lu
On Thu, Feb 15, 2018 at 12:06:08PM +, Mel Gorman wrote:
> On Thu, Jan 25, 2018 at 03:21:44PM +0800, Aaron Lu wrote:
> > When freeing a batch of pages from Per-CPU-Pages(PCP) back to buddy,
> > the zone->lock is held and then pages are chosen from PCP's migratetype
> > list. While there is actually no need to do this 'choose part' under
> > lock since it's PCP pages, the only CPU that can touch them is us and
> > irq is also disabled.
> > 
> > Moving this part outside could reduce lock held time and improve
> > performance. Test with will-it-scale/page_fault1 full load:
> > 
> > kernel  Broadwell(2S)  Skylake(2S)   Broadwell(4S)  Skylake(4S)
> > v4.15-rc4   90373328000124   13642741   15728686
> > this patch  9608786 +6.3%  8368915 +4.6% 14042169 +2.9% 17433559 +10.8%
> > 
> > What the test does is: starts $nr_cpu processes and each will repeatedly
> > do the following for 5 minutes:
> > 1 mmap 128M anonymouse space;
> > 2 write access to that space;
> > 3 munmap.
> > The score is the aggregated iteration.
> > 
> > https://github.com/antonblanchard/will-it-scale/blob/master/tests/page_fault1.c
> > 
> > Acked-by: Mel Gorman 
> > Signed-off-by: Aaron Lu 
> 
> It looks like this series may have gotten lost because it was embedded
> within an existing thread or else it was the proximity to the merge
> window. I suggest a rebase, retest and resubmit unless there was some
> major objection that I missed. Patch 1 is fine by me at least. I never
> explicitly acked patch 2 but I've no major objection to it, just am a tad
> uncomfortable with prefetch magic sauce in general.

Thanks for the suggestion.
I just got back from vacation and will send out once I collected all the
required date.


Re: [PATCH v2 1/2] free_pcppages_bulk: do not hold lock when picking pages to free

2018-02-15 Thread Mel Gorman
On Thu, Feb 15, 2018 at 04:46:44AM -0800, Matthew Wilcox wrote:
> On Thu, Jan 25, 2018 at 03:21:44PM +0800, Aaron Lu wrote:
> > When freeing a batch of pages from Per-CPU-Pages(PCP) back to buddy,
> > the zone->lock is held and then pages are chosen from PCP's migratetype
> > list. While there is actually no need to do this 'choose part' under
> > lock since it's PCP pages, the only CPU that can touch them is us and
> > irq is also disabled.
> 
> I have no objection to this patch.  If you're looking for ideas for
> future improvement though, I wonder whether using a LIST_HEAD is the
> best way to store these pages temporarily.  If you batch them into a
> pagevec and then free the entire pagevec, the CPU should be a little
> faster scanning a short array than walking a linked list.
> 
> It would also puts a hard boundary on how long zone->lock is held, as
> you'd drop it and go back for another batch after 15 pages.  That might
> be bad, of course.
> 

It's not a guaranteed win. You're trading a list traversal for increased
stack usage of 128 bytes (unless you stick a static one in the pgdat and
incur a cache miss penalty instead or a per-cpu pagevec and increase memory
consumption) and 2 spin lock acquire/releases in the common case which may
or may not be contended. It might make more sense if the PCP's themselves
where statically sized but that would limit tuning options and increase
the per-cpu footprint of the pcp structures.

Maybe I'm missing something obvious and it really would be a universal
win but right now I find it hard to imagine that it's a win.

> Another minor change I'd like to see is free_pcpages_bulk updating

> pcp->count itself; all of the callers do it currently.
> 

That should be reasonable, it's not even particularly difficult.

-- 
Mel Gorman
SUSE Labs


Re: [PATCH v2 1/2] free_pcppages_bulk: do not hold lock when picking pages to free

2018-02-15 Thread Mel Gorman
On Thu, Feb 15, 2018 at 04:46:44AM -0800, Matthew Wilcox wrote:
> On Thu, Jan 25, 2018 at 03:21:44PM +0800, Aaron Lu wrote:
> > When freeing a batch of pages from Per-CPU-Pages(PCP) back to buddy,
> > the zone->lock is held and then pages are chosen from PCP's migratetype
> > list. While there is actually no need to do this 'choose part' under
> > lock since it's PCP pages, the only CPU that can touch them is us and
> > irq is also disabled.
> 
> I have no objection to this patch.  If you're looking for ideas for
> future improvement though, I wonder whether using a LIST_HEAD is the
> best way to store these pages temporarily.  If you batch them into a
> pagevec and then free the entire pagevec, the CPU should be a little
> faster scanning a short array than walking a linked list.
> 
> It would also puts a hard boundary on how long zone->lock is held, as
> you'd drop it and go back for another batch after 15 pages.  That might
> be bad, of course.
> 

It's not a guaranteed win. You're trading a list traversal for increased
stack usage of 128 bytes (unless you stick a static one in the pgdat and
incur a cache miss penalty instead or a per-cpu pagevec and increase memory
consumption) and 2 spin lock acquire/releases in the common case which may
or may not be contended. It might make more sense if the PCP's themselves
where statically sized but that would limit tuning options and increase
the per-cpu footprint of the pcp structures.

Maybe I'm missing something obvious and it really would be a universal
win but right now I find it hard to imagine that it's a win.

> Another minor change I'd like to see is free_pcpages_bulk updating

> pcp->count itself; all of the callers do it currently.
> 

That should be reasonable, it's not even particularly difficult.

-- 
Mel Gorman
SUSE Labs


Re: [PATCH v2 1/2] free_pcppages_bulk: do not hold lock when picking pages to free

2018-02-15 Thread Matthew Wilcox
On Thu, Jan 25, 2018 at 03:21:44PM +0800, Aaron Lu wrote:
> When freeing a batch of pages from Per-CPU-Pages(PCP) back to buddy,
> the zone->lock is held and then pages are chosen from PCP's migratetype
> list. While there is actually no need to do this 'choose part' under
> lock since it's PCP pages, the only CPU that can touch them is us and
> irq is also disabled.

I have no objection to this patch.  If you're looking for ideas for
future improvement though, I wonder whether using a LIST_HEAD is the
best way to store these pages temporarily.  If you batch them into a
pagevec and then free the entire pagevec, the CPU should be a little
faster scanning a short array than walking a linked list.

It would also puts a hard boundary on how long zone->lock is held, as
you'd drop it and go back for another batch after 15 pages.  That might
be bad, of course.

Another minor change I'd like to see is free_pcpages_bulk updating
pcp->count itself; all of the callers do it currently.



Re: [PATCH v2 1/2] free_pcppages_bulk: do not hold lock when picking pages to free

2018-02-15 Thread Matthew Wilcox
On Thu, Jan 25, 2018 at 03:21:44PM +0800, Aaron Lu wrote:
> When freeing a batch of pages from Per-CPU-Pages(PCP) back to buddy,
> the zone->lock is held and then pages are chosen from PCP's migratetype
> list. While there is actually no need to do this 'choose part' under
> lock since it's PCP pages, the only CPU that can touch them is us and
> irq is also disabled.

I have no objection to this patch.  If you're looking for ideas for
future improvement though, I wonder whether using a LIST_HEAD is the
best way to store these pages temporarily.  If you batch them into a
pagevec and then free the entire pagevec, the CPU should be a little
faster scanning a short array than walking a linked list.

It would also puts a hard boundary on how long zone->lock is held, as
you'd drop it and go back for another batch after 15 pages.  That might
be bad, of course.

Another minor change I'd like to see is free_pcpages_bulk updating
pcp->count itself; all of the callers do it currently.



Re: [PATCH v2 1/2] free_pcppages_bulk: do not hold lock when picking pages to free

2018-02-15 Thread Mel Gorman
On Thu, Jan 25, 2018 at 03:21:44PM +0800, Aaron Lu wrote:
> When freeing a batch of pages from Per-CPU-Pages(PCP) back to buddy,
> the zone->lock is held and then pages are chosen from PCP's migratetype
> list. While there is actually no need to do this 'choose part' under
> lock since it's PCP pages, the only CPU that can touch them is us and
> irq is also disabled.
> 
> Moving this part outside could reduce lock held time and improve
> performance. Test with will-it-scale/page_fault1 full load:
> 
> kernel  Broadwell(2S)  Skylake(2S)   Broadwell(4S)  Skylake(4S)
> v4.15-rc4   90373328000124   13642741   15728686
> this patch  9608786 +6.3%  8368915 +4.6% 14042169 +2.9% 17433559 +10.8%
> 
> What the test does is: starts $nr_cpu processes and each will repeatedly
> do the following for 5 minutes:
> 1 mmap 128M anonymouse space;
> 2 write access to that space;
> 3 munmap.
> The score is the aggregated iteration.
> 
> https://github.com/antonblanchard/will-it-scale/blob/master/tests/page_fault1.c
> 
> Acked-by: Mel Gorman 
> Signed-off-by: Aaron Lu 

It looks like this series may have gotten lost because it was embedded
within an existing thread or else it was the proximity to the merge
window. I suggest a rebase, retest and resubmit unless there was some
major objection that I missed. Patch 1 is fine by me at least. I never
explicitly acked patch 2 but I've no major objection to it, just am a tad
uncomfortable with prefetch magic sauce in general.

-- 
Mel Gorman
SUSE Labs


Re: [PATCH v2 1/2] free_pcppages_bulk: do not hold lock when picking pages to free

2018-02-15 Thread Mel Gorman
On Thu, Jan 25, 2018 at 03:21:44PM +0800, Aaron Lu wrote:
> When freeing a batch of pages from Per-CPU-Pages(PCP) back to buddy,
> the zone->lock is held and then pages are chosen from PCP's migratetype
> list. While there is actually no need to do this 'choose part' under
> lock since it's PCP pages, the only CPU that can touch them is us and
> irq is also disabled.
> 
> Moving this part outside could reduce lock held time and improve
> performance. Test with will-it-scale/page_fault1 full load:
> 
> kernel  Broadwell(2S)  Skylake(2S)   Broadwell(4S)  Skylake(4S)
> v4.15-rc4   90373328000124   13642741   15728686
> this patch  9608786 +6.3%  8368915 +4.6% 14042169 +2.9% 17433559 +10.8%
> 
> What the test does is: starts $nr_cpu processes and each will repeatedly
> do the following for 5 minutes:
> 1 mmap 128M anonymouse space;
> 2 write access to that space;
> 3 munmap.
> The score is the aggregated iteration.
> 
> https://github.com/antonblanchard/will-it-scale/blob/master/tests/page_fault1.c
> 
> Acked-by: Mel Gorman 
> Signed-off-by: Aaron Lu 

It looks like this series may have gotten lost because it was embedded
within an existing thread or else it was the proximity to the merge
window. I suggest a rebase, retest and resubmit unless there was some
major objection that I missed. Patch 1 is fine by me at least. I never
explicitly acked patch 2 but I've no major objection to it, just am a tad
uncomfortable with prefetch magic sauce in general.

-- 
Mel Gorman
SUSE Labs