[PATCH 0/7] Improve hugepage allocation success rates under load V5

2012-08-17 Thread Mel Gorman
Andrew, the biggest change here is that I've reshuffled the patches to
simplify merging. Please consider picking up patches 2 and 3 and merging
them for 3.6 as they fix a broken commit merged in 3.6-rc1. The rest of
the patches can be merged later.

Changelog since V4
o Rebase to latest linux-next/akpm
o Reshuffle patches for easier merging

Changelog since V3
o Add patch to backoff compaction in the event of lock contention
o Rebase to mmotm, cope with the removal of __GFP_NO_KSWAPD
o Removed RFC

Changelog since V2
o Capture !MIGRATE_MOVABLE pages where possible
o Document the treatment of MIGRATE_MOVABLE pages while capturing
o Expand changelogs

Changelog since V1
o Dropped kswapd related patch, basically a no-op and regresses if fixed 
(minchan)
o Expanded changelogs a little

Allocation success rates have been far lower since 3.4 due to commit
[fe2c2a10: vmscan: reclaim at order 0 when compaction is enabled]. This
commit was introduced for good reasons and it was known in advance that
the success rates would suffer but it was justified on the grounds that
the high allocation success rates were achieved by aggressive reclaim.
Success rates are expected to suffer even more in 3.6 due to commit
[7db8889a: mm: have order > 0 compaction start off where it left] which
testing has shown to severely reduce allocation success rates under load -
to 0% in one case.

This series aims to improve the allocation success rates without regressing
the benefits of commit fe2c2a10. The series is based on latest mmotm and
takes into account the __GFP_NO_KSWAPD flag is going away.

Patch 1 reverts the __GFP_NO_KSWAPD patch and related fixes. This is so
patches 2 and 3 can be merged before 3.6 releases. It is reintroduced
later.

Patch 2 fixes the upstream commit [7db8889a: mm: have order > 0 compaction
start off where it left] to enable compaction again

Patch 3 identifies when compacion is taking too long due to contention
and aborts. This fixes a performance problem for Jim Schutt that
commit 7db8889a was meant to fix.

Patch 4 is a comment fix.

Patch 5 is a rebased version of the __GFP_NO_KSWAPD patch with one change
in how it handles deferred_compaction.

Patch 6 updates reclaim/compaction to reclaim pages scaled on the number
of recent failures.

Patch 7 captures suitable high-order pages freed by compaction to reduce
races with parallel allocation requests.

I tested with a high order allocation stress test. The following kernels
were tested.

revert-v5   linux-next/mmotm based on 3.6-rc2 with patch 1 applied
contended-v5patches 1-3
capture-v5  patches 1-7

STRESS-HIGHALLOC
   revert-v5  contended-v5capture-v5  
Pass 1   0.00 ( 0.00%)38.00 (38.00%)45.00 (45.00%)
Pass 2   0.00 ( 0.00%)46.00 (46.00%)52.00 (52.00%)
while Rested85.00 ( 0.00%)86.00 ( 1.00%)86.00 ( 1.00%)

From
http://www.csn.ul.ie/~mel/postings/mmtests-20120424/global-dhp__stress-highalloc-performance-ext3/hydra/comparison.html
I know that the allocation success rates in 3.3.6 was 78% in comparison to
36% in in the current akpm tree. At present the success rate is completely
shot but with patches and 3 applied it goes back up to 38% which is what
I would like to see merged for 3.6. With the full series applied success
rates go up to 45% with some variability in the results.  This is not
as high a success rate as seen in older kernels but it does not reclaim
excessively which is a key point.

MMTests Statistics: vmstat
Page Ins 2889316 2904472 3037020
Page Outs8042076 8030516 8026740
Swap Ins   0   0   0
Swap Outs  0   0   0

Note that swap in/out rates remain at 0. In 3.3.6 with 78% success rates
there were 71881 pages swapped out.

Direct pages scanned   16822  126135   39297
Kswapd pages scanned 1112284 1243865 1534553
Kswapd pages reclaimed   1106913 1203069 1499877
Direct pages reclaimed 16822  113769   26457
Kswapd efficiency99% 96% 97%
Kswapd velocity  899.586 980.6341218.131
Direct efficiency   100% 90% 67%
Direct velocity   13.605  99.442  31.194

kswapd velocity increased slightly but that is expected as __GFP_NO_KSWAPD is
removed by the full series. The velocity with the full series applied is 1218
pages/sec where as in kernel 3.3.6 with the high allocation success rates
it was 8140 pages/second. Direct velocity is slightly higher but this is
expected as a result of patch 6. Pushing direct reclaim higher would improve

[PATCH 0/7] Improve hugepage allocation success rates under load V5

2012-08-17 Thread Mel Gorman
Andrew, the biggest change here is that I've reshuffled the patches to
simplify merging. Please consider picking up patches 2 and 3 and merging
them for 3.6 as they fix a broken commit merged in 3.6-rc1. The rest of
the patches can be merged later.

Changelog since V4
o Rebase to latest linux-next/akpm
o Reshuffle patches for easier merging

Changelog since V3
o Add patch to backoff compaction in the event of lock contention
o Rebase to mmotm, cope with the removal of __GFP_NO_KSWAPD
o Removed RFC

Changelog since V2
o Capture !MIGRATE_MOVABLE pages where possible
o Document the treatment of MIGRATE_MOVABLE pages while capturing
o Expand changelogs

Changelog since V1
o Dropped kswapd related patch, basically a no-op and regresses if fixed 
(minchan)
o Expanded changelogs a little

Allocation success rates have been far lower since 3.4 due to commit
[fe2c2a10: vmscan: reclaim at order 0 when compaction is enabled]. This
commit was introduced for good reasons and it was known in advance that
the success rates would suffer but it was justified on the grounds that
the high allocation success rates were achieved by aggressive reclaim.
Success rates are expected to suffer even more in 3.6 due to commit
[7db8889a: mm: have order  0 compaction start off where it left] which
testing has shown to severely reduce allocation success rates under load -
to 0% in one case.

This series aims to improve the allocation success rates without regressing
the benefits of commit fe2c2a10. The series is based on latest mmotm and
takes into account the __GFP_NO_KSWAPD flag is going away.

Patch 1 reverts the __GFP_NO_KSWAPD patch and related fixes. This is so
patches 2 and 3 can be merged before 3.6 releases. It is reintroduced
later.

Patch 2 fixes the upstream commit [7db8889a: mm: have order  0 compaction
start off where it left] to enable compaction again

Patch 3 identifies when compacion is taking too long due to contention
and aborts. This fixes a performance problem for Jim Schutt that
commit 7db8889a was meant to fix.

Patch 4 is a comment fix.

Patch 5 is a rebased version of the __GFP_NO_KSWAPD patch with one change
in how it handles deferred_compaction.

Patch 6 updates reclaim/compaction to reclaim pages scaled on the number
of recent failures.

Patch 7 captures suitable high-order pages freed by compaction to reduce
races with parallel allocation requests.

I tested with a high order allocation stress test. The following kernels
were tested.

revert-v5   linux-next/mmotm based on 3.6-rc2 with patch 1 applied
contended-v5patches 1-3
capture-v5  patches 1-7

STRESS-HIGHALLOC
   revert-v5  contended-v5capture-v5  
Pass 1   0.00 ( 0.00%)38.00 (38.00%)45.00 (45.00%)
Pass 2   0.00 ( 0.00%)46.00 (46.00%)52.00 (52.00%)
while Rested85.00 ( 0.00%)86.00 ( 1.00%)86.00 ( 1.00%)

From
http://www.csn.ul.ie/~mel/postings/mmtests-20120424/global-dhp__stress-highalloc-performance-ext3/hydra/comparison.html
I know that the allocation success rates in 3.3.6 was 78% in comparison to
36% in in the current akpm tree. At present the success rate is completely
shot but with patches and 3 applied it goes back up to 38% which is what
I would like to see merged for 3.6. With the full series applied success
rates go up to 45% with some variability in the results.  This is not
as high a success rate as seen in older kernels but it does not reclaim
excessively which is a key point.

MMTests Statistics: vmstat
Page Ins 2889316 2904472 3037020
Page Outs8042076 8030516 8026740
Swap Ins   0   0   0
Swap Outs  0   0   0

Note that swap in/out rates remain at 0. In 3.3.6 with 78% success rates
there were 71881 pages swapped out.

Direct pages scanned   16822  126135   39297
Kswapd pages scanned 1112284 1243865 1534553
Kswapd pages reclaimed   1106913 1203069 1499877
Direct pages reclaimed 16822  113769   26457
Kswapd efficiency99% 96% 97%
Kswapd velocity  899.586 980.6341218.131
Direct efficiency   100% 90% 67%
Direct velocity   13.605  99.442  31.194

kswapd velocity increased slightly but that is expected as __GFP_NO_KSWAPD is
removed by the full series. The velocity with the full series applied is 1218
pages/sec where as in kernel 3.3.6 with the high allocation success rates
it was 8140 pages/second. Direct velocity is slightly higher but this is
expected as a result of patch 6. Pushing direct reclaim higher would improve