[PATCH 20/25] mm, compaction: Reduce unnecessary skipping of migration target scanner

2019-01-04 Thread Mel Gorman
and scan rates is marginal but avoiding unnecessary restarts is important. It helps later patches that are more careful about how pageblocks are treated as earlier iterations of those patches hit corner cases where the restarts were punishing and very visible. Signed-off-by: Mel Gorman --- mm

[PATCH 19/25] mm, compaction: Do not consider a need to reschedule as contention

2019-01-04 Thread Mel Gorman
94.26 ( 0.00%)16249.30 * 20.32%* Amean fault-both-3217450.76 ( 0.00%)14904.71 * 14.59%* Signed-off-by: Mel Gorman --- mm/compaction.c | 12 ++-- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index 1a41a2dbff24..75eb0d4

[PATCH 18/25] mm, compaction: Rework compact_should_abort as compact_check_resched

2019-01-04 Thread Mel Gorman
patches but it just makes the review slightly harder. Signed-off-by: Mel Gorman --- mm/compaction.c | 61 ++--- 1 file changed, 23 insertions(+), 38 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index be27e4fa1b40..1a41a2dbff24 100644

[PATCH 17/25] mm, compaction: Keep cached migration PFNs synced for unusable pageblocks

2019-01-04 Thread Mel Gorman
arted recently so overall the reduction in scan rates is a mere 2.8% which is borderline noise. Signed-off-by: Mel Gorman --- mm/compaction.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/mm/compaction.c b/mm/compaction.c index 921720f7a416..be27e4fa1b40 100644 ---

[PATCH 16/25] mm, compaction: Check early for huge pages encountered by the migration scanner

2019-01-04 Thread Mel Gorman
e not materially different. Signed-off-by: Mel Gorman --- mm/compaction.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index 608d274f9880..921720f7a416 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -1071,6 +1071,9 @@ s

[PATCH 14/25] mm, compaction: Avoid rescanning the same pageblock multiple times

2019-01-04 Thread Mel Gorman
en in this case. When it does happen, the scan rates multiple by factors measured in the hundreds and would be misleading to present. Signed-off-by: Mel Gorman --- mm/compaction.c | 32 ++-- mm/internal.h | 1 + 2 files changed, 27 insertions(+), 6 deletions(-) di

[PATCH 15/25] mm, compaction: Finish pageblock scanning on contention

2019-01-04 Thread Mel Gorman
success rate but also by the fact that the scanners do not meet for longer when pageblocks are actually used. Overall this is justified and completing a pageblock scan is very important for later patches. Signed-off-by: Mel Gorman --- mm/compaction.c | 95

[PATCH 13/25] mm, compaction: Use free lists to quickly locate a migration target

2019-01-04 Thread Mel Gorman
35%. The 2-socket reductions for the free scanner are more dramatic which is a likely reflection that the machine has more memory. Signed-off-by: Mel Gorman --- mm/compaction.c | 203 ++-- 1 file changed, 198 insertions(+), 5 deletions(-) diff

[PATCH 11/25] mm, compaction: Use free lists to quickly locate a migration source

2019-01-04 Thread Mel Gorman
showed similar benefits. Signed-off-by: Mel Gorman --- mm/compaction.c | 179 +++- mm/internal.h | 2 + 2 files changed, 179 insertions(+), 2 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index 8f0ce44dba41..137e32e8a2f5 100644

[PATCH 12/25] mm, compaction: Keep migration source private to a single compaction instance

2019-01-04 Thread Mel Gorman
( 0.00%) 95.17 ( 5.54%) Percentage huge-32 89.72 ( 0.00%) 93.59 ( 4.32%) Compaction migrate scanned5416830625516488 Compaction free scanned 80053095487603321 Migration scan rates are reduced by 52%. Signed-off-by: Mel Gorman --- mm/compaction.c | 126

[PATCH 06/25] mm, compaction: Skip pageblocks with reserved pages

2019-01-04 Thread Mel Gorman
ut it would also be considered a bug given that such a change would ruin fragmentation. On both 1-socket and 2-socket machines, scan rates are reduced slightly on workloads that intensively allocate THP while the system is fragmented. Signed-off-by: Mel Gorman --- mm/compaction.c | 16 ++

[PATCH 10/25] mm, compaction: Ignore the fragmentation avoidance boost for isolation and compaction

2019-01-04 Thread Mel Gorman
increased by less than 1% which is marginal. However, detailed tracing indicated that failure of migration due to a premature ENOMEM triggered by watermark checks were eliminated. Signed-off-by: Mel Gorman --- mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm

[PATCH 08/25] mm, compaction: Always finish scanning of a full pageblock

2019-01-04 Thread Mel Gorman
it is offset by future reductions in scanning. Hence, the results are not presented this time due to a misleading mix of gains/losses without any clear pattern. However, full scanning of the pageblock is important for later patches. Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka --- mm/compact

[PATCH 09/25] mm, compaction: Use the page allocator bulk-free helper for lists of pages

2019-01-04 Thread Mel Gorman
should reduce lock contention slightly in some cases. The main benefit is removing some partially duplicated code. Signed-off-by: Mel Gorman --- include/linux/gfp.h | 7 ++- mm/compaction.c | 12 +++- mm/page_alloc.c | 10 +- 3 files changed, 18 insertions(+), 11

[PATCH 07/25] mm, migrate: Immediately fail migration of a page with no migration handler

2019-01-04 Thread Mel Gorman
.00%)21707.05 ( 4.43%) Amean fault-both-3221692.92 ( 0.00%)21968.16 ( -1.27%) The 2-socket results are not materially different. Scan rates are similar as expected. Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka --- mm/migrate.c | 2 +- 1 file changed, 1 insertion(+), 1 delet

[PATCH 04/25] mm, compaction: Remove unnecessary zone parameter in some instances

2019-01-04 Thread Mel Gorman
. The change could be much deeper but this was enough to briefly clarify the flow. No functional change. Signed-off-by: Mel Gorman --- mm/compaction.c | 54 ++ 1 file changed, 26 insertions(+), 28 deletions(-) diff --git a/mm/compaction.c

[PATCH 05/25] mm, compaction: Rename map_pages to split_map_pages

2019-01-04 Thread Mel Gorman
It's non-obvious that high-order free pages are split into order-0 pages from the function name. Fix it. Signed-off-by: Mel Gorman --- mm/compaction.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index 7acb43f07303..3afa4e9

[PATCH 02/25] mm, compaction: Rearrange compact_control

2019-01-04 Thread Mel Gorman
compact_control spans two cache lines with write-intensive lines on both. Rearrange so the most write-intensive fields are in the same cache line. This has a negligible impact on the overall performance of compaction and is more a tidying exercise than anything. Signed-off-by: Mel Gorman Acked

[PATCH 00/25] Increase success rates and reduce latency of compaction v2

2019-01-04 Thread Mel Gorman
This series reduces scan rates and success rates of compaction, primarily by using the free lists to shorten scans, better controlling of skip information and whether multiple scanners can target the same block and capturing pageblocks before being stolen by parallel requests. The series is based o

[PATCH 01/25] mm, compaction: Shrink compact_control

2019-01-04 Thread Mel Gorman
The isolate and migrate scanners should never isolate more than a pageblock of pages so unsigned int is sufficient saving 8 bytes on a 64-bit build. Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka --- mm/internal.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm

[PATCH 03/25] mm, compaction: Remove last_migrated_pfn from compact_control

2019-01-04 Thread Mel Gorman
The last_migrated_pfn field is a bit dubious as to whether it really helps but either way, the information from it can be inferred without increasing the size of compact_control so remove the field. Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka --- mm/compaction.c | 25

Re: [PATCH] mm, page_alloc: Do not wake kswapd with zone lock held

2019-01-04 Thread Mel Gorman
On Fri, Jan 04, 2019 at 09:18:38AM +0100, Vlastimil Babka wrote: > On 1/3/19 11:57 PM, Mel Gorman wrote: > > While zone->flag could have continued to be unused, there is potential > > for moving some existing fields into the flags field instead. Particularly > > re

[PATCH] mm, page_alloc: Do not wake kswapd with zone lock held

2019-01-03 Thread Mel Gorman
e zone->initialized and zone->contiguous. Reported-by: syzbot+93d94a001cfbce9e6...@syzkaller.appspotmail.com Tested-by: Qian Cai Signed-off-by: Mel Gorman --- include/linux/mmzone.h | 6 ++ mm/page_alloc.c| 8 +++- 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/i

Re: possible deadlock in __wake_up_common_lock

2019-01-03 Thread Mel Gorman
On Thu, Jan 03, 2019 at 02:40:35PM -0500, Qian Cai wrote: > > Signed-off-by: Mel Gorman > > Tested-by: Qian Cai Thanks! -- Mel Gorman SUSE Labs

Re: possible deadlock in __wake_up_common_lock

2019-01-03 Thread Mel Gorman
isingly, unused zone flag field. The flag is read without the lock held to do the wakeup. It's possible that the flag setting context is not the same as the flag clearing context or for small races to occur. However, each race possibility is harmless and there is no visible degredation in f

Re: [RFC][PATCH v2 00/21] PMEM NUMA node and hotness accounting/migration

2019-01-03 Thread Mel Gorman
"distance" is reasonably well understood, it's not as clear to me whether distance is appropriate to describe "local-but-different-speed" memory given that accessing a remote NUMA node can saturate a single link where as the same may not be true of local-but-different-speed memory which probably has dedicated channels. In an ideal world, application developers interested in higher-speed-memory-reserved-for-important-use and cheaper-lower-speed-memory could describe what sort of application modifications they'd be willing to do but that might be unlikely. -- Mel Gorman SUSE Labs

Re: possible deadlock in __wake_up_common_lock

2019-01-02 Thread Mel Gorman
It's not necessarily to keep track of the IRQ flags as callers into that path already do things like treat IRQ disabling and the spin lock separately. 2. Use another alloc_flag in steal_suitable_fallback that is set when a wakeup is required but do the actual wakeup in rmqueue() after the zone locks are dropped and the allocation request is completed 3. Always wakeup kswapd if watermarks are boosted. I like this the least because it means doing wakeups that are unrelated to fragmentation that occurred in the current context. Any particular preference? While I recognise there is no test case available, how often does this trigger in syzbot as it would be nice to have some confirmation any patch is really fixing the problem. -- Mel Gorman SUSE Labs

Re: [PATCH] mm: compaction.c: Propagate return value upstream

2019-01-02 Thread Mel Gorman
not just ... > > Mel, Randy? You seem to have been the prime instigators on this. > Patch seems fine. Acked-by: Mel Gorman -- Mel Gorman SUSE Labs

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-22 Thread Mel Gorman
ng the skip bit to avoid picking a migration source that was previously a migration target o The exit condition for compaction is not when scanners meet but when fast_isolate_freepages cannot find any pageblock that is MIGRATE_MOVABLE && !pageblock_skip -- Mel Gorman SUSE Labs

Re: [PATCH 06/14] mm, migrate: Immediately fail migration of a page with no migration handler

2018-12-20 Thread Mel Gorman
On Thu, Dec 20, 2018 at 11:44:57AM -0800, Yang Shi wrote: > On Fri, Dec 14, 2018 at 3:03 PM Mel Gorman > wrote: > > > > Pages with no migration handler use a fallback hander which sometimes > > works and sometimes persistently fails such as blockdev pages. Migration >

Re: [PATCH 08/14] mm, compaction: Use the page allocator bulk-free helper for lists of pages

2018-12-19 Thread Mel Gorman
On Tue, Dec 18, 2018 at 10:55:31AM +0100, Vlastimil Babka wrote: > On 12/15/18 12:03 AM, Mel Gorman wrote: > > release_pages() is a simpler version of free_unref_page_list() but it > > tracks the highest PFN for caching the restart point of the compaction > > free scanner.

Re: [PATCH 09/14] mm, compaction: Ignore the fragmentation avoidance boost for isolation and compaction

2018-12-18 Thread Mel Gorman
On Tue, Dec 18, 2018 at 02:58:33PM +0100, Vlastimil Babka wrote: > On 12/18/18 2:51 PM, Mel Gorman wrote: > > On Tue, Dec 18, 2018 at 01:36:42PM +0100, Vlastimil Babka wrote: > >> On 12/15/18 12:03 AM, Mel Gorman wrote: > >>> When pageblocks get fragmented, waterma

Re: [PATCH 09/14] mm, compaction: Ignore the fragmentation avoidance boost for isolation and compaction

2018-12-18 Thread Mel Gorman
On Tue, Dec 18, 2018 at 01:36:42PM +0100, Vlastimil Babka wrote: > On 12/15/18 12:03 AM, Mel Gorman wrote: > > When pageblocks get fragmented, watermarks are artifically boosted to pages > > are reclaimed to avoid further fragmentation events. However, compaction > > is often

Re: [PATCH 06/14] mm, migrate: Immediately fail migration of a page with no migration handler

2018-12-18 Thread Mel Gorman
On Tue, Dec 18, 2018 at 10:06:31AM +0100, Vlastimil Babka wrote: > On 12/15/18 12:03 AM, Mel Gorman wrote: > > Pages with no migration handler use a fallback hander which sometimes > > works and sometimes persistently fails such as blockdev pages. Migration > > will retry

Re: [PATCH 05/14] mm, compaction: Skip pageblocks with reserved pages

2018-12-18 Thread Mel Gorman
On Tue, Dec 18, 2018 at 09:08:02AM +0100, Vlastimil Babka wrote: > On 12/15/18 12:03 AM, Mel Gorman wrote: > > Reserved pages are set at boot time, tend to be clustered and almost > > never become unreserved. When isolating pages for migrating, skip > > the entire pagebloc

Re: [PATCH 04/14] mm, compaction: Rename map_pages to split_map_pages

2018-12-17 Thread Mel Gorman
On Mon, Dec 17, 2018 at 03:06:59PM +0100, Vlastimil Babka wrote: > On 12/15/18 12:03 AM, Mel Gorman wrote: > > It's non-obvious that high-order free pages are split into order-0 > > pages from the function name. Fix it. > > That's fine, but looks like the patch h

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-14 Thread Mel Gorman
the per-zone > free_area's to determine migration targets and set a bit if it should be > considered a migration source or a migration target. If all pages for a > pageblock are not on free_areas, they are fully used. > Series has patches which implement something similar to this idea. -- Mel Gorman SUSE Labs

[PATCH 13/14] mm, compaction: Capture a page under direct compaction

2018-12-14 Thread Mel Gorman
%) 99.22 ( 3.86%) Percentage huge-32 94.94 ( 0.00%) 98.97 ( 4.25%) And scan rates are reduced Compaction migrate scanned2763428419002941 Compaction free scanned 5527951946395714 Signed-off-by: Mel Gorman --- include/linux/compaction.h | 3 ++- include/linux

[PATCH 14/14] mm, compaction: Do not direct compact remote memory

2018-12-14 Thread Mel Gorman
s THP, they are forbidden at the time of writing but if __GFP_THISNODE is ever removed, then it would still be preferable to fallback to small local base pages over remote THP in the general case. kcompactd is still woken via kswapd so compaction happens eventually. Signed-off-by: Mel Gorman --

[PATCH 12/14] mm, compaction: Use free lists to quickly locate a migration target

2018-12-14 Thread Mel Gorman
0-rc6 isolmig-v1r4findfree-v1r8 Compaction migrate scanned2558745327634284 Compaction free scanned 8773589455279519 The free scan rates are reduced by 37%. Signed-off-by: Mel Gorman --- mm/compaction.c

[PATCH 11/14] mm, compaction: Keep migration source private to a single compaction instance

2018-12-14 Thread Mel Gorman
4.12%) Compaction migrate scanned5100545025587453 Compaction free scanned 78035946487735894 Migration scan rates are reduced by 49%. At the time of writing, the 2-socket results are not yet available. Signed-off-by: Mel Gorman --- mm/compaction.c

[PATCH 10/14] mm, compaction: Use free lists to quickly locate a migration source

2018-12-14 Thread Mel Gorman
showing a 16% reduction in migration scanning with some mild improvements on latency. A 2-socket machine showed similar reductions of scan rates in percentage terms. Signed-off-by: Mel Gorman --- mm/compaction.c | 179 +++- mm/internal.h | 2

[PATCH 08/14] mm, compaction: Use the page allocator bulk-free helper for lists of pages

2018-12-14 Thread Mel Gorman
release_pages() is a simpler version of free_unref_page_list() but it tracks the highest PFN for caching the restart point of the compaction free scanner. This patch optionally tracks the highest PFN in the core helper and converts compaction to use it. Signed-off-by: Mel Gorman --- include

[PATCH 03/14] mm, compaction: Remove last_migrated_pfn from compact_control

2018-12-14 Thread Mel Gorman
The last_migrated_pfn field is a bit dubious as to whether it really helps but either way, the information from it can be inferred without increasing the size of compact_control so remove the field. Signed-off-by: Mel Gorman --- mm/compaction.c | 25 + mm/internal.h

[PATCH 07/14] mm, compaction: Always finish scanning of a full pageblock

2018-12-14 Thread Mel Gorman
f the pageblock and sometimes it is offset by future reductions in scanning. Hence, the results are not presented this time as it's a mix of gains/losses without any clear pattern. However, completing scanning of the pageblock is important for later patches. Signed-off-by: Mel Gorman --- mm/co

[PATCH 04/14] mm, compaction: Rename map_pages to split_map_pages

2018-12-14 Thread Mel Gorman
It's non-obvious that high-order free pages are split into order-0 pages from the function name. Fix it. Signed-off-by: Mel Gorman --- mm/compaction.c | 60 - 1 file changed, 29 insertions(+), 31 deletions(-) diff --git a/mm/compact

[PATCH 02/14] mm, compaction: Rearrange compact_control

2018-12-14 Thread Mel Gorman
compact_control spans two cache lines with write-intensive lines on both. Rearrange so the most write-intensive fields are in the same cache line. This has a negligible impact on the overall performance of compaction and is more a tidying exercise than anything. Signed-off-by: Mel Gorman --- mm

[PATCH 09/14] mm, compaction: Ignore the fragmentation avoidance boost for isolation and compaction

2018-12-14 Thread Mel Gorman
sensitive to timing and whether the boost was active or not. However, detailed tracing indicated that failure of migration due to a premature ENOMEM triggered by watermark checks were eliminated. Signed-off-by: Mel Gorman --- mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion

[PATCH 05/14] mm, compaction: Skip pageblocks with reserved pages

2018-12-14 Thread Mel Gorman
76.43 ( 0.00%) 1052.64 * 10.52%* Compaction migrate scanned 3860713 3294284 Compaction free scanned 613786341 433423502 Kcompactd migrate scanned 408711 291915 Kcompactd free scanned 242509759 217164988 Signed-off-by: Mel Gorman --- mm/compaction.

[PATCH 01/14] mm, compaction: Shrink compact_control

2018-12-14 Thread Mel Gorman
The isolate and migrate scanners should never isolate more than a pageblock of pages so unsigned int is sufficient saving 8 bytes on a 64-bit build. Signed-off-by: Mel Gorman --- mm/internal.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/internal.h b/mm/internal.h

[RFC PATCH 00/14] Increase success rates and reduce latency of compaction v1

2018-12-14 Thread Mel Gorman
This is a very preliminary RFC. I'm posting this early as the __GFP_THISNODE discussion continues and has started looking at the compaction implementation and it'd be worth looking at this series fdirst. The cc list is based on that dicussion just to make them aware it exists. A v2 will have a sign

[PATCH 06/14] mm, migrate: Immediately fail migration of a page with no migration handler

2018-12-14 Thread Mel Gorman
( 4.62%) Amean fault-both-3222461.41 ( 0.00%)21415.35 ( 4.66%) The 2-socket results are not materially different. Scan rates are similar as expected. Signed-off-by: Mel Gorman --- mm/migrate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/migrate.c b

Re: [RFC/RFT][PATCH v6] cpuidle: New timer events oriented governor for tickless systems

2018-12-07 Thread Mel Gorman
ts as a regular user, but > they seem to want to modify: > > /sys/kernel/mm/transparent_hugepage/enabled > Red herring in this case. Even if transparent hugepages are left as the default, it still tries to write it stupidly. An irritating, but harmless bug. -- Mel Gorman SUSE Labs

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-05 Thread Mel Gorman
On Wed, Dec 05, 2018 at 10:08:56AM +0100, Michal Hocko wrote: > On Tue 04-12-18 16:47:23, David Rientjes wrote: > > On Tue, 4 Dec 2018, Mel Gorman wrote: > > > > > What should also be kept in mind is that we should avoid conflating > > > locality preferences with

Re: [patch 0/2 for-4.20] mm, thp: fix remote access and allocation regressions

2018-12-05 Thread Mel Gorman
ot. It affects the level of work the system does as well as the overall success rate of operations (be it reclaim, THP allocation, compaction, whatever). This is why a reproduction case that is representative of the problem you're facing on the real workload matters would have been helpful because then any alternative proposal could have taken your workload into account during testing. -- Mel Gorman SUSE Labs

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-05 Thread Mel Gorman
On Tue, Dec 04, 2018 at 10:45:58AM +, Mel Gorman wrote: > I have *one* result of the series on a 1-socket machine running > "thpscale". It creates a file, punches holes in it to create a > very light form of fragmentation and then tries THP allocations > using madvise

Re: [PATCH 5/5] mm: Stall movable allocations until kswapd progresses during serious external fragmentation event

2018-12-05 Thread Mel Gorman
but probably worthwhile > > for long-term allocation success rates. It is possible to eliminate > > fragmentation events entirely with tuning due to this patch although that > > would require careful evaluation to determine if it's worthwhile. > > > > Signed-off-

Re: [LKP] [mm] ac5b2c1891: vm-scalability.throughput -61.3% regression

2018-12-04 Thread Mel Gorman
check first if we can defragment the memory or > whether it makes sense to free pages in case the defragmentation is > expected to help afterwards. It seemed better to put this special case > out of the main reclaim/compaction retry-with-increasing-priority loop > for non-costly-order allocations that in general can't fail. > Again, this is accurate. Scanning/compaction costs a lot. This has improved over time, but minimally it's unmapping pages, copying data and a bunch of TLB flushes. During migration, any access to the data being migrated stalls. The harm of reclaiming a little first so that the compaction is more likely to succeed incurred fewer stalls of small magnitude in general -- or at least it was the case when that behaviour was developed. -- Mel Gorman SUSE Labs

Re: [PATCH 5/5] mm: Stall movable allocations until kswapd progresses during serious external fragmentation event

2018-11-27 Thread Mel Gorman
but probably worthwhile > > for long-term allocation success rates. It is possible to eliminate > > fragmentation events entirely with tuning due to this patch although that > > would require careful evaluation to determine if it's worthwhile. > > > > Signed-off-

Re: Hackbench pipes regression bisected to PSI

2018-11-26 Thread Mel Gorman
ndicated it would) and that disabling PSI by default is reasonably close in terms of performance for this particular workload on this particular machine so; Tested-by: Mel Gorman Thanks! -- Mel Gorman SUSE Labs

Re: Hackbench pipes regression bisected to PSI

2018-11-26 Thread Mel Gorman
On Mon, Nov 26, 2018 at 12:32:18PM -0500, Johannes Weiner wrote: > On Mon, Nov 26, 2018 at 04:54:47PM +0000, Mel Gorman wrote: > > On Mon, Nov 26, 2018 at 11:07:24AM -0500, Johannes Weiner wrote: > > > @@ -509,6 +509,15 @@ config PSI > > > > > > Sa

Re: Hackbench pipes regression bisected to PSI

2018-11-26 Thread Mel Gorman
On Mon, Nov 26, 2018 at 11:07:24AM -0500, Johannes Weiner wrote: > Hi Mel, > > On Mon, Nov 26, 2018 at 01:34:20PM +0000, Mel Gorman wrote: > > Hi Johannes, > > > > PSI is a great idea but it does have overhead and if enabled by Kconfig > > then it incurs a hit

[PATCH] mm: Use alloc_flags to record if kswapd can wake -fix

2018-11-26 Thread Mel Gorman
Vlastimil Babka correctly pointed out that the ALLOC_KSWAPD flag needs to be applied in the !CONFIG_ZONE_DMA32 case. This is a fix for the mmotm path mm-use-alloc_flags-to-record-if-kswapd-can-wake.patch Signed-off-by: Mel Gorman --- mm/page_alloc.c | 10 ++ 1 file changed, 2 insertions

Hackbench pipes regression bisected to PSI

2018-11-26 Thread Mel Gorman
ect bad 505802a53510e54ad5fbbd655a68893df83bfb91 # bad: [2ce7135adc9ad081aa3c49744144376ac74fea60] psi: cgroup support git bisect bad 2ce7135adc9ad081aa3c49744144376ac74fea60 # first bad commit: [2ce7135adc9ad081aa3c49744144376ac74fea60] psi: cgroup support -- Mel Gorman SUSE Labs

[PATCH 3/5] mm: Use alloc_flags to record if kswapd can wake

2018-11-23 Thread Mel Gorman
claimed that this has nothing to do with ALLOC_NO_FRAGMENT. That's true in this patch but is not true later so it's done now for easier review to show where the flag needs to be recorded. No functional change. Signed-off-by: Mel Gorman --- mm/internal.h | 1 + mm/page_al

[PATCH 4/5] mm: Reclaim small amounts of memory when an external fragmentation event occurs

2018-11-23 Thread Mel Gorman
ong-term allocation success rate would be higher. Signed-off-by: Mel Gorman --- Documentation/sysctl/vm.txt | 21 +++ include/linux/mm.h | 1 + include/linux/mmzone.h | 11 ++-- kernel/sysctl.c | 8 +++ mm/page_alloc.c | 43 +- mm/vms

[PATCH 2/5] mm: Move zone watermark accesses behind an accessor

2018-11-23 Thread Mel Gorman
This is a preparation patch only, no functional change. Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka --- include/linux/mmzone.h | 9 + mm/compaction.c| 2 +- mm/page_alloc.c| 12 ++-- 3 files changed, 12 insertions(+), 11 deletions(-) diff --git a

[PATCH 1/5] mm, page_alloc: Spread allocations across zones before introducing fragmentation

2018-11-23 Thread Mel Gorman
tart on node 0 or not for this patch but the relevance is reduced later in the series. Overall, the patch reduces the number of external fragmentation causing events so the success of THP over long periods of time would be improved for this adverse workload.

[PATCH 0/5] Fragmentation avoidance improvements v5

2018-11-23 Thread Mel Gorman
There are some big changes due to both Vlastimil's review feedback on v4 and some oddities spotted while answering his review. In some respects, the series is slightly less effective but the approach is more consistent and logical overall. The overhead is also lower from the first patch and stalls

[PATCH 5/5] mm: Stall movable allocations until kswapd progresses during serious external fragmentation event

2018-11-23 Thread Mel Gorman
alls can be enough for kswapd to catch up. How much that helps is variable but probably worthwhile for long-term allocation success rates. It is possible to eliminate fragmentation events entirely with tuning due to this patch although that would require careful evaluation to determine if it&#

Re: [PATCH 4/4] mm: Stall movable allocations until kswapd progresses during serious external fragmentation event

2018-11-22 Thread Mel Gorman
On Thu, Nov 22, 2018 at 06:02:10PM +0100, Vlastimil Babka wrote: > On 11/21/18 11:14 AM, Mel Gorman wrote: > > An event that potentially causes external fragmentation problems has > > already been described but there are degrees of severity. A "serious" > > event

Re: [PATCH 3/4] mm: Reclaim small amounts of memory when an external fragmentation event occurs

2018-11-22 Thread Mel Gorman
check gfp_flags I'm afraid, and that doesn't seem worth the trouble. Indeed. While it works in some cases, it'll be full of holes and while I could close them, it just turns into a subtle mess. I've prepared a preparation path that encodes __GFP_KSWAPD_RECLAIM in alloc_flags and checks based on that. It's a lot cleaner overall, it's less of a mess than passing gfp_flags all the way through for one test and there are fewer side-effects. Thanks! -- Mel Gorman SUSE Labs

Re: [PATCH 3/4] mm: Reclaim small amounts of memory when an external fragmentation event occurs

2018-11-22 Thread Mel Gorman
ns without __GFP_KSWAPD_RECLAIM? But returning 0 here means > actually allowing the allocation go through steal_suitable_fallback()? > So should it return ALLOC_NOFRAGMENT below, or was the intent different? > I want to avoid waking kswapd in steal_suitable_fallback if waking kswapd is not allowed. If the calling context does not allow it, it does mean that fragmentation will be allowed to occur. I'm banking on it being a relatively rare case but potentially it'll be problematic. The main source of allocation requests that I expect to hit this are THP and as they are already at pageblock_order, it has limited impact from a fragmentation perspective -- particularly as pageblock_order stealing is allowed even with ALLOC_NOFRAGMENT. -- Mel Gorman SUSE Labs

Re: [PATCH 1/4] mm, page_alloc: Spread allocations across zones before introducing fragmentation

2018-11-21 Thread Mel Gorman
zoneref *z = ac->preferred_zoneref; > > struct zone *zone; > > struct pglist_data *last_pgdat_dirty_limit = NULL; > > + bool no_fallback; > > > > +retry: > > Ugh, I think 'z = ac->preferred_zoneref' should be moved here under > retry. AFAICS without that, the preference of local node to > fragmentation avoidance doesn't work? > Yup, you're right! In the event of fragmentation of both normal and dma32 zone, it doesn't restart on the local node and instead falls over to the remote node prematurely. This is obviously not desirable. I'll give it and thanks for spotting it. -- Mel Gorman SUSE Labs

[PATCH 0/4] Fragmentation avoidance improvements v4

2018-11-21 Thread Mel Gorman
No major change from v3 really, mostly resending to see if there is any review reaction. It's rebased but a partial test indicated that the behaviour is similar to the previous baseline Changelog since v3 o Rebase to 4.20-rc3 o Remove a stupid warning from the last patch Changelog since v2 o Drop

[PATCH 2/4] mm: Move zone watermark accesses behind an accessor

2018-11-21 Thread Mel Gorman
This is a preparation patch only, no functional change. Signed-off-by: Mel Gorman --- include/linux/mmzone.h | 9 + mm/compaction.c| 2 +- mm/page_alloc.c| 12 ++-- 3 files changed, 12 insertions(+), 11 deletions(-) diff --git a/include/linux/mmzone.h b

[PATCH 1/4] mm, page_alloc: Spread allocations across zones before introducing fragmentation

2018-11-21 Thread Mel Gorman
tch significantly reduces the number of external fragmentation causing events so the success of THP over long periods of time would be improved for this adverse workload. While there are large differences compared to how V1 behaved, this is almost entirely accounted for by ac5b2c18911f ("mm:

[PATCH 3/4] mm: Reclaim small amounts of memory when an external fragmentation event occurs

2018-11-21 Thread Mel Gorman
s under quite some pressure. Signed-off-by: Mel Gorman --- Documentation/sysctl/vm.txt | 19 +++ include/linux/mm.h | 1 + include/linux/mmzone.h | 11 ++-- kernel/sysctl.c | 8 +++ mm/page_alloc.c | 53 +-- mm/vms

[PATCH 4/4] mm: Stall movable allocations until kswapd progresses during serious external fragmentation event

2018-11-21 Thread Mel Gorman
limiting the fragmentation events. On the flip-side, it has been checked that setting the fragment_stall_order to 9 eliminated fragmentation events entirely. Signed-off-by: Mel Gorman --- Documentation/sysctl/vm.txt | 23 +++ include/linux/mm.h| 1 + include/linux/

Re: [RFC PATCH] mm: thp: implement THP reservations for anonymous memory

2018-11-10 Thread Mel Gorman
ad in advance, it might bring this the right direction and not accidentally throw Anthony down a hole working on a series that never gets ack'd. I'm not necessarily the best person to answer because my natural inclination after the fragmentation series would be to keep using thpfiosacle (from the fragmentation avoidance series) and work on improving the THP allocation success rates and reduce latencies. I've tunnel vision on that for the moment. Thanks. -- Mel Gorman SUSE Labs

Re: [RFC PATCH] mm: thp: implement THP reservations for anonymous memory

2018-11-09 Thread Mel Gorman
On Fri, Nov 09, 2018 at 03:13:18PM +0300, Kirill A. Shutemov wrote: > On Thu, Nov 08, 2018 at 10:48:58PM -0800, Anthony Yznaga wrote: > > The basic idea as outlined by Mel Gorman in [2] is: > > > > 1) On first fault in a sufficiently sized range, allocate a huge page >

Re: [RFC PATCH] mm: thp: implement THP reservations for anonymous memory

2018-11-09 Thread Mel Gorman
On Thu, Nov 08, 2018 at 10:48:58PM -0800, Anthony Yznaga wrote: > The basic idea as outlined by Mel Gorman in [2] is: > > 1) On first fault in a sufficiently sized range, allocate a huge page >sized and aligned block of base pages. Map the base page >corresponding to th

Re: UBSAN: Undefined behaviour in mm/page_alloc.c

2018-11-09 Thread Mel Gorman
It's unfortunate and I know the original microoptimisation was mine but if the fast-path check ends up being a problem then I/we go back to finding ways of making the page allocator faster from a fundamental algorithmic point of view and not a microoptimisation approach. There is potential fruit there, just none that is low-hanging. -- Mel Gorman SUSE Labs

[PATCH 2/4] mm: Move zone watermark accesses behind an accessor

2018-11-08 Thread Mel Gorman
This is a preparation patch only, no functional change. Signed-off-by: Mel Gorman --- include/linux/mmzone.h | 9 + mm/compaction.c| 2 +- mm/page_alloc.c| 12 ++-- 3 files changed, 12 insertions(+), 11 deletions(-) diff --git a/include/linux/mmzone.h b

[PATCH 4/4] mm: Stall movable allocations until kswapd progresses during serious external fragmentation event

2018-11-08 Thread Mel Gorman
limiting the fragmentation events. On the flip-side, it has been checked that setting the fragment_stall_order to 9 eliminated fragmentation events entirely. Signed-off-by: Mel Gorman --- Documentation/sysctl/vm.txt | 23 +++ include/linux/mm.h| 1 + include/linux/

[PATCH 1/4] mm, page_alloc: Spread allocations across zones before introducing fragmentation

2018-11-08 Thread Mel Gorman
tch significantly reduces the number of external fragmentation causing events so the success of THP over long periods of time would be improved for this adverse workload. While there are large differences compared to how V1 behaved, this is almost entirely accounted for by ac5b2c18911f ("mm:

[PATCH 3/4] mm: Reclaim small amounts of memory when an external fragmentation event occurs

2018-11-08 Thread Mel Gorman
s under quite some pressure. Signed-off-by: Mel Gorman --- Documentation/sysctl/vm.txt | 19 +++ include/linux/mm.h | 1 + include/linux/mmzone.h | 11 ++-- kernel/sysctl.c | 8 +++ mm/page_alloc.c | 53 +-- mm/vms

[PATCH 0/4] Fragmentation avoidance improvements v3

2018-11-08 Thread Mel Gorman
Sorry to send out a v3 so quickly. I dropped patch 5 as I'm not very happy with the approach or that it is without side-effects. I have some ideas on how it could be better achieved which can be done without delaying the other 4 patches. I've also updated patch 4 to reduce the stall timeout as lon

[PATCH 0/5] Fragmentation avoidance improvements v2

2018-11-07 Thread Mel Gorman
The 1-socket machine is different to the one used in v1 so some of the results are changed on that basis. The baseline has changed to 4.20-rc1 so the __GFP_THISNODE removal for THP is in effect which alters the behaviour on 2-socket in particular. The biggest changes are in the fourth patch, both

[PATCH 2/5] mm: Move zone watermark accesses behind an accessor

2018-11-07 Thread Mel Gorman
This is a preparation patch only, no functional change. Signed-off-by: Mel Gorman --- include/linux/mmzone.h | 9 + mm/compaction.c| 2 +- mm/page_alloc.c| 12 ++-- 3 files changed, 12 insertions(+), 11 deletions(-) diff --git a/include/linux/mmzone.h b

[PATCH 5/5] mm: Target compaction on pageblocks that were recently fragmented

2018-11-07 Thread Mel Gorman
the case of MADV_HUGEPAGE, the allocation success rates were already high. However, it's encouraging that the THP allocation latencies were improved. Signed-off-by: Mel Gorman --- include/linux/compaction.h| 4 ++ include/linux/migrate.h | 7 +- include/linux/mmzone.h

[PATCH 3/5] mm: Reclaim small amounts of memory when an external fragmentation event occurs

2018-11-07 Thread Mel Gorman
s under quite some pressure. Signed-off-by: Mel Gorman --- Documentation/sysctl/vm.txt | 19 +++ include/linux/mm.h | 1 + include/linux/mmzone.h | 11 ++-- kernel/sysctl.c | 8 +++ mm/page_alloc.c | 53 +-- mm/vms

[PATCH 1/5] mm, page_alloc: Spread allocations across zones before introducing fragmentation

2018-11-07 Thread Mel Gorman
tch significantly reduces the number of external fragmentation causing events so the success of THP over long periods of time would be improved for this adverse workload. While there are large differences compared to how V1 behaved, this is almost entirely accounted for by ac5b2c18911f ("mm:

[PATCH 4/5] mm: Stall movable allocations until kswapd progresses during serious external fragmentation event

2018-11-07 Thread Mel Gorman
made available for analysis to see if the stall behaviour can be reduced while still limiting the fragmentation events. On the flip-side, it has been checked that setting the fragment_stall_order to 9 eliminated fragmentation events entirely. Signed-off-by: Mel Gorman --- Documentation/sysct

Re: [RFC PATCH v2 1/1] pipe: busy wait for pipe

2018-11-06 Thread Mel Gorman
On Mon, Nov 05, 2018 at 03:40:40PM -0800, Subhra Mazumdar wrote: > > On 11/5/18 2:08 AM, Mel Gorman wrote: > > Adding Al Viro as per get_maintainers.pl. > > > > On Tue, Sep 25, 2018 at 04:32:40PM -0700, subhra mazumdar wrote: > > > Introduce pipe_ll_usec field fo

Re: [RFC PATCH v2 1/1] pipe: busy wait for pipe

2018-11-05 Thread Mel Gorman
nu guesses how long it'll be in an idle state for). It's not really my area but I feel that this patch is a benchmark-specific hack and that tuning it on a system-wide basis will be a game of "win some, lose some" that is never used in practice. Worse, it might end up in a tuning guide as "always set this sysctl" without considering the capabilities of the machine or the workload and falls victim to cargo cult tuning. -- Mel Gorman SUSE Labs

Re: [PATCH 3/5] mm: Reclaim small amounts of memory when an external fragmentation event occurs

2018-10-31 Thread Mel Gorman
On Wed, Oct 31, 2018 at 04:06:43PM +, Mel Gorman wrote: > An external fragmentation event was previously described as > > When the page allocator fragments memory, it records the event using > the mm_page_alloc_extfrag event. If the fallback_order is smaller > t

[PATCH 1/5] mm, page_alloc: Spread allocations across zones before introducing fragmentation

2018-10-31 Thread Mel Gorman
er of external fragmentation causing events so the success of THP over long periods of time would be improved for this adverse workload. Signed-off-by: Mel Gorman --- mm/internal.h | 13 +--- mm/page_alloc.c | 101 ++-- 2 files changed,

[PATCH 5/5] mm: Target compaction on pageblocks that were recently fragmented

2018-10-31 Thread Mel Gorman
hemselves can increase fragmentation pressure. This is less an obvious universal win. It does control fragmentation better to some extent in that pageblocks can be found faster in some cases but the nature of the workload makes it less clear-cut. Signed-off-by: Mel Gorman --- include/linux/compaction.h

[PATCH 4/5] mm: Stall movable allocations until kswapd progresses during serious external fragmentation event

2018-10-31 Thread Mel Gorman
stall_order to 9 eliminated fragmentation events entirely on the 1-socket machine and by 99.71% on the 2-socket machine. Signed-off-by: Mel Gorman --- Documentation/sysctl/vm.txt | 23 +++ include/linux/mm.h | 1 + include/linux/mmzone.h | 2 ++ kernel/sysctl.c

<    3   4   5   6   7   8   9   10   11   12   >