[PATCH 10/10] mm/page_alloc: Embed per_cpu_pages locking within the per-cpu structure

2021-04-19 Thread Mel Gorman
nel configurations, local_lock_t is empty and no storage is required. By embedding the lock, the memory consumption on PREEMPT_RT and CONFIG_DEBUG_LOCK_ALLOC is higher. Suggested-by: Peter Zijlstra Signed-off-by: Mel Gorman --- include/linux/mmzone.h | 31 - mm/page_a

[PATCH 09/10] mm/page_alloc: Update PGFREE outside the zone lock in __free_pages_ok

2021-04-19 Thread Mel Gorman
VM events do not need explicit protection by disabling IRQs so update the counter with IRQs enabled in __free_pages_ok. Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka --- mm/page_alloc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm

[PATCH 08/10] mm/page_alloc: Avoid conflating IRQs disabled with zone->lock

2021-04-19 Thread Mel Gorman
operation. Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka --- mm/page_alloc.c | 68 ++--- 1 file changed, 42 insertions(+), 26 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index c6e8da942905..a9c1282d9c7b 100644 --- a/mm/page_alloc.c

[PATCH 07/10] mm/page_alloc: Explicitly acquire the zone lock in __free_pages_ok

2021-04-19 Thread Mel Gorman
. This patch explicitly acquires the lock with spin_lock_irqsave instead of relying on a helper. This removes the last instance of local_irq_save() in page_alloc.c. Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka --- mm/page_alloc.c | 16 1 file changed, 8 insertions(+), 8 deletions

[PATCH 06/10] mm/page_alloc: Reduce duration that IRQs are disabled for VM counters

2021-04-19 Thread Mel Gorman
called with IRQs disabled. While this could be moved out, it's not free on all architectures as some require IRQs to be disabled for mod_zone_page_state on !PREEMPT_RT kernels. Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka --- mm/page_alloc.c | 12 ++-- 1 file changed, 6

[PATCH 05/10] mm/page_alloc: Batch the accounting updates in the bulk allocator

2021-04-19 Thread Mel Gorman
Now that the zone_statistics are simple counters that do not require special protection, the bulk allocator accounting updates can be batch updated without adding too much complexity with protected RMW updates or using xchg. Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka --- include/linux

[PATCH 04/10] mm/vmstat: Inline NUMA event counter updates

2021-04-19 Thread Mel Gorman
__count_numa_event is small enough to be treated similarly to __count_vm_event so inline it. Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka --- include/linux/vmstat.h | 10 +- mm/vmstat.c| 9 - 2 files changed, 9 insertions(+), 10 deletions(-) diff --git

[PATCH 03/10] mm/vmstat: Convert NUMA statistics to basic NUMA counters

2021-04-19 Thread Mel Gorman
at the node level to save space but it would have a user-visible impact due to /proc/zoneinfo. Signed-off-by: Mel Gorman --- drivers/base/node.c| 18 ++-- include/linux/mmzone.h | 13 ++- include/linux/vmstat.h | 43 +- mm/mempolicy.c | 2 +- mm/page_alloc.c| 12

[PATCH 02/10] mm/page_alloc: Convert per-cpu list protection to local_lock

2021-04-19 Thread Mel Gorman
in the series. [l...@intel.com: Make pagesets static] Signed-off-by: Mel Gorman --- include/linux/mmzone.h | 2 ++ mm/page_alloc.c| 50 +- 2 files changed, 37 insertions(+), 15 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index

[PATCH 01/10] mm/page_alloc: Split per cpu page lists and zone stats

2021-04-19 Thread Mel Gorman
...@intel.com: Check struct per_cpu_zonestat has a non-zero size] [vba...@suse.cz: Init zone->per_cpu_zonestats properly] Signed-off-by: Mel Gorman --- include/linux/mmzone.h | 18 include/linux/vmstat.h | 8 ++-- mm/page_alloc.c| 85 - mm/vmsta

[PATCH 00/10 v4] Use local_lock for pcp protection and reduce stat overhead

2021-04-19 Thread Mel Gorman
Some Acks from RT people are still missing that I'd like to have before trying to merge this via Andrew's tree and there is an open question is whether the last path in this series is worthwhile. It embeds local_lock within the per_cpu_pages structure to clarify the scope but it increases

[tip: sched/core] sched/numa: Allow runtime enabling/disabling of NUMA balance without SCHED_DEBUG

2021-04-16 Thread tip-bot2 for Mel Gorman
The following commit has been merged into the sched/core branch of tip: Commit-ID: b7cc6ec744b307db59568c654a8904a5928aa855 Gitweb: https://git.kernel.org/tip/b7cc6ec744b307db59568c654a8904a5928aa855 Author:Mel Gorman AuthorDate:Wed, 24 Mar 2021 13:39:16 Committer

Re: [PATCH 11/11] mm/page_alloc: Embed per_cpu_pages locking within the per-cpu structure

2021-04-15 Thread Mel Gorman
On Thu, Apr 15, 2021 at 04:53:46PM +0200, Vlastimil Babka wrote: > On 4/14/21 3:39 PM, Mel Gorman wrote: > > struct per_cpu_pages is protected by the pagesets lock but it can be > > embedded within struct per_cpu_pages at a minor cost. This is possible > > because per-

Re: [PATCH 09/11] mm/page_alloc: Avoid conflating IRQs disabled with zone->lock

2021-04-15 Thread Mel Gorman
+ + set_page_private(page, pfn); } local_lock_irqsave(, flags); -- Mel Gorman SUSE Labs

Re: [PATCH 04/11] mm/vmstat: Convert NUMA statistics to basic NUMA counters

2021-04-15 Thread Mel Gorman
On Wed, Apr 14, 2021 at 05:56:53PM +0200, Vlastimil Babka wrote: > On 4/14/21 5:18 PM, Mel Gorman wrote: > > On Wed, Apr 14, 2021 at 02:56:45PM +0200, Vlastimil Babka wrote: > >> So it seems that this intermediate assignment to zone counters (using > >> atomic_long

Re: [PATCH 04/11] mm/vmstat: Inline NUMA event counter updates

2021-04-15 Thread Mel Gorman
On Wed, Apr 14, 2021 at 06:26:25PM +0200, Vlastimil Babka wrote: > On 4/14/21 6:20 PM, Vlastimil Babka wrote: > > On 4/14/21 3:39 PM, Mel Gorman wrote: > >> __count_numa_event is small enough to be treated similarly to > >> __count_vm_event so inline it. > >

Re: [PATCH 07/11] mm/page_alloc: Remove duplicate checks if migratetype should be isolated

2021-04-15 Thread Mel Gorman
On Wed, Apr 14, 2021 at 07:21:42PM +0200, Vlastimil Babka wrote: > On 4/14/21 3:39 PM, Mel Gorman wrote: > > Both free_pcppages_bulk() and free_one_page() have very similar > > checks about whether a page's migratetype has changed under the > > zone lock. Use a common helper

Re: [PATCH 04/11] mm/vmstat: Convert NUMA statistics to basic NUMA counters

2021-04-14 Thread Mel Gorman
On Wed, Apr 14, 2021 at 02:56:45PM +0200, Vlastimil Babka wrote: > On 4/7/21 10:24 PM, Mel Gorman wrote: > > NUMA statistics are maintained on the zone level for hits, misses, foreign > > etc but nothing relies on them being perfectly accurate for functional > > correctness.

[PATCH 11/11] mm/page_alloc: Embed per_cpu_pages locking within the per-cpu structure

2021-04-14 Thread Mel Gorman
nel configurations, local_lock_t is empty and no storage is required. By embedding the lock, the memory consumption on PREEMPT_RT and CONFIG_DEBUG_LOCK_ALLOC is higher. Suggested-by: Peter Zijlstra Signed-off-by: Mel Gorman --- include/linux/mmzone.h | 31 - mm/page_a

[PATCH 07/11] mm/page_alloc: Remove duplicate checks if migratetype should be isolated

2021-04-14 Thread Mel Gorman
Both free_pcppages_bulk() and free_one_page() have very similar checks about whether a page's migratetype has changed under the zone lock. Use a common helper. Signed-off-by: Mel Gorman --- mm/page_alloc.c | 32 ++-- 1 file changed, 22 insertions(+), 10 deletions

[PATCH 06/11] mm/page_alloc: Reduce duration that IRQs are disabled for VM counters

2021-04-14 Thread Mel Gorman
called with IRQs disabled. While this could be moved out, it's not free on all architectures as some require IRQs to be disabled for mod_zone_page_state on !PREEMPT_RT kernels. Signed-off-by: Mel Gorman --- mm/page_alloc.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff

[PATCH 05/11] mm/page_alloc: Batch the accounting updates in the bulk allocator

2021-04-14 Thread Mel Gorman
Now that the zone_statistics are simple counters that do not require special protection, the bulk allocator accounting updates can be batch updated without adding too much complexity with protected RMW updates or using xchg. Signed-off-by: Mel Gorman --- include/linux/vmstat.h | 8 mm

[PATCH 10/11] mm/page_alloc: Update PGFREE outside the zone lock in __free_pages_ok

2021-04-14 Thread Mel Gorman
VM events do not need explicit protection by disabling IRQs so update the counter with IRQs enabled in __free_pages_ok. Signed-off-by: Mel Gorman --- mm/page_alloc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index a0b210077178

[PATCH 08/11] mm/page_alloc: Explicitly acquire the zone lock in __free_pages_ok

2021-04-14 Thread Mel Gorman
. This patch explicitly acquires the lock with spin_lock_irqsave instead of relying on a helper. This removes the last instance of local_irq_save() in page_alloc.c. Signed-off-by: Mel Gorman --- mm/page_alloc.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/mm

[PATCH 09/11] mm/page_alloc: Avoid conflating IRQs disabled with zone->lock

2021-04-14 Thread Mel Gorman
operation. Signed-off-by: Mel Gorman --- mm/page_alloc.c | 67 ++--- 1 file changed, 41 insertions(+), 26 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 6791e9361076..a0b210077178 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@

[PATCH 04/11] mm/vmstat: Inline NUMA event counter updates

2021-04-14 Thread Mel Gorman
__count_numa_event is small enough to be treated similarly to __count_vm_event so inline it. Signed-off-by: Mel Gorman --- include/linux/vmstat.h | 9 + mm/vmstat.c| 9 - 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/include/linux/vmstat.h b/include

[PATCH 01/11] mm/page_alloc: Split per cpu page lists and zone stats

2021-04-14 Thread Mel Gorman
...@intel.com: Check struct per_cpu_zonestat has a non-zero size] [vba...@suse.cz: Init zone->per_cpu_zonestats properly] Signed-off-by: Mel Gorman --- include/linux/mmzone.h | 18 include/linux/vmstat.h | 8 ++-- mm/page_alloc.c| 85 - mm/vmsta

[PATCH 02/11] mm/page_alloc: Convert per-cpu list protection to local_lock

2021-04-14 Thread Mel Gorman
in the series. [l...@intel.com: Make pagesets static] Signed-off-by: Mel Gorman --- include/linux/mmzone.h | 2 ++ mm/page_alloc.c| 50 +- 2 files changed, 37 insertions(+), 15 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index

[PATCH 03/11] mm/vmstat: Convert NUMA statistics to basic NUMA counters

2021-04-14 Thread Mel Gorman
. There is a possibility that slight errors will be introduced but the overall trend as seen by userspace will be similar. Note that while these counters could be maintained at the node level that it would have a user-visible impact. Signed-off-by: Mel Gorman --- drivers/base/node.c| 18

[PATCH 0/11 v3] Use local_lock for pcp protection and reduce stat overhead

2021-04-14 Thread Mel Gorman
files changed, 370 insertions(+), 325 deletions(-) -- 2.26.2 Mel Gorman (11): mm/page_alloc: Split per cpu page lists and zone stats mm/page_alloc: Convert per-cpu list protection to local_lock mm/vmstat: Convert NUMA statistics to basic NUMA counters mm/vmstat: Inline NUMA event counter update

Re: [PATCH 02/11] mm/page_alloc: Convert per-cpu list protection to local_lock

2021-04-13 Thread Mel Gorman
On Mon, Apr 12, 2021 at 11:47:00PM +0200, Thomas Gleixner wrote: > On Mon, Apr 12 2021 at 12:56, Mel Gorman wrote: > > On Fri, Apr 09, 2021 at 08:55:39PM +0200, Peter Zijlstra wrote: > > I'll update the changelog and comment accordingly. I'll decide later > > whethe

Re: [PATCH 01/11] mm/page_alloc: Split per cpu page lists and zone stats

2021-04-13 Thread Mel Gorman
On Mon, Apr 12, 2021 at 07:43:18PM +0200, Vlastimil Babka wrote: > On 4/7/21 10:24 PM, Mel Gorman wrote: > > @@ -6691,7 +6697,7 @@ static __meminit void zone_pcp_init(struct zone *zone) > > * relies on the ability of the linker to provide the > > * offset of a (sta

Re: [PATCH v2 resend] mm/memory_hotplug: Make unpopulated zones PCP structures unreachable during hot remove

2021-04-13 Thread Mel Gorman
On Tue, Apr 13, 2021 at 11:36:08AM +0200, Vlastimil Babka wrote: > On 4/12/21 4:08 PM, Mel Gorman wrote: > > On Mon, Apr 12, 2021 at 02:40:18PM +0200, Vlastimil Babka wrote: > >> On 4/12/21 2:08 PM, Mel Gorman wrote: > > > > the pageset structures in place would

Re: [PATCH v2 resend] mm/memory_hotplug: Make unpopulated zones PCP structures unreachable during hot remove

2021-04-12 Thread Mel Gorman
machine. Even if I used movable_zone to create a zone or numa=fake to create multiple fake nodes and zones, there was always either reserved or pinned pages preventing the full zone being removed. -- Mel Gorman SUSE Labs

Re: [RFC/PATCH] powerpc/smp: Add SD_SHARE_PKG_RESOURCES flag to MC sched-domain

2021-04-12 Thread Mel Gorman
h generation of Zen. The common pattern is that a single NUMA node can have multiple L3 caches and at one point I thought it might be reasonable to allow spillover to select a local idle CPU instead of stacking multiple tasks on a CPU sharing cache. I never got as far as thinking how it could be done in a way that multiple architectures would be happy with. -- Mel Gorman SUSE Labs

Re: [PATCH v2 resend] mm/memory_hotplug: Make unpopulated zones PCP structures unreachable during hot remove

2021-04-12 Thread Mel Gorman
On Mon, Apr 12, 2021 at 02:40:18PM +0200, Vlastimil Babka wrote: > On 4/12/21 2:08 PM, Mel Gorman wrote: > > zone_pcp_reset allegedly protects against a race with drain_pages > > using local_irq_save but this is bogus. local_irq_save only operates > > on the local CPU. If memo

[PATCH v2 resend] mm/memory_hotplug: Make unpopulated zones PCP structures unreachable during hot remove

2021-04-12 Thread Mel Gorman
. Signed-off-by: Mel Gorman --- Resending for email address correction and adding lists Changelog since v1 o Minimal fix mm/page_alloc.c | 4 1 file changed, 4 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 5e8aedb64b57..9bf0db982f14 100644 --- a/mm/page_alloc.c +++ b/mm

Re: [PATCH 02/11] mm/page_alloc: Convert per-cpu list protection to local_lock

2021-04-12 Thread Mel Gorman
On Fri, Apr 09, 2021 at 08:55:39PM +0200, Peter Zijlstra wrote: > On Fri, Apr 09, 2021 at 02:32:56PM +0100, Mel Gorman wrote: > > That said, there are some curious users already. > > fs/squashfs/decompressor_multi_percpu.c looks like it always uses the > > local_lock in CPU

Re: [PATCH 2/9] mm/page_alloc: Add a bulk page allocator

2021-04-12 Thread Mel Gorman
On Mon, Apr 12, 2021 at 11:59:38AM +0100, Mel Gorman wrote: > > I don't understand this comment. Only alloc_flags_nofragment() sets this > > flag > > and we don't use it here? > > > > It's there as a reminder that there are non-obvious consequences > to ALLOC_NO

[PATCH] mm/page_alloc: Add a bulk page allocator -fix -fix -fix

2021-04-12 Thread Mel Gorman
Vlastimil Babka noted that a comment is wrong, fix it. This is the third fix to the mmotm patch mm-page_alloc-add-a-bulk-page-allocator.patch. Signed-off-by: Mel Gorman --- mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c

Re: [PATCH 2/9] mm/page_alloc: Add a bulk page allocator

2021-04-12 Thread Mel Gorman
ment. I'm waiting for a bug that can trivially trigger a case with a meaningful workload where the success rate is poor enough to affect latency before adding complexity. Ideally by then, the allocation paths would be unified a bit better. > > + gfp &= gfp_allowed_mask; > > + alloc_gfp = gfp; > >

Re: [RFC/PATCH] powerpc/smp: Add SD_SHARE_PKG_RESOURCES flag to MC sched-domain

2021-04-12 Thread Mel Gorman
On Mon, Apr 12, 2021 at 11:06:19AM +0100, Valentin Schneider wrote: > On 12/04/21 10:37, Mel Gorman wrote: > > On Mon, Apr 12, 2021 at 11:54:36AM +0530, Srikar Dronamraju wrote: > >> * Gautham R. Shenoy [2021-04-02 11:07:54]: > >> > >> > > >> &g

Re: [RFC/PATCH] powerpc/smp: Add SD_SHARE_PKG_RESOURCES flag to MC sched-domain

2021-04-12 Thread Mel Gorman
rch depth allows within the node with the LLC CPUs masked out. While there would be a latency hit because cache is not shared, it would still be a CPU local to memory that is idle. That would potentially be beneficial on Zen* as well without having to introduce new domains in the topology hierarchy. -- Mel Gorman SUSE Labs

Re: [PATCH] mm/memory_hotplug: Make unpopulated zones PCP structures unreachable during hot remove

2021-04-09 Thread Mel Gorman
t; -} > > > - > > > > zone_pcp_reset still needs to exist to drain the remaining vmstats or > > it'll break 5a883813845a ("memory-hotplug: fix zone stat > > mismatch"). > > Are you sure we are reseting vmstats in the hotremove. I do not see > anything like that. Maybe this was needed at the time. I will double > check. zone_pcp_reset calls drain_zonestat to apply the per-cpu vmstat deltas to the atomic per-zone and global stats. If anything, the minimal "fix" is to simply delete IRQ disable/enable on the grounds that IRQs protect nothing and assume the existing hotplug paths guarantees the PCP cannot be used after zone_pcp_enable(). That should be the case already because all the pages have been freed and there is nothing to even put into the PCPs but I worried that the PCP structure itself might still be reachable even if it's useless which is why I freed the structure once they could not be reached via zonelists. -- Mel Gorman SUSE Labs

Re: [PATCH] mm/memory_hotplug: Make unpopulated zones PCP structures unreachable during hot remove

2021-04-09 Thread Mel Gorman
On Fri, Apr 09, 2021 at 02:48:12PM +0200, Michal Hocko wrote: > On Fri 09-04-21 14:42:58, Michal Hocko wrote: > > On Fri 09-04-21 13:09:57, Mel Gorman wrote: > > > zone_pcp_reset allegedly protects against a race with drain_pages > > > using local_irq_save but this is

Re: [PATCH 02/11] mm/page_alloc: Convert per-cpu list protection to local_lock

2021-04-09 Thread Mel Gorman
On Fri, Apr 09, 2021 at 10:24:24AM +0200, Peter Zijlstra wrote: > On Fri, Apr 09, 2021 at 08:59:39AM +0100, Mel Gorman wrote: > > In the end I just gave up and kept it simple as there is no benefit to > > !PREEMPT_RT which just disables IRQs. Maybe it'll be worth considering when

[PATCH] mm/memory_hotplug: Make unpopulated zones PCP structures unreachable during hot remove

2021-04-09 Thread Mel Gorman
to zone_pcp_destroy to make it clear that the per-cpu structures are deleted when the function returns. Signed-off-by: Mel Gorman --- mm/internal.h | 2 +- mm/memory_hotplug.c | 10 +++--- mm/page_alloc.c | 22 -- 3 files changed, 24 insertions(+), 10 deletions(-) diff

Re: Problem in pfmemalloc skb handling in net/core/dev.c

2021-04-09 Thread Mel Gorman
On Fri, Apr 09, 2021 at 02:14:12AM -0700, Xie He wrote: > On Fri, Apr 9, 2021 at 1:44 AM Mel Gorman wrote: > > > > That would imply that the tap was communicating with a swap device to > > allocate a pfmemalloc skb which shouldn't happen. Furthermore, it would > &

Re: Problem in pfmemalloc skb handling in net/core/dev.c

2021-04-09 Thread Mel Gorman
On Fri, Apr 09, 2021 at 01:33:24AM -0700, Xie He wrote: > On Fri, Apr 9, 2021 at 12:30 AM Mel Gorman > wrote: > > > > Under what circumstances do you expect sk_memalloc_socks() to be false > > and skb_pfmemalloc() to be true that would cause a problem? > > For e

Re: [PATCH 02/11] mm/page_alloc: Convert per-cpu list protection to local_lock

2021-04-09 Thread Mel Gorman
On Fri, Apr 09, 2021 at 08:39:45AM +0200, Peter Zijlstra wrote: > On Thu, Apr 08, 2021 at 06:42:44PM +0100, Mel Gorman wrote: > > On Thu, Apr 08, 2021 at 12:52:07PM +0200, Peter Zijlstra wrote: > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > > > index

Re: Problem in pfmemalloc skb handling in net/core/dev.c

2021-04-09 Thread Mel Gorman
On Thu, Apr 08, 2021 at 11:52:01AM -0700, Xie He wrote: > Hi Mel Gorman, > > I may have found a problem in pfmemalloc skb handling in > net/core/dev.c. I see there are "if" conditions checking for > "sk_memalloc_socks() && skb_pfmemalloc(skb)", and when

Re: [PATCH 0/11 v2] Use local_lock for pcp protection and reduce stat overhead

2021-04-08 Thread Mel Gorman
On Thu, Apr 08, 2021 at 12:56:01PM +0200, Peter Zijlstra wrote: > On Wed, Apr 07, 2021 at 09:24:12PM +0100, Mel Gorman wrote: > > Why local_lock? PREEMPT_RT considers the following sequence to be unsafe > > as documented in Documentation/locking/locktypes.rst > > >

Re: [PATCH 02/11] mm/page_alloc: Convert per-cpu list protection to local_lock

2021-04-08 Thread Mel Gorman
make the allocator RT-safe in general, I realised that locking was broken and fixed it in patch 3 of this series. With that, the local_lock could potentially be embedded within per_cpu_pages safely at the end of this series. -- Mel Gorman SUSE Labs

[PATCH 11/11] mm/page_alloc: Update PGFREE outside the zone lock in __free_pages_ok

2021-04-07 Thread Mel Gorman
VM events do not need explicit protection by disabling IRQs so update the counter with IRQs enabled in __free_pages_ok. Signed-off-by: Mel Gorman --- mm/page_alloc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 6d98d97b6cf5

[PATCH 10/11] mm/page_alloc: Avoid conflating IRQs disabled with zone->lock

2021-04-07 Thread Mel Gorman
operation. Signed-off-by: Mel Gorman --- mm/page_alloc.c | 67 ++--- 1 file changed, 41 insertions(+), 26 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d94ec53367bd..6d98d97b6cf5 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@

[PATCH 09/11] mm/page_alloc: Explicitly acquire the zone lock in __free_pages_ok

2021-04-07 Thread Mel Gorman
the last instance of local_irq_save() in page_alloc.c. Signed-off-by: Mel Gorman --- mm/page_alloc.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 1bb5b522a0f9..d94ec53367bd 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c

[PATCH 08/11] mm/page_alloc: Remove duplicate checks if migratetype should be isolated

2021-04-07 Thread Mel Gorman
Both free_pcppages_bulk() and free_one_page() have very similar checks about whether a pages migratetype has changed under the zone lock. Use a common helper. Signed-off-by: Mel Gorman --- mm/page_alloc.c | 32 ++-- 1 file changed, 22 insertions(+), 10 deletions

[PATCH 07/11] mm/page_alloc: Reduce duration that IRQs are disabled for VM counters

2021-04-07 Thread Mel Gorman
called with IRQs disabled. While this could be moved out, it's not free on all architectures as some require IRQs to be disabled for mod_zone_page_state on !PREEMPT_RTkernels. Signed-off-by: Mel Gorman --- mm/page_alloc.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff

[PATCH 06/11] mm/page_alloc: Batch the accounting updates in the bulk allocator

2021-04-07 Thread Mel Gorman
Now that the zone_statistics are simple counters that do not require special protection, the bulk allocator accounting updates can be batch updated without adding too much complexity with protected RMW updates or using xchg. Signed-off-by: Mel Gorman --- include/linux/vmstat.h | 8 mm

[PATCH 05/11] mm/vmstat: Inline NUMA event counter updates

2021-04-07 Thread Mel Gorman
__count_numa_event is small enough to be treated similarly to __count_vm_event so inline it. Signed-off-by: Mel Gorman --- include/linux/vmstat.h | 9 + mm/vmstat.c| 9 - 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/include/linux/vmstat.h b/include

[PATCH 04/11] mm/vmstat: Convert NUMA statistics to basic NUMA counters

2021-04-07 Thread Mel Gorman
. There is a possibility that slight errors will be introduced but the overall trend as seen by userspace will be similar. Note that while these counters could be maintained at the node level that it would have a user-visible impact. Signed-off-by: Mel Gorman --- drivers/base/node.c| 18

[PATCH 03/11] mm/memory_hotplug: Make unpopulated zones PCP structures unreachable during hot remove

2021-04-07 Thread Mel Gorman
to zone_pcp_destroy to make it clear that the per-cpu structures are deleted when the function returns. Signed-off-by: Mel Gorman --- mm/internal.h | 2 +- mm/memory_hotplug.c | 10 +++--- mm/page_alloc.c | 22 -- 3 files changed, 24 insertions(+), 10 deletions(-) diff

[PATCH 02/11] mm/page_alloc: Convert per-cpu list protection to local_lock

2021-04-07 Thread Mel Gorman
to IRQ enabling/disabling. The scope of the lock is still wider than it should be but this is decreased laster. [l...@intel.com: Make pagesets static] Signed-off-by: Mel Gorman --- include/linux/mmzone.h | 2 ++ mm/page_alloc.c| 50 +- 2 files

[PATCH 0/11 v2] Use local_lock for pcp protection and reduce stat overhead

2021-04-07 Thread Mel Gorman
For MM people, the whole series is relevant but patch 3 needs particular attention for memory hotremove as I had problems testing it because full zone removal always failed for me. For RT people, the most interesting patches are 2, 9 and 10 with 2 being the most important. This series requires

[PATCH 01/11] mm/page_alloc: Split per cpu page lists and zone stats

2021-04-07 Thread Mel Gorman
...@intel.com: Check struct per_cpu_zonestat has a non-zero size] Signed-off-by: Mel Gorman --- include/linux/mmzone.h | 18 include/linux/vmstat.h | 8 ++-- mm/page_alloc.c| 84 +++- mm/vmstat.c| 96 ++ 4

Re: [PATCH v3] sched/fair: bring back select_idle_smt, but differently

2021-04-07 Thread Mel Gorman
On Wed, Apr 07, 2021 at 12:15:13PM +0200, Peter Zijlstra wrote: > On Wed, Apr 07, 2021 at 10:41:06AM +0100, Mel Gorman wrote: > > > > --- a/kernel/sched/fair.c > > > +++ b/kernel/sched/fair.c > > > @@ -6112,6 +6112,27 @@ static int select_idle_co

Re: [PATCH v3] sched/fair: bring back select_idle_smt, but differently

2021-04-07 Thread Mel Gorman
On Wed, Apr 07, 2021 at 09:17:18AM +0200, Peter Zijlstra wrote: > Subject: sched/fair: Bring back select_idle_smt(), but differently > From: Rik van Riel > Date: Fri, 26 Mar 2021 15:19:32 -0400 > > From: Rik van Riel > > Mel Gorman did some nice work in 9fe1f127b91

Re: [RFC PATCH 0/6] mm: thp: use generic THP migration for NUMA hinting fault

2021-04-07 Thread Mel Gorman
code for migration because even if it shows up a problem, it would be better to optimise the generic implementation than carry two similar implementations. I'm undecided on whether s390 should split+migrate rather than skip because I do not have a good overview of "typical workloads on s390 that benefit from NUMA balancing". -- Mel Gorman SUSE Labs

Re: [PATCH -V2] NUMA balancing: reduce TLB flush via delaying mapping on hint page fault

2021-04-07 Thread Mel Gorman
is what this patch doing. > > > Thanks, I think this is ok for Andrew to pick up to see if anything bisects to this commit but it's a low risk. Reviewed-by: Mel Gorman More notes; This is not a universal win given that not all workloads exhibit the pattern where accesses occur in par

Re: [RFC] NUMA balancing: reduce TLB flush via delaying mapping on hint page fault

2021-04-01 Thread Mel Gorman
On Wed, Mar 31, 2021 at 09:36:04AM -0700, Nadav Amit wrote: > > > > On Mar 31, 2021, at 6:16 AM, Mel Gorman wrote: > > > > On Wed, Mar 31, 2021 at 07:20:09PM +0800, Huang, Ying wrote: > >> Mel Gorman writes: > >> > >>> On M

Re: [PATCH 2/6] mm/page_alloc: Convert per-cpu list protection to local_lock

2021-03-31 Thread Mel Gorman
On Wed, Mar 31, 2021 at 07:42:42PM +0200, Thomas Gleixner wrote: > On Wed, Mar 31 2021 at 12:01, Mel Gorman wrote: > > On Wed, Mar 31, 2021 at 11:55:56AM +0200, Thomas Gleixner wrote: > > @@ -887,13 +887,11 @@ void cpu_vm_stats_fold(int cpu) > > > >

Re: [RFC PATCH 0/6] mm: thp: use generic THP migration for NUMA hinting fault

2021-03-31 Thread Mel Gorman
er, it might be ok as an s390-specific workaround. (Note, I haven't read the rest of the series due to lack of time but this query caught my eye). -- Mel Gorman SUSE Labs

Re: [RFC] NUMA balancing: reduce TLB flush via delaying mapping on hint page fault

2021-03-31 Thread Mel Gorman
On Wed, Mar 31, 2021 at 07:20:09PM +0800, Huang, Ying wrote: > Mel Gorman writes: > > > On Mon, Mar 29, 2021 at 02:26:51PM +0800, Huang Ying wrote: > >> For NUMA balancing, in hint page fault handler, the faulting page will > >> be migrated to the access

Re: [PATCH 2/6] mm/page_alloc: Convert per-cpu list protection to local_lock

2021-03-31 Thread Mel Gorman
On Wed, Mar 31, 2021 at 11:55:56AM +0200, Thomas Gleixner wrote: > On Mon, Mar 29 2021 at 13:06, Mel Gorman wrote: > > There is a lack of clarity of what exactly local_irq_save/local_irq_restore > > protects in page_alloc.c . It conflates the protection of per-cpu page > >

Re: [RFC PATCH 0/6] Use local_lock for pcp protection and reduce stat overhead

2021-03-31 Thread Mel Gorman
Ingo, Thomas or Peter, is there any chance one of you could take a look at patch "[PATCH 2/6] mm/page_alloc: Convert per-cpu list protection to local_lock" from this series? It's partially motivated by PREEMPT_RT. More details below. On Mon, Mar 29, 2021 at 01:06:42PM +0100, Mel Go

Re: [RFC PATCH 0/6] Use local_lock for pcp protection and reduce stat overhead

2021-03-31 Thread Mel Gorman
On Tue, Mar 30, 2021 at 08:51:54PM +0200, Jesper Dangaard Brouer wrote: > On Mon, 29 Mar 2021 13:06:42 +0100 > Mel Gorman wrote: > > > This series requires patches in Andrew's tree so the series is also > > available at > > > > git://git.kernel.org/pub/scm/linux

Re: [RFC] NUMA balancing: reduce TLB flush via delaying mapping on hint page fault

2021-03-30 Thread Mel Gorman
IPI savings are enough to justify stalling parallel accesses that could be making forward progress. One nit below > Signed-off-by: "Huang, Ying" > Cc: Peter Zijlstra > Cc: Mel Gorman > Cc: Peter Xu > Cc: Johannes Weiner > Cc: Vlastimil Babka > Cc: &qu

[PATCH] mm/page_alloc: Add a bulk page allocator -fix -fix

2021-03-30 Thread Mel Gorman
Colin Ian King reported the following problem (slightly edited) Author: Mel Gorman Date: Mon Mar 29 11:12:24 2021 +1100 mm/page_alloc: add a bulk page allocator ... Static analysis on linux-next with Coverity has found a potential

Re: mm/page_alloc: add a bulk page allocator

2021-03-30 Thread Mel Gorman
lized value should be ALLOC_WMARK_LOW. A value of 0 would be the same as ALLOC_WMARK_MIN and that would allow the bulk allocator to potentially consume too many pages without waking kswapd. I'll put together a patch shortly. Thanks Colin! -- Mel Gorman SUSE Labs

[PATCH 6/6] mm/page_alloc: Reduce duration that IRQs are disabled for VM counters

2021-03-29 Thread Mel Gorman
if ever called from an IRQ context. Signed-off-by: Mel Gorman --- mm/page_alloc.c | 22 -- 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 32c64839c145..25d9351e75d8 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c

[PATCH 2/6] mm/page_alloc: Convert per-cpu list protection to local_lock

2021-03-29 Thread Mel Gorman
-off-by: Mel Gorman --- include/linux/mmzone.h | 2 ++ mm/page_alloc.c| 43 -- mm/vmstat.c| 4 3 files changed, 31 insertions(+), 18 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index a4393ac27336

[PATCH 5/6] mm/page_alloc: Batch the accounting updates in the bulk allocator

2021-03-29 Thread Mel Gorman
Now that the zone_statistics are a simple counter that does not require special protection, the bulk allocator accounting updates can be batch updated without requiring IRQs to be disabled. Signed-off-by: Mel Gorman --- include/linux/vmstat.h | 8 mm/page_alloc.c| 30

[PATCH 4/6] mm/vmstat: Inline NUMA event counter updates

2021-03-29 Thread Mel Gorman
__count_numa_event is small enough to be treated similarly to __count_vm_event so inline it. Signed-off-by: Mel Gorman --- include/linux/vmstat.h | 9 + mm/vmstat.c| 9 - 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/include/linux/vmstat.h b/include

[PATCH 3/6] mm/vmstat: Convert NUMA statistics to basic NUMA counters

2021-03-29 Thread Mel Gorman
to VM events. There is a possibility that slight errors will be introduced but the overall trend as seen by userspace will be similar. Note that while these counters could be maintained at the node level, it would have a user-visible impact. Signed-off-by: Mel Gorman --- drivers/base/node.c

[PATCH 1/6] mm/page_alloc: Split per cpu page lists and zone stats

2021-03-29 Thread Mel Gorman
...@intel.com: Check struct per_cpu_zonestat has a non-zero size] Signed-off-by: Mel Gorman --- include/linux/mmzone.h | 18 include/linux/vmstat.h | 8 ++-- mm/page_alloc.c| 84 +++- mm/vmstat.c| 96 ++ 4

[RFC PATCH 0/6] Use local_lock for pcp protection and reduce stat overhead

2021-03-29 Thread Mel Gorman
This series requires patches in Andrew's tree so the series is also available at git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git mm-percpu-local_lock-v1r15 tldr: Jesper and Chuck, it would be nice to verify if this series helps the allocation rate of the bulk page allocator.

Re: [PATCH v3] sched/fair: bring back select_idle_smt, but differently

2021-03-28 Thread Mel Gorman
On Fri, Mar 26, 2021 at 03:19:32PM -0400, Rik van Riel wrote: > ---8<--- > sched,fair: bring back select_idle_smt, but differently > > Mel Gorman did some nice work in 9fe1f127b913 > ("sched/fair: Merge select_idle_core/cpu()"), resulting in the kernel > being mo

Re: [PATCH 0/9 v6] Introduce a bulk order-0 page allocator with two in-tree users

2021-03-25 Thread Mel Gorman
On Thu, Mar 25, 2021 at 03:06:57PM +0100, Uladzislau Rezki wrote: > > On Thu, Mar 25, 2021 at 12:50:01PM +, Matthew Wilcox wrote: > > > On Thu, Mar 25, 2021 at 11:42:19AM +0000, Mel Gorman wrote: > > > > This series introduces a bulk order-0 page allocator with sun

Re: [PATCH 0/9 v6] Introduce a bulk order-0 page allocator with two in-tree users

2021-03-25 Thread Mel Gorman
On Thu, Mar 25, 2021 at 12:50:01PM +, Matthew Wilcox wrote: > On Thu, Mar 25, 2021 at 11:42:19AM +0000, Mel Gorman wrote: > > This series introduces a bulk order-0 page allocator with sunrpc and > > the network page pool being the first users. The implementation is no

Re: [PATCH 4/9] mm/page_alloc: optimize code layout for __alloc_pages_bulk

2021-03-25 Thread Mel Gorman
On Thu, Mar 25, 2021 at 12:12:17PM +, Matthew Wilcox wrote: > On Thu, Mar 25, 2021 at 11:42:23AM +0000, Mel Gorman wrote: > > > > - if (WARN_ON_ONCE(nr_pages <= 0)) > > + if (unlikely(nr_pages <= 0)) > > return 0; > > If we made nr_page

Re: [PATCH 2/9] mm/page_alloc: Add a bulk page allocator

2021-03-25 Thread Mel Gorman
On Thu, Mar 25, 2021 at 12:05:25PM +, Matthew Wilcox wrote: > On Thu, Mar 25, 2021 at 11:42:21AM +0000, Mel Gorman wrote: > > +int __alloc_pages_bulk(gfp_t gfp, int preferred_nid, > > + nodemask_t *nodemask, int nr_pages, > > +

Re: [RFC] mm: activate access-more-than-once page via NUMA balancing

2021-03-25 Thread Mel Gorman
to migrate the hot private pages first? > I'm not sure how the hotness of pages could be ranked. At the time of a hinting fault, the page is by definition active now because it was been accessed. Prioritising what pages to migrate based on the number of faults that have been trapped would have to be stored somewhere. -- Mel Gorman SUSE Labs

[PATCH 9/9] net: page_pool: use alloc_pages_bulk in refill code path

2021-03-25 Thread Mel Gorman
by: Jesper Dangaard Brouer Signed-off-by: Mel Gorman --- include/net/page_pool.h | 2 +- net/core/page_pool.c| 82 - 2 files changed, 57 insertions(+), 27 deletions(-) diff --git a/include/net/page_pool.h b/include/net/page_pool.h index b5b195305346..6d

[PATCH 8/9] net: page_pool: refactor dma_map into own function page_pool_dma_map

2021-03-25 Thread Mel Gorman
-by: Mel Gorman --- net/core/page_pool.c | 45 +--- 1 file changed, 26 insertions(+), 19 deletions(-) diff --git a/net/core/page_pool.c b/net/core/page_pool.c index ad8b0707af04..40e1b2beaa6c 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c

[PATCH 6/9] SUNRPC: Set rq_page_end differently

2021-03-25 Thread Mel Gorman
_actor() was renamed nfsd_splice_actor() by commit cf8208d0eabd ("sendfile: convert nfsd to splice_direct_to_actor()"). Signed-off-by: Chuck Lever Signed-off-by: Mel Gorman --- net/sunrpc/svc_xprt.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/net/sunrpc/svc_xprt.c

[PATCH 7/9] SUNRPC: Refresh rq_pages using a bulk page allocator

2021-03-25 Thread Mel Gorman
-by: Mel Gorman --- net/sunrpc/svc_xprt.c | 31 +++ 1 file changed, 15 insertions(+), 16 deletions(-) diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c index 609bda97d4ae..0c27c3291ca1 100644 --- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c @@ -643,30

[PATCH 5/9] mm/page_alloc: inline __rmqueue_pcplist

2021-03-25 Thread Mel Gorman
) 30.633 ns (step:64) Signed-off-by: Jesper Dangaard Brouer Signed-off-by: Mel Gorman --- mm/page_alloc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 1ec18121268b..d900e92884b2 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c

[PATCH 4/9] mm/page_alloc: optimize code layout for __alloc_pages_bulk

2021-03-25 Thread Mel Gorman
, which confuse the I-cache prefetcher in the CPU. [mgorman: Minor changes and rebasing] Signed-off-by: Jesper Dangaard Brouer Signed-off-by: Mel Gorman --- mm/page_alloc.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index

[PATCH 3/9] mm/page_alloc: Add an array-based interface to the bulk page allocator

2021-03-25 Thread Mel Gorman
storage to store the pages. Signed-off-by: Mel Gorman --- include/linux/gfp.h | 13 +++--- mm/page_alloc.c | 60 + 2 files changed, 54 insertions(+), 19 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 4a304fd39916

[PATCH 2/9] mm/page_alloc: Add a bulk page allocator

2021-03-25 Thread Mel Gorman
is to make it available early to determine what semantics are required by different callers. Once the full semantics are nailed down, it can be refactored. Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka --- include/linux/gfp.h | 11 + mm/page_alloc.c | 118

  1   2   3   4   5   6   7   8   9   10   >