nel configurations, local_lock_t is empty and no storage is
required. By embedding the lock, the memory consumption on PREEMPT_RT
and CONFIG_DEBUG_LOCK_ALLOC is higher.
Suggested-by: Peter Zijlstra
Signed-off-by: Mel Gorman
---
include/linux/mmzone.h | 31 -
mm/page_a
VM events do not need explicit protection by disabling IRQs so
update the counter with IRQs enabled in __free_pages_ok.
Signed-off-by: Mel Gorman
Acked-by: Vlastimil Babka
---
mm/page_alloc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm
operation.
Signed-off-by: Mel Gorman
Acked-by: Vlastimil Babka
---
mm/page_alloc.c | 68 ++---
1 file changed, 42 insertions(+), 26 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index c6e8da942905..a9c1282d9c7b 100644
--- a/mm/page_alloc.c
.
This patch explicitly acquires the lock with spin_lock_irqsave instead of
relying on a helper. This removes the last instance of local_irq_save()
in page_alloc.c.
Signed-off-by: Mel Gorman
Acked-by: Vlastimil Babka
---
mm/page_alloc.c | 16
1 file changed, 8 insertions(+), 8 deletions
called with IRQs disabled. While this
could be moved out, it's not free on all architectures as some require
IRQs to be disabled for mod_zone_page_state on !PREEMPT_RT kernels.
Signed-off-by: Mel Gorman
Acked-by: Vlastimil Babka
---
mm/page_alloc.c | 12 ++--
1 file changed, 6
Now that the zone_statistics are simple counters that do not require
special protection, the bulk allocator accounting updates can be batch
updated without adding too much complexity with protected RMW updates or
using xchg.
Signed-off-by: Mel Gorman
Acked-by: Vlastimil Babka
---
include/linux
__count_numa_event is small enough to be treated similarly to
__count_vm_event so inline it.
Signed-off-by: Mel Gorman
Acked-by: Vlastimil Babka
---
include/linux/vmstat.h | 10 +-
mm/vmstat.c| 9 -
2 files changed, 9 insertions(+), 10 deletions(-)
diff --git
at the node level to save space
but it would have a user-visible impact due to /proc/zoneinfo.
Signed-off-by: Mel Gorman
---
drivers/base/node.c| 18 ++--
include/linux/mmzone.h | 13 ++-
include/linux/vmstat.h | 43 +-
mm/mempolicy.c | 2 +-
mm/page_alloc.c| 12
in the series.
[l...@intel.com: Make pagesets static]
Signed-off-by: Mel Gorman
---
include/linux/mmzone.h | 2 ++
mm/page_alloc.c| 50 +-
2 files changed, 37 insertions(+), 15 deletions(-)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index
...@intel.com: Check struct per_cpu_zonestat has a non-zero size]
[vba...@suse.cz: Init zone->per_cpu_zonestats properly]
Signed-off-by: Mel Gorman
---
include/linux/mmzone.h | 18
include/linux/vmstat.h | 8 ++--
mm/page_alloc.c| 85 -
mm/vmsta
Some Acks from RT people are still missing that I'd like to have before
trying to merge this via Andrew's tree and there is an open question is
whether the last path in this series is worthwhile. It embeds local_lock
within the per_cpu_pages structure to clarify the scope but it increases
The following commit has been merged into the sched/core branch of tip:
Commit-ID: b7cc6ec744b307db59568c654a8904a5928aa855
Gitweb:
https://git.kernel.org/tip/b7cc6ec744b307db59568c654a8904a5928aa855
Author:Mel Gorman
AuthorDate:Wed, 24 Mar 2021 13:39:16
Committer
On Thu, Apr 15, 2021 at 04:53:46PM +0200, Vlastimil Babka wrote:
> On 4/14/21 3:39 PM, Mel Gorman wrote:
> > struct per_cpu_pages is protected by the pagesets lock but it can be
> > embedded within struct per_cpu_pages at a minor cost. This is possible
> > because per-
+
+ set_page_private(page, pfn);
}
local_lock_irqsave(, flags);
--
Mel Gorman
SUSE Labs
On Wed, Apr 14, 2021 at 05:56:53PM +0200, Vlastimil Babka wrote:
> On 4/14/21 5:18 PM, Mel Gorman wrote:
> > On Wed, Apr 14, 2021 at 02:56:45PM +0200, Vlastimil Babka wrote:
> >> So it seems that this intermediate assignment to zone counters (using
> >> atomic_long
On Wed, Apr 14, 2021 at 06:26:25PM +0200, Vlastimil Babka wrote:
> On 4/14/21 6:20 PM, Vlastimil Babka wrote:
> > On 4/14/21 3:39 PM, Mel Gorman wrote:
> >> __count_numa_event is small enough to be treated similarly to
> >> __count_vm_event so inline it.
> >
On Wed, Apr 14, 2021 at 07:21:42PM +0200, Vlastimil Babka wrote:
> On 4/14/21 3:39 PM, Mel Gorman wrote:
> > Both free_pcppages_bulk() and free_one_page() have very similar
> > checks about whether a page's migratetype has changed under the
> > zone lock. Use a common helper
On Wed, Apr 14, 2021 at 02:56:45PM +0200, Vlastimil Babka wrote:
> On 4/7/21 10:24 PM, Mel Gorman wrote:
> > NUMA statistics are maintained on the zone level for hits, misses, foreign
> > etc but nothing relies on them being perfectly accurate for functional
> > correctness.
nel configurations, local_lock_t is empty and no storage is
required. By embedding the lock, the memory consumption on PREEMPT_RT
and CONFIG_DEBUG_LOCK_ALLOC is higher.
Suggested-by: Peter Zijlstra
Signed-off-by: Mel Gorman
---
include/linux/mmzone.h | 31 -
mm/page_a
Both free_pcppages_bulk() and free_one_page() have very similar
checks about whether a page's migratetype has changed under the
zone lock. Use a common helper.
Signed-off-by: Mel Gorman
---
mm/page_alloc.c | 32 ++--
1 file changed, 22 insertions(+), 10 deletions
called with IRQs disabled. While this
could be moved out, it's not free on all architectures as some require
IRQs to be disabled for mod_zone_page_state on !PREEMPT_RT kernels.
Signed-off-by: Mel Gorman
---
mm/page_alloc.c | 12 ++--
1 file changed, 6 insertions(+), 6 deletions(-)
diff
Now that the zone_statistics are simple counters that do not require
special protection, the bulk allocator accounting updates can be batch
updated without adding too much complexity with protected RMW updates or
using xchg.
Signed-off-by: Mel Gorman
---
include/linux/vmstat.h | 8
mm
VM events do not need explicit protection by disabling IRQs so
update the counter with IRQs enabled in __free_pages_ok.
Signed-off-by: Mel Gorman
---
mm/page_alloc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a0b210077178
.
This patch explicitly acquires the lock with spin_lock_irqsave instead of
relying on a helper. This removes the last instance of local_irq_save()
in page_alloc.c.
Signed-off-by: Mel Gorman
---
mm/page_alloc.c | 13 +
1 file changed, 5 insertions(+), 8 deletions(-)
diff --git a/mm
operation.
Signed-off-by: Mel Gorman
---
mm/page_alloc.c | 67 ++---
1 file changed, 41 insertions(+), 26 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6791e9361076..a0b210077178 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@
__count_numa_event is small enough to be treated similarly to
__count_vm_event so inline it.
Signed-off-by: Mel Gorman
---
include/linux/vmstat.h | 9 +
mm/vmstat.c| 9 -
2 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/include/linux/vmstat.h b/include
...@intel.com: Check struct per_cpu_zonestat has a non-zero size]
[vba...@suse.cz: Init zone->per_cpu_zonestats properly]
Signed-off-by: Mel Gorman
---
include/linux/mmzone.h | 18
include/linux/vmstat.h | 8 ++--
mm/page_alloc.c| 85 -
mm/vmsta
in the series.
[l...@intel.com: Make pagesets static]
Signed-off-by: Mel Gorman
---
include/linux/mmzone.h | 2 ++
mm/page_alloc.c| 50 +-
2 files changed, 37 insertions(+), 15 deletions(-)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index
. There is a possibility that slight errors will be
introduced but the overall trend as seen by userspace will be similar.
Note that while these counters could be maintained at the node level that
it would have a user-visible impact.
Signed-off-by: Mel Gorman
---
drivers/base/node.c| 18
files changed, 370 insertions(+), 325 deletions(-)
--
2.26.2
Mel Gorman (11):
mm/page_alloc: Split per cpu page lists and zone stats
mm/page_alloc: Convert per-cpu list protection to local_lock
mm/vmstat: Convert NUMA statistics to basic NUMA counters
mm/vmstat: Inline NUMA event counter update
On Mon, Apr 12, 2021 at 11:47:00PM +0200, Thomas Gleixner wrote:
> On Mon, Apr 12 2021 at 12:56, Mel Gorman wrote:
> > On Fri, Apr 09, 2021 at 08:55:39PM +0200, Peter Zijlstra wrote:
> > I'll update the changelog and comment accordingly. I'll decide later
> > whethe
On Mon, Apr 12, 2021 at 07:43:18PM +0200, Vlastimil Babka wrote:
> On 4/7/21 10:24 PM, Mel Gorman wrote:
> > @@ -6691,7 +6697,7 @@ static __meminit void zone_pcp_init(struct zone *zone)
> > * relies on the ability of the linker to provide the
> > * offset of a (sta
On Tue, Apr 13, 2021 at 11:36:08AM +0200, Vlastimil Babka wrote:
> On 4/12/21 4:08 PM, Mel Gorman wrote:
> > On Mon, Apr 12, 2021 at 02:40:18PM +0200, Vlastimil Babka wrote:
> >> On 4/12/21 2:08 PM, Mel Gorman wrote:
> >
> > the pageset structures in place would
machine. Even if I used movable_zone to create a zone
or numa=fake to create multiple fake nodes and zones, there was always
either reserved or pinned pages preventing the full zone being removed.
--
Mel Gorman
SUSE Labs
h generation
of Zen. The common pattern is that a single NUMA node can have multiple
L3 caches and at one point I thought it might be reasonable to allow
spillover to select a local idle CPU instead of stacking multiple tasks
on a CPU sharing cache. I never got as far as thinking how it could be
done in a way that multiple architectures would be happy with.
--
Mel Gorman
SUSE Labs
On Mon, Apr 12, 2021 at 02:40:18PM +0200, Vlastimil Babka wrote:
> On 4/12/21 2:08 PM, Mel Gorman wrote:
> > zone_pcp_reset allegedly protects against a race with drain_pages
> > using local_irq_save but this is bogus. local_irq_save only operates
> > on the local CPU. If memo
.
Signed-off-by: Mel Gorman
---
Resending for email address correction and adding lists
Changelog since v1
o Minimal fix
mm/page_alloc.c | 4
1 file changed, 4 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5e8aedb64b57..9bf0db982f14 100644
--- a/mm/page_alloc.c
+++ b/mm
On Fri, Apr 09, 2021 at 08:55:39PM +0200, Peter Zijlstra wrote:
> On Fri, Apr 09, 2021 at 02:32:56PM +0100, Mel Gorman wrote:
> > That said, there are some curious users already.
> > fs/squashfs/decompressor_multi_percpu.c looks like it always uses the
> > local_lock in CPU
On Mon, Apr 12, 2021 at 11:59:38AM +0100, Mel Gorman wrote:
> > I don't understand this comment. Only alloc_flags_nofragment() sets this
> > flag
> > and we don't use it here?
> >
>
> It's there as a reminder that there are non-obvious consequences
> to ALLOC_NO
Vlastimil Babka noted that a comment is wrong, fix it. This is the third
fix to the mmotm patch mm-page_alloc-add-a-bulk-page-allocator.patch.
Signed-off-by: Mel Gorman
---
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
ment. I'm waiting for a bug that can trivially trigger
a case with a meaningful workload where the success rate is poor enough to
affect latency before adding complexity. Ideally by then, the allocation
paths would be unified a bit better.
> > + gfp &= gfp_allowed_mask;
> > + alloc_gfp = gfp;
> >
On Mon, Apr 12, 2021 at 11:06:19AM +0100, Valentin Schneider wrote:
> On 12/04/21 10:37, Mel Gorman wrote:
> > On Mon, Apr 12, 2021 at 11:54:36AM +0530, Srikar Dronamraju wrote:
> >> * Gautham R. Shenoy [2021-04-02 11:07:54]:
> >>
> >> >
> >> &g
rch depth
allows within the node with the LLC CPUs masked out. While there would be
a latency hit because cache is not shared, it would still be a CPU local
to memory that is idle. That would potentially be beneficial on Zen*
as well without having to introduce new domains in the topology hierarchy.
--
Mel Gorman
SUSE Labs
t; -}
> > > -
> >
> > zone_pcp_reset still needs to exist to drain the remaining vmstats or
> > it'll break 5a883813845a ("memory-hotplug: fix zone stat
> > mismatch").
>
> Are you sure we are reseting vmstats in the hotremove. I do not see
> anything like that. Maybe this was needed at the time. I will double
> check.
zone_pcp_reset calls drain_zonestat to apply the per-cpu vmstat deltas
to the atomic per-zone and global stats.
If anything, the minimal "fix" is to simply delete IRQ disable/enable on
the grounds that IRQs protect nothing and assume the existing hotplug
paths guarantees the PCP cannot be used after zone_pcp_enable(). That
should be the case already because all the pages have been freed and
there is nothing to even put into the PCPs but I worried that the PCP
structure itself might still be reachable even if it's useless which is
why I freed the structure once they could not be reached via zonelists.
--
Mel Gorman
SUSE Labs
On Fri, Apr 09, 2021 at 02:48:12PM +0200, Michal Hocko wrote:
> On Fri 09-04-21 14:42:58, Michal Hocko wrote:
> > On Fri 09-04-21 13:09:57, Mel Gorman wrote:
> > > zone_pcp_reset allegedly protects against a race with drain_pages
> > > using local_irq_save but this is
On Fri, Apr 09, 2021 at 10:24:24AM +0200, Peter Zijlstra wrote:
> On Fri, Apr 09, 2021 at 08:59:39AM +0100, Mel Gorman wrote:
> > In the end I just gave up and kept it simple as there is no benefit to
> > !PREEMPT_RT which just disables IRQs. Maybe it'll be worth considering when
to zone_pcp_destroy to make it clear that the per-cpu structures
are deleted when the function returns.
Signed-off-by: Mel Gorman
---
mm/internal.h | 2 +-
mm/memory_hotplug.c | 10 +++---
mm/page_alloc.c | 22 --
3 files changed, 24 insertions(+), 10 deletions(-)
diff
On Fri, Apr 09, 2021 at 02:14:12AM -0700, Xie He wrote:
> On Fri, Apr 9, 2021 at 1:44 AM Mel Gorman wrote:
> >
> > That would imply that the tap was communicating with a swap device to
> > allocate a pfmemalloc skb which shouldn't happen. Furthermore, it would
> &
On Fri, Apr 09, 2021 at 01:33:24AM -0700, Xie He wrote:
> On Fri, Apr 9, 2021 at 12:30 AM Mel Gorman
> wrote:
> >
> > Under what circumstances do you expect sk_memalloc_socks() to be false
> > and skb_pfmemalloc() to be true that would cause a problem?
>
> For e
On Fri, Apr 09, 2021 at 08:39:45AM +0200, Peter Zijlstra wrote:
> On Thu, Apr 08, 2021 at 06:42:44PM +0100, Mel Gorman wrote:
> > On Thu, Apr 08, 2021 at 12:52:07PM +0200, Peter Zijlstra wrote:
> > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > > index
On Thu, Apr 08, 2021 at 11:52:01AM -0700, Xie He wrote:
> Hi Mel Gorman,
>
> I may have found a problem in pfmemalloc skb handling in
> net/core/dev.c. I see there are "if" conditions checking for
> "sk_memalloc_socks() && skb_pfmemalloc(skb)", and when
On Thu, Apr 08, 2021 at 12:56:01PM +0200, Peter Zijlstra wrote:
> On Wed, Apr 07, 2021 at 09:24:12PM +0100, Mel Gorman wrote:
> > Why local_lock? PREEMPT_RT considers the following sequence to be unsafe
> > as documented in Documentation/locking/locktypes.rst
> >
>
make the allocator RT-safe in general, I realised
that locking was broken and fixed it in patch 3 of this series. With that,
the local_lock could potentially be embedded within per_cpu_pages safely
at the end of this series.
--
Mel Gorman
SUSE Labs
VM events do not need explicit protection by disabling IRQs so
update the counter with IRQs enabled in __free_pages_ok.
Signed-off-by: Mel Gorman
---
mm/page_alloc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6d98d97b6cf5
operation.
Signed-off-by: Mel Gorman
---
mm/page_alloc.c | 67 ++---
1 file changed, 41 insertions(+), 26 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d94ec53367bd..6d98d97b6cf5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@
the last instance of local_irq_save()
in page_alloc.c.
Signed-off-by: Mel Gorman
---
mm/page_alloc.c | 13 +
1 file changed, 5 insertions(+), 8 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 1bb5b522a0f9..d94ec53367bd 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
Both free_pcppages_bulk() and free_one_page() have very similar
checks about whether a pages migratetype has changed under the
zone lock. Use a common helper.
Signed-off-by: Mel Gorman
---
mm/page_alloc.c | 32 ++--
1 file changed, 22 insertions(+), 10 deletions
called with IRQs disabled. While this
could be moved out, it's not free on all architectures as some require
IRQs to be disabled for mod_zone_page_state on !PREEMPT_RTkernels.
Signed-off-by: Mel Gorman
---
mm/page_alloc.c | 12 ++--
1 file changed, 6 insertions(+), 6 deletions(-)
diff
Now that the zone_statistics are simple counters that do not require
special protection, the bulk allocator accounting updates can be batch
updated without adding too much complexity with protected RMW updates or
using xchg.
Signed-off-by: Mel Gorman
---
include/linux/vmstat.h | 8
mm
__count_numa_event is small enough to be treated similarly to
__count_vm_event so inline it.
Signed-off-by: Mel Gorman
---
include/linux/vmstat.h | 9 +
mm/vmstat.c| 9 -
2 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/include/linux/vmstat.h b/include
. There is a possibility that slight errors will be
introduced but the overall trend as seen by userspace will be similar.
Note that while these counters could be maintained at the node level that
it would have a user-visible impact.
Signed-off-by: Mel Gorman
---
drivers/base/node.c| 18
to zone_pcp_destroy to make it clear that the per-cpu structures
are deleted when the function returns.
Signed-off-by: Mel Gorman
---
mm/internal.h | 2 +-
mm/memory_hotplug.c | 10 +++---
mm/page_alloc.c | 22 --
3 files changed, 24 insertions(+), 10 deletions(-)
diff
to IRQ enabling/disabling. The scope of the
lock is still wider than it should be but this is decreased laster.
[l...@intel.com: Make pagesets static]
Signed-off-by: Mel Gorman
---
include/linux/mmzone.h | 2 ++
mm/page_alloc.c| 50 +-
2 files
For MM people, the whole series is relevant but patch 3 needs particular
attention for memory hotremove as I had problems testing it because full
zone removal always failed for me. For RT people, the most interesting
patches are 2, 9 and 10 with 2 being the most important.
This series requires
...@intel.com: Check struct per_cpu_zonestat has a non-zero size]
Signed-off-by: Mel Gorman
---
include/linux/mmzone.h | 18
include/linux/vmstat.h | 8 ++--
mm/page_alloc.c| 84 +++-
mm/vmstat.c| 96 ++
4
On Wed, Apr 07, 2021 at 12:15:13PM +0200, Peter Zijlstra wrote:
> On Wed, Apr 07, 2021 at 10:41:06AM +0100, Mel Gorman wrote:
>
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -6112,6 +6112,27 @@ static int select_idle_co
On Wed, Apr 07, 2021 at 09:17:18AM +0200, Peter Zijlstra wrote:
> Subject: sched/fair: Bring back select_idle_smt(), but differently
> From: Rik van Riel
> Date: Fri, 26 Mar 2021 15:19:32 -0400
>
> From: Rik van Riel
>
> Mel Gorman did some nice work in 9fe1f127b91
code for migration because even if it shows up a
problem, it would be better to optimise the generic implementation than
carry two similar implementations. I'm undecided on whether s390 should
split+migrate rather than skip because I do not have a good overview of
"typical workloads on s390 that benefit from NUMA balancing".
--
Mel Gorman
SUSE Labs
is what this patch doing.
>
>
>
Thanks, I think this is ok for Andrew to pick up to see if anything
bisects to this commit but it's a low risk.
Reviewed-by: Mel Gorman
More notes;
This is not a universal win given that not all workloads exhibit the
pattern where accesses occur in par
On Wed, Mar 31, 2021 at 09:36:04AM -0700, Nadav Amit wrote:
>
>
> > On Mar 31, 2021, at 6:16 AM, Mel Gorman wrote:
> >
> > On Wed, Mar 31, 2021 at 07:20:09PM +0800, Huang, Ying wrote:
> >> Mel Gorman writes:
> >>
> >>> On M
On Wed, Mar 31, 2021 at 07:42:42PM +0200, Thomas Gleixner wrote:
> On Wed, Mar 31 2021 at 12:01, Mel Gorman wrote:
> > On Wed, Mar 31, 2021 at 11:55:56AM +0200, Thomas Gleixner wrote:
> > @@ -887,13 +887,11 @@ void cpu_vm_stats_fold(int cpu)
> >
> >
er, it might be ok as an s390-specific workaround.
(Note, I haven't read the rest of the series due to lack of time but this
query caught my eye).
--
Mel Gorman
SUSE Labs
On Wed, Mar 31, 2021 at 07:20:09PM +0800, Huang, Ying wrote:
> Mel Gorman writes:
>
> > On Mon, Mar 29, 2021 at 02:26:51PM +0800, Huang Ying wrote:
> >> For NUMA balancing, in hint page fault handler, the faulting page will
> >> be migrated to the access
On Wed, Mar 31, 2021 at 11:55:56AM +0200, Thomas Gleixner wrote:
> On Mon, Mar 29 2021 at 13:06, Mel Gorman wrote:
> > There is a lack of clarity of what exactly local_irq_save/local_irq_restore
> > protects in page_alloc.c . It conflates the protection of per-cpu page
> >
Ingo, Thomas or Peter, is there any chance one of you could take a look
at patch "[PATCH 2/6] mm/page_alloc: Convert per-cpu list protection to
local_lock" from this series? It's partially motivated by PREEMPT_RT. More
details below.
On Mon, Mar 29, 2021 at 01:06:42PM +0100, Mel Go
On Tue, Mar 30, 2021 at 08:51:54PM +0200, Jesper Dangaard Brouer wrote:
> On Mon, 29 Mar 2021 13:06:42 +0100
> Mel Gorman wrote:
>
> > This series requires patches in Andrew's tree so the series is also
> > available at
> >
> > git://git.kernel.org/pub/scm/linux
IPI savings are enough to justify stalling parallel
accesses that could be making forward progress.
One nit below
> Signed-off-by: "Huang, Ying"
> Cc: Peter Zijlstra
> Cc: Mel Gorman
> Cc: Peter Xu
> Cc: Johannes Weiner
> Cc: Vlastimil Babka
> Cc: &qu
Colin Ian King reported the following problem (slightly edited)
Author: Mel Gorman
Date: Mon Mar 29 11:12:24 2021 +1100
mm/page_alloc: add a bulk page allocator
...
Static analysis on linux-next with Coverity has found a potential
lized value should be ALLOC_WMARK_LOW. A value of 0 would be the same
as ALLOC_WMARK_MIN and that would allow the bulk allocator to potentially
consume too many pages without waking kswapd. I'll put together a patch
shortly. Thanks Colin!
--
Mel Gorman
SUSE Labs
if ever
called from an IRQ context.
Signed-off-by: Mel Gorman
---
mm/page_alloc.c | 22 --
1 file changed, 16 insertions(+), 6 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 32c64839c145..25d9351e75d8 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
-off-by: Mel Gorman
---
include/linux/mmzone.h | 2 ++
mm/page_alloc.c| 43 --
mm/vmstat.c| 4
3 files changed, 31 insertions(+), 18 deletions(-)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index a4393ac27336
Now that the zone_statistics are a simple counter that does not require
special protection, the bulk allocator accounting updates can be
batch updated without requiring IRQs to be disabled.
Signed-off-by: Mel Gorman
---
include/linux/vmstat.h | 8
mm/page_alloc.c| 30
__count_numa_event is small enough to be treated similarly to
__count_vm_event so inline it.
Signed-off-by: Mel Gorman
---
include/linux/vmstat.h | 9 +
mm/vmstat.c| 9 -
2 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/include/linux/vmstat.h b/include
to VM events. There is a possibility that slight errors will be
introduced but the overall trend as seen by userspace will be similar.
Note that while these counters could be maintained at the node level,
it would have a user-visible impact.
Signed-off-by: Mel Gorman
---
drivers/base/node.c
...@intel.com: Check struct per_cpu_zonestat has a non-zero size]
Signed-off-by: Mel Gorman
---
include/linux/mmzone.h | 18
include/linux/vmstat.h | 8 ++--
mm/page_alloc.c| 84 +++-
mm/vmstat.c| 96 ++
4
This series requires patches in Andrew's tree so the series is also
available at
git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git
mm-percpu-local_lock-v1r15
tldr: Jesper and Chuck, it would be nice to verify if this series helps
the allocation rate of the bulk page allocator.
On Fri, Mar 26, 2021 at 03:19:32PM -0400, Rik van Riel wrote:
> ---8<---
> sched,fair: bring back select_idle_smt, but differently
>
> Mel Gorman did some nice work in 9fe1f127b913
> ("sched/fair: Merge select_idle_core/cpu()"), resulting in the kernel
> being mo
On Thu, Mar 25, 2021 at 03:06:57PM +0100, Uladzislau Rezki wrote:
> > On Thu, Mar 25, 2021 at 12:50:01PM +, Matthew Wilcox wrote:
> > > On Thu, Mar 25, 2021 at 11:42:19AM +0000, Mel Gorman wrote:
> > > > This series introduces a bulk order-0 page allocator with sun
On Thu, Mar 25, 2021 at 12:50:01PM +, Matthew Wilcox wrote:
> On Thu, Mar 25, 2021 at 11:42:19AM +0000, Mel Gorman wrote:
> > This series introduces a bulk order-0 page allocator with sunrpc and
> > the network page pool being the first users. The implementation is no
On Thu, Mar 25, 2021 at 12:12:17PM +, Matthew Wilcox wrote:
> On Thu, Mar 25, 2021 at 11:42:23AM +0000, Mel Gorman wrote:
> >
> > - if (WARN_ON_ONCE(nr_pages <= 0))
> > + if (unlikely(nr_pages <= 0))
> > return 0;
>
> If we made nr_page
On Thu, Mar 25, 2021 at 12:05:25PM +, Matthew Wilcox wrote:
> On Thu, Mar 25, 2021 at 11:42:21AM +0000, Mel Gorman wrote:
> > +int __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
> > + nodemask_t *nodemask, int nr_pages,
> > +
to migrate the hot private pages first?
>
I'm not sure how the hotness of pages could be ranked. At the time of a
hinting fault, the page is by definition active now because it was been
accessed. Prioritising what pages to migrate based on the number of faults
that have been trapped would have to be stored somewhere.
--
Mel Gorman
SUSE Labs
by: Jesper Dangaard Brouer
Signed-off-by: Mel Gorman
---
include/net/page_pool.h | 2 +-
net/core/page_pool.c| 82 -
2 files changed, 57 insertions(+), 27 deletions(-)
diff --git a/include/net/page_pool.h b/include/net/page_pool.h
index b5b195305346..6d
-by: Mel Gorman
---
net/core/page_pool.c | 45 +---
1 file changed, 26 insertions(+), 19 deletions(-)
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index ad8b0707af04..40e1b2beaa6c 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
_actor() was renamed nfsd_splice_actor()
by commit cf8208d0eabd ("sendfile: convert nfsd to
splice_direct_to_actor()").
Signed-off-by: Chuck Lever
Signed-off-by: Mel Gorman
---
net/sunrpc/svc_xprt.c | 7 +++
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/net/sunrpc/svc_xprt.c
-by: Mel Gorman
---
net/sunrpc/svc_xprt.c | 31 +++
1 file changed, 15 insertions(+), 16 deletions(-)
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 609bda97d4ae..0c27c3291ca1 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -643,30
) 30.633 ns (step:64)
Signed-off-by: Jesper Dangaard Brouer
Signed-off-by: Mel Gorman
---
mm/page_alloc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 1ec18121268b..d900e92884b2 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
, which confuse the
I-cache prefetcher in the CPU.
[mgorman: Minor changes and rebasing]
Signed-off-by: Jesper Dangaard Brouer
Signed-off-by: Mel Gorman
---
mm/page_alloc.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index
storage to store the pages.
Signed-off-by: Mel Gorman
---
include/linux/gfp.h | 13 +++---
mm/page_alloc.c | 60 +
2 files changed, 54 insertions(+), 19 deletions(-)
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 4a304fd39916
is to make it available early
to determine what semantics are required by different callers. Once the
full semantics are nailed down, it can be refactored.
Signed-off-by: Mel Gorman
Acked-by: Vlastimil Babka
---
include/linux/gfp.h | 11 +
mm/page_alloc.c | 118
1 - 100 of 10256 matches
Mail list logo