Re: [PATCH] memcg: effective memory.high reclaim for remote charging

2020-05-07 Thread Michal Hocko
On Thu 07-05-20 10:00:07, Shakeel Butt wrote: > On Thu, May 7, 2020 at 9:47 AM Michal Hocko wrote: > > > > On Thu 07-05-20 09:33:01, Shakeel Butt wrote: > > [...] > > > @@ -2600,8 +2596,23 @@ static int try_charge(struct mem_cgroup

Re: [PATCH] memcg: effective memory.high reclaim for remote charging

2020-05-07 Thread Michal Hocko
schedule_work(>high_work); > break; > } > } while ((memcg = parent_mem_cgroup(memcg))); > -- > 2.26.2.526.g744177e7f7-goog > -- Michal Hocko SUSE Labs

Re: [PATCH] Documentation: update numastat explanation

2020-05-07 Thread Michal Hocko
.org/linux-mm/20200504070304.127361-1-sandi...@linux.ibm.com/T/#u > > Signed-off-by: Vlastimil Babka Acked-by: Michal Hocko Thanks! > --- > Documentation/admin-guide/numastat.rst | 31 +++--- > 1 file changed, 28 insertions(+), 3 deletions(-) > > diff --git a

Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

2020-05-05 Thread Michal Hocko
On Tue 05-05-20 08:35:45, Shakeel Butt wrote: > On Tue, May 5, 2020 at 8:27 AM Johannes Weiner wrote: > > > > On Mon, May 04, 2020 at 12:23:51PM -0700, Shakeel Butt wrote: > > > On Mon, May 4, 2020 at 9:06 AM Michal Hocko wrote: > > > > I really hate to re

Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

2020-05-05 Thread Michal Hocko
l information can we focus on the remote charging side of the problem and deal with it in a sensible way? That would make memory.high usable for your usecase and I still believe that this is what you should be using in the first place. -- Michal Hocko SUSE Labs

Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

2020-05-04 Thread Michal Hocko
On Mon 04-05-20 08:35:57, Shakeel Butt wrote: > On Mon, May 4, 2020 at 8:00 AM Michal Hocko wrote: > > > > On Mon 04-05-20 07:53:01, Shakeel Butt wrote: [...] > > > I am trying to see if "no eligible task" is really an issue and should > > > be warned f

Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

2020-05-04 Thread Michal Hocko
On Mon 04-05-20 07:53:01, Shakeel Butt wrote: > On Mon, May 4, 2020 at 7:11 AM Michal Hocko wrote: > > > > On Mon 04-05-20 06:54:40, Shakeel Butt wrote: > > > On Sun, May 3, 2020 at 11:56 PM Michal Hocko wrote: > > > > > > > > On Thu 30-04-20

Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

2020-05-04 Thread Michal Hocko
On Mon 04-05-20 06:54:40, Shakeel Butt wrote: > On Sun, May 3, 2020 at 11:56 PM Michal Hocko wrote: > > > > On Thu 30-04-20 11:27:12, Shakeel Butt wrote: > > > Lowering memory.max can trigger an oom-kill if the reclaim does not > > > succeed. However if o

Re: [PATCH v2 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline

2020-05-04 Thread Michal Hocko
On Thu 30-04-20 12:48:20, Srikar Dronamraju wrote: > * Michal Hocko [2020-04-29 14:22:11]: > > > On Wed 29-04-20 07:11:45, Srikar Dronamraju wrote: > > > > > > > > > > By marking, N_ONLINE as NODE_MASK_NONE, lets stop assuming that No

Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

2020-05-04 Thread Michal Hocko
On Mon 04-05-20 15:40:18, Yafang Shao wrote: > On Mon, May 4, 2020 at 3:35 PM Michal Hocko wrote: > > > > On Mon 04-05-20 15:26:52, Yafang Shao wrote: [...] > > > As explianed above, no eligible task is different from no task. > > > If there are some candidates b

Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

2020-05-04 Thread Michal Hocko
On Mon 04-05-20 15:26:52, Yafang Shao wrote: > On Mon, May 4, 2020 at 3:03 PM Michal Hocko wrote: > > > > On Fri 01-05-20 09:39:24, Yafang Shao wrote: > > > On Fri, May 1, 2020 at 2:27 AM Shakeel Butt wrote: > > > > > > > > Lowering memory

Re: [PATCH 1/2] mm, memcg: Avoid stale protection values when cgroup is above protection

2020-05-04 Thread Michal Hocko
On Fri 01-05-20 07:59:57, Yafang Shao wrote: > On Thu, Apr 30, 2020 at 10:57 PM Michal Hocko wrote: > > > > On Wed 29-04-20 12:56:27, Johannes Weiner wrote: > > [...] > > > I think to address this, we need a more comprehensive solution and > > > introduce s

Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

2020-05-04 Thread Michal Hocko
; + break; > + > memcg_memory_event(memcg, MEMCG_OOM); > if (!mem_cgroup_out_of_memory(memcg, GFP_KERNEL, 0)) > break; I am not a great fan to be honest. The warning might be useful for other usecases when it is not clear that the memcg is empty. -- Michal Hocko SUSE Labs

Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

2020-05-04 Thread Michal Hocko
ter the memcg is offlined and at the moment, high > reclaim does not work for remote memcg and the usage can go till max > or global pressure. This is most probably a misconfiguration and we > might not receive the warnings in the log ever. Setting memory.max to > 0 will definitely give such warnings. Can we add a warning for the remote charging on dead memcgs? -- Michal Hocko SUSE Labs

Re: [PATCH] memcg: oom: ignore oom warnings from memory.max

2020-05-04 Thread Michal Hocko
diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index 463b3d74a64a..5ace39f6fe1e 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -1098,7 +1098,7 @@ bool out_of_memory(struct oom_control *oc) > > select_bad_process(oc); > /* Found nothing?!?! */ > - if (!oc->chosen) { > + if (!oc->chosen && !oc->no_warn) { > dump_header(oc, NULL); > pr_warn("Out of memory and no killable processes...\n"); > /* > -- > 2.26.2.526.g744177e7f7-goog -- Michal Hocko SUSE Labs

Re: [PATCH 1/2] mm, memcg: Avoid stale protection values when cgroup is above protection

2020-04-30 Thread Michal Hocko
l that much because limit reclaim > and global reclaim tend to occur in complementary > containerization/isolation strategies, not heavily simultaneously. I would expect that as well but this is always hard to tell. -- Michal Hocko SUSE Labs

Re: [PATCH 1/2] mm, memcg: Avoid stale protection values when cgroup is above protection

2020-04-29 Thread Michal Hocko
On Wed 29-04-20 10:03:30, Johannes Weiner wrote: > On Wed, Apr 29, 2020 at 12:15:10PM +0200, Michal Hocko wrote: > > On Tue 28-04-20 19:26:47, Chris Down wrote: > > > From: Yafang Shao > > > > > > A cgroup can have both memory protection and a memory limit t

Re: [PATCH v3] mm/vmscan.c: change prototype for shrink_page_list

2020-04-29 Thread Michal Hocko
by: Vaneet Narang > Signed-off-by: Maninder Singh You could have kept my ack from v1 Acked-by: Michal Hocko Thanks! > --- > v1 -> v2: position of variable changed mistakenly, thus reverted. > v2 -> v3: Don't change position of any variable, thus reverted. > if required then

Re: [PATCH] printk: Add loglevel for "do not print to consoles".

2020-04-29 Thread Michal Hocko
On Wed 29-04-20 01:23:15, Tetsuo Handa wrote: > On 2020/04/29 0:45, Michal Hocko wrote: > > On Tue 28-04-20 22:11:19, Tetsuo Handa wrote: > >> Existing KERN_$LEVEL allows a user to determine whether he/she wants that > >> message > >> to be printed on consoles

Re: (2) [PATCH 1/1] mm/vmscan.c: change prototype for shrink_page_list

2020-04-29 Thread Michal Hocko
On Wed 29-04-20 18:59:40, Vaneet Narang wrote: > Hi Michal,  > > >> > > >> >Acked-by: Michal Hocko  > >> > > >> >Is there any reason to move declarations here? > >> > > >>  > >> "unsigned int ret" was c

Re: [PATCH 1/1] mm/vmscan.c: change prototype for shrink_page_list

2020-04-29 Thread Michal Hocko
On Wed 29-04-20 18:23:23, Maninder Singh wrote: > > Hi, > > > > >Acked-by: Michal Hocko > > > >Is there any reason to move declarations here? > > > > "unsigned int ret" was changed mistakenely, sending V2. > and "unsigned int nr_re

Re: [PATCH v2 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline

2020-04-29 Thread Michal Hocko
MA Multi node but with no CPUs and memory from node 0. Have you tested on something else than ppc? Each arch does the NUMA setup separately and this is a big mess. E.g. x86 marks even memory less nodes (see init_memory_less_node) as online. Honestly I have hard time to evaluate the effect of this

Re: [PATCH 1/1] mm/vmscan.c: change prototype for shrink_page_list

2020-04-29 Thread Michal Hocko
by: Vaneet Narang > Signed-off-by: Maninder Singh Acked-by: Michal Hocko Is there any reason to move declarations here? > -unsigned long reclaim_clean_pages_from_list(struct zone *zone, > +unsigned int reclaim_clean_pages_from_list(struct zone *zone, >

Re: [patch] mm, oom: stop reclaiming if GFP_ATOMIC will start failing soon

2020-04-29 Thread Michal Hocko
On Wed 29-04-20 19:45:07, Tetsuo Handa wrote: > On 2020/04/29 18:04, Michal Hocko wrote: > > Completely agreed! The in kernel OOM killer is to deal with situations > > when memory is desperately depleted without any sign of a forward > > progress. If there is a recla

Re: [PATCH 1/2] mm, memcg: Avoid stale protection values when cgroup is above protection

2020-04-29 Thread Michal Hocko
e more robust against races on top of that because this is likely a more tricky thing to do. > Fixes: 9783aa9917f8 ("mm, memcg: proportional memory.{low,min} reclaim") > Signed-off-by: Yafang Shao > Signed-off-by: Chris Down > Cc: Johannes Weiner > Cc: Michal Hocko

Re: [PATCH 2/2] mm, memcg: Decouple e{low,min} state mutations from protection checks

2020-04-29 Thread Michal Hocko
simply checking and > don't need to worry about that. > > Signed-off-by: Chris Down > Suggested-by: Johannes Weiner > Cc: Michal Hocko > Cc: Roman Gushchin > Cc: Yafang Shao Acked-by: Michal Hocko > --- > include/linux/memcontrol.h | 48 +++

Re: [patch] mm, oom: stop reclaiming if GFP_ATOMIC will start failing soon

2020-04-29 Thread Michal Hocko
is desperately depleted without any sign of a forward progress. If there is a reclaimable memory then we are not there yet. If a workload can benefit from early oom killing based on response time then we have facilities to achieve that (e.g. PSI). -- Michal Hocko SUSE Labs

Re: [patch] mm, oom: stop reclaiming if GFP_ATOMIC will start failing soon

2020-04-29 Thread Michal Hocko
On Wed 29-04-20 10:31:41, peter enderborg wrote: > On 4/28/20 9:43 AM, Michal Hocko wrote: > > On Mon 27-04-20 16:35:58, Andrew Morton wrote: > > [...] > >> No consumer of GFP_ATOMIC memory should consume an unbounded amount of > >> it. > >> Subsystems

Re: [PATCH] printk: Add loglevel for "do not print to consoles".

2020-04-28 Thread Michal Hocko
On Tue 28-04-20 22:11:19, Tetsuo Handa wrote: > On 2020/04/28 21:18, Michal Hocko wrote: > > On Tue 28-04-20 20:33:21, Tetsuo Handa wrote: > >> On 2020/04/27 15:21, Sergey Senozhatsky wrote: > >>>> KERN_NO_CONSOLES is for type of messages where

Re: [PATCH] printk: Add loglevel for "do not print to consoles".

2020-04-28 Thread Michal Hocko
hard-coded policy. > > But given that whether to use KERN_NO_CONSOLES is configurable via e.g. > sysctl, > KERN_NO_CONSOLES will become a user configurable parameter. What's still > wrong? How do I as a kernel developer know that KERN_NO_CONSOLES should be used? In other words, how can I assume what a user will consider important on the console? -- Michal Hocko SUSE Labs

Re: [PATCH v3 1/5] kernel/sysctl: support setting sysctl parameters from kernel command line

2020-04-28 Thread Michal Hocko
g . form, > wonder why it doesn't work, then read the doc and realize it's not > supported? Yes, I do agree. I have only recently learned that sysctl supports / as well. Most people are simply used to . notation. The copy of the arch and . -> / substitution is a trivial operation and I do not think it is a real reason to introduce unnecessarily harder to use interface. -- Michal Hocko SUSE Labs

Re: [patch] mm, oom: stop reclaiming if GFP_ATOMIC will start failing soon

2020-04-28 Thread Michal Hocko
cing with the reclaim and betting on luck. The last problem was the most annoying because it is really hard to tune for. -- Michal Hocko SUSE Labs

Re: [PATCH 2/2] mm, vmstat: List total free blocks for each order in /proc/pagetypeinfo

2019-10-23 Thread Michal Hocko
printf(m, "%6lu ", area->nr_free); > + } > + seq_putc(m, '\n'); This is essentially duplicating /proc/buddyinfo. Do we really need that? -- Michal Hocko SUSE Labs

Re: [PATCH 1/2] mm, vmstat: Release zone lock more frequently when reading /proc/pagetypeinfo

2019-10-23 Thread Michal Hocko
seq_printf(m, "%s%6lu ", overflow ? ">" : "", freecount); + spin_unlock_irq(>lock); + cond_resched(); + spin_lock_irq(>lock); } seq_putc(m, '\n'); } I do not have a strong opinion here but I can fold this into my patch 2. -- Michal Hocko SUSE Labs

Re: [PATCH] Add prctl support for controlling PF_MEMALLOC V2

2019-10-23 Thread Michal Hocko
On Wed 23-10-19 12:27:29, Mike Christie wrote: > On 10/23/2019 02:11 AM, Michal Hocko wrote: > > On Wed 23-10-19 07:43:44, Dave Chinner wrote: > >> On Tue, Oct 22, 2019 at 06:33:10PM +0200, Michal Hocko wrote: > > > > Thanks for more clarifiat

Re: [RFC PATCH 2/2] mm, vmstat: reduce zone->lock holding time by /proc/pagetypeinfo

2019-10-23 Thread Michal Hocko
With a brown paper bag bug fixed. I have also added a note about low number of pages being more important as per Vlastimil's feedback >From 0282f604144a5c06fdf3cf0bb2df532411e7f8c9 Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Wed, 23 Oct 2019 12:13:02 +0200 Subject: [PATCH] mm, vms

Re: [RFC PATCH 2/2] mm, vmstat: reduce zone->lock holding time by /proc/pagetypeinfo

2019-10-23 Thread Michal Hocko
On Wed 23-10-19 10:56:30, Waiman Long wrote: > On 10/23/19 6:27 AM, Michal Hocko wrote: > > From: Michal Hocko > > > > pagetypeinfo_showfree_print is called by zone->lock held in irq mode. > > This is not really nice because it blocks both any interrupts on that

Re: [PATCH] mm/page_alloc: fix gcc compile warning

2019-10-23 Thread Michal Hocko
* will be artificially small. >*/ > +#ifdef CONFIG_MEMORY_HOTPLUG > for_each_populated_zone(zone) > zone_pcp_update(zone); > +#endif > > /* > * We initialized the rest of the deferred pages. Permanently disable > -- > 2.7.4 -- Michal Hocko SUSE Labs

Re: [RFC PATCH 2/2] mm, vmstat: reduce zone->lock holding time by /proc/pagetypeinfo

2019-10-23 Thread Michal Hocko
On Wed 23-10-19 15:48:36, Vlastimil Babka wrote: > On 10/23/19 3:37 PM, Michal Hocko wrote: > > On Wed 23-10-19 15:32:05, Vlastimil Babka wrote: > >> On 10/23/19 12:27 PM, Michal Hocko wrote: > >>> From: Michal Hocko > >>> > >>> pagetypeinfo_s

Re: [PATCH 7/8] mm: vmscan: split shrink_node() into node part and memcgs part

2019-10-23 Thread Michal Hocko
up vmpressure notifications > > Signed-off-by: Johannes Weiner Acked-by: Michal Hocko > --- > mm/vmscan.c | 28 ++-- > 1 file changed, 18 insertions(+), 10 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index db073b40c432..65baa89740

Re: [PATCH 6/8] mm: vmscan: turn shrink_node_memcg() into shrink_lruvec()

2019-10-23 Thread Michal Hocko
to access node or cgroup properties can look > them them up if necessary, but there are only a few cases. > > Signed-off-by: Johannes Weiner Acked-by: Michal Hocko > --- > mm/vmscan.c | 21 ++--- > 1 file changed, 10 insertions(+), 11 deletions(-) > >

Re: [PATCH 5/8] mm: vmscan: replace shrink_node() loop with a retry jump

2019-10-23 Thread Michal Hocko
y memcg will stall in page writeback so avoid forcibly > + * stalling in wait_iff_congested(). > + */ > + if (cgroup_reclaim(sc) && writeback_throttling_sane(sc) && > + sc->nr.dirty && sc->nr.dirty == sc->nr.congested) > + set_memcg_congestion(pgdat, root, true); > + > + /* > + * Stall direct reclaim for IO completions if underlying BDIs > + * and node is congested. Allow kswapd to continue until it > + * starts encountering unqueued dirty pages or cycling through > + * the LRU too quickly. > + */ > + if (!sc->hibernation_mode && !current_is_kswapd() && > + current_may_throttle() && pgdat_memcg_congested(pgdat, root)) > + wait_iff_congested(BLK_RW_ASYNC, HZ/10); > > - } while (should_continue_reclaim(pgdat, sc->nr_reclaimed - nr_reclaimed, > - sc)); > + if (should_continue_reclaim(pgdat, sc->nr_reclaimed - nr_reclaimed, > + sc)) > + goto again; > > /* >* Kswapd gives up on balancing particular nodes after too > -- > 2.23.0 -- Michal Hocko SUSE Labs

Re: [PATCH 4/8] mm: vmscan: naming fixes: global_reclaim() and sane_reclaim()

2019-10-23 Thread Michal Hocko
f you insist on having sane in the name then I won't object but it just raises a question whether we have some levels of throttling with a different level of sanity. > Signed-off-by: Johannes Weiner Acked-by: Michal Hocko > --- > mm/vmscan.c | 38 ++ >

Re: [PATCH 3/8] mm: vmscan: move inactive_list_is_low() swap check to the caller

2019-10-23 Thread Michal Hocko
ck. Add it there. > > Then delete the swap check from inactive_list_is_low(). > > Signed-off-by: Johannes Weiner OK, makes sense to me. Acked-by: Michal Hocko > --- > mm/vmscan.c | 9 + > 1 file changed, 1 insertion(+), 8 deletions(-) > > diff --git a/mm/vmscan.c b/mm/

Re: [PATCH 2/8] mm: clean up and clarify lruvec lookup procedure

2019-10-23 Thread Michal Hocko
le in this area, swap the mem_cgroup_lruvec() argument order. The > name suggests a memcg operation, yet it takes a pgdat first and a > memcg second. I have to double take every time I call this. Fix that. > > Signed-off-by: Johannes Weiner I do agree that node_lruvec() adds confusion and i

Re: [PATCH 1/8] mm: vmscan: simplify lruvec_lru_size()

2019-10-23 Thread Michal Hocko
. The original intention was to optimize this for GFP_KERNEL like allocations by reducing the number of zones to reduce. But considering this is not called from hot paths I do agree that a simpler code is more preferable. > Signed-off-by: Johannes Weiner Acked-by: Michal Hocko > --

Re: [RFC PATCH 2/2] mm, vmstat: reduce zone->lock holding time by /proc/pagetypeinfo

2019-10-23 Thread Michal Hocko
On Wed 23-10-19 15:32:05, Vlastimil Babka wrote: > On 10/23/19 12:27 PM, Michal Hocko wrote: > > From: Michal Hocko > > > > pagetypeinfo_showfree_print is called by zone->lock held in irq mode. > > This is not really nice because it blocks both any interrupt

Re: [RFC v1] mm: add page preemption

2019-10-23 Thread Michal Hocko
On Wed 23-10-19 19:53:50, Hillf Danton wrote: > > On Wed, 23 Oct 2019 10:17:29 +0200 Michal Hocko wrote: [...] > > This doesn't really answer my question. > > Why cannot you use memcgs as they are now. > > No prio provided. > > > Why exactly do you need a fix

[RFC PATCH 0/2] mm/vmstat: Reduce zone lock hold time when reading /proc/pagetypeinfo

2019-10-23 Thread Michal Hocko
On Wed 23-10-19 10:56:08, Mel Gorman wrote: > On Wed, Oct 23, 2019 at 11:04:22AM +0200, Michal Hocko wrote: > > So can we go with this to address the security aspect of this and have > > something trivial to backport. > > > > Yes. Ok, pat

[RFC PATCH 2/2] mm, vmstat: reduce zone->lock holding time by /proc/pagetypeinfo

2019-10-23 Thread Michal Hocko
From: Michal Hocko pagetypeinfo_showfree_print is called by zone->lock held in irq mode. This is not really nice because it blocks both any interrupts on that cpu and the page allocator. On large machines this might even trigger the hard lockup detector. Considering the pagetypei

[RFC PATCH 1/2] mm, vmstat: hide /proc/pagetypeinfo from normal users

2019-10-23 Thread Michal Hocko
From: Michal Hocko /proc/pagetypeinfo is a debugging tool to examine internal page allocator state wrt to fragmentation. It is not very useful for any other use so normal users really do not need to read this file. Waiman Long has noticed that reading this file can have negative side effects

Re: [PATCH RFC v3 6/9] mm: Allow to offline PageOffline() pages with a reference count of 0

2019-10-23 Thread Michal Hocko
(PageOffline() + refcount == 0)? Simply skip over PageOffline pages. Reference count should never be != 0 at this stage. > In summary, is what you suggest simply delaying setting the reference count > to 0 > in MEM_GOING_OFFLINE instead of right away when the driver unpluggs the pages? Yes > What's the big benefit you see and I fail to see? Aparat from no hooks into __put_page it is also an explicit control over the page via reference counting. Do you see any downsides? -- Michal Hocko SUSE Labs

Re: [PATCH] mm/vmstat: Reduce zone lock hold time when reading /proc/pagetypeinfo

2019-10-23 Thread Michal Hocko
On Wed 23-10-19 09:31:43, Mel Gorman wrote: > On Tue, Oct 22, 2019 at 06:57:45PM +0200, Michal Hocko wrote: > > [Cc Mel] > > > > On Tue 22-10-19 12:21:56, Waiman Long wrote: > > > The pagetypeinfo_showfree_print() function prints out the number of > > >

Re: [RFC v1] mm: add page preemption

2019-10-23 Thread Michal Hocko
On Tue 22-10-19 22:28:02, Hillf Danton wrote: > > On Tue, 22 Oct 2019 14:42:41 +0200 Michal Hocko wrote: > > > > On Tue 22-10-19 20:14:39, Hillf Danton wrote: > > > > > > On Mon, 21 Oct 2019 14:27:28 +0200 Michal Hocko wrote: > > [...] > > >

Re: [RFC v1] memcg: add memcg lru for page reclaiming

2019-10-23 Thread Michal Hocko
On Wed 23-10-19 12:44:48, Hillf Danton wrote: > > On Tue, 22 Oct 2019 15:58:32 +0200 Michal Hocko wrote: > > > > On Tue 22-10-19 21:30:50, Hillf Danton wrote: [...] > > > in this RFC after ripping pages off > > > the first victim, the work finishes w

Re: [PATCH 2/2] mm: memcontrol: try harder to set a new memory.high

2019-10-23 Thread Michal Hocko
+ reclaimed = try_to_free_mem_cgroup_pages(memcg, nr_pages - high, > + GFP_KERNEL, true); > + > + if (!reclaimed && !nr_retries--) > + break; > + } > > - memcg_wb_domain_size_changed(memcg); > return nbytes; > } > > -- > 2.23.0 -- Michal Hocko SUSE Labs

Re: [PATCH 1/2] mm: memcontrol: remove dead code from memory_max_write()

2019-10-23 Thread Michal Hocko
Weiner Acked-by: Michal Hocko > --- > mm/memcontrol.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 055975b0b3a3..ff90d4e7df37 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -6122,10 +6122,8

Re: [PATCH] mm: memcontrol: fix network errors from failing __GFP_ATOMIC charges

2019-10-23 Thread Michal Hocko
burden of reclaim on regular allocation requests > + * and let these go through as privileged allocations. > + */ > + if (gfp_mask & __GFP_ATOMIC) > + goto force; > + > /* >* Unlike in global OOM situations, memcg is not in a physical >* memory shortage. Allow dying and OOM-killed tasks to > -- > 2.23.0 > -- Michal Hocko SUSE Labs

Re: [PATCH] mm/vmstat: Reduce zone lock hold time when reading /proc/pagetypeinfo

2019-10-23 Thread Michal Hocko
from Mel and Vlastimil how would they feel about making free_list fully migrate type aware (including nr_free). > Why are we actually holding zone->lock so much? Can we get away with > holding it across the list_for_each() loop and nothing else? If so, > this still isn't a bulletproof fix. Maybe just terminate the list > walk if freecount reaches 1024. Would anyone really care? > > Sigh. I wonder if anyone really uses this thing for anything > important. Can we just remove it all? Vlastimil would know much better but I have seen this being used for fragmentation related debugging. That should imply that 0400 should be sufficient and a quick and easily backportable fix for the most pressing immediate problem. -- Michal Hocko SUSE Labs

Re: [PATCH] mm: fix comments based on per-node memcg

2019-10-22 Thread Michal Hocko
On Tue 22-10-19 15:06:18, Hao Lee wrote: > These comments should be updated as memcg limit enforcement has been moved > from zones to nodes. > > Signed-off-by: Hao Lee Acked-by: Michal Hocko > --- > include/linux/memcontrol.h | 5 ++--- > 1 file changed, 2 inser

Re: [PATCH] mm/vmstat: Reduce zone lock hold time when reading /proc/pagetypeinfo

2019-10-22 Thread Michal Hocko
< MIGRATE_TYPES; mtype++) { > + seq_printf(m, "Node %4d, zone %8s, type %12s ", > + pgdat->node_id, > + zone->name, > + migratetype_names[mtype]); > + for (order = 0; order < MAX_ORDER; ++order) > + seq_printf(m, "%6lu ", nfree[order][mtype]); > seq_putc(m, '\n'); > } > } > -- > 2.18.1 -- Michal Hocko SUSE Labs

Re: [RFC v1] memcg: add memcg lru for page reclaiming

2019-10-22 Thread Michal Hocko
On Tue 22-10-19 21:30:50, Hillf Danton wrote: > > On Mon, 21 Oct 2019 14:14:53 +0200 Michal Hocko wrote: > > > > On Mon 21-10-19 19:56:54, Hillf Danton wrote: > > > > > > Currently soft limit reclaim is frozen, see > > > Documentation/admin-guide/cg

Re: [PATCH 0/4] [RFC] Migrate Pages in lieu of discard

2019-10-22 Thread Michal Hocko
On Fri 18-10-19 07:54:20, Dave Hansen wrote: > On 10/18/19 12:44 AM, Michal Hocko wrote: > > How does this compare to > > http://lkml.kernel.org/r/1560468577-101178-1-git-send-email-yang@linux.alibaba.com > > It's a _bit_ more tied to persistent memory and it appears a b

Re: [PATCH 00/16] The new slab memory controller

2019-10-22 Thread Michal Hocko
gt; things generally simpler. What is the performance impact? Also what is the effect on the memory reclaim side and the isolation. I would expect that mixing objects from different cgroups would have a negative/unpredictable impact on the memcg slab shrinking. -- Michal Hocko SUSE Labs

Re: [PATCH 00/16] The new slab memory controller

2019-10-22 Thread Michal Hocko
On Tue 22-10-19 15:22:06, Michal Hocko wrote: > On Thu 17-10-19 17:28:04, Roman Gushchin wrote: > [...] > > Using a drgn* script I've got an estimation of slab utilization on > > a number of machines running different production workloads. In most > > cases it was between 4

Re: [PATCH 00/16] The new slab memory controller

2019-10-22 Thread Michal Hocko
ific caches that tend to utilize much worse than others? -- Michal Hocko SUSE Labs

Re: [RFC v1] mm: add page preemption

2019-10-22 Thread Michal Hocko
On Tue 22-10-19 20:14:39, Hillf Danton wrote: > > On Mon, 21 Oct 2019 14:27:28 +0200 Michal Hocko wrote: [...] > > Why do we care and which workloads would benefit and how much. > > Page preemption, disabled by default, should be turned on by those > who wish

Re: [PATCH RFC v3 6/9] mm: Allow to offline PageOffline() pages with a reference count of 0

2019-10-22 Thread Michal Hocko
On Fri 18-10-19 14:35:06, David Hildenbrand wrote: > On 18.10.19 13:20, Michal Hocko wrote: > > On Fri 18-10-19 10:50:24, David Hildenbrand wrote: > > > On 18.10.19 10:15, Michal Hocko wrote: [...] > > > > for that - MEM_GOING_OFFLINE notification.

Re: [RFC PATCH v2 10/16] mm,hwpoison: Rework soft offline for free pages

2019-10-22 Thread Michal Hocko
On Tue 22-10-19 11:58:52, Oscar Salvador wrote: > On Tue, Oct 22, 2019 at 11:22:56AM +0200, Michal Hocko wrote: > > Hmm, that might be a misunderstanding on my end. I thought that it is > > the MCE handler to say whether the failure is recoverable or not. If yes > > then we

Re: [PATCH v2 0/2] mm: Memory offlining + page isolation cleanups

2019-10-22 Thread Michal Hocko
On Tue 22-10-19 11:17:24, David Hildenbrand wrote: > On 22.10.19 11:14, Michal Hocko wrote: > > On Tue 22-10-19 10:32:11, David Hildenbrand wrote: > > [...] > > > E.g., arch/x86/kvm/mmu.c:kvm_is_mmio_pfn() > > > > Thanks for these references. I am not real

Re: [RFC PATCH v2 10/16] mm,hwpoison: Rework soft offline for free pages

2019-10-22 Thread Michal Hocko
On Tue 22-10-19 10:35:17, Oscar Salvador wrote: > On Tue, Oct 22, 2019 at 10:26:11AM +0200, Michal Hocko wrote: > > On Tue 22-10-19 09:46:20, Oscar Salvador wrote: > > [...] > > > So, opposite to hard-offline, in soft-offline we do not fiddle with pages > >

Re: [PATCH v2 0/2] mm: Memory offlining + page isolation cleanups

2019-10-22 Thread Michal Hocko
we do care about holes in RAM (from the early boot), those should be reserved already AFAIR. So we are left with hotplugged memory with holes and I am not really sure we should bother with this until there is a clear usecase in sight. -- Michal Hocko SUSE Labs

Re: [PATCH v1 1/2] mm/page_alloc.c: Don't set pages PageReserved() when offlining

2019-10-22 Thread Michal Hocko
On Tue 22-10-19 10:23:37, David Hildenbrand wrote: > On 22.10.19 10:20, Michal Hocko wrote: > > On Mon 21-10-19 17:54:35, David Hildenbrand wrote: > > > On 21.10.19 17:47, Michal Hocko wrote: > > > > On Mon 21-10-19 17:39:36, David Hildenbrand wrote: > > > &g

Re: [RFC PATCH v2 11/16] mm,hwpoison: Rework soft offline for in-use pages

2019-10-22 Thread Michal Hocko
On Tue 22-10-19 09:56:27, Oscar Salvador wrote: > On Mon, Oct 21, 2019 at 04:06:19PM +0200, Michal Hocko wrote: > > On Mon 21-10-19 15:48:48, Oscar Salvador wrote: > > > We can only perform actions on LRU/Movable pages or hugetlb pages. > > > > What would preve

Re: [RFC PATCH v2 10/16] mm,hwpoison: Rework soft offline for free pages

2019-10-22 Thread Michal Hocko
enttly from MCE (hard-offline)? -- Michal Hocko SUSE Labs

Re: [PATCH v2 0/2] mm: Memory offlining + page isolation cleanups

2019-10-22 Thread Michal Hocko
On Tue 22-10-19 10:15:07, David Hildenbrand wrote: > On 22.10.19 10:08, Michal Hocko wrote: > > On Tue 22-10-19 08:52:28, David Hildenbrand wrote: > > > On 21.10.19 19:23, David Hildenbrand wrote: > > > > Two cleanups that popped up while working on (and d

Re: [PATCH v1 1/2] mm/page_alloc.c: Don't set pages PageReserved() when offlining

2019-10-22 Thread Michal Hocko
On Mon 21-10-19 17:54:35, David Hildenbrand wrote: > On 21.10.19 17:47, Michal Hocko wrote: > > On Mon 21-10-19 17:39:36, David Hildenbrand wrote: > > > On 21.10.19 16:43, Michal Hocko wrote: > > [...] > > > > We still set PageReserved before onlining pages

Re: [PATCH v2 0/2] mm: Memory offlining + page isolation cleanups

2019-10-22 Thread Michal Hocko
to_online_page() check already). But of course, there might be special > cases I remember Alexander didn't want to change the PageReserved handling because he was worried about unforeseeable side effects. I have a vague recollection he (or maybe Dan) has promissed some follow up clean ups which didn't seem to materialize. -- Michal Hocko SUSE Labs

Re: [PATCH v1 1/2] mm/page_alloc.c: Don't set pages PageReserved() when offlining

2019-10-21 Thread Michal Hocko
On Mon 21-10-19 17:39:36, David Hildenbrand wrote: > On 21.10.19 16:43, Michal Hocko wrote: [...] > > We still set PageReserved before onlining pages and that one should be > > good to go as well (memmap_init_zone). > > Thanks! > > memmap_init_zone() is called when onli

Re: [RFC PATCH v2 10/16] mm,hwpoison: Rework soft offline for free pages

2019-10-21 Thread Michal Hocko
On Mon 21-10-19 14:58:49, Oscar Salvador wrote: > On Fri, Oct 18, 2019 at 02:06:15PM +0200, Michal Hocko wrote: > > On Thu 17-10-19 16:21:17, Oscar Salvador wrote: > > [...] > > > +bool take_page_off_buddy(struct page *page) > > > + { > > > + struct zone

Re: [PATCH v1 2/2] mm/page_isolation.c: Convert SKIP_HWPOISON to MEMORY_OFFLINE

2019-10-21 Thread Michal Hocko
e such > memory. > > Let's generalize the approach so we can special case other types of > pages we want to skip over in case we offline memory. While at it, also > pass the same flags to test_pages_isolated(). > > Cc: Michal Hocko > Cc: Oscar Salvador > Cc: Andrew Morton

Re: [PATCH v1 1/2] mm/page_alloc.c: Don't set pages PageReserved() when offlining

2019-10-21 Thread Michal Hocko
ges were set > PageReserved so re-onling would work as expected). > > Cc: Andrew Morton > Cc: Michal Hocko > Cc: Vlastimil Babka > Cc: Oscar Salvador > Cc: Mel Gorman > Cc: Mike Rapoport > Cc: Dan Williams > Cc: Wei Yang > Cc: Alexander Duyck > Cc: Anshuman Kha

Re: [PATCH 1/3] mm, meminit: Recalculate pcpu batch and high limits after init completes

2019-10-21 Thread Michal Hocko
t_sleep+0x334/0x370 > [ 15.590588][ T658] [c0003d8cfbb0] [c094a784] > __mutex_lock+0x84/0xb20 > [ 15.590643][ T658] [c0003d8cfcc0] [c0954038] > zone_pcp_update+0x34/0x64 > [ 15.590689][ T658] [c0003d8cfcf0] [c0b9e6bc] > deferred_init_memmap+0x1b8/0x26c > [ 15.590739][ T658] [c0003d8cfdb0] [c0149528] > kthread+0x1a8/0x1b0 > [ 15.590790][ T658] [c0003d8cfe20] [c000b748] > ret_from_kernel_thread+0x5c/0x74 -- Michal Hocko SUSE Labs

Re: [RFC PATCH v2 11/16] mm,hwpoison: Rework soft offline for in-use pages

2019-10-21 Thread Michal Hocko
On Mon 21-10-19 15:48:48, Oscar Salvador wrote: > On Fri, Oct 18, 2019 at 02:39:01PM +0200, Michal Hocko wrote: > > > > I am sorry but I got lost in the above description and I cannot really > > make much sense from the code either. Let me try to outline the way

Re: [RFC v1] mm: add page preemption

2019-10-21 Thread Michal Hocko
re compared when deactivating lru > pages, and skip page if it is higher on prio. > > V1 is based on next-20191018. > > Changes since v0 > - s/page->nice/page->prio/ > - drop the role of kswapd's reclaiming prioirty in prio comparison > - add pgdat->kswapd_prio &g

Re: [RFC PATCH v2 02/16] mm,madvise: call soft_offline_page() without MF_COUNT_INCREASED

2019-10-21 Thread Michal Hocko
On Mon 21-10-19 07:02:55, Naoya Horiguchi wrote: > On Fri, Oct 18, 2019 at 01:52:27PM +0200, Michal Hocko wrote: > > On Thu 17-10-19 16:21:09, Oscar Salvador wrote: > > > From: Naoya Horiguchi > > > > > > The call to get_user_pages_fast is only to ge

Re: [RFC PATCH v2 01/16] mm,hwpoison: cleanup unused PageHuge() check

2019-10-21 Thread Michal Hocko
On Mon 21-10-19 07:00:46, Naoya Horiguchi wrote: > On Fri, Oct 18, 2019 at 01:48:32PM +0200, Michal Hocko wrote: > > On Thu 17-10-19 16:21:08, Oscar Salvador wrote: > > > From: Naoya Horiguchi > > > > > > Drop the PageHuge check since memory_failure fork

Re: [RFC v1] memcg: add memcg lru for page reclaiming

2019-10-21 Thread Michal Hocko
it/Kconfig > - drop changes in mm/vmscan.c > - make memcg lru work in parallel to slr > > Cc: Chris Down > Cc: Tejun Heo > Cc: Roman Gushchin > Cc: Michal Hocko > Cc: Johannes Weiner > Cc: Shakeel Butt > Cc: Matthew Wilcox > Cc: Minchan Kim > Cc: Mel Go

Re: [PATCH 1/3] mm, meminit: Recalculate pcpu batch and high limits after init completes

2019-10-21 Thread Michal Hocko
batch: 1 > 768 batch: 63 > 256 high: 0 > 768 high: 378 > > Cc: sta...@vger.kernel.org # v4.1+ > Signed-off-by: Mel Gorman Acked-by: Michal Hocko > --- > mm/page_alloc.c | 8 > 1 file changed,

Re: [patch 07/26] mm/memunmap: don't access uninitialized memmap in memunmap_pages()

2019-10-21 Thread Michal Hocko
On Mon 21-10-19 10:28:16, David Hildenbrand wrote: > On 21.10.19 10:26, Michal Hocko wrote: > > Has this been properly reviewed? I do not see any Acks nor Reviewed-bys. > > > > As I modified this patch while carrying it along, it at least has my > implicit Ack/RB. OK,

Re: [patch 06/26] mm/memory_hotplug: don't access uninitialized memmaps in shrink_pgdat_span()

2019-10-21 Thread Michal Hocko
_start_pfn) > + node_start_pfn = zone->zone_start_pfn; > } > > - /* The pgdat has no valid section */ > - pgdat->node_start_pfn = 0; > - pgdat->node_spanned_pages = 0; > + pgdat->node_start_pfn = node_start_pfn; > + pgdat->node_spanned_pages = node_end_pfn - node_start_pfn; > } > > static void __remove_zone(struct zone *zone, unsigned long start_pfn, > @@ -507,7 +465,7 @@ static void __remove_zone(struct zone *z > > pgdat_resize_lock(zone->zone_pgdat, ); > shrink_zone_span(zone, start_pfn, start_pfn + nr_pages); > - shrink_pgdat_span(pgdat, start_pfn, start_pfn + nr_pages); > + update_pgdat_span(pgdat); > pgdat_resize_unlock(zone->zone_pgdat, ); > } > > _ -- Michal Hocko SUSE Labs

Re: [patch 07/26] mm/memunmap: don't access uninitialized memmap in memunmap_pages()

2019-10-21 Thread Michal Hocko
ogan Gunthorpe > Cc: Ira Weiny > Cc: Damian Tometzki > Cc: Alexander Duyck > Cc: Alexander Potapenko > Cc: Andy Lutomirski > Cc: Anshuman Khandual > Cc: Benjamin Herrenschmidt > Cc: Borislav Petkov > Cc: Catalin Marinas > Cc: Christian Borntraeger > Cc: Ch

Re: [patch for-5.3 0/4] revert immediate fallback to remote hugepages

2019-10-18 Thread Michal Hocko
ensed form On Tue 01-10-19 10:37:43, Michal Hocko wrote: > I have split out my kvm machine into two nodes to get at least some > idea how these patches behave > $ numactl -H > available: 2 nodes (0-1) > node 0 cpus: 0 2 > node 0 size: 475 MB > node 0 free: 432 MB > node 1 cpus

Re: [PATCH 3/3] mm, pcpu: Make zone pcp updates and reset internal to the mm

2019-10-18 Thread Michal Hocko
On Fri 18-10-19 11:56:06, Mel Gorman wrote: > Memory hotplug needs to be able to reset and reinit the pcpu allocator > batch and high limits but this action is internal to the VM. Move > the declaration to internal.h > > Signed-off-by: Mel Gorman Acked-by: Michal Hocko > --

Re: [PATCH 2/3] mm, meminit: Recalculate pcpu batch and high limits after init completes

2019-10-18 Thread Michal Hocko
sta...@vger.kernel.org # v4.15+ Hmm, are you sure about 4.15? Doesn't this go all the way down to deferred initialization? I do not see any recent changes on when setup_per_cpu_pageset is called. > Signed-off-by: Mel Gorman Acked-by: Michal Hocko > --- > mm/page_alloc.c |

Re: [PATCH 1/3] mm, pcp: Share common code between memory hotplug and percpu sysctl handler

2019-10-18 Thread Michal Hocko
> > Signed-off-by: Mel Gorman Acked-by: Michal Hocko > --- > mm/page_alloc.c | 23 --- > 1 file changed, 12 insertions(+), 11 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index c0b2e0306720..cafe568d36f6 100644 > --- a/mm/page_al

Re: [RFC PATCH v2 11/16] mm,hwpoison: Rework soft offline for in-use pages

2019-10-18 Thread Michal Hocko
This > - * test is performed under the zone lock to prevent a race against page > - * allocation. > - */ > -bool set_hwpoison_free_buddy_page(struct page *page) > -{ > - struct zone *zone = page_zone(page); > - unsigned long pfn = page_to_pfn(page); > - unsigned long flags; > - unsigned int order; > - bool hwpoisoned = false; > - > - spin_lock_irqsave(>lock, flags); > - for (order = 0; order < MAX_ORDER; order++) { > - struct page *page_head = page - (pfn & ((1 << order) - 1)); > - > - if (PageBuddy(page_head) && page_order(page_head) >= order) { > - if (!TestSetPageHWPoison(page)) > - hwpoisoned = true; > - break; > - } > - } > - spin_unlock_irqrestore(>lock, flags); > - > - return hwpoisoned; > -} > #endif > -- > 2.12.3 -- Michal Hocko SUSE Labs

Re: [RFC PATCH v2 10/16] mm,hwpoison: Rework soft offline for free pages

2019-10-18 Thread Michal Hocko
pin_unlock_irqrestore(>lock, flags); > + return ret; > + } > + > +/* > * Set PG_hwpoison flag if a given page is confirmed to be a free page. This > * test is performed under the zone lock to prevent a race against page > * allocation. > -- > 2.12.3 -- Michal Hocko SUSE Labs

Re: [RFC PATCH v2 02/16] mm,madvise: call soft_offline_page() without MF_COUNT_INCREASED

2019-10-18 Thread Michal Hocko
reference taken by get_user_pages_fast(). In > - * the absence of MF_COUNT_INCREASED the memory_failure() > - * routine is responsible for pinning the page to prevent it > - * from being released back to the page allocator. > - */ > - put_page(page); > ret = memory_failure(pfn, 0); > if (ret) > return ret; > -- > 2.12.3 > -- Michal Hocko SUSE Labs

Re: [RFC PATCH v2 01/16] mm,hwpoison: cleanup unused PageHuge() check

2019-10-18 Thread Michal Hocko
> + page_flags = p->flags; > > /* >* unpoison always clear PG_hwpoison inside page lock > -- > 2.12.3 -- Michal Hocko SUSE Labs

<    9   10   11   12   13   14   15   16   17   18   >