Re: [RFC-PATCH 2/4] mm: Add __rcu_alloc_page_lockless() func.

2020-10-07 Thread Michal Hocko
On Wed 07-10-20 00:25:29, Uladzislau Rezki wrote: > On Mon, Oct 05, 2020 at 05:41:00PM +0200, Michal Hocko wrote: > > On Mon 05-10-20 17:08:01, Uladzislau Rezki wrote: > > > On Fri, Oct 02, 2020 at 11:05:07AM +0200, Michal Hocko wrote: > > > > On Fri 02-10

Re: [RFC][PATCH 00/12] mm: tweak page cache migration

2020-10-07 Thread Michal Hocko
> Cc: linux...@kvack.org > Cc: linux-kernel@vger.kernel.org -- Michal Hocko SUSE Labs

Re: [PATCH v2 3/5] mm/page_alloc: move pages to tail in move_to_free_list()

2020-10-06 Thread Michal Hocko
lvador > Acked-by: Pankaj Gupta > Reviewed-by: Wei Yang > Cc: Andrew Morton > Cc: Alexander Duyck > Cc: Mel Gorman > Cc: Michal Hocko > Cc: Dave Hansen > Cc: Vlastimil Babka > Cc: Wei Yang > Cc: Oscar Salvador > Cc: Mike Rapoport > Cc: Scott C

Re: [RFC PATCH v2 00/30] 1GB PUD THP support on x86_64

2020-10-06 Thread Michal Hocko
getting. > I just do not get why hugetlbfs is so special that it can have pagesize > fine control when normal pages cannot get. The “it should be invisible > to userpsace” argument suddenly does not hold for hugetlbfs. In short it provides a guarantee. Does the above clarifies it a bit? [1] this is not entirely true though because there is a non-trivial admin interface around THP. Mostly because they turned out to be too transparent and many people do care about internal fragmentation, allocation latency, locality (small page on a local node or a large on a slightly further one?) or simply follow a cargo cult - just have a look how many admin guides recommend disabling THPs. We got seriously burned by 2MB THP because of the way how they were enforced on users. -- Michal Hocko SUSE Labs

Re: [PATCH 9/9] mm, page_alloc: optionally disable pcplists during page isolation

2020-10-06 Thread Michal Hocko
On Tue 06-10-20 10:40:23, David Hildenbrand wrote: > On 06.10.20 10:34, Michal Hocko wrote: > > On Tue 22-09-20 16:37:12, Vlastimil Babka wrote: > >> Page isolation can race with process freeing pages to pcplists in a way > >> that > >> a page from isol

Re: [PATCH 9/9] mm, page_alloc: optionally disable pcplists during page isolation

2020-10-06 Thread Michal Hocko
(). > > [1] > https://lore.kernel.org/linux-mm/20200903140032.380431-1-pasha.tatas...@soleen.com/ > > Suggested-by: David Hildenbrand > Suggested-by: Michal Hocko > Signed-off-by: Vlastimil Babka > --- > include/linux/mmzone.h | 2 ++ > include/linux/page-isol

Re: [RFC V2] mm/vmstat: Add events for HugeTLB migration

2020-10-06 Thread Michal Hocko
On Tue 06-10-20 08:26:35, Anshuman Khandual wrote: > > > On 10/05/2020 11:35 AM, Michal Hocko wrote: > > On Mon 05-10-20 07:59:12, Anshuman Khandual wrote: > >> > >> > >> On 10/02/2020 05:34 PM, Michal Hocko wrote: > >>> On Wed 30-09-20 11:30

Re: [PATCH 9/9] mm, page_alloc: optionally disable pcplists during page isolation

2020-10-05 Thread Michal Hocko
On Mon 05-10-20 16:22:46, Vlastimil Babka wrote: > On 10/5/20 4:05 PM, Michal Hocko wrote: > > On Fri 25-09-20 13:10:05, Vlastimil Babka wrote: > >> On 9/25/20 12:54 PM, David Hildenbrand wrote: > >> > >> Hmm that temporary write lock would still block ne

Re: [RFC-PATCH 2/4] mm: Add __rcu_alloc_page_lockless() func.

2020-10-05 Thread Michal Hocko
On Mon 05-10-20 17:08:01, Uladzislau Rezki wrote: > On Fri, Oct 02, 2020 at 11:05:07AM +0200, Michal Hocko wrote: > > On Fri 02-10-20 09:50:14, Mel Gorman wrote: > > > On Fri, Oct 02, 2020 at 09:11:23AM +0200, Michal Hocko wrote: > > > > On Thu 01-10-20 21

Re: [PATCH 9/9] mm, page_alloc: optionally disable pcplists during page isolation

2020-10-05 Thread Michal Hocko
f (atomic_inc_return == 1) > // atomic_cmpxchg from 0 to 1; if that fails, goto retry > > Tricky, but races could only read to unnecessary duplicated updates + flushing > but nothing worse? > > Or add another spinlock to cover this part instead of the temp write lock... Do you plan to post a new version or should I review this one? -- Michal Hocko SUSE Labs

Re: [PATCH 8/9] mm, page_alloc: drain all pcplists during memory offline

2020-10-05 Thread Michal Hocko
NATION > > > > /* > > > > Interesting race. Instead of this ugly __drain_all_pages() with a > boolean parameter, can we have two properly named functions to be used > in !page_alloc.c code without scratching your head what the difference is? I tend to agree h

Re: [PATCH 7/9] mm, page_alloc: move draining pcplists to page isolation users

2020-10-05 Thread Michal Hocko
the > current imperfect draining to the callers also as a preparation step. > > Suggested-by: Pavel Tatashin > Signed-off-by: Vlastimil Babka Acked-by: Michal Hocko > --- > mm/memory_hotplug.c | 11 ++- > mm/page_alloc.c | 2 ++ > mm/page_isolation.c |

Re: [PATCH 6/9] mm, page_alloc: cache pageset high and batch in struct zone

2020-10-05 Thread Michal Hocko
per cpu variable into the per cpu area. >*/ > zone->pageset = _pageset; > + zone->pageset_high = BOOT_PAGESET_HIGH; > + zone->pageset_batch = BOOT_PAGESET_BATCH; > > if (populated_zone(zone)) > printk(KERN_DEBUG " %s zone: %lu pages, LIFO batch:%u\n", > -- > 2.28.0 -- Michal Hocko SUSE Labs

Re: [PATCH 5/9] mm, page_alloc: make per_cpu_pageset accessible only after init

2020-10-05 Thread Michal Hocko
> + new_pageset = alloc_percpu(struct per_cpu_pageset); > for_each_possible_cpu(cpu) { > - p = per_cpu_ptr(zone->pageset, cpu); > + p = per_cpu_ptr(new_pageset, cpu); > pageset_init(p); > } > > + smp_store_release(>pageset, new_pageset); > zone_set_pageset_high_and_batch(zone); > } > > -- > 2.28.0 -- Michal Hocko SUSE Labs

Re: [PATCH 4/9] mm, page_alloc: simplify pageset_update()

2020-10-05 Thread Michal Hocko
this should be safe AFAICS. I believe the original intention was well minded but didn't go all the way to do the thing properly. I have to admit I have stumbled over this weirdness few times and never found enough motivation to think that through. Acked-by: Michal Hocko > --- > mm/page_allo

Re: [PATCH 2/9] mm, page_alloc: calculate pageset high and batch once per zone

2020-10-05 Thread Michal Hocko
to all per-cpu pagesets of the zone. > > This also allows removing the zone_pageset_init() and __zone_pcp_update() > wrappers. > > No functional change. > > Signed-off-by: Vlastimil Babka > Reviewed-by: Oscar Salvador > Reviewed-by: David Hildenbrand I like this.

Re: [PATCH 3/9] mm, page_alloc: remove setup_pageset()

2020-10-05 Thread Michal Hocko
. Isn't this more about early zone initialization rather than boot pagesets? Or am I misreading the patch? > + */ > + pcp->high = 0; > + pcp->batch = 1; > } > > /* > -- > 2.28.0 -- Michal Hocko SUSE Labs

Re: [PATCH 1/9] mm, page_alloc: clean up pageset high and batch update

2020-10-05 Thread Michal Hocko
nstead. > > No functional change. > > Signed-off-by: Vlastimil Babka > Reviewed-by: Oscar Salvador yes this looks better, the original code was really hard to follow. Acked-by: Michal Hocko > --- > mm/page_alloc.c | 49 - > 1

Re: [PATCH] mm: optionally disable brk()

2020-10-05 Thread Michal Hocko
On Mon 05-10-20 11:13:48, David Hildenbrand wrote: > On 05.10.20 08:12, Michal Hocko wrote: > > On Sat 03-10-20 00:44:09, Topi Miettinen wrote: > >> On 2.10.2020 20.52, David Hildenbrand wrote: > >>> On 02.10.20 19:19, Topi Miettinen wrote: > >>>>

Re: [PATCH RFC v2] Opportunistic memory reclaim

2020-10-05 Thread Michal Hocko
A similar thing has been proposed recently by Shakeel http://lkml.kernel.org/r/20200909215752.1725525-1-shake...@google.com Please have a look at the follow up discussion. -- Michal Hocko SUSE Labs

Re: [PATCH] mm: optionally disable brk()

2020-10-05 Thread Michal Hocko
ompatibility with legacy software is more important than any hardening. I believe we already do have means to filter syscalls from userspace for security hardened environements. Or is there any reason to duplicate that and control during the configuration time? -- Michal Hocko SUSE Labs

Re: [PATCH] mm/util.c: Add error logs for commitment overflow

2020-10-05 Thread Michal Hocko
On Fri 02-10-20 21:53:37, pi...@codeaurora.org wrote: > On 2020-10-02 17:47, Michal Hocko wrote: > > > > __vm_enough_memory: commitment overflow: ppid:150, pid:164, > > > pages:62451 > > > fork failed[count:0]: Cannot allocate memory > > > > While I u

Re: [PATCH v1 3/5] mm/page_alloc: always move pages to the tail of the freelist in unset_migratetype_isolate()

2020-10-05 Thread Michal Hocko
On Fri 02-10-20 17:20:09, David Hildenbrand wrote: > On 02.10.20 15:24, Michal Hocko wrote: > > On Mon 28-09-20 20:21:08, David Hildenbrand wrote: > >> Page isolation doesn't actually touch the pages, it simply isolates > >> pageblocks and moves all free pages to t

Re: [PATCH] mm: optionally disable brk()

2020-10-05 Thread Michal Hocko
onfigure? How do I know that something won't break? brk() is one of those syscalls that has been here for ever and a lot of userspace might depend on it. I haven't checked but the code size is very unlikely to be shrunk much as this is mostly a tiny wrapper around mmap code. We are not going to get rid of any complexity. So what is the point? -- Michal Hocko SUSE Labs

Re: [RFC V2] mm/vmstat: Add events for HugeTLB migration

2020-10-05 Thread Michal Hocko
On Mon 05-10-20 07:59:12, Anshuman Khandual wrote: > > > On 10/02/2020 05:34 PM, Michal Hocko wrote: > > On Wed 30-09-20 11:30:49, Anshuman Khandual wrote: > >> Add following new vmstat events which will track HugeTLB page migration. > >> > &g

Re: [RFC PATCH 1/1] vmscan: Support multiple kswapd threads per node

2020-10-02 Thread Michal Hocko
On Fri 02-10-20 09:53:05, Rik van Riel wrote: > On Fri, 2020-10-02 at 09:03 +0200, Michal Hocko wrote: > > On Thu 01-10-20 18:18:10, Sebastiaan Meijer wrote: > > > (Apologies for messing up the mailing list thread, Gmail had fooled > > > me into > > >

Re: [PATCH v1 5/5] mm/memory_hotplug: update comment regarding zone shuffling

2020-10-02 Thread Michal Hocko
c: Alexander Duyck > Cc: Mel Gorman > Cc: Michal Hocko > Cc: Dave Hansen > Cc: Vlastimil Babka > Cc: Wei Yang > Cc: Oscar Salvador > Cc: Mike Rapoport > Signed-off-by: David Hildenbrand Acked-by: Michal Hocko > --- > mm/memory_hotplug.c | 11 --- >

Re: [PATCH v1 4/5] mm/page_alloc: place pages to tail in __free_pages_core()

2020-10-02 Thread Michal Hocko
do not expect this to make a huge difference but who knows. It makes some sense to add pages in the order they show up in the physical address ordering. > Reviewed-by: Vlastimil Babka > Reviewed-by: Oscar Salvador > Cc: Andrew Morton > Cc: Alexander Duyck > Cc: Mel Gorman > Cc:

Re: [PATCH v1 3/5] mm/page_alloc: always move pages to the tail of the freelist in unset_migratetype_isolate()

2020-10-02 Thread Michal Hocko
Salvador > Cc: Andrew Morton > Cc: Alexander Duyck > Cc: Mel Gorman > Cc: Michal Hocko > Cc: Dave Hansen > Cc: Vlastimil Babka > Cc: Wei Yang > Cc: Oscar Salvador > Cc: Mike Rapoport > Cc: Scott Cheloha > Cc: Michael Ellerman > Signed-off-by:

Re: [PATCH v1 2/5] mm/page_alloc: place pages to tail in __putback_isolated_page()

2020-10-02 Thread Michal Hocko
oing isolation of larger ranges, and after > free_contig_range(). > > Reviewed-by: Alexander Duyck > Reviewed-by: Oscar Salvador > Cc: Andrew Morton > Cc: Alexander Duyck > Cc: Mel Gorman > Cc: Michal Hocko > Cc: Dave Hansen > Cc: Vlastimil Babka > Cc: Wei

Re: [PATCH v1 1/5] mm/page_alloc: convert "report" flag of __free_one_page() to a proper flag

2020-10-02 Thread Michal Hocko
good enough for internal purposes. > > Reviewed-by: Alexander Duyck > Reviewed-by: Vlastimil Babka > Reviewed-by: Oscar Salvador > Cc: Andrew Morton > Cc: Alexander Duyck > Cc: Mel Gorman > Cc: Michal Hocko > Cc: Dave Hansen > Cc: Vlastimil Babka > C

Re: [PATCH] mm/util.c: Add error logs for commitment overflow

2020-10-02 Thread Michal Hocko
es); > > + pr_err_once("%s: commitment overflow: ppid:%d, pid:%d, pages:%ld\n", > + __func__, current->parent->pid, current->pid, pages); > + > return -ENOMEM; > } > > -- > Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc., > is a member of Code Aurora Forum, a Linux Foundation Collaborative Project. -- Michal Hocko SUSE Labs

Re: [RFC V2] mm/vmstat: Add events for HugeTLB migration

2020-10-02 Thread Michal Hocko
gt; + count_vm_events(HUGETLB_MIGRATION_SUCCESS, nr_hugetlb_succeeded); > + count_vm_events(HUGETLB_MIGRATION_FAIL, nr_hugetlb_failed); > trace_mm_migrate_pages(nr_succeeded, nr_failed, nr_thp_succeeded, > -nr_thp_failed, nr_thp_split, mode, reason); > +nr_thp_failed, nr_thp_split, > nr_hugetlb_succeeded, > +nr_hugetlb_failed, mode, reason); > > if (!swapwrite) > current->flags &= ~PF_SWAPWRITE; > diff --git a/mm/vmstat.c b/mm/vmstat.c > index 79e5cd0abd0e..12fd35ba135f 100644 > --- a/mm/vmstat.c > +++ b/mm/vmstat.c > @@ -1286,6 +1286,8 @@ const char * const vmstat_text[] = { > "thp_migration_success", > "thp_migration_fail", > "thp_migration_split", > + "hugetlb_migration_success", > + "hugetlb_migration_fail", > #endif > #ifdef CONFIG_COMPACTION > "compact_migrate_scanned", > -- > 2.20.1 > -- Michal Hocko SUSE Labs

Re: [v5] mm: khugepaged: recalculate min_free_kbytes after memory hotplug as expected by khugepaged

2020-10-02 Thread Michal Hocko
may help with the > next message. Auto tuning and user provided override is quite tricky to get sensible. Especially in the case here. Admin has provided an override but has the potential memory hotplug been considered? Or to make it even more complicated, consider that the hotplug happens without admin involvement - e.g. memory gets hotremoved due to HW problems. Is the admin provided value still meaningful? To be honest I do not have a good answer and I am not sure we should care all that much until we see practical problems. -- Michal Hocko SUSE Labs

Re: [RFC-PATCH 2/4] mm: Add __rcu_alloc_page_lockless() func.

2020-10-02 Thread Michal Hocko
On Fri 02-10-20 09:50:14, Mel Gorman wrote: > On Fri, Oct 02, 2020 at 09:11:23AM +0200, Michal Hocko wrote: > > On Thu 01-10-20 21:26:26, Uladzislau Rezki wrote: > > > > > > > > No, I meant going back to idea of new gfp flag, but adjust the > > >

Re: [RFC PATCH v2 00/30] 1GB PUD THP support on x86_64

2020-10-02 Thread Michal Hocko
y do not want to be very explicit about that. E.g. an interface for address space defragmentation without any more specifics sounds like a useful feature to me. It will be up to the kernel to decide which huge pages to use. -- Michal Hocko SUSE Labs

Re: [RFC PATCH v2 00/30] 1GB PUD THP support on x86_64

2020-10-02 Thread Michal Hocko
On Thu 01-10-20 11:14:14, Zi Yan wrote: > On 30 Sep 2020, at 7:55, Michal Hocko wrote: > > > On Mon 28-09-20 13:53:58, Zi Yan wrote: > >> From: Zi Yan > >> > >> Hi all, > >> > >> This patchset adds support for 1GB PUD THP on x86

Re: [RFC-PATCH 2/4] mm: Add __rcu_alloc_page_lockless() func.

2020-10-02 Thread Michal Hocko
f a new gfp flag gains a sufficient traction and support I am _strongly_ opposed against consuming another flag for that. Bit space is limited. Besides that we certainly do not want to allow craziness like __GFP_NO_LOCK | __GFP_RECLAIM (and similar), do we? -- Michal Hocko SUSE Labs

Re: [RFC PATCH 1/1] vmscan: Support multiple kswapd threads per node

2020-10-02 Thread Michal Hocko
On Thu 01-10-20 18:18:10, Sebastiaan Meijer wrote: > (Apologies for messing up the mailing list thread, Gmail had fooled me into > believing that it properly picked up the thread) > > On Thu, 1 Oct 2020 at 14:30, Michal Hocko wrote: > > > > On Wed 30-09-20 21:27:12,

Re: [PATCH tip/core/rcu 14/15] rcu/tree: Allocate a page when caller is preemptible

2020-10-02 Thread Michal Hocko
This commit applies the __GFP_NOMEMALLOC gfp flag to memory allocations > carried out by the single-argument variant of kvfree_rcu(), thus avoiding > this can-sleep code path from dipping into the emergency reserves. > > Suggested-by: Michal Hocko > Signed-off-by: Paul E. McKe

Re: [RFC PATCH 1/1] vmscan: Support multiple kswapd threads per node

2020-10-01 Thread Michal Hocko
t in a context outside of the reclaim? My recollection of the particular patch is dimm but I do remember it tried to add more kswapd threads which would just paper over the problem you are seein rather than solve it. -- Michal Hocko SUSE Labs

Re: [PATCH tip/core/rcu 14/15] rcu/tree: Allocate a page when caller is preemptible

2020-10-01 Thread Michal Hocko
On Wed 30-09-20 16:21:54, Paul E. McKenney wrote: > On Wed, Sep 30, 2020 at 10:41:39AM +0200, Michal Hocko wrote: > > On Tue 29-09-20 18:53:27, Paul E. McKenney wrote: [...] > > > No argument on it being confusing, and I hope that the added header > > > comment helps.

Re: [RFC-PATCH 2/4] mm: Add __rcu_alloc_page_lockless() func.

2020-09-30 Thread Michal Hocko
On Wed 30-09-20 13:03:29, Joel Fernandes wrote: > On Wed, Sep 30, 2020 at 12:48 PM Michal Hocko wrote: > > > > On Wed 30-09-20 11:25:17, Joel Fernandes wrote: > > > On Fri, Sep 25, 2020 at 05:47:41PM +0200, Michal Hocko wrote: > > > > On Fri 25-09

Re: [RFC-PATCH 2/4] mm: Add __rcu_alloc_page_lockless() func.

2020-09-30 Thread Michal Hocko
On Wed 30-09-20 11:25:17, Joel Fernandes wrote: > On Fri, Sep 25, 2020 at 05:47:41PM +0200, Michal Hocko wrote: > > On Fri 25-09-20 17:31:29, Uladzislau Rezki wrote: > > > > > > > > > > > > > > All good points! > > > > > > >

Re: [RFC-PATCH 2/4] mm: Add __rcu_alloc_page_lockless() func.

2020-09-30 Thread Michal Hocko
On Wed 30-09-20 15:39:54, Uladzislau Rezki wrote: > On Wed, Sep 30, 2020 at 02:44:13PM +0200, Michal Hocko wrote: > > On Wed 30-09-20 14:35:35, Uladzislau Rezki wrote: > > > On Wed, Sep 30, 2020 at 11:27:32AM +0200, Michal Hocko wrote: > > > > On Tue 29-09-20 18

Re: [RFC-PATCH 2/4] mm: Add __rcu_alloc_page_lockless() func.

2020-09-30 Thread Michal Hocko
On Wed 30-09-20 14:35:35, Uladzislau Rezki wrote: > On Wed, Sep 30, 2020 at 11:27:32AM +0200, Michal Hocko wrote: > > On Tue 29-09-20 18:25:14, Uladzislau Rezki wrote: > > > > > I look at it in scope of GFP_ATOMIC/GFP_NOWAIT issues, i.e. inability > > > > > t

Re: [RFC PATCH v2 00/30] 1GB PUD THP support on x86_64

2020-09-30 Thread Michal Hocko
do we need some sort of access control or privilege check as some THPs would be a really scarce (like those that require pre-reservation). -- Michal Hocko SUSE Labs

Re: [PATCH v3] mm: memcontrol: reword obsolete comment of mem_cgroup_unmark_under_oom()

2020-09-30 Thread Michal Hocko
ense > here because mem_cgroup_oom_lock() does not operate on under_oom field. So > we reword the comment as this would be helpful. > [Thanks Michal Hocko for rewording this comment.] > > Signed-off-by: Miaohe Lin > Cc: Johannes Weiner > Cc: Michal Hocko > Cc: Vladimir Davydov Ac

Re: [RFC-PATCH 2/4] mm: Add __rcu_alloc_page_lockless() func.

2020-09-30 Thread Michal Hocko
e want users to be aware of internal implementation details like pcp caches, migrate types or others. While pcp caches are here for years and unlikely to change in a foreseeable future many details are changing on regular basis. -- Michal Hocko SUSE Labs

Re: [RFC-PATCH 2/4] mm: Add __rcu_alloc_page_lockless() func.

2020-09-30 Thread Michal Hocko
FP_ATOMIC users can not sleep and need the allocation to succeed. A %lower > > > should be rephrased, IMHO. Any suggestions? Or more specifics about which part is conflicting? It tries to say that there is a higher demand to succeed even though the context cannot sleep to take active measures to achieve that. So the only way to achieve that is to break the watermakrs to a certain degree which is making them more "higher class" than other allocations. -- Michal Hocko SUSE Labs

Re: [PATCH v2] mm: memcontrol: remove obsolete comment of mem_cgroup_unmark_under_oom()

2020-09-30 Thread Michal Hocko
On Wed 30-09-20 01:34:25, linmiaohe wrote: > Michal Hocko wrote: > > On Thu 17-09-20 06:59:00, Miaohe Lin wrote: > >> Since commit 79dfdaccd1d5 ("memcg: make oom_lock 0 and 1 based rather > >> than counter"), the mem_cgroup_unmark_under

Re: [PATCH tip/core/rcu 14/15] rcu/tree: Allocate a page when caller is preemptible

2020-09-30 Thread Michal Hocko
On Tue 29-09-20 18:53:27, Paul E. McKenney wrote: > On Tue, Sep 29, 2020 at 02:07:56PM +0200, Michal Hocko wrote: > > On Mon 28-09-20 16:31:01, paul...@kernel.org wrote: > > [...] > > Apologies for the delay, but today has not been boring. > > > > This

Re: [PATCH] memcg: introduce per-memcg reclaim interface

2020-09-29 Thread Michal Hocko
cause that tends to be tricky from the configuration POV as you mentioned above. But a new limit (memory.middle for a lack of a better name) to define the background reclaim sounds like a good fit with above points. -- Michal Hocko SUSE Labs

Re: [v4] mm: khugepaged: recalculate min_free_kbytes after memory hotplug as expected by khugepaged

2020-09-29 Thread Michal Hocko
ult set when THP enabled is lost. This change restores min_free_kbytes > as expected for THP consumers. > > Fixes: f000565adb77 ("thp: set recommended min free kbytes") > > Signed-off-by: Vijay Balakrishna > Cc: sta...@vger.kernel.org > Reviewed-by: Pavel Tatashin

Re: [patch 00/13] preempt: Make preempt count unconditional

2020-09-29 Thread Michal Hocko
On Tue 29-09-20 11:00:03, Daniel Vetter wrote: > On Tue, Sep 29, 2020 at 10:19:38AM +0200, Michal Hocko wrote: > > On Wed 16-09-20 23:43:02, Daniel Vetter wrote: > > > I can > > > then figure out whether it's better to risk not spotting issues with > > > call_

Re: [PATCH v2] mm: memcontrol: remove obsolete comment of mem_cgroup_unmark_under_oom()

2020-09-29 Thread Michal Hocko
oom, > - * mem_cgroup_oom_lock() may not be called. Watch for underflow. > - */ > spin_lock(_oom_lock); > for_each_mem_cgroup_tree(iter, memcg) > if (iter->under_oom > 0) > -- > 2.19.1 -- Michal Hocko SUSE Labs

[PATCH] mm: clarify usage of GFP_ATOMIC in !preemptible contexts

2020-09-29 Thread Michal Hocko
From: Michal Hocko There is a general understanding that GFP_ATOMIC/GFP_NOWAIT are to be used from atomic contexts. E.g. from within a spin lock or from the IRQ context. This is correct but there are some atomic contexts where the above doesn't hold. One of them would be an NMI context. Page

Re: [PATCH tip/core/rcu 14/15] rcu/tree: Allocate a page when caller is preemptible

2020-09-29 Thread Michal Hocko
>bkvhead[idx] || > + (*krcp)->bkvhead[idx]->nr_records == > KVFREE_BULK_MAX_ENTR) { > + bnode = get_cached_bnode(*krcp); > + if (!bnode && can_alloc_page) { > + krc_this_cpu_unlock(*krcp, *flags); > + bnode = kmalloc(PAGE_SIZE, gfp); What is the point of calling kmalloc for a PAGE_SIZE object? Wouldn't using the page allocator directly be better? -- Michal Hocko SUSE Labs

Re: [PATCH v2 for v5.9] mm/page_alloc: handle a missing case for memalloc_nocma_{save/restore} APIs

2020-09-29 Thread Michal Hocko
On Tue 29-09-20 17:38:43, Joonsoo Kim wrote: > 2020년 9월 29일 (화) 오후 5:08, Michal Hocko 님이 작성: > > > > On Mon 28-09-20 17:50:46, Joonsoo Kim wrote: > > > From: Joonsoo Kim > > > > > > memalloc_nocma_{save/restore} APIs can be used to skip page allocation

Re: [patch 00/13] preempt: Make preempt count unconditional

2020-09-29 Thread Michal Hocko
as to carefuly consider failure. This is not a random allocation mode. -- Michal Hocko SUSE Labs

Re: [PATCH v2 for v5.9] mm/page_alloc: handle a missing case for memalloc_nocma_{save/restore} APIs

2020-09-29 Thread Michal Hocko
age = __rmqueue_smallest(zone, order, > MIGRATE_HIGHATOMIC); > if (page) > trace_mm_page_alloc_zone_locked(page, order, > migratetype); But this condition is not clear to me. __rmqueue_smallest doesn't access pcp lists. Maybe I have missed the point in the original discussion but this deserves a comment at least. > -- > 2.7.4 -- Michal Hocko SUSE Labs

Re: [RFC-PATCH 2/4] mm: Add __rcu_alloc_page_lockless() func.

2020-09-25 Thread Michal Hocko
to allow access to "atomic reserves" + * watermark is applied to allow access to "atomic reserves". + * The current implementation doesn't support NMI and other non-preemptive context + * (e.g. raw_spin_lock). * * %GFP_KERNEL is typical for kernel-internal allocations. The caller requires * %ZONE_NORMAL or a lower zone for direct access but can direct reclaim. [...] -- Michal Hocko SUSE Labs

Re: Ways to deprecate /sys/devices/system/memory/memoryX/phys_device ?

2020-09-25 Thread Michal Hocko
an find or somebody will show them. Really, deprecation has never really worked. The only thing that worked was to remove the functionality and then wait for somebody to complain and revert or somehow allow the functionality without necessity to alter the userspace. As much as I would like to remove as much crud as possible I strongly suspect that the existing hotplug interface is just a lost case and it doesn't make for the best used time to put a lip stick on a pig. Even if we remove this particular interface we are not going to get rid of a lot of code or we won't gain any more sensible semantic, right? -- Michal Hocko SUSE Labs

Re: [RFC-PATCH 2/4] mm: Add __rcu_alloc_page_lockless() func.

2020-09-25 Thread Michal Hocko
am afraid that we are going in circles here. We do not have any meaningful numbers to claim memory footprint problems. There is a clear opposition to hook into page allocator for reasons already mentioned. You are looking for a dedicated memory pool and it should be quite trivial to develop one and fine tune it for your specific usecase. All that on top of page allocator. Unless this is seen as completely unfeasible based on some solid arguments then we can start talking about the page allocator itself. -- Michal Hocko SUSE Labs

Re: [v3 1/2] mm: khugepaged: recalculate min_free_kbytes after memory hotplug as expected by khugepaged

2020-09-25 Thread Michal Hocko
2070726.dlw24lf3wd3p2...@black.fi.intel.com -- Michal Hocko SUSE Labs

Re: [v3 1/2] mm: khugepaged: recalculate min_free_kbytes after memory hotplug as expected by khugepaged

2020-09-25 Thread Michal Hocko
ith this patch. I am not sure this is worth backporting to stable trees becasuse this is not a functional bug. Surprising behavior, yes, but not much more than that. Acked-by: Michal Hocko One minor comment below [...] > @@ -857,6 +858,7 @@ int __ref online_pages(unsigned long pfn, unsigned

Re: [PATCH 1/2] vmalloc: Free pages as a batch

2020-09-23 Thread Michal Hocko
page *page = area->pages[i]; > - > - BUG_ON(!page); > - __free_pages(page, 0); > - } > + release_pages(area->pages, area->nr_pages); > atomic_long_sub(area->nr_pages, _vmalloc_pages); > - > kvfree(area->pages); > } > > -- > 2.28.0 -- Michal Hocko SUSE Labs

Re: [PATCH 1/4] mm: Trial do_wp_page() simplification

2020-09-23 Thread Michal Hocko
On Mon 21-09-20 18:06:44, Michal Hocko wrote: [...] > Thanks a lot for this clarification! So I believe the only existing bug > is in documentation which should be explicit that the cgroup fd read > access is not sufficient because it also requires to have a write access > for

Re: [PATCH] memcg: introduce per-memcg reclaim interface

2020-09-22 Thread Michal Hocko
On Tue 22-09-20 11:10:17, Shakeel Butt wrote: > On Tue, Sep 22, 2020 at 9:55 AM Michal Hocko wrote: [...] > > Last but not least the memcg > > background reclaim is something that should be possible without a new > > interface. > > So, it comes down to adding m

Re: [PATCH] memcg: introduce per-memcg reclaim interface

2020-09-22 Thread Michal Hocko
On Tue 22-09-20 11:10:17, Shakeel Butt wrote: > On Tue, Sep 22, 2020 at 9:55 AM Michal Hocko wrote: [...] > > So far I have learned that you are primarily working around an > > implementation detail in the zswap which is doing the swapout path > > directly in the pageout pa

Re: Machine lockups on extreme memory pressure

2020-09-22 Thread Michal Hocko
On Tue 22-09-20 09:51:30, Shakeel Butt wrote: > On Tue, Sep 22, 2020 at 9:34 AM Michal Hocko wrote: > > > > On Tue 22-09-20 09:29:48, Shakeel Butt wrote: [...] > > > Anyways, what do you think of the in-kernel PSI based > > > oom-kill trigger. I think Johannes ha

Re: [PATCH] memcg: introduce per-memcg reclaim interface

2020-09-22 Thread Michal Hocko
On Tue 22-09-20 08:54:25, Shakeel Butt wrote: > On Tue, Sep 22, 2020 at 4:49 AM Michal Hocko wrote: > > > > On Mon 21-09-20 10:50:14, Shakeel Butt wrote: [...] > > > Let me add one more point. Even if the high limit reclaim is swift, it > > > can still tak

Re: Machine lockups on extreme memory pressure

2020-09-22 Thread Michal Hocko
On Tue 22-09-20 09:29:48, Shakeel Butt wrote: > On Tue, Sep 22, 2020 at 8:16 AM Michal Hocko wrote: > > > > On Tue 22-09-20 06:37:02, Shakeel Butt wrote: [...] > > > I talked about this problem with Johannes at LPC 2019 and I think we > > > talked abo

Re: [RFC-PATCH 2/4] mm: Add __rcu_alloc_page_lockless() func.

2020-09-22 Thread Michal Hocko
ld you please elaborate? Do not want to speculate :) It thrown 501 on me. lkml.org is quite unreliable. It works now. I will read through that. Please use lore or lkml.kernel.org/r/$msg in future. -- Michal Hocko SUSE Labs

Re: Machine lockups on extreme memory pressure

2020-09-22 Thread Michal Hocko
he second one > might help. Why does your oomd depend on memory allocation? -- Michal Hocko SUSE Labs

Re: [PATCH] memcg: introduce per-memcg reclaim interface

2020-09-22 Thread Michal Hocko
On Mon 21-09-20 10:50:14, Shakeel Butt wrote: > On Mon, Sep 21, 2020 at 9:30 AM Michal Hocko wrote: > > > > On Wed 09-09-20 14:57:52, Shakeel Butt wrote: > > > Introduce an memcg interface to trigger memory reclaim on a memory cgroup. > > > > > > Use

Re: Machine lockups on extreme memory pressure

2020-09-22 Thread Michal Hocko
much memory > pressure. > > I am wondering if anyone else has seen a similar situation in production > and if there is a recommended way to resolve this situation. I would recommend to focus on tracking down the who is blocking the further progress. -- Michal Hocko SUSE Labs

Re: [v4] mm: khugepaged: avoid overriding min_free_kbytes set by user

2020-09-22 Thread Michal Hocko
to tuned value is to be expected. The primary problem is that the hotadding memory after boot (without any user configured value) will decrease the value effectively because khugepaged tuning (set_recommended_min_free_kbytes) is not called. -- Michal Hocko SUSE Labs

Re: [PATCH] mm/memcontrol: Add the drop_cache interface for cgroup v2

2020-09-22 Thread Michal Hocko
On Tue 22-09-20 16:06:31, Yafang Shao wrote: > On Tue, Sep 22, 2020 at 3:27 PM Michal Hocko wrote: [...] > > What is the latency triggered by the memory reclaim? It should be mostly > > a clean page cache right as drop_caches only drops clean pages. Or is > > this more ab

Re: [RFC-PATCH 2/4] mm: Add __rcu_alloc_page_lockless() func.

2020-09-22 Thread Michal Hocko
On Mon 21-09-20 20:35:53, Paul E. McKenney wrote: > On Mon, Sep 21, 2020 at 06:03:18PM +0200, Michal Hocko wrote: > > On Mon 21-09-20 08:45:58, Paul E. McKenney wrote: > > > On Mon, Sep 21, 2020 at 09:47:16AM +0200, Michal Hocko wrote: > > > > On Fri 18-09-20 21

Re: [RFC-PATCH 2/4] mm: Add __rcu_alloc_page_lockless() func.

2020-09-22 Thread Michal Hocko
eed to reclaim. Otherwise they are constantly refilled/rebalanced on demand. The fact that you are refilling them from outside just suggest that you are operating on a wrong layer. Really, create your own pool of pages and rebalance them based on the workload. > Could you please specify a real test case or workload you are talking about? I am not a performance expert but essentially any memory allocator heavy workload might notice. I am pretty sure Mel would tell you more. -- Michal Hocko SUSE Labs

Re: [PATCH] mm/memcontrol: Add the drop_cache interface for cgroup v2

2020-09-22 Thread Michal Hocko
On Tue 22-09-20 12:20:52, Yafang Shao wrote: > On Mon, Sep 21, 2020 at 7:36 PM Michal Hocko wrote: > > > > On Mon 21-09-20 19:23:01, Yafang Shao wrote: > > > On Mon, Sep 21, 2020 at 7:05 PM Michal Hocko wrote: > > > > > > > > On Mon 21-09-20 18:55

Re: [PATCH] memcg: introduce per-memcg reclaim interface

2020-09-21 Thread Michal Hocko
ds like something too easy to use incorrectly (remember drop_caches). I am also a bit worried about corner cases wich would be easier to hit - e.g. fill up the swap limit and turn anonymous memory into unreclaimable and who knows what else. -- Michal Hocko SUSE Labs

Re: [PATCH 1/4] mm: Trial do_wp_page() simplification

2020-09-21 Thread Michal Hocko
On Mon 21-09-20 17:04:50, Christian Brauner wrote: > On Mon, Sep 21, 2020 at 04:55:37PM +0200, Michal Hocko wrote: > > On Mon 21-09-20 16:43:55, Christian Brauner wrote: > > > On Mon, Sep 21, 2020 at 10:38:47AM -0400, Tejun Heo wrote: > > > > Hello, > > > &

Re: [RFC-PATCH 2/4] mm: Add __rcu_alloc_page_lockless() func.

2020-09-21 Thread Michal Hocko
On Mon 21-09-20 08:45:58, Paul E. McKenney wrote: > On Mon, Sep 21, 2020 at 09:47:16AM +0200, Michal Hocko wrote: > > On Fri 18-09-20 21:48:15, Uladzislau Rezki (Sony) wrote: > > [...] > > > Proposal > > > > > > Introduce a lock-free function

Re: [PATCH 1/4] mm: Trial do_wp_page() simplification

2020-09-21 Thread Michal Hocko
On Mon 21-09-20 16:41:34, Christian Brauner wrote: > On Mon, Sep 21, 2020 at 03:42:00PM +0200, Michal Hocko wrote: > > [Cc Tejun and Christian - this is a part of a larger discussion which is > > not directly related to this particular question so let me trim the > > origi

Re: [PATCH 1/4] mm: Trial do_wp_page() simplification

2020-09-21 Thread Michal Hocko
On Mon 21-09-20 16:43:55, Christian Brauner wrote: > On Mon, Sep 21, 2020 at 10:38:47AM -0400, Tejun Heo wrote: > > Hello, > > > > On Mon, Sep 21, 2020 at 04:28:34PM +0200, Michal Hocko wrote: > > > Fundamentaly CLONE_INTO_CGROUP is similar to regular fork + move to

Re: [PATCH 1/4] mm: Trial do_wp_page() simplification

2020-09-21 Thread Michal Hocko
On Mon 21-09-20 10:18:30, Peter Xu wrote: > Hi, Michal, > > On Mon, Sep 21, 2020 at 03:42:00PM +0200, Michal Hocko wrote: [...] > > I have only now > > learned about this feature so I am not deeply familiar with all the > > details and I might be easily wrong. No

Re: [PATCH 1/4] mm: Trial do_wp_page() simplification

2020-09-21 Thread Michal Hocko
ces bound to child's lifetime but accounted to the parent's memcg which can lead to all sorts of interesting problems (e.g. unreclaimable memory - even by the oom killer). Christian, Tejun is this the expected semantic or I am just misreading the code? -- Michal Hocko SUSE Labs

Re: [PATCH] mm/memcontrol: Add the drop_cache interface for cgroup v2

2020-09-21 Thread Michal Hocko
On Mon 21-09-20 19:23:01, Yafang Shao wrote: > On Mon, Sep 21, 2020 at 7:05 PM Michal Hocko wrote: > > > > On Mon 21-09-20 18:55:40, Yafang Shao wrote: > > > On Mon, Sep 21, 2020 at 4:12 PM Michal Hocko wrote: > > > > > > > > On Mon 21

Re: [PATCH 02/13] mm: use page_off_lru()

2020-09-21 Thread Michal Hocko
On Fri 18-09-20 12:53:58, Yu Zhao wrote: > On Fri, Sep 18, 2020 at 01:09:14PM +0200, Michal Hocko wrote: > > On Fri 18-09-20 04:27:13, Yu Zhao wrote: > > > On Fri, Sep 18, 2020 at 09:37:00AM +0200, Michal Hocko wrote: > > > > On Thu 17-09-20 21:00:40, Yu Zhao wrote:

Re: [PATCH] mm/memcontrol: Add the drop_cache interface for cgroup v2

2020-09-21 Thread Michal Hocko
On Mon 21-09-20 18:55:40, Yafang Shao wrote: > On Mon, Sep 21, 2020 at 4:12 PM Michal Hocko wrote: > > > > On Mon 21-09-20 16:02:55, zangchun...@bytedance.com wrote: > > > From: Chunxin Zang > > > > > > In the cgroup v1, we have 'force_mepty' interfac

Re: [PATCH] mm/memcontrol: Add the drop_cache interface for cgroup v2

2020-09-21 Thread Michal Hocko
gt; { > + .name = "drop_cache", > + .flags = CFTYPE_NOT_ON_ROOT, > + .write = mem_cgroup_force_empty_write, > + }, > + { > .name = "events", > .flags = CFTYPE_NOT_ON_ROOT, > .file_offset = offsetof(struct mem_cgroup, events_file), > -- > 2.11.0 -- Michal Hocko SUSE Labs

Re: [RFC-PATCH 2/4] mm: Add __rcu_alloc_page_lockless() func.

2020-09-21 Thread Michal Hocko
going to do any good for long term maintainability. -- Michal Hocko SUSE Labs

Re: [PATCH v9 3/3] mm/madvise: introduce process_madvise() syscall: an external memory hinting API

2020-09-21 Thread Michal Hocko
re asking for a long time. This functionality shouldn't be much different from the standard memory reclaim. It has some limitations (e.g. it can only handle mapped memory) but allows to pro-actively swap out or reclaim disk based memory based on a specific knowlege of the workload. Kernel is not able to do the same. [1] http://lkml.kernel.org/r/20200117115225.gv19...@dhcp22.suse.cz -- Michal Hocko SUSE Labs

Re: [[PATCH]] mm: khugepaged: recalculate min_free_kbytes after memory hotplug as expected by khugepaged

2020-09-21 Thread Michal Hocko
On Fri 18-09-20 08:32:13, Vijay Balakrishna wrote: > > > On 9/17/2020 10:45 PM, Michal Hocko wrote: > > On Thu 17-09-20 11:03:56, Vijay Balakrishna wrote: > > [...] > > > > > The auto tuned value is incorrect post hotplug memory operation, in > >

Re: [PATCH 02/13] mm: use page_off_lru()

2020-09-18 Thread Michal Hocko
On Fri 18-09-20 04:27:13, Yu Zhao wrote: > On Fri, Sep 18, 2020 at 09:37:00AM +0200, Michal Hocko wrote: [...] > And I have asked this before: why does 'the compound page situation' > even matter here? Perhaps if you could give a concrete example related > to the code change and help m

Re: [PATCH 02/13] mm: use page_off_lru()

2020-09-18 Thread Michal Hocko
On Fri 18-09-20 04:27:13, Yu Zhao wrote: > On Fri, Sep 18, 2020 at 09:37:00AM +0200, Michal Hocko wrote: > > On Thu 17-09-20 21:00:40, Yu Zhao wrote: > > > This patch replaces the only open-coded __ClearPageActive() with > > > page_off_lru(). There is no open-code

Re: [PATCH 00/13] mm: clean up some lru related pieces

2020-09-18 Thread Michal Hocko
other might think differently but as it is not clear what is your actual goal here it is hard to judge pros and cons. -- Michal Hocko SUSE Labs

Re: [PATCH 03/13] mm: move __ClearPageLRU() into page_off_lru()

2020-09-18 Thread Michal Hocko
@@ static unsigned noinline_for_stack > move_pages_to_lru(struct lruvec *lruvec, > add_page_to_lru_list(page, lruvec, lru); > > if (put_page_testzero(page)) { > - __ClearPageLRU(page); > del_page_from_lru_list(page, lruvec, > page_off_lru(page)); > > if (unlikely(PageCompound(page))) { > -- > 2.28.0.681.g6f77f65b4e-goog -- Michal Hocko SUSE Labs

<    4   5   6   7   8   9   10   11   12   13   >