Joonsoo worked in the past and I believe following up on that
work has been recommended last time a similar/same approach like this
patch was proposed.
--
Michal Hocko
SUSE Labs
se anything
but from what I understand the conversion should be pretty
straightforward, albeit noisy.
One thing that was really strange to me when seeing the concept for the
first time was the choice of naming (no I do not want to start any
bikeshedding) because it hasn't really resonated with the udnerlying
concept. Maybe just me as a non native speaker... page_head would have
been so much more straightforward but not something I really care about.
--
Michal Hocko
SUSE Labs
return -ENOMEM;
> ret = single_open(file, show, data);
> if (ret) {
> - kvfree(buf);
> + vfree(buf);
> return ret;
> }
> ((struct seq_file *)file->private_data)->buf = buf;
> --
> 2.25.1
--
Michal Hocko
SUSE Labs
k_struct *task,
> void *arg)
> static void select_bad_process(struct oom_control *oc)
> {
> oc->chosen_points = LONG_MIN;
> + oc->chosen = NULL;
>
> if (is_memcg_oom(oc))
> mem_cgroup_scan_tasks(oc->memcg, oom_evaluate_task, oc);
> --
> 1.8.3.1
>
--
Michal Hocko
SUSE Labs
operation happens only once. This is also in
> line with pcp allocator cache which are disabled for the offlining as
> well.
>
> Signed-off-by: Minchan Kim
Looks goot to me
Acked-by: Michal Hocko
Thanks
--
Michal Hocko
SUSE Labs
e: migrate_prep_local in compaction.c changed into lru_add_drain
> to avoid CPU schedule cost with involving many other CPUs to keep
> keep old behavior.
>
> Signed-off-by: Minchan Kim
Acked-by: Michal Hocko
Btw. that migrate_prep_local likely needs revisiting. I really fail to
see why
On Thu 11-03-21 14:53:08, Mike Kravetz wrote:
> On 3/11/21 9:59 AM, Mike Kravetz wrote:
> > On 3/11/21 4:17 AM, Michal Hocko wrote:
> >>> Yeah per cpu preempt counting shouldn't be noticeable but I have to
> >>> confess I haven't benchmarked it.
> >>
&
On Thu 11-03-21 10:21:39, Johannes Weiner wrote:
> On Thu, Mar 11, 2021 at 09:37:02AM +0100, Michal Hocko wrote:
> > Johannes, Hugh,
> >
> > what do you think about this approach? If we want to stick with
> > split_page approach then we need to update the missing place
date_end
> should *not* be called.
Yes, this is what I remember when introducing nonblock interface. So I
agree with Jason this patch is not correct. The interface is subtle but
I remember we couldn't come up with something more robust and still
memory with notifiers to be reapable.
--
Michal Hocko
SUSE Labs
less optimal in normal case (e.g.
hugetlb is almost never freed from an atomic context - one has to be
really creative to achieve that). So where do we draw a line?
--
Michal Hocko
SUSE Labs
On Thu 11-03-21 09:40:57, Michal Hocko wrote:
> On Wed 10-03-21 15:28:51, Paul E. McKenney wrote:
> > On Wed, Mar 10, 2021 at 02:10:12PM -0800, Mike Kravetz wrote:
> > > On 3/10/21 1:49 PM, Paul E. McKenney wrote:
> > > > On Wed, Mar 10, 2021 at 10:11:
w
more of these. Hugetlb code paths shouldn't really think about size of
the struct page.
--
Michal Hocko
SUSE Labs
On Thu 11-03-21 12:36:51, Peter Zijlstra wrote:
> On Thu, Mar 11, 2021 at 12:09:15PM +0100, Michal Hocko wrote:
>
> > Sorry for being dense but I do not follow. You have provided the
> > following example
> > spin_lock();
> >
> > spin_lock();
&
On Thu 11-03-21 10:52:50, Peter Zijlstra wrote:
> On Thu, Mar 11, 2021 at 10:44:56AM +0100, Michal Hocko wrote:
> > On Thu 11-03-21 10:32:24, Peter Zijlstra wrote:
> > > The whole changelog reads like a trainwreck, but akpm already commented
> > > on that. I
On Thu 11-03-21 10:32:24, Peter Zijlstra wrote:
> On Thu, Mar 11, 2021 at 10:01:22AM +0100, Michal Hocko wrote:
> > On Thu 11-03-21 09:46:30, Peter Zijlstra wrote:
> > > On Wed, Mar 10, 2021 at 06:13:21PM -0800, Mike Kravetz wrote:
> > > > from irq context. Chang
On Thu 11-03-21 17:08:34, Muchun Song wrote:
> On Thu, Mar 11, 2021 at 4:55 PM Michal Hocko wrote:
> >
> > On Thu 11-03-21 15:33:20, Muchun Song wrote:
> > > On Wed, Mar 10, 2021 at 11:41 PM Michal Hocko wrote:
> > > >
> > > > On Mon 08-0
not even sure whether we
need to care about irq disabled regions without any locks held that
wouldn't be covered by in_atomic. But it would be safer to add
irq_disabled check as well.
--
Michal Hocko
SUSE Labs
On Thu 11-03-21 15:33:20, Muchun Song wrote:
> On Wed, Mar 10, 2021 at 11:41 PM Michal Hocko wrote:
> >
> > On Mon 08-03-21 18:28:07, Muchun Song wrote:
> > > When the "struct page size" crosses page boundaries we cannot
> > > make use of thi
On Thu 11-03-21 16:45:51, Muchun Song wrote:
> On Thu, Mar 11, 2021 at 10:58 AM Muchun Song wrote:
> >
> > On Wed, Mar 10, 2021 at 10:14 PM Michal Hocko wrote:
> > >
> > > [I am sorry for a late review]
> >
> > Thanks for your review.
> >
>
On Thu 11-03-21 14:34:04, Muchun Song wrote:
> On Wed, Mar 10, 2021 at 11:28 PM Michal Hocko wrote:
> >
> > On Mon 08-03-21 18:28:03, Muchun Song wrote:
> > > Because we reuse the first tail vmemmap page frame and remap it
> > > with read-only, we cannot set th
On Thu 11-03-21 12:26:32, Muchun Song wrote:
> On Wed, Mar 10, 2021 at 11:19 PM Michal Hocko wrote:
> >
> > On Mon 08-03-21 18:28:02, Muchun Song wrote:
[...]
> > > @@ -1771,8 +1813,12 @@ int dissolve_free_huge_page(struct page *page)
> > >
On Wed 10-03-21 15:28:51, Paul E. McKenney wrote:
> On Wed, Mar 10, 2021 at 02:10:12PM -0800, Mike Kravetz wrote:
> > On 3/10/21 1:49 PM, Paul E. McKenney wrote:
> > > On Wed, Mar 10, 2021 at 10:11:22PM +0100, Michal Hocko wrote:
> > >> On Wed 10-03-21 10:56:08, Mike
Johannes, Hugh,
what do you think about this approach? If we want to stick with
split_page approach then we need to update the missing place Matthew has
pointed out.
On Tue 09-03-21 14:03:36, Michal Hocko wrote:
> On Tue 09-03-21 12:32:55, Matthew Wilcox wrote:
> > On Tue, Mar 09, 2021
On Thu 11-03-21 09:26:03, Michal Hocko wrote:
> On Wed 10-03-21 18:13:21, Mike Kravetz wrote:
> > put_page does not correctly handle all calling contexts for hugetlb
> > pages. This was recently discussed in the threads [1] and [2].
> >
> > free_huge_page is the r
..@google.com/
> [2] https://lore.kernel.org/linux-mm/yejji9oawhuza...@dhcp22.suse.cz/
> [3] https://lore.kernel.org/linux-mm/ydzaawk41k4gd...@dhcp22.suse.cz/
>
> Suggested-by: Michal Hocko
> Signed-off-by: Mike Kravetz
While not an ideal solution I believe this is the most straightf
hugetlb specific knowledge.
> Or something else?
>
> Is anyone looking onto fixing this for real?
Mike said he would be looking into making hugetlb_lock irq safe but
there is a non trivial way there and this would be not a great candidate
for backporting.
Btw. RCU already wants to have a reliable in_atomic as well and that
effectivelly means enabling PREEMPT_COUNT for everybody. The overhead of
per-cpu preempt counter should pretty much invisible AFAIK.
--
Michal Hocko
SUSE Labs
On Wed 10-03-21 10:56:08, Mike Kravetz wrote:
> On 3/10/21 7:19 AM, Michal Hocko wrote:
> > On Mon 08-03-21 18:28:02, Muchun Song wrote:
> > [...]
> >> @@ -1447,7 +1486,7 @@ void free_huge_page(struct page *page)
> >>/*
> >> * Defer freeing if
On Wed 10-03-21 11:46:57, Zi Yan wrote:
> On 10 Mar 2021, at 11:23, Michal Hocko wrote:
>
> > On Mon 08-03-21 16:18:52, Mike Kravetz wrote:
> > [...]
> >> Converting larger to smaller hugetlb pages can be accomplished today by
> >> first freeing th
On Wed 10-03-21 08:05:36, Minchan Kim wrote:
> On Wed, Mar 10, 2021 at 02:07:05PM +0100, Michal Hocko wrote:
[...]
> > The is a lot of churn indeed. Have you considered adding $FOO_lglvl
> > variants for those so that you can use them for your particular case
> > without affec
?
Is this all really worth the additional code to something as tricky as
hugetlb code base?
> include/linux/hugetlb.h | 8 ++
> mm/hugetlb.c| 199 +++-
> 2 files changed, 204 insertions(+), 3 deletions(-)
>
> --
> 2.29.2
>
--
Michal Hocko
SUSE Labs
gt; + * This check aims to let the compiler help us optimize the code as
> + * much as possible.
> + */
> + if (!is_power_of_2(sizeof(struct page)))
> + return 0;
> return h->nr_free_vmemmap_pages;
> }
> #else
> --
> 2.11.0
>
--
Michal Hocko
SUSE Labs
Salvador
> Reviewed-by: Miaohe Lin
> Tested-by: Chen Huang
> Tested-by: Bodeddula Balasubramaniam
Acked-by: Michal Hocko
> ---
> include/linux/hugetlb.h| 24 ++--
> include/linux/hugetlb_cgroup.h | 19 +++
> mm/hugetlb.c
to corner cases under heavy
memory pressure.
> +
> + on: enable the feature
> + off: disable the feature
> +
> hung_task_panic=
> [KNL] Should the hung task detector generate panics.
> Format: 0 | 1
--
Michal Hocko
SUSE Labs
ad, page);
> list_del(>lru);
> h->free_huge_pages--;
> h->free_huge_pages_node[nid]--;
> @@ -1818,6 +1881,7 @@ int dissolve_free_huge_page(struct page *page)
> h->surplus_huge_pages--;
> h->surplus_huge_pages_node[nid]--;
> h->max_huge_pages++;
> + hwpoison_subpage_clear(h, head);
> }
> }
> out:
> --
> 2.11.0
>
--
Michal Hocko
SUSE Labs
--;
> h->max_huge_pages--;
> - update_and_free_page(h, head);
> - rc = 0;
> + rc = update_and_free_page(h, head);
> + if (rc) {
> + h->surplus_huge_pages--;
> + h->surplus_huge_pages_node[nid]--;
> + h->max_huge_pages++;
This is quite ugly and confusing. update_and_free_page is careful to do
the proper counters accounting and now you just override it partially.
Why cannot we rely on update_and_free_page do the right thing?
--
Michal Hocko
SUSE Labs
de directly. Talking about struct pages backing struct
pages (vmemmap) is usually a good recipe for headache but those diagrams
make it easy to follow the reasoning.
Anyway
Acked-by: michal Hocko
> ---
> include/linux/bootmem_info.h | 27 +-
> include/linux/mm.h | 3 +
> m
SPARSE.
BOOTMEM_INFO_NODE is really an odd thing to depend on here. There is
some functionality which requires the node info but that can be gated
specifically. Or what is the thinking behind?
This doesn't matter right now because it seems that the *_page_bootmem
is only used by x86 outside of
On Wed 10-03-21 13:26:23, Matthew Wilcox wrote:
> On Tue, Mar 09, 2021 at 10:32:51AM +0100, Michal Hocko wrote:
> > Apart from the above, do we have to warn for something that is a
> > debugging aid? A similar concern wrt dump_page which uses pr_warn and
> > page owner i
mm/kfence/report.c | 3 ++-
> mm/kmemleak.c | 2 +-
> mm/memory.c | 2 +-
> mm/memory_hotplug.c | 4 ++--
> mm/page_alloc.c | 4 ++--
> mm/page_isolation.c | 2 +-
> mm/page_owner.c | 24 +++---
> 25 files changed, 88 insertions(+), 69 deletions(-)
The is a lot of churn indeed. Have you considered adding $FOO_lglvl
variants for those so that you can use them for your particular case
without affecting most of existing users? Something similar we have
discussed in other email thread regarding lru_add_drain_all?
--
Michal Hocko
SUSE Labs
On Tue 09-03-21 09:27:51, Minchan Kim wrote:
> On Tue, Mar 09, 2021 at 05:32:08PM +0100, Michal Hocko wrote:
> > On Tue 09-03-21 08:15:41, Minchan Kim wrote:
> > > On Tue, Mar 09, 2021 at 10:32:51AM +0100, Michal Hocko wrote:
> > > > On Mon 08-03-
nk it would be better to have a
proper allocation flags in the initial patch which implements the
fallback.
> + }
> +
> page = __alloc_pages_nodemask(gfp_mask, order,
> policy_node(gfp, pol, preferred_nid),
>
On Tue 09-03-21 08:29:21, Minchan Kim wrote:
> On Tue, Mar 09, 2021 at 12:03:08PM +0100, Michal Hocko wrote:
[...]
> > Sorry for nit picking but I think the additional abstraction for
> > migrate_prep is not really needed and we can remove some more code.
> > Maybe w
On Tue 09-03-21 08:15:41, Minchan Kim wrote:
> On Tue, Mar 09, 2021 at 10:32:51AM +0100, Michal Hocko wrote:
> > On Mon 08-03-21 12:20:47, Minchan Kim wrote:
> > > alloc_contig_range is usually used on cma area or movable zone.
> > > It's critical if the page migrat
On Tue 09-03-21 12:32:55, Matthew Wilcox wrote:
> On Tue, Mar 09, 2021 at 10:02:00AM +0100, Michal Hocko wrote:
[...]
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 913c2b9e5c72..d44dea2b8d22 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
>
* local_lock or preemption disabled would be ordered by that.
+* The atomic operation doesn't need to have stronger ordering
+ * requirements because that is enforeced by the scheduling
+* guarantees.
+*/
+ __lru_add_drain_all(true);
+#else
+ lru_add_drain();
+#endif
+}
+
/**
* release_pages - batched put_page()
* @pages: array of pages to release
--
Michal Hocko
SUSE Labs
age(page, "migration failure");
> + }
Apart from the above, do we have to warn for something that is a
debugging aid? A similar concern wrt dump_page which uses pr_warn and
page owner is using even pr_alert.
Would it make sense to add a loglevel parameter both into __dump_page
and dump_page_owner?
--
Michal Hocko
SUSE Labs
*/
+ for (i = 1; i < nr_pages; i++) {
+ struct page * p = page + i;
+ p->memcg_data = page->memcg_data
+ }
return 0;
}
css_put(>css);
--
Michal Hocko
SUSE Labs
;
In practice this will be power of 2 but why should we bother to sanitze
that?
--
Michal Hocko
SUSE Labs
d by people who are more
familiar with that framework than me.
--
Michal Hocko
SUSE Labs
On Mon 08-03-21 15:13:35, David Hildenbrand wrote:
> On 08.03.21 15:11, Michal Hocko wrote:
> > On Mon 08-03-21 14:22:12, David Hildenbrand wrote:
> > > On 08.03.21 13:49, Michal Hocko wrote:
> > [...]
> > > > Earlier in the discussion I have
On Mon 08-03-21 14:22:12, David Hildenbrand wrote:
> On 08.03.21 13:49, Michal Hocko wrote:
[...]
> > Earlier in the discussion I have suggested dynamic debugging facility.
> > Documentation/admin-guide/dynamic-debug-howto.rst. Have you tried to
> > look into that direct
accidentally used or returned to caller.
> */
> - ret = __alloc_contig_migrate_range(, start, end);
> + ret = __alloc_contig_migrate_range(, start, end,
> + migratetype == CMA ||
> + zone_idx(cc.zone) == ZONE_MOVABLE);
> if (ret && ret != -EBUSY)
> goto done;
> ret =0;
--
Michal Hocko
SUSE Labs
On Fri 05-03-21 15:58:40, Andrew Morton wrote:
> On Fri, 5 Mar 2021 12:52:52 +0100 Michal Hocko wrote:
>
> > On Thu 04-03-21 07:40:53, Zhou Guanghui wrote:
> > > As described in the split_page function comment, for the non-compound
> > > high order page, the sub-pa
On Fri 05-03-21 11:07:59, Tim Chen wrote:
>
>
> On 3/5/21 1:11 AM, Michal Hocko wrote:
> > On Thu 04-03-21 09:35:08, Tim Chen wrote:
> >>
> >>
> >> On 2/18/21 11:13 AM, Michal Hocko wrote:
> >>
> >>>
> >>> Fixes
On Wed 03-03-21 12:23:22, Minchan Kim wrote:
> On Wed, Mar 03, 2021 at 01:49:36PM +0100, Michal Hocko wrote:
> > On Tue 02-03-21 13:09:48, Minchan Kim wrote:
> > > LRU pagevec holds refcount of pages until the pagevec are drained.
> > > It could prevent migration sin
so that it is clear this is not just a theoretical issue.
> Signed-off-by: Zhou Guanghui
Acked-by: Michal Hocko
> ---
> mm/page_alloc.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 3e4b29ee2b1e..3ed783e25c3c 100644
>
y: Zhou Guanghui
Acked-by: Michal Hocko
> ---
> include/linux/memcontrol.h | 6 ++
> mm/huge_memory.c | 2 +-
> mm/memcontrol.c| 15 ++-
> 3 files changed, 9 insertions(+), 14 deletions(-)
>
> diff --git a/include/linux/memc
On Thu 04-03-21 09:35:08, Tim Chen wrote:
>
>
> On 2/18/21 11:13 AM, Michal Hocko wrote:
>
> >
> > Fixes: 4e41695356fb ("memory controller: soft limit reclaim on contention")
> > Acked-by: Michal Hocko
> >
> > Thanks!
> >>
On Tue 02-03-21 19:59:22, Shakeel Butt wrote:
> On Tue, Mar 2, 2021 at 1:19 PM Mike Kravetz wrote:
> >
> > On 3/2/21 6:29 AM, Michal Hocko wrote:
> > > On Tue 02-03-21 06:11:51, Shakeel Butt wrote:
> > >> On Tue, Mar 2, 2021 at 1:44 AM Michal Hocko wrote:
>
On Thu 04-03-21 16:14:14, Feng Tang wrote:
> On Wed, Mar 03, 2021 at 09:22:50AM -0800, Ben Widawsky wrote:
> > On 21-03-03 18:14:30, Michal Hocko wrote:
> > > On Wed 03-03-21 08:31:41, Ben Widawsky wrote:
> > > > On 21-03-03 14:59:35, Michal Hocko wrote:
> >
On Wed 03-03-21 09:22:50, Ben Widawsky wrote:
> On 21-03-03 18:14:30, Michal Hocko wrote:
> > On Wed 03-03-21 08:31:41, Ben Widawsky wrote:
> > > On 21-03-03 14:59:35, Michal Hocko wrote:
> > > > On Wed 03-03-21 21:46:44, Feng Tang wrote:
> > > > > O
On Wed 03-03-21 09:59:45, Paul E. McKenney wrote:
> On Wed, Mar 03, 2021 at 09:03:27AM +0100, Michal Hocko wrote:
[...]
> > Paul what is the current plan with in_atomic to be usable for !PREEMPT
> > configurations?
>
> Ah, thank you for the reminder! I have rebased that ser
On Wed 03-03-21 08:31:41, Ben Widawsky wrote:
> On 21-03-03 14:59:35, Michal Hocko wrote:
> > On Wed 03-03-21 21:46:44, Feng Tang wrote:
> > > On Wed, Mar 03, 2021 at 09:18:32PM +0800, Tang, Feng wrote:
> > > > On Wed, Mar 03, 2021 at 01:32:11PM +0100, Michal Hocko wr
On Wed 03-03-21 21:18:32, Feng Tang wrote:
> On Wed, Mar 03, 2021 at 01:32:11PM +0100, Michal Hocko wrote:
> > On Wed 03-03-21 20:18:33, Feng Tang wrote:
> > > On Wed, Mar 03, 2021 at 08:07:17PM +0800, Tang, Feng wrote:
> > > > Hi Michal,
> > > >
> &
On Wed 03-03-21 21:27:24, Muchun Song wrote:
> On Wed, Mar 3, 2021 at 6:25 PM Michal Hocko wrote:
> >
> > On Wed 03-03-21 17:39:56, Muchun Song wrote:
> > > For simplification 991e7673859e ("mm: memcontrol: account kernel stack
> > > per node") has chan
On Wed 03-03-21 21:46:44, Feng Tang wrote:
> On Wed, Mar 03, 2021 at 09:18:32PM +0800, Tang, Feng wrote:
> > On Wed, Mar 03, 2021 at 01:32:11PM +0100, Michal Hocko wrote:
> > > On Wed 03-03-21 20:18:33, Feng Tang wrote:
[...]
> > > > One thing I tr
On Wed 03-03-21 20:18:33, Feng Tang wrote:
> On Wed, Mar 03, 2021 at 08:07:17PM +0800, Tang, Feng wrote:
> > Hi Michal,
> >
> > On Wed, Mar 03, 2021 at 12:39:57PM +0100, Michal Hocko wrote:
> > > On Wed 03-03-21 18:20:58, Feng Tang wrote:
> > > > When
ith something like this instead
/*
* lru_add_drain_all in the force mode will schedule draining on
* all online CPUs so any calls of lru_cache_disabled wrapped by
* local_lock or preemption disabled would be ordered by that.
* The atomic operation doesn't need to have
nd 'fallback-nmask', and they will be tried in turn if not NULL, with
> it we can call __alloc_pages_nodemask() only once.
Yes, it is very much disliked. Is there any reason why you cannot use
GFP_NOWAIT for that purpose?
--
Michal Hocko
SUSE Labs
859e ("mm: memcontrol: account kernel stack per node")
Fixes tag might make somebody assume this is worth backporting but I
highly doubt so.
> Signed-off-by: Muchun Song
> Reviewed-by: Shakeel Butt
Anyway
Acked-by: Michal Hocko
as the patch is correct with one comment below
On Tue 02-03-21 17:56:07, Johannes Weiner wrote:
> On Tue, Mar 02, 2021 at 12:24:41PM -0800, Hugh Dickins wrote:
> > On Tue, 2 Mar 2021, Michal Hocko wrote:
> > > [Cc Johannes for awareness and fixup Nick's email]
> > >
> > > On Tue 02-03-21 01:34:51, Zhou Gua
[Add Paul]
On Tue 02-03-21 13:19:34, Mike Kravetz wrote:
> On 3/2/21 6:29 AM, Michal Hocko wrote:
> > On Tue 02-03-21 06:11:51, Shakeel Butt wrote:
> >> On Tue, Mar 2, 2021 at 1:44 AM Michal Hocko wrote:
> >>>
> >>> On Mon 01-03-21 17:16:29, Mike Krave
y sure it is an actual
problem worth complicating the code. I am pretty sure this would grow
into more tricky problem quite quickly (e.g. proper memory policy
handling).
--
Michal Hocko
SUSE Labs
On Tue 02-03-21 10:17:03, Michal Hocko wrote:
> [Cc Johannes for awareness and fixup Nick's email]
>
> On Tue 02-03-21 01:34:51, Zhou Guanghui wrote:
> > When split page, the memory cgroup info recorded in first page is
> > not copied to tail pages. In this case, when the tai
> time.
> But your copy_page_memcg does not do css_get for split subpages. Will it cause
> memcg->css underflow when subpages are uncharged?
yes, well spotted. I have completely missed that. This will also discard
my comment on testing the memcg.
--
Michal Hocko
SUSE Labs
On Tue 02-03-21 06:11:51, Shakeel Butt wrote:
> On Tue, Mar 2, 2021 at 1:44 AM Michal Hocko wrote:
> >
> > On Mon 01-03-21 17:16:29, Mike Kravetz wrote:
> > > On 3/1/21 9:23 AM, Michal Hocko wrote:
> > > > On Mon 01-03-21 08:39:22, Shakeel Butt wrote:
>
.
Address the problem by accounting each vmalloc backing page to its own
node.
"
--
Michal Hocko
SUSE Labs
On Mon 01-03-21 17:16:29, Mike Kravetz wrote:
> On 3/1/21 9:23 AM, Michal Hocko wrote:
> > On Mon 01-03-21 08:39:22, Shakeel Butt wrote:
> >> On Mon, Mar 1, 2021 at 7:57 AM Michal Hocko wrote:
> > [...]
> >>> Then how come this can ever be a problem? in_
On Tue 02-03-21 17:23:42, Muchun Song wrote:
> On Tue, Mar 2, 2021 at 4:44 PM Michal Hocko wrote:
> >
> > On Tue 02-03-21 15:37:33, Muchun Song wrote:
> > > The alloc_thread_stack_node() cannot guarantee that allocated stack pages
> > > are in the same node when
is not a theoretical one. Both
users (arm64 and s390 kvm) are quite recent AFAICS. split_page is also
used in dma allocator but I got lost in indirection so I have no idea
whether there are any users there.
The page itself looks reasonable to me.
> Signed-off-by: Zhou Guanghui
Acked-by:
t; + } else {
> + /* All stack pages are in the same node. */
> mod_lruvec_kmem_state(stack, NR_KERNEL_STACK_KB,
> account * (THREAD_SIZE / 1024));
> + }
> }
>
> static int memcg_charge_kernel_stack(struct task_struct *tsk)
> --
> 2.11.0
--
Michal Hocko
SUSE Labs
On Mon 01-03-21 08:39:22, Shakeel Butt wrote:
> On Mon, Mar 1, 2021 at 7:57 AM Michal Hocko wrote:
[...]
> > Then how come this can ever be a problem? in_task() should exclude soft
> > irq context unless I am mistaken.
> >
>
> If I take the following example of
On Mon 01-03-21 07:10:11, Shakeel Butt wrote:
> On Mon, Mar 1, 2021 at 4:12 AM Michal Hocko wrote:
> >
> > On Fri 26-02-21 16:00:30, Shakeel Butt wrote:
> > > On Fri, Feb 26, 2021 at 3:14 PM Mike Kravetz
> > > wrote:
> > > >
> > > > Cc
eason why
> alloc_contig_range() failed on specific pages.
>
> Cc: Andrew Morton
> Cc: Minchan Kim
> Cc: Oscar Salvador
> Cc: Michal Hocko
> Cc: Vlastimil Babka
> Signed-off-by: David Hildenbrand
Acked-by: Michal Hocko
> ---
> mm/page_alloc.c | 2 --
>
On Fri 26-02-21 11:19:51, Yang Shi wrote:
> On Fri, Feb 26, 2021 at 8:42 AM Yang Shi wrote:
> >
> > On Thu, Feb 25, 2021 at 11:30 PM Michal Hocko wrote:
> > >
> > > On Thu 25-02-21 18:12:54, Yang Shi wrote:
> > > > When debugging an oom
On Fri 26-02-21 08:42:29, Yang Shi wrote:
> On Thu, Feb 25, 2021 at 11:30 PM Michal Hocko wrote:
> >
> > On Thu 25-02-21 18:12:54, Yang Shi wrote:
> > > When debugging an oom issue, I found the oom_kill counter of memcg is
> > > confusing. At the first glance wit
ith kernels which have c77c0a8ac4c5
applied?
Btw. making hugetlb lock irq safe has been already discussed and it
seems to be much harder than expected as some heavy operations are done
under the lock. This is really bad. Postponing the whole freeing
operation into a worker context is certainly possible but I would
consider it rather unfortunate. We would have to add some sync mechanism
to wait for hugetlb pages in flight to prevent from external
observability to the userspace. E.g. when shrinking the pool.
--
Michal Hocko
SUSE Labs
On Fri 26-02-21 16:56:28, Tim Chen wrote:
>
>
> On 2/26/21 12:52 AM, Michal Hocko wrote:
>
> >>
> >> Michal,
> >>
> >> Let's take an extreme case where memcg 1 always generate the
> >> first event and memcg 2 generates the rest of
On Fri 26-02-21 11:24:29, Oscar Salvador wrote:
> On Fri, Feb 26, 2021 at 09:46:57AM +0100, Michal Hocko wrote:
> > On Mon 22-02-21 14:51:37, Oscar Salvador wrote:
> > [...]
> > > @@ -2394,9 +2397,19 @@ bool isolate_or_dissolve_huge_page(
On Fri 26-02-21 10:45:14, Oscar Salvador wrote:
> On Fri, Feb 26, 2021 at 09:35:09AM +0100, Michal Hocko wrote:
> > I think it would be helpful to call out that specific case explicitly
> > here. I can see only one scenario (are there more?)
>
On Thu 25-02-21 14:48:58, Tim Chen wrote:
>
>
> On 2/24/21 3:53 AM, Michal Hocko wrote:
> > On Mon 22-02-21 11:48:37, Tim Chen wrote:
> >>
> >>
> >> On 2/22/21 11:09 AM, Michal Hocko wrote:
> >>
> >>>>
> >>>> I a
try_again = false;
> + goto retry;
> + }
Is this retry once logic really needed? Does it really give us any real
benefit? alloc_and_dissolve_huge_page already retries when the page is
being freed.
--
Michal Hocko
SUSE Labs
On Fri 26-02-21 09:35:10, Michal Hocko wrote:
> On Mon 22-02-21 14:51:36, Oscar Salvador wrote:
> > alloc_contig_range will fail if it ever sees a HugeTLB page within the
> > range we are trying to allocate, even when that page is free and can be
> > easily reallocated
the page belongs
> to with __GFP_THISNODE, meaning we do not fallback on other node's zones.
>
> Note that gigantic hugetlb pages are fenced off since there is a cyclic
> dependency between them and alloc_contig_range.
>
> Signed-off-by: Oscar Salvador
there would be no victim counted for the
above mentioned memcg ooms.
> The cgroup v2 documents it, but the description is missed for cgroup v1.
>
> Signed-off-by: Yang Shi
Acked-by: Michal Hocko
> ---
> Documentation/admin-guide/cgroup-v1/memory.rst | 3 +++
> 1 file changed, 3
On Mon 22-02-21 11:48:37, Tim Chen wrote:
>
>
> On 2/22/21 11:09 AM, Michal Hocko wrote:
>
> >>
> >> I actually have tried adjusting the threshold but found that it doesn't
> >> work well for
> >> the case with unenven memory access frequency b
On Wed 24-02-21 19:10:42, Muchun Song wrote:
> On Wed, Feb 24, 2021 at 5:43 PM Michal Hocko wrote:
> >
> > On Tue 23-02-21 13:55:44, Mike Kravetz wrote:
> > > Gerald Schaefer reported a panic on s390 in hugepage_subpool_put_pages()
> > > with linux-next
ized by the allocator.
> Fix by initializing hugetlb page subpool pointer in prep_new_huge_page().
>
> Fixes: f1280272ae4d ("hugetlb: use page.private for hugetlb specific page
> flags")
This is not a stable sha to refer to as it comes from linux next.
> Reported-by: G
On Tue 23-02-21 12:56:25, Shakeel Butt wrote:
> Replace the implicit checking of root memcg with explicit root memcg
> checking i.e. !css->parent with mem_cgroup_is_root().
>
> Signed-off-by: Shakeel Butt
Acked-by: Michal Hocko
Thanks!
> ---
> mm/memcontrol.c | 4 ++--
The patch is correct, I just do follow why 1f14c1ac19aa4 is really
relevant here. There nomem label wouldn't make any difference for
__GFP_NOFAIL requests. The code has has changed quite a lot since then.
> Signed-off-by: Shakeel Butt
This is a clear overlook when I moved the oom handling
201 - 300 of 20557 matches
Mail list logo