Re: [PATCH 2/2] core-api/memory-hotplug.rst: divide Locking Internal section by different locks

2018-12-05 Thread Michal Hocko
hotplug (e.g. access to global/zone > -variables). > +variables). Currently, we take advantage of this to serialise sparsemem's > +mem_section handling in sparse_add_one_section() and > +sparse_remove_one_section(). > > In addition, mem_hotplug_lock (in contrast to device_hotplug_lock) in read > mode allows for a quite efficient get_online_mems/put_online_mems > -- > 2.15.1 > -- Michal Hocko SUSE Labs

Re: [PATCH 1/2] admin-guide/memory-hotplug.rst: remove locking internal part from admin-guide

2018-12-05 Thread Michal Hocko
uide. It is a pure implementation detail nobody should be relying on. -- Michal Hocko SUSE Labs

Re: [PATCH v2 0/7] mm: pages for hugetlb's overcommit may be able to charge to memcg

2018-05-24 Thread Michal Hocko
On Thu 24-05-18 21:58:49, TSUKADA Koutaro wrote: > On 2018/05/24 17:20, Michal Hocko wrote: > > On Thu 24-05-18 13:39:59, TSUKADA Koutaro wrote: > >> On 2018/05/23 3:54, Michal Hocko wrote: > > [...] > >>> I am also quite confused why you keep distinguish

Re: [PATCH v2 0/7] mm: pages for hugetlb's overcommit may be able to charge to memcg

2018-05-24 Thread Michal Hocko
will allow users to exhaust > the entire memory of the system. Of course, this can be prevented by the > hugetlb cgroup, but even if we set the limit for memcg and hugetlb cgroup > respectively, as I asked in the first mail(set limit to 10GB), the > control will not work. -

Re: [PATCH v2 0/7] mm: pages for hugetlb's overcommit may be able to charge to memcg

2018-05-24 Thread Michal Hocko
On Thu 24-05-18 13:39:59, TSUKADA Koutaro wrote: > On 2018/05/23 3:54, Michal Hocko wrote: [...] > > I am also quite confused why you keep distinguishing surplus hugetlb > > pages from regular preallocated ones. Being a surplus page is an > > implementation detail that w

Re: [PATCH v2 0/7] mm: pages for hugetlb's overcommit may be able to charge to memcg

2018-05-22 Thread Michal Hocko
ugetlb pages into the memcg mix. They just do not belong there. Try to look at previous discussions why it has been decided to have a separate hugetlb pages at all. I am also quite confused why you keep distinguishing surplus hugetlb pages from regular preallocated ones. Being a surplus

Re: [PATCH v2 0/7] mm: pages for hugetlb's overcommit may be able to charge to memcg

2018-05-22 Thread Michal Hocko
ing it difficult to use. HugeTLBfs > may support multiple huge page sizes, and in such a special environment > there is a desire to use HugeTLBfs. Well, then I would argue that you shouldn't use 64kB pages for your setup or allow THP for smaller sizes. Really hugetlb pages are by no means a substitute h

Re: [PATCH v3 2/2] mm: remove odd HAVE_PTE_SPECIAL

2018-04-11 Thread Michal Hocko
On Wed 11-04-18 12:32:07, Laurent Dufour wrote: [...] > Andrew, should I send a v4 or could you wipe the 2 __maybe_unsued when > applying > the patch ? A follow $patch-fix should be better rather than post this again and spam people with more emails. -- Michal Hocko SUSE Labs -- To un

Re: [PATCH v3 2/2] mm: remove odd HAVE_PTE_SPECIAL

2018-04-11 Thread Michal Hocko
On Wed 11-04-18 10:41:23, Laurent Dufour wrote: > On 11/04/2018 10:33, Michal Hocko wrote: > > On Wed 11-04-18 10:03:36, Laurent Dufour wrote: > >> @@ -881,7 +876,8 @@ struct page *_vm_normal_page(struct vm_area_struct > >> *vma, unsigned long addr, > &g

Re: [PATCH v3 1/2] mm: introduce ARCH_HAS_PTE_SPECIAL

2018-04-11 Thread Michal Hocko
ted-by: Jerome Glisse <jglisse@redhat> > Reviewed-by: Jerome Glisse <jglisse@redhat> > Acked-by: David Rientjes <rient...@google.com> > Signed-off-by: Laurent Dufour <lduf...@linux.vnet.ibm.com> Looks good to me. I have checked x86 and the generic code and it looks

Re: [PATCH v3 2/2] mm: remove odd HAVE_PTE_SPECIAL

2018-04-11 Thread Michal Hocko
page tables. >* eg. VDSO mappings can cause them to exist. >*/ > -out: > +out: __maybe_unused > return pfn_to_page(pfn); Why do we need this ugliness all of the sudden? -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe

Re: [PATCH 0/3] move __HAVE_ARCH_PTE_SPECIAL in Kconfig

2018-04-09 Thread Michal Hocko
| 4 ++-- > mm/memory.c | 2 +- > 23 files changed, 18 insertions(+), 24 deletions(-) > > -- > 2.7.4 -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] mm, compaction: correct the bounds of __fragmentation_index()

2018-02-23 Thread Michal Hocko
compaction changes over time. So I would really prefer to kill the tuning than try to "fix" it. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] mm, compaction: correct the bounds of __fragmentation_index()

2018-02-23 Thread Michal Hocko
On Mon 19-02-18 14:30:36, Robert Harris wrote: > > > > On 19 Feb 2018, at 12:39, Michal Hocko <mho...@kernel.org> wrote: > > > > On Mon 19-02-18 12:14:26, Robert Harris wrote: > >> > >> > >>> On 19 Feb 2018, at 08:26, Michal Hocko &

Re: [PATCH 1/1] mm, compaction: correct the bounds of __fragmentation_index()

2018-02-19 Thread Michal Hocko
On Mon 19-02-18 12:14:26, Robert Harris wrote: > > > > On 19 Feb 2018, at 08:26, Michal Hocko <mho...@kernel.org> wrote: > > > > On Sun 18-02-18 16:47:55, robert.m.har...@oracle.com wrote: > >> From: "Robert M. Harris" <robert.m.har...@oracl

Re: [PATCH 1/1] mm, compaction: correct the bounds of __fragmentation_index()

2018-02-19 Thread Michal Hocko
* has the more useful range of 0 < F <= 1. In order to inhibit > + * compaction in the event of a pathological shortfall of memory this > + * function truncates and returns > + * > + * F - 1/info->free_blocks_total >*/ > -

Re: [patch 1/2] mm, page_alloc: extend kernelcore and movablecore for percent

2018-02-15 Thread Michal Hocko
On Wed 14-02-18 02:28:38, David Rientjes wrote: > On Wed, 14 Feb 2018, Michal Hocko wrote: > > > I do not have any objections regarding the extension. What I am more > > interested in is _why_ people are still using this command line > > parameter at all these days

Re: [patch 1/2] mm, page_alloc: extend kernelcore and movablecore for percent

2018-02-14 Thread Michal Hocko
e lowmem issues from 32b days. I can see the CMA/Hotplug usecases for ZONE_MOVABLE but those have their own ways to define zone movable. I was tempted to simply remove the kernelcore already. Could you be more specific what is your usecase which triggered a need of an easier scaling of

Re: [patch -mm v2 1/3] mm, memcg: introduce per-memcg oom policy tunable

2018-01-31 Thread Michal Hocko
On Tue 30-01-18 14:38:40, David Rientjes wrote: > On Tue, 30 Jan 2018, Michal Hocko wrote: > > > > > So what is the actual semantic and scope of this policy. Does it apply > > > > only down the hierarchy. Also how do you compare cgroups with different > &

Re: [PATCH] Documentation: Fix 'file_mapped' -> 'mapped_file'

2018-01-30 Thread Michal Hocko
@neclab.eu> Acked-by: Michal Hocko <mho...@suse.com> Thanks for catching this. > --- > Documentation/cgroup-v1/memory.txt | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/Documentation/cgroup-v1/memory.txt > b/Documentation/cgroup-v1/memory.

Re: [patch -mm v2 2/3] mm, memcg: replace cgroup aware oom killer mount option with tunable

2018-01-30 Thread Michal Hocko
On Tue 30-01-18 11:58:51, Roman Gushchin wrote: > On Tue, Jan 30, 2018 at 09:54:45AM +0100, Michal Hocko wrote: > > On Mon 29-01-18 11:11:39, Tejun Heo wrote: > > Hello, Michal! > > > diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt > > in

Re: [patch -mm v2 2/3] mm, memcg: replace cgroup aware oom killer mount option with tunable

2018-01-30 Thread Michal Hocko
On Mon 29-01-18 11:11:39, Tejun Heo wrote: > Hello, Michal. > > On Mon, Jan 29, 2018 at 11:46:57AM +0100, Michal Hocko wrote: > > @@ -1292,7 +1292,11 @@ the memory controller considers only cgroups > > belonging to the sub-tree > > of the OOM'ing cgroup. > >

Re: [patch -mm v2 1/3] mm, memcg: introduce per-memcg oom policy tunable

2018-01-30 Thread Michal Hocko
On Mon 29-01-18 14:38:02, David Rientjes wrote: > On Fri, 26 Jan 2018, Michal Hocko wrote: > > > > The cgroup aware oom killer is needlessly declared for the entire system > > > by a mount option. It's unnecessary to force the system into a single > > >

Re: [patch -mm v2 2/3] mm, memcg: replace cgroup aware oom killer mount option with tunable

2018-01-29 Thread Michal Hocko
es are nice. We can cc:stable them too, so no huge > hurry. What about this? >From c02d8bc1396d5ab3785d01f577e2ee74e5dd985e Mon Sep 17 00:00:00 2001 From: Michal Hocko <mho...@suse.com> Date: Mon, 29 Jan 2018 11:42:59 +0100 Subject: [PATCH] oom, memcg: clarify root memcg oom accounting

Re: [patch -mm v2 1/3] mm, memcg: introduce per-memcg oom policy tunable

2018-01-27 Thread Michal Hocko
ensible this is actually. How do we place priorities on top? > Signed-off-by: David Rientjes <rient...@google.com> -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [patch -mm 3/4] mm, memcg: replace memory.oom_group with policy tunable

2018-01-26 Thread Michal Hocko
On Thu 25-01-18 15:27:29, David Rientjes wrote: > On Thu, 25 Jan 2018, Michal Hocko wrote: > > > > As a result, this would remove patch 3/4 from the series. Do you have > > > any > > > other feedback regarding the remainder of this patch series before I >

Re: [patch -mm 3/4] mm, memcg: replace memory.oom_group with policy tunable

2018-01-25 Thread Michal Hocko
n API hazard AFAICS. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [patch -mm 3/4] mm, memcg: replace memory.oom_group with policy tunable

2018-01-25 Thread Michal Hocko
On Wed 24-01-18 13:44:02, David Rientjes wrote: > On Wed, 24 Jan 2018, Michal Hocko wrote: > > > > The current implementation of memory.oom_group is based on top of a > > > selection implementation that is broken in three ways I have listed for > > > m

Re: [patch -mm 3/4] mm, memcg: replace memory.oom_group with policy tunable

2018-01-24 Thread Michal Hocko
On Tue 23-01-18 14:22:07, David Rientjes wrote: > On Tue, 23 Jan 2018, Michal Hocko wrote: > > > > It can't, because the current patchset locks the system into a single > > > selection criteria that is unnecessary and the mount option would become > > > a

Re: [patch -mm 3/4] mm, memcg: replace memory.oom_group with policy tunable

2018-01-23 Thread Michal Hocko
ubject to changes in future. Current implementation doesn't provide any externally controlable selection policy and therefore the default can be assumed. Whatever that default means now or in future. The only contract added here is the kill full memcg if selected and that can be implemented on _any_ sele

Re: [patch -mm 3/4] mm, memcg: replace memory.oom_group with policy tunable

2018-01-23 Thread Michal Hocko
On Wed 17-01-18 14:18:33, David Rientjes wrote: > On Wed, 17 Jan 2018, Michal Hocko wrote: > > > Absolutely agreed! And moreover, there are not all that many ways what > > to do as an action. You just kill a logical entity - be it a process or > > a logical group of pro

Re: [patch -mm 3/4] mm, memcg: replace memory.oom_group with policy tunable

2018-01-17 Thread Michal Hocko
lection and the action is a no go and a wrong API. This is why I've said that what you (David) outlined yesterday is probably going to suffer from a much longer discussion and most likely to be not acceptable. Your patchset proves me correct... -- Michal Hocko SUSE Labs -- To unsubscribe from th

Re: [PATCH v13 0/7] cgroup-aware OOM killer

2018-01-16 Thread Michal Hocko
On Tue 16-01-18 13:36:21, David Rientjes wrote: > On Mon, 15 Jan 2018, Michal Hocko wrote: > > > > No, this isn't how kernel features get introduced. We don't design a new > > > kernel feature with its own API for a highly specialized usecase and then > > > c

Re: [PATCH v13 0/7] cgroup-aware OOM killer

2018-01-15 Thread Michal Hocko
gain_ a form of obstructing the current patchset which is what you have been doing for quite some time. I will leave the final decision for merging to Andrew. If you want to build a more fine grained control on top, you are free to do so. I will be reviewing those like any other upstream oom changes. --

Re: [PATCH v13 0/7] cgroup-aware OOM killer

2018-01-11 Thread Michal Hocko
today. We > > > need to implement that heuristic and introduce userspace influence over > > > oom kill selection now rather than later because its implementation > > > changes how this patchset is implemented. > > > > > > I can implement these changes,

Re: [RFC PATCH] mm: memcontrol: memory+swap accounting for cgroup-v2

2017-12-19 Thread Michal Hocko
roaches but I am not convinced we should convolute the user API for the usecase you describe. > Signed-off-by: Shakeel Butt <shake...@google.com> -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH v13 6/7] mm, oom, docs: describe the cgroup-aware OOM killer

2017-12-01 Thread Michal Hocko
controller tries to make the best > choice of a victim, looking for a memory cgroup with the largest > memory footprint, considering leaf cgroups and cgroups with the Looks good to me Thanks! -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscri

Re: [PATCH] mm, oom: simplify alloc_pages_before_oomkill handling

2017-12-01 Thread Michal Hocko
On Fri 01-12-17 13:32:15, Roman Gushchin wrote: > Hi, Michal! > > I totally agree that out_of_memory() function deserves some refactoring. > > But I think there is an issue with your patch (see below): > > On Fri, Dec 01, 2017 at 10:14:25AM +0100, Michal Hocko wrot

Re: [PATCH v13 5/7] mm, oom: add cgroup v2 mount option for cgroup-aware OOM killer

2017-12-01 Thread Michal Hocko
On Fri 01-12-17 13:15:38, Roman Gushchin wrote: [...] > So, maybe we just need to return -EAGAIN (or may be -ENOTSUP) on any > read/write > attempt if option is not enabled? Yes, that would work as well. ENOTSUP sounds better to me. -- Michal Hocko SUSE Labs -- To unsubscribe from

[PATCH] mm, oom: simplify alloc_pages_before_oomkill handling

2017-12-01 Thread Michal Hocko
Recently added alloc_pages_before_oomkill gained new caller with this patchset and I think it just grown to deserve a simpler code flow. What do you think about this on top of the series? --- >From f1f6035ea0df65e7619860b013f2fabdda65233e Mon Sep 17 00:00:00 2001 From: Michal Hocko &

Re: [PATCH v13 6/7] mm, oom, docs: describe the cgroup-aware OOM killer

2017-12-01 Thread Michal Hocko
. > > +OOM Killer > +~~ > + > +Cgroup v2 memory controller implements a cgroup-aware OOM killer. > +It means that it treats cgroups as first class OOM entities. This should mention groupoom mount option to enable this functionality. Other than that looks ok to me Acked-by:

Re: [PATCH v13 5/7] mm, oom: add cgroup v2 mount option for cgroup-aware OOM killer

2017-12-01 Thread Michal Hocko
remounting the cgroupfs. Is it ok to create oom_group if the option is not enabled? This looks confusing. I forgot all the details about how cgroup core creates file so I do not have a good idea how to fix this. > Signed-off-by: Roman Gushchin <g...@fb.com> > Cc: Michal Hocko <m

Re: [PATCH v13 3/7] mm, oom: cgroup-aware OOM killer

2017-12-01 Thread Michal Hocko
tion a special approximation > is used for estimating oom_score of root memory cgroup: we sum > oom_score of the belonging processes (or, to be more precise, > tasks owning their mm structures). > > Signed-off-by: Roman Gushchin <g...@fb.com> > Cc: Michal Hocko <mho...

Re: [PATCH] mm:Add watermark slope for high mark

2017-11-24 Thread Michal Hocko
On Fri 24-11-17 14:12:56, peter enderborg wrote: > On 11/24/2017 11:14 AM, Michal Hocko wrote: > > On Fri 24-11-17 11:07:07, Peter Enderborg wrote: > >> When tuning the watermark_scale_factor to reduce stalls and compactions > >> the high mark is also changed, it

Re: [PATCH] mm:Add watermark slope for high mark

2017-11-24 Thread Michal Hocko
t;lock, flags); > tmp = (u64)pages_min * zone->managed_pages; > @@ -7026,7 +7028,9 @@ static void __setup_per_zone_wmarks(void) > watermark_scale_factor, 1)); > > zone->watermark[WMARK_LOW] = min_wmark_pages(zo

Re: [RESEND v12 0/6] cgroup-aware OOM killer

2017-11-01 Thread Michal Hocko
On Tue 31-10-17 15:21:23, David Rientjes wrote: > On Tue, 31 Oct 2017, Michal Hocko wrote: > > > > I'm not ignoring them, I have stated that we need the ability to protect > > > important cgroups on the system without oom disabling all attached > > > p

Re: [RESEND v12 3/6] mm, oom: cgroup-aware OOM killer

2017-10-31 Thread Michal Hocko
On Tue 31-10-17 20:06:44, Michal Hocko wrote: > On Tue 31-10-17 16:29:23, Michal Hocko wrote: > > On Tue 31-10-17 08:04:19, Shakeel Butt wrote: > > > > + > > > > +static void select_victim_memcg(struct mem_cgroup *root, struct > > > > oom_control *oc)

Re: [RESEND v12 3/6] mm, oom: cgroup-aware OOM killer

2017-10-31 Thread Michal Hocko
On Tue 31-10-17 16:29:23, Michal Hocko wrote: > On Tue 31-10-17 08:04:19, Shakeel Butt wrote: > > > + > > > +static void select_victim_memcg(struct mem_cgroup *root, struct > > > oom_control *oc) > > > +{ > > > + struct mem_cgroup *ite

Re: [RESEND v12 3/6] mm, oom: cgroup-aware OOM killer

2017-10-31 Thread Michal Hocko
for charge migration in v2. To be honest I wasn't completely happy about removing this functionality altogether in v2 but there was a strong pushback back then that relying on the charge migration doesn't have any sound usecase. Anyway, I agree that documentation should be explicit about that. -- Michal Hocko

Re: [RESEND v12 0/6] cgroup-aware OOM killer

2017-10-31 Thread Michal Hocko
life-line when the user-space for some reason fails. > > So I guess quite a few will have this problem. Could you be more specific please? We are _not_ removing possibility of the user space influenced oom victim selection. You can still use the _current_ oom selection heuristic. The patch adds a new selection method which is opt-in so only those who want to opt-in will not be allowed to have any influence on the victim selection. And as it has been pointed out this can be implemented later so it is not like "this won't be possible anymore in future" -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [RESEND v12 0/6] cgroup-aware OOM killer

2017-10-31 Thread Michal Hocko
an be implemented without changing user visible behavior as and add-on. If you disagree then you better come with a solid proof that all of us wrong and reasonable semantic cannot be achieved that way. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-do

Re: [RESEND v12 0/6] cgroup-aware OOM killer

2017-10-23 Thread Michal Hocko
rom the global memory killer by spawning new processes. > (3) the inability of userspace to effectively control oom victim > selection. this is not requested by the current usecase and it has been pointed out that this will be possible to implement on top of the foundation of

Re: [RESEND v12 0/6] cgroup-aware OOM killer

2017-10-19 Thread Michal Hocko
ehavior" > Tejun also wasn't convinced > of the risk for regression, and too would prefer cgroup-awareness to > be the default in cgroup2. I would ask for patch 5/6 to be dropped. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc

Re: [RESEND v12 3/6] mm, oom: cgroup-aware OOM killer

2017-10-19 Thread Michal Hocko
tion a special approximation > is used for estimating oom_score of root memory cgroup: we sum > oom_score of the belonging processes (or, to be more precise, > tasks owning their mm structures). > > Signed-off-by: Roman Gushchin <g...@fb.com> > Acked-by: Michal Hocko &

Re: [PATCH 1/2] mm, thp: introduce dedicated transparent huge page allocation interfaces

2017-10-19 Thread Michal Hocko
On Wed 18-10-17 19:00:26, Du, Changbin wrote: > Hi Hocko, > > On Tue, Oct 17, 2017 at 12:20:52PM +0200, Michal Hocko wrote: > > [CC Kirill] > > > > On Mon 16-10-17 17:19:16, changbin...@intel.com wrote: > > > From: Changbin Du <changbin...@intel.com>

Re: [PATCH 2/2] mm: rename page dtor functions to {compound,huge,transhuge}_page__dtor

2017-10-17 Thread Michal Hocko
On Tue 17-10-17 14:22:14, Kirill A. Shutemov wrote: > On Tue, Oct 17, 2017 at 12:22:03PM +0200, Michal Hocko wrote: > > On Mon 16-10-17 17:19:17, changbin...@intel.com wrote: > > > From: Changbin Du <changbin...@intel.com> > > > > > > The current

Re: [PATCH 1/2] mm, thp: introduce dedicated transparent huge page allocation interfaces

2017-10-17 Thread Michal Hocko
On Tue 17-10-17 12:20:52, Michal Hocko wrote: > [CC Kirill] now for real > On Mon 16-10-17 17:19:16, changbin...@intel.com wrote: > > From: Changbin Du <changbin...@intel.com> > > > > This patch introduced 4 new interfaces to allocate a prep

Re: [PATCH 2/2] mm: rename page dtor functions to {compound,huge,transhuge}_page__dtor

2017-10-17 Thread Michal Hocko
* the reservation was consumed when the page was allocated. >* We clear the PagePrivate flag now so that the global > - * reserve count will not be incremented in free_huge_page. > + * reserve count will not be incremented in huge_page_dtor. >* The reservation map will still indicate the reservation >* was consumed and possibly prevent later page allocation. >* This is better than leaking a global reservation. If no > -- > 2.7.4 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majord...@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: mailto:"d...@kvack.org;> em...@kvack.org -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] mm, thp: introduce dedicated transparent huge page allocation interfaces

2017-10-17 Thread Michal Hocko
; - HPAGE_PMD_ORDER); > + new_page = alloc_transhuge_page_node(node, > + (GFP_TRANSHUGE_LIGHT | __GFP_THISNODE)); > if (!new_page) > goto out_fail; > - prep_transhuge_page(new_page); > > isolated = numamigrate_isolate_page(pgd

Re: [v11 3/6] mm, oom: cgroup-aware OOM killer

2017-10-12 Thread Michal Hocko
On Wed 11-10-17 13:27:44, David Rientjes wrote: > On Wed, 11 Oct 2017, Michal Hocko wrote: > > > > For these reasons: unfair comparison of root mem cgroup usage to bias > > > against that mem cgroup from oom kill in system oom conditions, the > > > ability of

Re: [v11 3/6] mm, oom: cgroup-aware OOM killer

2017-10-11 Thread Michal Hocko
s been stated several times already that future improvements are possible and cover what you have described already. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info

Re: [v11 3/6] mm, oom: cgroup-aware OOM killer

2017-10-10 Thread Michal Hocko
delay = true; > > + goto out; > > + } > > + > > select_bad_process(oc); > > This is racy because mem_cgroup_select_oom_victim() found an eligible > oc->chosen_memcg that is not INFLIGHT_VICTIM with at least one eligible > process but mem_

Re: [v11 4/6] mm, oom: introduce memory.oom_group

2017-10-06 Thread Michal Hocko
On Fri 06-10-17 13:04:35, Roman Gushchin wrote: > On Thu, Oct 05, 2017 at 04:31:04PM +0200, Michal Hocko wrote: > > Btw. here is how I would do the recursive oom badness. The diff is not > > the nicest one because there is some code moving but the resulting code > > is sm

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-05 Thread Michal Hocko
knob on top of that if you need. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [v10 5/6] mm, oom: add cgroup v2 mount option for cgroup-aware OOM killer

2017-10-05 Thread Michal Hocko
On Thu 05-10-17 10:54:01, Johannes Weiner wrote: > On Thu, Oct 05, 2017 at 03:14:19PM +0200, Michal Hocko wrote: > > On Wed 04-10-17 16:04:53, Johannes Weiner wrote: > > [...] > > > That will silently ignore what the user writes to the memory.oom_group > > >

Re: [v11 4/6] mm, oom: introduce memory.oom_group

2017-10-05 Thread Michal Hocko
oc->chosen_memcg = group; + if (score > oc->chosen_points) { + oc->chosen_points = score; + oc->chosen_memcg = iter; } } -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [v11 4/6] mm, oom: introduce memory.oom_group

2017-10-05 Thread Michal Hocko
er defined configuration might lead to data corruptions or other > misbehavior. > > The default value is 0. I still believe that oc->chosen_task == INFLIGHT_VICTIM check in oom_kill_memcg_victim should go away. > > Signed-off-by: Roman Gushchin <g...@fb.com> > Cc: Mi

Re: [v11 3/6] mm, oom: cgroup-aware OOM killer

2017-10-05 Thread Michal Hocko
. > > The root cgroup is treated as a leaf memory cgroup, so it's score > is compared with leaf memory cgroups. > Due to memcg statistics implementation a special algorithm > is used for estimating it's oom_score: we define it as maximum > oom_score of the belonging tasks. &g

Re: [v10 5/6] mm, oom: add cgroup v2 mount option for cgroup-aware OOM killer

2017-10-05 Thread Michal Hocko
On Thu 05-10-17 14:41:13, Roman Gushchin wrote: > On Thu, Oct 05, 2017 at 03:14:19PM +0200, Michal Hocko wrote: > > On Wed 04-10-17 16:04:53, Johannes Weiner wrote: > > [...] > > > That will silently ignore what the user writes to the memory.oom_group > > > control

Re: [v10 5/6] mm, oom: add cgroup v2 mount option for cgroup-aware OOM killer

2017-10-05 Thread Michal Hocko
the interface part but I disagree with making it default just because v2 is not largerly adopted yet. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [v10 4/6] mm, oom: introduce memory.oom_group

2017-10-05 Thread Michal Hocko
On Thu 05-10-17 13:32:14, Roman Gushchin wrote: > On Thu, Oct 05, 2017 at 02:06:49PM +0200, Michal Hocko wrote: > > On Wed 04-10-17 16:46:36, Roman Gushchin wrote: > > > The cgroup-aware OOM killer treats leaf memory cgroups as memory > > > consumption entities and pe

Re: [v10 4/6] mm, oom: introduce memory.oom_group

2017-10-05 Thread Michal Hocko
uate_task, oc); > + > + if (oc->chosen_task == NULL || > + oc->chosen_task == INFLIGHT_VICTIM) > + goto out; How can this happen? There shouldn't be any INFLIGHT_VICTIM in our memcg because we have checked for that already. I can

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-05 Thread Michal Hocko
anks for separating the group_oom part. This is getting in the mergeable state. I will ack it once the suggested fixes are folded in. There is some clean up potential on top (I find the oc->chosen_memcg quite ugly and will post a patch on top of yours) but that can be done later. > Signed-of

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-05 Thread Michal Hocko
ext patch would probably make sense. Although, us reviewers have > been made aware of this now, so I don't feel strongly about it. Won't > make much of a difference once the patches are merged. I think it would be better to move it because it will be less confusing that way. Especially for those who are going to read git history in order to understand why this is needed. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [v10 3/6] mm, oom: cgroup-aware OOM killer

2017-10-05 Thread Michal Hocko
all > > oom_evaluate_memcg() for offlined memcgs. > > Sounds like a good optimization, which can be done on top of the current > patchset. You could achive this by checking whether a memcg has tasks rather than explicitly checking for children memcgs as I've suggested already. --

Re: [v9 3/5] mm, oom: cgroup-aware OOM killer

2017-10-04 Thread Michal Hocko
On Tue 03-10-17 07:35:59, Tejun Heo wrote: > Hello, Michal. > > On Tue, Oct 03, 2017 at 04:22:46PM +0200, Michal Hocko wrote: > > On Tue 03-10-17 15:08:41, Roman Gushchin wrote: > > > On Tue, Oct 03, 2017 at 03:36:23PM +0200, Michal Hocko wrote: > > [...] >

Re: [v9 3/5] mm, oom: cgroup-aware OOM killer

2017-10-03 Thread Michal Hocko
On Tue 03-10-17 15:38:08, Roman Gushchin wrote: > On Tue, Oct 03, 2017 at 04:22:46PM +0200, Michal Hocko wrote: > > On Tue 03-10-17 15:08:41, Roman Gushchin wrote: > > > On Tue, Oct 03, 2017 at 03:36:23PM +0200, Michal Hocko wrote: > > [...] > > > >

Re: [v9 3/5] mm, oom: cgroup-aware OOM killer

2017-10-03 Thread Michal Hocko
On Tue 03-10-17 15:08:41, Roman Gushchin wrote: > On Tue, Oct 03, 2017 at 03:36:23PM +0200, Michal Hocko wrote: [...] > > I guess we want to inherit the value on the memcg creation but I agree > > that enforcing parent setting is weird. I will think about it some more

Re: [v9 4/5] mm, oom: add cgroup v2 mount option for cgroup-aware OOM killer

2017-10-03 Thread Michal Hocko
On Tue 03-10-17 13:49:36, Roman Gushchin wrote: > On Tue, Oct 03, 2017 at 01:50:36PM +0200, Michal Hocko wrote: > > On Wed 27-09-17 14:09:35, Roman Gushchin wrote: > > > Add a "groupoom" cgroup v2 mount option to enable the cgroup-aware > > > OOM killer. If no

Re: [v9 3/5] mm, oom: cgroup-aware OOM killer

2017-10-03 Thread Michal Hocko
On Tue 03-10-17 13:37:21, Roman Gushchin wrote: > On Tue, Oct 03, 2017 at 01:48:48PM +0200, Michal Hocko wrote: [...] > > Wrt. to the implicit inheritance you brought up in a separate email > > thread [1]. Let me quote > > : after some additional thinking I don't think

Re: [v9 4/5] mm, oom: add cgroup v2 mount option for cgroup-aware OOM killer

2017-10-03 Thread Michal Hocko
line argument would fit better IMHO. > Signed-off-by: Roman Gushchin <g...@fb.com> > Cc: Michal Hocko <mho...@kernel.org> > Cc: Vladimir Davydov <vdavydov@gmail.com> > Cc: Johannes Weiner <han...@cmpxchg.org> > Cc: Tetsuo Handa <penguin-ker...@i-love

Re: [v9 3/5] mm, oom: cgroup-aware OOM killer

2017-10-03 Thread Michal Hocko
footprint */ > + oc->chosen_points = 0; > + oc->chosen_task = NULL; > + mem_cgroup_scan_tasks(oc->chosen_memcg, oom_evaluate_task, oc); > + > + if (oc->chosen_task == NULL || oc->chosen_task == INFLIGHT_VICTIM) > + goto out; > + >

Re: [v9 2/5] mm: implement mem_cgroup_scan_tasks() for the root memory cgroup

2017-10-03 Thread Michal Hocko
is just a preparatory work for later changes. > Signed-off-by: Roman Gushchin <g...@fb.com> > Cc: Michal Hocko <mho...@kernel.org> > Cc: Vladimir Davydov <vdavydov@gmail.com> > Cc: Johannes Weiner <han...@cmpxchg.org> > Cc: Tetsuo Handa <penguin-ker...

Re: [PATCH] mm,hugetlb,migration: don't migrate kernelcore hugepages

2017-10-03 Thread Michal Hocko
On Tue 03-10-17 07:42:25, Alexandru Moise wrote: > On Mon, Oct 02, 2017 at 06:15:00PM +0200, Michal Hocko wrote: [...] > > I really fail to see why kernel vs. movable zones play any role here. > > Zones should be mostly an implementation detail which userspace > > should

Re: [v8 0/4] cgroup-aware OOM killer

2017-10-02 Thread Michal Hocko
On Mon 02-10-17 13:24:25, Shakeel Butt wrote: > On Mon, Oct 2, 2017 at 12:56 PM, Michal Hocko <mho...@kernel.org> wrote: > > On Mon 02-10-17 12:45:18, Shakeel Butt wrote: > >> > I am sorry to cut the rest of your proposal because it simply goes over > >> > t

Re: [v8 0/4] cgroup-aware OOM killer

2017-10-02 Thread Michal Hocko
understand the proposal (from > reading thread, not patch) it does not. No it doesn't. It allows you to kill A (recursively) as the largest memory consumer. So, no, it cannot be used for prioritization, but again this is not yet the scope of the proposed solution. -- Michal Hocko SUSE Labs -- T

Re: [v8 0/4] cgroup-aware OOM killer

2017-10-02 Thread Michal Hocko
would really appreciate to focus on making the step 1 done before diverging into details about potential improvements and a better control over the selection. This whole thing is an opt-in so there is a no risk of a regression. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: se

Re: [PATCH] mm,hugetlb,migration: don't migrate kernelcore hugepages

2017-10-02 Thread Michal Hocko
On Mon 02-10-17 17:06:38, Alexandru Moise wrote: > On Mon, Oct 02, 2017 at 04:27:17PM +0200, Michal Hocko wrote: > > On Mon 02-10-17 16:06:33, Alexandru Moise wrote: > > > On Mon, Oct 02, 2017 at 02:54:32PM +0200, Michal Hocko wrote: > > > > On Mon 02-10-17 0

Re: [v8 0/4] cgroup-aware OOM killer

2017-10-02 Thread Michal Hocko
On Mon 02-10-17 13:47:12, Roman Gushchin wrote: > On Mon, Oct 02, 2017 at 02:24:34PM +0200, Michal Hocko wrote: [...] > > I believe the latest version (v9) looks sensible from the semantic point > > of view and we should focus on making it into a mergeable shape. > > The onl

Re: [PATCH] mm,hugetlb,migration: don't migrate kernelcore hugepages

2017-10-02 Thread Michal Hocko
On Mon 02-10-17 16:06:33, Alexandru Moise wrote: > On Mon, Oct 02, 2017 at 02:54:32PM +0200, Michal Hocko wrote: > > On Mon 02-10-17 00:51:11, Alexandru Moise wrote: > > > This attempts to bring more flexibility to how hugepages are allocated > > > by making it possibl

Re: [PATCH] mm,hugetlb,migration: don't migrate kernelcore hugepages

2017-10-02 Thread Michal Hocko
s are not really migratable which should be the only criterion. Hugetlb pages are no different from other migratable pages in that regards. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [v8 0/4] cgroup-aware OOM killer

2017-10-02 Thread Michal Hocko
module based selection. But let's start simple with the most basic scenario first with a most sensible semantic implemented. I believe the latest version (v9) looks sensible from the semantic point of view and we should focus on making it into a mergeable shape. -- Michal Hocko SUSE Labs -- To u

Re: [v8 0/4] cgroup-aware OOM killer

2017-09-27 Thread Michal Hocko
ounter argument for that example yet. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [v8 0/4] cgroup-aware OOM killer

2017-09-27 Thread Michal Hocko
On Tue 26-09-17 14:04:41, David Rientjes wrote: > On Tue, 26 Sep 2017, Michal Hocko wrote: > > > > No, I agree that we shouldn't compare sibling memory cgroups based on > > > different criteria depending on whether group_oom is set or not. > > > > >

Re: [v8 0/4] cgroup-aware OOM killer

2017-09-26 Thread Michal Hocko
On Tue 26-09-17 13:13:00, Roman Gushchin wrote: > On Tue, Sep 26, 2017 at 01:21:34PM +0200, Michal Hocko wrote: > > On Tue 26-09-17 11:59:25, Roman Gushchin wrote: > > > On Mon, Sep 25, 2017 at 10:25:21PM +0200, Michal Hocko wrote: > > > > On Mon 25-09-17

Re: [v8 0/4] cgroup-aware OOM killer

2017-09-26 Thread Michal Hocko
g semantic. I can see priorities being very useful on killable entities for sure. I am not entirely sure what would be the best approach yet and that is why I've suggested that to postpone to after we settle with a simple approach first. Bringing priorities back to the discussion again will not hel

Re: [v8 0/4] cgroup-aware OOM killer

2017-09-25 Thread Michal Hocko
vious example will > be a task with oom_score_adj set to any non-extreme (other than 0 and -1000) > value, but it can also happen in case of constrained alloc, for instance. I am not sure I understand. Are you talking about root memcg comparing to other memcgs? -- Michal Hocko SUSE Labs --

Re: [v8 0/4] cgroup-aware OOM killer

2017-09-25 Thread Michal Hocko
I would really appreciate some feedback from Tejun, Johannes here. On Wed 20-09-17 14:53:41, Roman Gushchin wrote: > On Mon, Sep 18, 2017 at 08:14:05AM +0200, Michal Hocko wrote: > > On Fri 15-09-17 08:23:01, Roman Gushchin wrote: > > > On Fri, Sep 15, 2017 at 12:58:26PM +0200,

Re: [v8 0/4] cgroup-aware OOM killer

2017-09-18 Thread Michal Hocko
are on different levels > (or in different subtrees). Well, I have given you one that doesn't sounds completely insane to me in other email. You may need an intermediate level for other than memcg controller. The whole concept of significance of the hierarchy level seems really odd to me. Or am I wrong here? -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [v8 0/4] cgroup-aware OOM killer

2017-09-18 Thread Michal Hocko
; cgroups and they can own memory.oom_priority for their own subcontainers, > this becomes quite powerful so they can define their own oom priorities. > Otherwise, they can easily override the oom priorities of other cgroups. Could you be more speicific about your usecase? What would

  1   2   >