Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-18 Thread Michal Hocko
[] handle_pte_fault+0x84/0x940 > [] handle_mm_fault+0x16a/0x320 > [] do_page_fault+0x13b/0x490 > [] page_fault+0x1f/0x30 > [] 0x This is the direct reclaim path. You are simply running out of memory globaly. There is no memcg specific code in that trace. -- Michal Hoc

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-18 Thread Michal Hocko
[] __do_fault+0x78/0x5a0 > >> [] handle_pte_fault+0x84/0x940 > >> [] handle_mm_fault+0x16a/0x320 > >> [] do_page_fault+0x13b/0x490 > >> [] page_fault+0x1f/0x30 > >> [] 0xffff > > > >This is the direct reclaim path. You are simpl

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-18 Thread Michal Hocko
ubmit+0x21/0x30 > >> >> [] filemap_fault+0x380/0x4f0 > >> >> [] __do_fault+0x78/0x5a0 > >> >> [] handle_pte_fault+0x84/0x940 > >> >> [] handle_mm_fault+0x16a/0x320 > >> >> [] do_page_fault+0x13b/0x490 > >> >> []

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-04 Thread Michal Hocko
_signal+0x3d/0x7b > [] 0x [...] This task is sitting in the refigerator which means it has been frozen by the freezer cgroup most probably. I am not familiar with the implementation but my recollection is that you have to thaw that group in order the killed process can

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-04 Thread Michal Hocko
>by the freezer cgroup most probably. I am not familiar with the > >implementation but my recollection is that you have to thaw that group > >in order the killed process can pass away. > > Yes, my script is freezing the cgroup before killing processes inside > it. Stack

Re: [PATCH v5] Soft limit rework

2013-09-04 Thread Michal Hocko
On Tue 03-09-13 12:15:50, Johannes Weiner wrote: > > On Tue 20-08-13 10:13:39, Johannes Weiner wrote: > > > On Tue, Aug 20, 2013 at 11:14:14AM +0200, Michal Hocko wrote: > > > > On Mon 19-08-13 12:35:12, Johannes Weiner wrote: > > > > > On Tue, Jun 18, 20

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-05 Thread Michal Hocko
; [] pagefault_out_of_memory+0xe/0x120 > [] mm_fault_error+0x9e/0x150 > [] do_page_fault+0x404/0x490 > [] page_fault+0x1f/0x30 > [] 0x -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord..

Re: memcg creates an unkillable task in 3.11-rc2

2013-09-05 Thread Michal Hocko
It seems that this one fell though the cracks? On Thu 01-08-13 11:06:20, Michal Hocko wrote: > On Wed 31-07-13 15:09:16, Eric W. Biederman wrote: > > Michal Hocko writes: > > > > > [I am CCing David here as well] > > > > > > On Tue 30-07-13 09:37:46, Er

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-05 Thread Michal Hocko
(apache2) score 1000 or sacrifice child And this doesn't list any of the tasks sleeping and waiting for oom resolving so they must have been created after this OOM. Is this the same group? -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-05 Thread Michal Hocko
s why i'm sending stacks here, i simply cannot tell if > there was or wasn't a problem. On the other hand if those processes would be stuck waiting for somebody to resolve the OOM for a long time without any change then yes we have a problem. Just to be sure I got you right. You have

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-05 Thread Michal Hocko
handle_mm_fault(struct mm_struct *mm, struct > vm_area_struct *vma, > if (flags & FAULT_FLAG_USER) > mem_cgroup_disable_oom(); > > - if (WARN_ON(task_in_memcg_oom(current) && !(ret & VM_FAULT_OOM))) { > - printk(&qu

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-05 Thread Michal Hocko
gt;those are killable by definition. > > Yes, my script killed all of that processes right after taking > stack. OK, _after_ part is important. Has the group gone away after then? -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kern

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-05 Thread Michal Hocko
m_cgroup_disable_oom branch to reduce an overhead for in-kernel faults. The overhead shouldn't be noticeable so I am not sure this is that important. > Signed-off-by: Johannes Weiner I do not see any easier way to fix this without returning back to the old behavior which is much wo

Re: [patch] mm, memcg: store memcg name for oom kill log consistency

2013-09-05 Thread Michal Hocko
On Thu 29-08-13 15:30:32, Michal Hocko wrote: > On Wed 28-08-13 23:03:54, David Rientjes wrote: > > A shared buffer is currently used for the name of the oom memcg and the > > memcg of the killed process. There is no serialization of memcg oom > > kills, so this buffer can

Re: memcg creates an unkillable task in 3.11-rc2

2013-09-09 Thread Michal Hocko
On Fri 06-09-13 11:09:21, Eric W. Biederman wrote: > Michal Hocko writes: > > > It seems that this one fell though the cracks? > > Not completely, but it happened just as I was doing my initial triage of > memcg problems and I haven't quite made it back to this.

Re: [PATCH] vmpressure: fix divide-by-0 in vmpressure_work_fn

2013-09-09 Thread Michal Hocko
03.596003080 -0700 > @@ -187,6 +187,9 @@ static void vmpressure_work_fn(struct wo > vmpr->reclaimed = 0; > spin_unlock(&vmpr->sr_lock); > > + if (!scanned) > + return; > + > do { > if (vmpressure_event(vm

Re: [patch] mm, memcg: store memcg name for oom kill log consistency

2013-09-09 Thread Michal Hocko
On Mon 09-09-13 02:00:26, David Rientjes wrote: > On Thu, 5 Sep 2013, Michal Hocko wrote: [...] > > Reported-by: David Rientjes > > Remove this. OK. Is there any other way how to give you a credit for discovering/reporting this issue? -- Michal Hocko SUSE Labs -- To unsubscribe

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-09 Thread Michal Hocko
ut_of_memory where it is expected. > Reported-by: Reported-by: azurIt > Debugged-by: Michal Hocko > Not-yet-Signed-off-by: Johannes Weiner Acked-by: Michal Hocko Thanks! > --- > include/linux/memcontrol.h | 17 > include/linux/sched.h

Re: [patch 0/7] improve memcg oom killer robustness v2

2013-09-09 Thread Michal Hocko
[Adding Glauber - the full patch is here https://lkml.org/lkml/2013/9/5/319] On Mon 09-09-13 14:36:25, Michal Hocko wrote: > On Thu 05-09-13 12:18:17, Johannes Weiner wrote: > [...] > > From: Johannes Weiner > > Subject: [patch] mm: memcg: do not trap chargers with ful

Re: [PATCH] vmpressure: fix divide-by-0 in vmpressure_work_fn

2013-09-11 Thread Michal Hocko
On Wed 11-09-13 08:40:57, Anton Vorontsov wrote: > On Mon, Sep 09, 2013 at 01:08:47PM +0200, Michal Hocko wrote: > > On Fri 06-09-13 22:59:16, Hugh Dickins wrote: > > > Hit divide-by-0 in vmpressure_work_fn(): checking vmpr->scanned before > > > taking the lock is not

Re: [PATCH] vmpressure: fix divide-by-0 in vmpressure_work_fn

2013-09-12 Thread Michal Hocko
On Wed 11-09-13 13:04:33, Hugh Dickins wrote: > On Wed, 11 Sep 2013, Michal Hocko wrote: [...] > > From 888745909da34f8aee8a208a82d467236b828d0d Mon Sep 17 00:00:00 2001 > > From: Michal Hocko > > Date: Wed, 11 Sep 2013 17:48:10 +0200 > > Subject: [PATCH] vmp

Re: [PATCH v5] Soft limit rework

2013-09-13 Thread Michal Hocko
On Fri 06-09-13 15:23:11, Johannes Weiner wrote: > On Wed, Sep 04, 2013 at 06:38:23PM +0200, Michal Hocko wrote: [...] > > To handle overcommit situations more gracefully. As the documentation > > states: > > " > > 7. Soft limits > > > > Soft limits a

Re: [PATCH 5/5] mm/cgroup: use N_MEMORY instead of N_HIGH_MEMORY

2013-08-30 Thread Michal Hocko
, N_MEMORY)) > addr = vzalloc_node(size, nid); > else > addr = vzalloc(size); > -- > 1.7.1 > > -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message

Re: [PATCH] memcg: fix multiple large threshold notifications

2013-09-03 Thread Michal Hocko
Done leaking pages. > > Patched v3.11-rc7 properly notifies: > Leaking... > 4096 listener:2013:8:31:14:13:36 > Done leaking pages. > > The fixed bug is old. It appears to date back to the introduction of > memcg threshold notifications in v2.6.34-rc1-116-g2e72b634

Re: [PATCH v2] memcg: first step towards hierarchical controller

2012-09-05 Thread Michal Hocko
On Wed 05-09-12 12:14:12, Glauber Costa wrote: > On 09/04/2012 08:25 PM, Michal Hocko wrote: > > On Tue 04-09-12 18:54:08, Glauber Costa wrote: > > [...] > >>>> I'd personally believe merging both our patches together would achieve a > >>>> good

Re: [PATCH v2] memcg: first step towards hierarchical controller

2012-09-06 Thread Michal Hocko
On Wed 05-09-12 13:12:38, Tejun Heo wrote: > Hello, Michal. > > On Wed, Sep 05, 2012 at 04:49:42PM +0200, Michal Hocko wrote: > > Can we settle on the following 3 steps? > > 1) warn about "flat" hierarchies (give it X releases) - I will push it > >to

Re: [PATCH v2] memcg: first step towards hierarchical controller

2012-09-06 Thread Michal Hocko
On Thu 06-09-12 16:09:20, Glauber Costa wrote: > On 09/06/2012 04:06 PM, Michal Hocko wrote: > > On Wed 05-09-12 13:12:38, Tejun Heo wrote: > >> Hello, Michal. > >> > >> On Wed, Sep 05, 2012 at 04:49:42PM +0200, Michal Hocko wrote: > >>> Can we sett

Re: mmotm 2012-09-06-16-46 uploaded

2012-09-07 Thread Michal Hocko
[CCing Wu Fengguang so that he can update the link to the new tree location] On Thu 06-09-12 16:47:34, Andrew Morton wrote: [...] > A git tree which contains the memory management portion of this tree is > maintained at https://github.com/mstsxfx/memcg-devel.git by Michal Hocko. I have f

Re: mmotm 2012-09-06-16-46 uploaded

2012-09-07 Thread Michal Hocko
On Fri 07-09-12 23:22:51, Wu Fengguang wrote: > On Fri, Sep 07, 2012 at 05:12:47PM +0200, Michal Hocko wrote: > > [CCing Wu Fengguang so that he can update the link to the new tree > > location] > > > $ git remote add mm > > git://git.kernel.org/pub/scm/linux/kernel

Re: + mm-memblock-reduce-overhead-in-binary-search.patch added to -mm tree

2012-09-10 Thread Michal Hocko
ides that, if this kind of optimization is really worth, why don't we do the same thing for memblock_is_reserved and memblock_is_region_memory as well? So, while the patch seems correct, I do not see how much it helps while it definitely adds a code to maintain. > Signed-off-by: Wanpeng Li > C

Re: + mm-memblock-reduce-overhead-in-binary-search.patch added to -mm tree

2012-09-10 Thread Michal Hocko
On Mon 10-09-12 17:46:04, Wanpeng Li wrote: > On Mon, Sep 10, 2012 at 10:22:39AM +0200, Michal Hocko wrote: > >[Sorry for the late reply] > > > >On Fri 07-09-12 16:50:57, Andrew Morton wrote: > >> > >> The patch titled > >> Subject: mm/memblock

Re: + mm-memblock-reduce-overhead-in-binary-search.patch added to -mm tree

2012-09-10 Thread Michal Hocko
On Mon 10-09-12 19:30:51, Wanpeng Li wrote: > On Mon, Sep 10, 2012 at 01:05:50PM +0200, Michal Hocko wrote: > >On Mon 10-09-12 17:46:04, Wanpeng Li wrote: > >> On Mon, Sep 10, 2012 at 10:22:39AM +0200, Michal Hocko wrote: > >> >[Sorry for the late reply] > &g

Re: [PATCH v4 05/14] Add a __GFP_KMEMCG flag

2012-10-09 Thread Michal Hocko
. > > [ v4: make flag unconditional, also declare it in trace code ] > > Signed-off-by: Glauber Costa > CC: Christoph Lameter > CC: Pekka Enberg > CC: Michal Hocko > CC: Suleiman Souhlal > Acked-by: Johannes Weiner > Acked-by: Rik van Riel > Acked-by: Me

Re: [PATCH v4 08/14] res_counter: return amount of charges after res_counter_uncharge

2012-10-09 Thread Michal Hocko
cked(c, val); > + r = res_counter_uncharge_locked(c, val); > + if (c == counter) > + ret = r; > spin_unlock(&c->lock); > } > local_irq_restore(flags); > + return ret; As I have already mentioned in my previous feedback th

Re: [PATCH v4 08/14] res_counter: return amount of charges after res_counter_uncharge

2012-10-09 Thread Michal Hocko
On Tue 09-10-12 19:14:57, Glauber Costa wrote: > On 10/09/2012 07:08 PM, Michal Hocko wrote: > > As I have already mentioned in my previous feedback this is cetainly not > > atomic as you the lock protects only one group in the hierarchy. How is > > the return value from this

Re: [patch for-linus] memcg, kmem: fix build error when CONFIG_INET is disabled

2012-10-10 Thread Michal Hocko
ng) build testing. > As a matter of fact, I just tested, and it indeed start failing after > that patch. > > Michal, since it is just a cleanup patch, I'd prefer just reverting if > you are okay with it. I think that taking David's patch makes more sense. -- Michal Hocko

Re: [patch for-linus] memcg, kmem: fix build error when CONFIG_INET is disabled

2012-10-10 Thread Michal Hocko
ned reference to `sock_update_memcg' > > sock_update_memcg() is only defined when CONFIG_INET is enabled, so fix it > by defining the dummy function without this option. > > Reported-by: Randy Dunlap > Signed-off-by: David Rientjes Acked-by: Michal Hocko Thanks! > --- >

Re: [PATCH v4 08/14] res_counter: return amount of charges after res_counter_uncharge

2012-10-10 Thread Michal Hocko
On Wed 10-10-12 13:03:39, Glauber Costa wrote: > On 10/09/2012 07:35 PM, Michal Hocko wrote: > > On Tue 09-10-12 19:14:57, Glauber Costa wrote: > >> On 10/09/2012 07:08 PM, Michal Hocko wrote: > >>> As I have already mentioned in my previous feedback this is cetain

Re: [PATCH v4 08/14] res_counter: return amount of charges after res_counter_uncharge

2012-10-10 Thread Michal Hocko
only > users appearing from now on will be checking this value. > > Signed-off-by: Glauber Costa > CC: Michal Hocko > CC: Johannes Weiner > CC: Suleiman Souhlal > CC: Kamezawa Hiroyuki Reviewed-by: Michal Hocko > --- > Documentation/cgroups/resource_counter.txt |

[RFC PATCH] memcg: oom: fix totalpages calculation for swappiness==0

2012-10-10 Thread Michal Hocko
. --- >From 445c2ced957cd77cbfca44d0e3f5056fed252a34 Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Wed, 10 Oct 2012 15:46:54 +0200 Subject: [PATCH] memcg: oom: fix totalpages calculation for swappiness==0 oom_badness takes totalpages argument which says how many pages are available and it uses it

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-30 Thread Michal Hocko
group (called > 'mysql') but with no limits to any resources. Where is that group in the hierarchy? > > azurIt > -- > To unsubscribe from this list: send the line "unsubscribe cgroups" in > the body of a message to majord...@vger.kernel.org > More majordom

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-30 Thread Michal Hocko
t you are comounting with cpuset. If this happens to be a NUMA machine have you made the access to all nodes available? Also what does /proc/sys/vm/zone_reclaim_mode says? -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a mes

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-30 Thread Michal Hocko
On Fri 30-11-12 15:44:31, Michal Hocko wrote: > On Fri 30-11-12 14:44:27, azurIt wrote: > > >Anyway your system is under both global and local memory pressure. You > > >didn't see apache going down previously because it was probably the one > > >which was stuck a

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-30 Thread Michal Hocko
On Fri 30-11-12 16:03:47, Michal Hocko wrote: [...] > Anyway, the more interesting thing is gfp_mask is GFP_NOWAIT allocation > from the page fault? Huh this shouldn't happen - ever. OK, it starts making sense now. The message came from pagefault_out_of_memory which doesn't

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-30 Thread Michal Hocko
/en.wikipedia.org/wiki/Non-Uniform_Memory_Access > # cat /proc/sys/vm/zone_reclaim_mode > cat: /proc/sys/vm/zone_reclaim_mode: No such file or directory OK, so the NUMA is not enabled. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-ke

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-30 Thread Michal Hocko
e memcg OOM already), file backed page faults (aka __do_fault) use mem_cgroup_newpage_charge which doesn't disable OOM. This is a real head scratcher. Could you also post your complete containers configuration, maybe there is something strange in there (basically grep . -r YOUR_CGROUP_MNT excep

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-11-30 Thread Michal Hocko
memory.limit_in_bytes:157286400 68 memory.limit_in_bytes:209715200 10 memory.limit_in_bytes:262144000 28 memory.limit_in_bytes:314572800 1 memory.limit_in_bytes:346030080 1 memory.limit_in_bytes:524288000 2 memory.limit_in_bytes:9223372036854775807 -- Michal Hocko SU

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-12-03 Thread Michal Hocko
On Fri 30-11-12 17:19:23, Michal Hocko wrote: [...] > The important question is why you see VM_FAULT_OOM and whether memcg > charging failure can trigger that. I don not see how this could happen > right now because __GFP_NORETRY is not used for user pages (except for > THP which disab

Re: [PATCHSET cgroup/for-3.8] cpuset: decouple cpuset locking from cgroup core

2012-12-03 Thread Michal Hocko
Hi Tejun, I have glanced through the series and spotten nothing obviously wrong. I do not feel I could give my r-b because I am not familiar with cpusets internals enough and some patches looks quite scary (like #8). Anyway the resulting outcome seems nice. Thanks! -- Michal Hocko SUSE Labs

Re: [PATCHSET cgroup/for-3.8] cpuset: drop cpuset->stack_list and ->parent

2012-12-03 Thread Michal Hocko
y generic conditions when the whole subtree might be skipped at the moment. Maybe it will turn out being useful for the soft limit reclaim but I haven't thought about it more. [...] -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the bod

Re: [PATCH 1/3] cpuset: implement cgroup_rightmost_descendant()

2012-12-03 Thread Michal Hocko
off-by: Tejun Heo > Cc: Michal Hocko Acked-by: Michal Hocko Just a nit bellow [...] > +/** > + * cgroup_rightmost_descendant - return the rightmost descendant of a cgroup > + * @cgrp: cgroup of interest > + * > + * Return the rightmost descendant of @cgrp. If there's no des

Re: [PATCH 2/3] cpuset: replace cpuset->stack_list with cpuset_for_each_descendant_pre()

2012-12-03 Thread Michal Hocko
On Wed 28-11-12 14:27:00, Tejun Heo wrote: > Implement cpuset_for_each_descendant_pre() and replace the > cpuset-specific tree walking using cpuset->stack_list with it. > > Signed-off-by: Tejun Heo Reviewed-by: Michal Hocko > --- >

Re: [PATCH 3/3] cpuset: remove cpuset->parent

2012-12-03 Thread Michal Hocko
On Wed 28-11-12 14:27:01, Tejun Heo wrote: > cgroup already tracks the hierarchy. Follow cgroup->parent to find > the parent and drop cpuset->parent. > > Signed-off-by: Tejun Heo Yes, makes total sense. Reviewed-by: Michal Hocko > --- > kernel/cpuset.c | 28 +

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-12-05 Thread Michal Hocko
sults only in an order-0 page being allocated and charged to the memcg which has a higher liklihood to succeed. This is expensive because the hugepage must be split in the page fault handler, but it is much better than unnecessarily oom killing a process. Signed-off-by: David Rientjes Cc: Andrea

[PATCH 3/6] memcg: Simplify mem_cgroup_force_empty_list error handling

2012-10-17 Thread Michal Hocko
d up in a later patch because it nees a help from cgroup core. Signed-off-by: Michal Hocko --- mm/memcontrol.c | 52 +++- 1 file changed, 27 insertions(+), 25 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 9ce24b7..f57ba4c

[PATCH 6/6] hugetlb: do not fail in hugetlb_cgroup_pre_destroy

2012-10-17 Thread Michal Hocko
Now that pre_destroy callbacks are called from within cgroup_lock and the cgroup has been checked to be empty without any children then there is no other way to fail. Signed-off-by: Michal Hocko --- mm/hugetlb_cgroup.c | 11 +++ 1 file changed, 3 insertions(+), 8 deletions(-) diff

[PATCH 5/6] memcg: make mem_cgroup_reparent_charges non failing

2012-10-17 Thread Michal Hocko
Signed-off-by: Michal Hocko --- mm/memcontrol.c | 18 ++ 1 file changed, 6 insertions(+), 12 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index f57ba4c..7c75da3 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3738,14 +3738,12 @@ static void mem_cgroup_f

[RFC] memcg/cgroup: do not fail fail on pre_destroy callbacks

2012-10-17 Thread Michal Hocko
then we are safe as well. The last two patches are trivial follow ups for the cgroups core change because now we know that nobody will interfere with us so we can drop those empty && no child condition. Comments, thoughts? Michal Hocko (6): memcg: split mem_cgroup_force_empty into re

[PATCH 1/6] memcg: split mem_cgroup_force_empty into reclaiming and reparenting parts

2012-10-17 Thread Michal Hocko
t have any functional changes. Signed-off-by: Michal Hocko --- mm/memcontrol.c | 72 --- 1 file changed, 42 insertions(+), 30 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index e4e9b18..f25e9c0 100644 --- a/mm/memcontrol.

[PATCH 4/6] cgroups: forbid pre_destroy callback to fail

2012-10-17 Thread Michal Hocko
also called from within cgroup_lock to guarantee that no new tasks show up. We could theoretically call them outside of the lock but then we have to move after CGRP_REMOVED flag is set. Signed-off-by: Michal Hocko --- kernel/cgroup.c | 30 +- 1 file changed, 9 inser

[PATCH 2/6] memcg: root_cgroup cannot reach mem_cgroup_move_parent

2012-10-17 Thread Michal Hocko
ume it can always move charges upwards. Signed-off-by: Michal Hocko --- mm/memcontrol.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index f25e9c0..9ce24b7 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2709,9 +2709,7 @@

Re: [PATCH v5 09/14] memcg: kmem accounting lifecycle management

2012-10-17 Thread Michal Hocko
need to test&set atomicaly. Also once a group becomes active it is always marked that way until it goes away. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo

Re: [PATCH 5/6] memcg: make mem_cgroup_reparent_charges non failing

2012-10-18 Thread Michal Hocko
cg); > > + return 0; > > } > > > > Why don't you make pre_destroy() return void? Yes I plan to do that later after I have feedback for this RFC. I am especially interested whether the cgroup core patch is OK, resp. has to be reworked to pull pre_destroy outside

Re: [PATCH] oom, memcg: handle sysctl oom_kill_allocating_task while memcg oom happening

2012-10-18 Thread Michal Hocko
On Wed 17-10-12 01:14:48, Sha Zhengju wrote: > On Tuesday, October 16, 2012, Michal Hocko wrote: [...] > > Could you be more specific about the motivation for this patch? Is it > > "let's be consistent with the global oom" or you have a real use case > > for th

Re: [PATCH] oom, memcg: handle sysctl oom_kill_allocating_task while memcg oom happening

2012-10-18 Thread Michal Hocko
On Thu 18-10-12 21:51:57, Sha Zhengju wrote: > On 10/18/2012 07:56 PM, Michal Hocko wrote: > >On Wed 17-10-12 01:14:48, Sha Zhengju wrote: > >>On Tuesday, October 16, 2012, Michal Hocko wrote: > >[...] > >>>Could you be more specific about the motivation

Re: [PATCH] oom, memcg: handle sysctl oom_kill_allocating_task while memcg oom happening

2012-10-19 Thread Michal Hocko
On Fri 19-10-12 12:11:52, Sha Zhengju wrote: > On 10/18/2012 11:32 PM, Michal Hocko wrote: > >On Thu 18-10-12 21:51:57, Sha Zhengju wrote: > >>On 10/18/2012 07:56 PM, Michal Hocko wrote: > >>>On Wed 17-10-12 01:14:48, Sha Zhengju wrote: > >>>>On Tu

Re: [PATCH 4/6] cgroups: forbid pre_destroy callback to fail

2012-10-19 Thread Michal Hocko
On Fri 19-10-12 17:33:18, Li Zefan wrote: > On 2012/10/17 21:30, Michal Hocko wrote: > > Now that mem_cgroup_pre_destroy callback doesn't fail finally we can > > safely move on and forbit all the callbacks to fail. The last missing > > piece is moving cgr

Re: [PATCH 3/6] memcg: Simplify mem_cgroup_force_empty_list error handling

2012-10-19 Thread Michal Hocko
On Thu 18-10-12 15:16:54, Tejun Heo wrote: > Hello, Michal. > > On Wed, Oct 17, 2012 at 03:30:45PM +0200, Michal Hocko wrote: > > mem_cgroup_force_empty_list currently tries to remove all pages from > > the given LRU. To prevent from temoporary failur

Re: [PATCH 4/6] cgroups: forbid pre_destroy callback to fail

2012-10-19 Thread Michal Hocko
On Thu 18-10-12 15:41:48, Tejun Heo wrote: > Hello, Michal. > > On Wed, Oct 17, 2012 at 03:30:46PM +0200, Michal Hocko wrote: > > Now that mem_cgroup_pre_destroy callback doesn't fail finally we can > > safely move on and forbit all the callbacks to fail. The last m

Re: [PATCH 4/6] cgroups: forbid pre_destroy callback to fail

2012-10-19 Thread Michal Hocko
the rest of memcg changes. I can do the cleanup on top of this > whole series, but please do drop .__DEPRECATED_clear_css_refs from > memcg. OK I will drop that one. > Acked-by: Tejun Heo Do you still agree with the v2 based on Li's feedback? Thanks -- Michal Hocko SUSE La

Re: [PATCH 5/6] memcg: make mem_cgroup_reparent_charges non failing

2012-10-19 Thread Michal Hocko
:00 2001 From: Michal Hocko Date: Wed, 17 Oct 2012 14:15:09 +0200 Subject: [PATCH] memcg: make mem_cgroup_reparent_charges non failing Now that pre_destroy callbacks are called from within cgroup_lock and the cgroup has been checked to be empty without any children then there is no othe

Re: process hangs on do_exit when oom happens

2012-10-19 Thread Michal Hocko
> [] int_signal+0x12/0x17 > [] 0x This looks strange because this is just an exit part which shouldn't deadlock or anything. Is this stack stable? Have you tried to take check it more times? -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "u

Re: [PATCH 4/6] cgroups: forbid pre_destroy callback to fail

2012-10-22 Thread Michal Hocko
On Fri 19-10-12 13:24:05, Tejun Heo wrote: > Hello, Michal. > > On Fri, Oct 19, 2012 at 03:32:45PM +0200, Michal Hocko wrote: > > On Thu 18-10-12 15:41:48, Tejun Heo wrote: > > > Hello, Michal. > > > > > > On Wed, Oct 17, 2012 at 03:30:46PM +

Re: [PATCH v5 06/14] memcg: kmem controller infrastructure

2012-10-22 Thread Michal Hocko
heir pages back, how would you > >> feel about the following test: > >> > >> may_oom = (gfp & GFP_KERNEL) && !(gfp & __GFP_NORETRY) ? > >> > > > > I would simply copy the logic from the page allocator and only trigger oom > > for _

Re: process hangs on do_exit when oom happens

2012-10-22 Thread Michal Hocko
the worker process hangs there. > > Actually, if we didn't set the worker process into the cpu cgroup, this > will never happens. Strange and it smells like a misconfiguration. Could you provide the compllete setting for both controllers? grep . -r /cgroup/ > On Sat, Oct 20, 201

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-12-06 Thread Michal Hocko
heavy workloads but this will give us better traces - I hope). Anyway do you see the same problem if transparent huge pages are disabled? echo never > /sys/kernel/mm/transparent_hugepage/enabled) --- >From 93a30140b50d8474a047b91c698f4880149635db Mon Sep 17 00:00:00 2001 From: Michal Hocko Da

Re: [PATCHSET cgroup/for-3.8] cpuset: decouple cpuset locking from cgroup core

2012-12-06 Thread Michal Hocko
On Thu 06-12-12 14:25:03, Li Zefan wrote: > On 2012/12/4 0:53, Tejun Heo wrote: > > Hello, Michal. > > > > On Mon, Dec 03, 2012 at 04:22:05PM +0100, Michal Hocko wrote: > >> I have glanced through the series and spotten nothing obviously wrong. I > >> do no

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-12-06 Thread Michal Hocko
e > >debugging will tell us something (the inlining has been reduced for thp > >paths which can reduce performance in thp page fault heavy workloads but > >this will give us better traces - I hope). > > > Should i apply all patches togather? (fix for this bug, more log >

Re: [patch v2 3/6] memcg: rework mem_cgroup_iter to use cgroup iterators

2012-12-07 Thread Michal Hocko
On Thu 06-12-12 19:43:52, Ying Han wrote: [...] > Forgot to mention, I was testing 3.7-rc6 with the two cgroup changes : Could you give a try to -mm tree as well. There are some changes for memcgs removal in that tree which are not in Linus's tree. -- Michal Hocko SUSE Labs -- To uns

Re: [patch v2 3/6] memcg: rework mem_cgroup_iter to use cgroup iterators

2012-12-07 Thread Michal Hocko
visited. Didn't follow why we made that change, but after > restoring the behavior a bit seems passed my test. Hmm, strange. css reference counting should be stronger than mem_cgroup one because it pins css thus cgroup which in turn keeps memcg alive. > Here is the patch I applied on

Re: [patch v2 3/6] memcg: rework mem_cgroup_iter to use cgroup iterators

2012-12-07 Thread Michal Hocko
On Fri 07-12-12 09:12:25, Ying Han wrote: > On Fri, Dec 7, 2012 at 12:58 AM, Michal Hocko wrote: > > On Thu 06-12-12 19:43:52, Ying Han wrote: > > [...] > >> Forgot to mention, I was testing 3.7-rc6 with the two cgroup changes : > > > > Could you give a try

Re: [patch v2 3/6] memcg: rework mem_cgroup_iter to use cgroup iterators

2012-12-07 Thread Michal Hocko
On Fri 07-12-12 11:16:23, Ying Han wrote: > On Fri, Dec 7, 2012 at 9:27 AM, Michal Hocko wrote: > > On Fri 07-12-12 09:12:25, Ying Han wrote: > >> On Fri, Dec 7, 2012 at 12:58 AM, Michal Hocko wrote: > >> > On Thu 06-12-12 19:43:52, Ying Han wrote: > >> &

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-12-10 Thread Michal Hocko
at. If my current understanding is correct then this is related to transparent huge pages (and leaking charge to the page fault handler). Do you see the same problem if you disable THP before you start your workload? (echo never > /sys/kernel/mm/transparent_hugepage/enabled) -- Michal Hocko

Re: [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked

2012-12-10 Thread Michal Hocko
fault > >handler). Do you see the same problem if you disable THP before you > >start your workload? (echo never > > >/sys/kernel/mm/transparent_hugepage/enabled) > > # cat /sys/kernel/mm/transparent_hugepage/enabled > cat: /sys/kernel/mm/transparent_hugepage/enabled: No

Re: [patch v2 3/6] memcg: rework mem_cgroup_iter to use cgroup iterators

2012-12-11 Thread Michal Hocko
On Sun 09-12-12 08:59:54, Ying Han wrote: > On Mon, Nov 26, 2012 at 10:47 AM, Michal Hocko wrote: [...] > > + /* > > +* Even if we found a group we have to make sure it is > > alive. > > +* css && !memcg means th

Re: [patch v2 3/6] memcg: rework mem_cgroup_iter to use cgroup iterators

2012-12-11 Thread Michal Hocko
On Sun 09-12-12 11:39:50, Ying Han wrote: > On Mon, Nov 26, 2012 at 10:47 AM, Michal Hocko wrote: [...] > > if (reclaim) { > > - iter->position = id; > > + struct mem_cgroup *curr = memcg; > > +

Re: [patch v2 4/6] memcg: simplify mem_cgroup_iter

2012-12-11 Thread Michal Hocko
On Sun 09-12-12 09:01:48, Ying Han wrote: > On Mon, Nov 26, 2012 at 10:47 AM, Michal Hocko wrote: > > Current implementation of mem_cgroup_iter has to consider both css and > > memcg to find out whether no group has been found (css==NULL - aka the > > loop is completed)

Re: [patch v2 4/6] memcg: simplify mem_cgroup_iter

2012-12-11 Thread Michal Hocko
On Mon 10-12-12 20:35:20, Ying Han wrote: > On Mon, Nov 26, 2012 at 10:47 AM, Michal Hocko wrote: > > Current implementation of mem_cgroup_iter has to consider both css and > > memcg to find out whether no group has been found (css==NULL - aka the > > loop is completed)

Re: [patch v2 3/6] memcg: rework mem_cgroup_iter to use cgroup iterators

2012-12-11 Thread Michal Hocko
On Tue 11-12-12 16:50:25, Michal Hocko wrote: > On Sun 09-12-12 08:59:54, Ying Han wrote: > > On Mon, Nov 26, 2012 at 10:47 AM, Michal Hocko wrote: > [...] > > > + /* > > > +* Even if we found a group we have t

Re: [patch v2 3/6] memcg: rework mem_cgroup_iter to use cgroup iterators

2012-12-11 Thread Michal Hocko
On Tue 11-12-12 17:15:59, Michal Hocko wrote: > On Tue 11-12-12 16:50:25, Michal Hocko wrote: > > On Sun 09-12-12 08:59:54, Ying Han wrote: > > > On Mon, Nov 26, 2012 at 10:47 AM, Michal Hocko wrote: > > [...] > > > > + /* > > > > +

Re: [patch v2 3/6] memcg: rework mem_cgroup_iter to use cgroup iterators

2012-12-12 Thread Michal Hocko
On Tue 11-12-12 14:43:37, Ying Han wrote: > On Tue, Dec 11, 2012 at 8:15 AM, Michal Hocko wrote: > > On Tue 11-12-12 16:50:25, Michal Hocko wrote: > >> On Sun 09-12-12 08:59:54, Ying Han wrote: > >> > On Mon, Nov 26, 2012 at

Re: [patch v2 3/6] memcg: rework mem_cgroup_iter to use cgroup iterators

2012-12-12 Thread Michal Hocko
On Tue 11-12-12 14:36:10, Ying Han wrote: > On Tue, Dec 11, 2012 at 7:54 AM, Michal Hocko wrote: > > On Sun 09-12-12 11:39:50, Ying Han wrote: > >> On Mon, Nov 26, 2012 at 10:47 AM, Michal Hocko wrote: > > [...] > >> > if (reclaim) { > >

[RFC 5/5] cgroup: remove css_get_next

2012-11-13 Thread Michal Hocko
Now that we have generic and well ordered cgroup tree walkers there is no need to keep css_get_next in the place. Signed-off-by: Michal Hocko --- include/linux/cgroup.h |7 --- kernel/cgroup.c| 49 2 files changed, 56 deletions

[RFC 3/5] memcg: simplify mem_cgroup_iter

2012-11-13 Thread Michal Hocko
a simple invariant that memcg is always alive when non-NULL and all nodes have been visited otherwise. We could get rid of the surrounding while loop but keep it for now for an easier review. It will go away in the next patch. Signed-off-by: Michal Hocko --- mm/memcontrol.

[RFC 1/5] memcg: synchronize per-zone iterator access by a spinlock

2012-11-13 Thread Michal Hocko
will be replaced cgroup generic iteration which requires storing mem_cgroup pointer into iterator and that requires reference counting and so concurrent access will be a problem. Signed-off-by: Michal Hocko --- mm/memcontrol.c | 12 +++- 1 file changed, 11 insertions(+), 1 delet

[RFC 4/5] memcg: clean up mem_cgroup_iter

2012-11-13 Thread Michal Hocko
Get rid of while(!memcg) loop as it is no longer needed because there will always be at least one group that should be visited (root). This patch doesn't add any change to the implementation but it is separate to make a review easier. Signed-off-by: Michal Hocko --- mm/memcontrol.c |

[RFC 2/5] memcg: rework mem_cgroup_iter to use cgroup iterators

2012-11-13 Thread Michal Hocko
rogress is still possible). iter_lock will make sure that only one reclaimer will see the last_visited group and the reference count game around it. Signed-off-by: Michal Hocko --- mm/memcontrol.c | 64 ++- 1 file changed, 49 insertions(+), 15 de

[RFC] rework mem_cgroup iterator

2012-11-13 Thread Michal Hocko
75 insertions(+), 91 deletions(-) Michal Hocko (5): memcg: synchronize per-zone iterator access by a spinlock memcg: rework mem_cgroup_iter to use cgroup iterators memcg: simplify mem_cgroup_iter memcg: clean up mem_cgroup_iter cgroup: remove css_get_next -- To unsubscribe

Re: [RFC] rework mem_cgroup iterator

2012-11-14 Thread Michal Hocko
On Wed 14-11-12 09:55:08, Li Zefan wrote: > On 2012/11/13 23:30, Michal Hocko wrote: > > Hi all, > > this patch set tries to make mem_cgroup_iter saner in the way how it > > walks hierarchies. css->id based traversal is far from being ideal as it > > is not determini

Re: [RFC] rework mem_cgroup iterator

2012-11-14 Thread Michal Hocko
based ones. Memcg iterator, however, still needs its own iterator on top because we have to handle the parallel reclaimers. [...] -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org M

<    1   2   3   4   5   6   7   8   9   10   >