[Devel] Re: [PATCH v2 02/29] slub: fix slab_state for slub

2012-05-16 Thread Glauber Costa
On 05/16/2012 01:55 AM, David Rientjes wrote: On Fri, 11 May 2012, Glauber Costa wrote: When the slub code wants to know if the sysfs state has already been initialized, it tests for slab_state == SYSFS. This is quite fragile, since new state can be added in the future (it is, in fact

[Devel] Re: [PATCH v2 01/29] slab: dup name string

2012-05-16 Thread Glauber Costa
On 05/16/2012 02:04 AM, David Rientjes wrote: On Fri, 11 May 2012, Glauber Costa wrote: diff --git a/mm/slab.c b/mm/slab.c index e901a36..91b9c13 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -2118,6 +2118,7 @@ static void __kmem_cache_destroy(struct kmem_cache *cachep

[Devel] Re: [PATCH v2 11/29] cgroups: ability to stop res charge propagation on bounded ancestor

2012-05-16 Thread Glauber Costa
On 05/15/2012 06:59 AM, KAMEZAWA Hiroyuki wrote: (2012/05/12 2:44), Glauber Costa wrote: From: Frederic Weisbeckerfweis...@gmail.com Moving a task from a cgroup to another may require to substract its resource charge from the old cgroup and add it to the new one. For this to happen

[Devel] Re: [PATCH v2 19/29] skip memcg kmem allocations in specified code regions

2012-05-16 Thread Glauber Costa
On 05/15/2012 06:46 AM, KAMEZAWA Hiroyuki wrote: (2012/05/12 2:44), Glauber Costa wrote: This patch creates a mechanism that skip memcg allocations during certain pieces of our core code. It basically works in the same way as preempt_disable()/preempt_enable(): By marking a region under

[Devel] Re: [PATCH v2 18/29] memcg: kmem controller charge/uncharge infrastructure

2012-05-16 Thread Glauber Costa
On 05/15/2012 06:57 AM, KAMEZAWA Hiroyuki wrote: +#ifdef CONFIG_CGROUP_MEM_RES_CTLR_KMEM +int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, s64 delta) +{ + struct res_counter *fail_res; + struct mem_cgroup *_memcg; + int may_oom, ret; + bool nofail = false; + +

[Devel] Re: [PATCH v5 2/2] decrement static keys on real destroy time

2012-05-16 Thread Glauber Costa
On 05/14/2012 05:38 AM, Li Zefan wrote: +static void disarm_static_keys(struct mem_cgroup *memcg) +{ +#ifdef CONFIG_INET +if (memcg-tcp_mem.cg_proto.activated) +static_key_slow_dec(memcg_socket_limit_enabled); +#endif +} Move this inside the ifdef/endif below ?

[Devel] Re: [PATCH v5 2/2] decrement static keys on real destroy time

2012-05-16 Thread Glauber Costa
On 05/16/2012 10:03 AM, Glauber Costa wrote: BTW, what is the relationship between 1/2 and 2/2 ? Can't do jump label patching inside an interrupt handler. They need to happen when we free the structure, and I was about to add a worker myself when I found out we already have one: just we don't

[Devel] Re: [PATCH v2 18/29] memcg: kmem controller charge/uncharge infrastructure

2012-05-16 Thread Glauber Costa
On 05/16/2012 12:18 PM, KAMEZAWA Hiroyuki wrote: If at this point the memcg hits a NOFAIL allocation worth 2 pages, by the method I am using, the memcg will be at 4M + 4k after the allocation. Charging it to the root memcg will leave it at 4M - 4k. This means that to be able to

[Devel] Re: [PATCH v5 2/2] decrement static keys on real destroy time

2012-05-16 Thread Glauber Costa
On 05/16/2012 12:28 PM, KAMEZAWA Hiroyuki wrote: For the record, I compiled test it many times, and the problem that Li wondered about seems not to exist. Ah...Hmm.I guess dependency problem will be found in -mm if any rather than netdev... Yes. As I said, this only touches stuff

[Devel] Re: [PATCH v5 2/2] decrement static keys on real destroy time

2012-05-16 Thread Glauber Costa
On 05/16/2012 12:28 PM, KAMEZAWA Hiroyuki wrote: (2012/05/16 16:04), Glauber Costa wrote: On 05/16/2012 10:03 AM, Glauber Costa wrote: BTW, what is the relationship between 1/2 and 2/2 ? Can't do jump label patching inside an interrupt handler. They need to happen when we free

[Devel] Re: [PATCH v5 2/2] decrement static keys on real destroy time

2012-05-16 Thread Glauber Costa
On 05/17/2012 01:06 AM, Andrew Morton wrote: On Fri, 11 May 2012 17:11:17 -0300 Glauber Costaglom...@parallels.com wrote: We call the destroy function when a cgroup starts to be removed, such as by a rmdir event. However, because of our reference counters, some objects are still inflight.

[Devel] Re: [PATCH v5 2/2] decrement static keys on real destroy time

2012-05-16 Thread Glauber Costa
On 05/17/2012 01:13 AM, Andrew Morton wrote: On Fri, 11 May 2012 17:11:17 -0300 Glauber Costaglom...@parallels.com wrote: We call the destroy function when a cgroup starts to be removed, such as by a rmdir event. However, because of our reference counters, some objects are still inflight.

[Devel] [PATCH v2 00/29] kmem limitation for memcg

2012-05-11 Thread Glauber Costa
ancestor Glauber Costa (24): slab: dup name string slub: fix slab_state for slub memcg: Always free struct memcg through schedule_work() slub: always get the cache from its page in kfree slab: rename gfpflags to allocflags slab: use obj_size field of struct kmem_cache when not debugging

[Devel] [PATCH v2 03/29] memcg: Always free struct memcg through schedule_work()

2012-05-11 Thread Glauber Costa
in a separate thread. The goal is to have a stable place to call the upcoming jump label destruction function outside the realm of the complicated and quite far-reaching cgroup lock (that can't be held when calling neither the cpu_hotplug.lock nor the jump_label_mutex) Signed-off-by: Glauber Costa

[Devel] [PATCH v2 07/29] memcg: Reclaim when more than one page needed.

2012-05-11 Thread Glauber Costa
-by: Suleiman Souhlal sulei...@google.com Signed-off-by: Glauber Costa glom...@parallels.com Reviewed-by: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com --- mm/memcontrol.c | 18 +++--- 1 files changed, 11 insertions(+), 7 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c

[Devel] [PATCH v2 09/29] memcg: change defines to an enum

2012-05-11 Thread Glauber Costa
This is just a cleanup patch for clarity of expression. In earlier submissions, people asked it to be in a separate patch, so here it is. Signed-off-by: Glauber Costa glom...@parallels.com CC: Michal Hocko mho...@suse.cz CC: Johannes Weiner han...@cmpxchg.org Acked-by: Kamezawa Hiroyuki

[Devel] [PATCH v2 01/29] slab: dup name string

2012-05-11 Thread Glauber Costa
copies would be better. So here it is. Signed-off-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb...@cs.helsinki.fi --- mm/slab.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/mm/slab.c b/mm/slab.c index e901a36

[Devel] [PATCH v2 02/29] slub: fix slab_state for slub

2012-05-11 Thread Glauber Costa
does. Signed-off-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb...@cs.helsinki.fi --- mm/slub.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index ffe13fd..226e053 100644 --- a/mm/slub.c +++ b/mm

[Devel] [PATCH v2 06/29] memcg: Make it possible to use the stock for more than one page.

2012-05-11 Thread Glauber Costa
From: Suleiman Souhlal ssouh...@freebsd.org Signed-off-by: Suleiman Souhlal sulei...@google.com Signed-off-by: Glauber Costa glom...@parallels.com Acked-by: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com --- mm/memcontrol.c | 18 +- 1 files changed, 9 insertions(+), 9

[Devel] [PATCH v2 04/29] slub: always get the cache from its page in kfree

2012-05-11 Thread Glauber Costa
struct page already have this information. If we start chaining caches, this information will always be more trustworthy than whatever is passed into the function Signed-off-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb...@cs.helsinki.fi

[Devel] [PATCH v2 12/29] kmem slab accounting basic infrastructure

2012-05-11 Thread Glauber Costa
. People who want to track kernel memory but not limit it, can set this limit to a very high number (like RESOURCE_MAX - 1page - that no one will ever hit, or equal to the user memory) Signed-off-by: Glauber Costa glom...@parallels.com CC: Michal Hocko mho...@suse.cz CC: Johannes Weiner han

[Devel] [PATCH v2 05/29] slab: rename gfpflags to allocflags

2012-05-11 Thread Glauber Costa
A consistent name with slub saves us an acessor function. In both caches, this field represents the same thing. We would like to use it from the mem_cgroup code. Signed-off-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb...@cs.helsinki.fi

[Devel] [PATCH v2 11/29] cgroups: ability to stop res charge propagation on bounded ancestor

2012-05-11 Thread Glauber Costa
. To solve this, provide a pair of new API that can charge/uncharge a resource counter until we reach a given ancestor. Signed-off-by: Frederic Weisbecker fweis...@gmail.com Acked-by: Paul Menage p...@paulmenage.org Acked-by: Glauber Costa glom...@parallels.com Cc: Li Zefan l...@cn.fujitsu.com Cc

[Devel] [PATCH v2 08/29] slab: use obj_size field of struct kmem_cache when not debugging

2012-05-11 Thread Glauber Costa
-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb...@cs.helsinki.fi --- include/linux/slab_def.h |4 +++- mm/slab.c| 37 ++--- 2 files changed, 29 insertions(+), 12 deletions(-) diff --git

[Devel] [PATCH v2 13/29] slab/slub: struct memcg_params

2012-05-11 Thread Glauber Costa
For the kmem slab controller, we need to record some extra information in the kmem_cache structure. Signed-off-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb...@cs.helsinki.fi CC: Michal Hocko mho...@suse.cz CC: Kamezawa Hiroyuki kamezawa.hir

[Devel] [PATCH v2 14/29] slub: consider a memcg parameter in kmem_create_cache

2012-05-11 Thread Glauber Costa
their index right away. This index mechanism was developed by Suleiman Souhlal. Changed to a idr/ida based approach based on suggestion from Kamezawa. Signed-off-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb...@cs.helsinki.fi CC: Michal Hocko

[Devel] [PATCH v2 15/29] slab: pass memcg parameter to kmem_cache_create

2012-05-11 Thread Glauber Costa
of simplifications Signed-off-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb...@cs.helsinki.fi CC: Michal Hocko mho...@suse.cz CC: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com CC: Johannes Weiner han...@cmpxchg.org CC: Suleiman Souhlal sulei

[Devel] [PATCH v2 16/29] slub: create duplicate cache

2012-05-11 Thread Glauber Costa
count is increased if the cache creation succeeds. Signed-off-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb...@cs.helsinki.fi CC: Michal Hocko mho...@suse.cz CC: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com CC: Johannes Weiner han...@cmpxchg.org

[Devel] [PATCH v2 25/29] memcg: Track all the memcg children of a kmem_cache.

2012-05-11 Thread Glauber Costa
From: Suleiman Souhlal ssouh...@freebsd.org This enables us to remove all the children of a kmem_cache being destroyed, if for example the kernel module it's being used in gets unloaded. Otherwise, the children will still point to the destroyed parent. We also use this to propagate

[Devel] [PATCH v2 17/29] slab: create duplicate cache

2012-05-11 Thread Glauber Costa
Souhlal, with some adaptations and simplifications by me. Signed-off-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb...@cs.helsinki.fi CC: Michal Hocko mho...@suse.cz CC: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com CC: Johannes Weiner han

[Devel] [PATCH v2 26/29] memcg: Per-memcg memory.kmem.slabinfo file.

2012-05-11 Thread Glauber Costa
From: Suleiman Souhlal ssouh...@freebsd.org This file shows all the kmem_caches used by a memcg. Signed-off-by: Suleiman Souhlal sulei...@google.com --- include/linux/slab.h |1 + mm/memcontrol.c | 17 ++ mm/slab.c| 87

[Devel] [PATCH v2 18/29] memcg: kmem controller charge/uncharge infrastructure

2012-05-11 Thread Glauber Costa
is inspired by the code written by Suleiman Souhlal, but heavily changed. Signed-off-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb...@cs.helsinki.fi CC: Michal Hocko mho...@suse.cz CC: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com CC: Johannes Weiner

[Devel] [PATCH v2 19/29] skip memcg kmem allocations in specified code regions

2012-05-11 Thread Glauber Costa
cache creation, when we allocate data using caches that are not necessarily created already. Signed-off-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb...@cs.helsinki.fi CC: Michal Hocko mho...@suse.cz CC: Kamezawa Hiroyuki kamezawa.hir

[Devel] [PATCH v2 20/29] slub: charge allocation to a memcg

2012-05-11 Thread Glauber Costa
to return as soon as we realize we are not a memcg cache. The charge/uncharge functions are heavier, but are only called for new page allocations. The kmalloc_no_account variant is patched so the base function is used and we don't even try to do cache selection. Signed-off-by: Glauber Costa glom

[Devel] [PATCH v2 21/29] slab: per-memcg accounting of slab caches

2012-05-11 Thread Glauber Costa
to return as soon as we realize we are not a memcg cache. The charge/uncharge functions are heavier, but are only called for new page allocations. Code is heavily inspired by Suleiman's, with adaptations to the patchset and minor simplifications by me. Signed-off-by: Glauber Costa glom

[Devel] [PATCH v2 22/29] memcg: disable kmem code when not in use.

2012-05-11 Thread Glauber Costa
that no mischarges are applied. Jump label decrement happens when the last reference count from the memcg dies. This will only happen when the caches are all dead. Signed-off-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb...@cs.helsinki.fi CC

[Devel] [PATCH v2 23/29] memcg: destroy memcg caches

2012-05-11 Thread Glauber Costa
from the cache code. Caches are only destroyed in process context, so we queue them up for later processing in the general case. Signed-off-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb...@cs.helsinki.fi CC: Michal Hocko mho...@suse.cz CC

[Devel] [PATCH v2 24/29] memcg/slub: shrink dead caches

2012-05-11 Thread Glauber Costa
cache reorganization, and then all references to empty pages will be removed. An unlikely branch is used to make sure this case does not affect performance in the usual slab_free path. Signed-off-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb

[Devel] [PATCH v2 27/29] slub: create slabinfo file for memcg

2012-05-11 Thread Glauber Costa
This patch implements mem_cgroup_slabinfo() for the slub. With that, we can also probe the used caches for it. Signed-off-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb...@cs.helsinki.fi CC: Michal Hocko mho...@suse.cz CC: Kamezawa Hiroyuki

[Devel] [PATCH v2 28/29] slub: track all children of a kmem cache

2012-05-11 Thread Glauber Costa
given all memcg caches are expected to be empty - even though they are likely to be hanging around in the system, we just need to scan a list of sibling caches, and destroy each one of them. This is very similar to the work done by Suleiman for the slab. Signed-off-by: Glauber Costa glom

[Devel] [PATCH v2 29/29] Documentation: add documentation for slab tracker for memcg

2012-05-11 Thread Glauber Costa
In a separate patch, to aid reviewers. Signed-off-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb...@cs.helsinki.fi CC: Michal Hocko mho...@suse.cz CC: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com CC: Johannes Weiner han...@cmpxchg.org CC

[Devel] Re: [PATCH v2 04/29] slub: always get the cache from its page in kfree

2012-05-11 Thread Glauber Costa
On 05/11/2012 02:53 PM, Christoph Lameter wrote: On Fri, 11 May 2012, Glauber Costa wrote: struct page already have this information. If we start chaining caches, this information will always be more trustworthy than whatever is passed into the function Other allocators may not have

[Devel] Re: [PATCH v2 00/29] kmem limitation for memcg

2012-05-11 Thread Glauber Costa
On 05/11/2012 02:44 PM, Glauber Costa wrote: Hello All, This is my new take for the memcg kmem accounting. At this point, I consider the series pretty mature - although of course, bugs are always there... As a disclaimer, however, I must say that the slub code is much more stressed by me

[Devel] Re: [PATCH v2 04/29] slub: always get the cache from its page in kfree

2012-05-11 Thread Glauber Costa
On 05/11/2012 03:06 PM, Christoph Lameter wrote: On Fri, 11 May 2012, Glauber Costa wrote: Adding a VM_BUG_ON may be useful to make sure that kmem_cache_free is always passed the correct slab cache. Well, problem is , it isn't always passed the correct slab cache. At least not after

[Devel] Re: [PATCH v2 04/29] slub: always get the cache from its page in kfree

2012-05-11 Thread Glauber Costa
On 05/11/2012 03:17 PM, Christoph Lameter wrote: On Fri, 11 May 2012, Glauber Costa wrote: On 05/11/2012 03:06 PM, Christoph Lameter wrote: On Fri, 11 May 2012, Glauber Costa wrote: Adding a VM_BUG_ON may be useful to make sure that kmem_cache_free is always passed the correct slab cache

[Devel] Re: [PATCH v2 04/29] slub: always get the cache from its page in kfree

2012-05-11 Thread Glauber Costa
On 05/11/2012 03:32 PM, Christoph Lameter wrote: On Fri, 11 May 2012, Glauber Costa wrote: Thank you in advance for your time reviewing this! Where do I find the rationale for all of this? Trouble is that pages can contain multiple objects f.e. so accounting of pages to groups is a bit fuzzy

[Devel] Re: [PATCH v2 04/29] slub: always get the cache from its page in kfree

2012-05-11 Thread Glauber Costa
On 05/11/2012 03:56 PM, Christoph Lameter wrote: On Fri, 11 May 2012, Glauber Costa wrote: So we don't mix pages from multiple memcgs in the same cache - we believe that would be too confusing. Well subsystem create caches and other things that are shared between multiple processes. How can

[Devel] Re: [PATCH v2 04/29] slub: always get the cache from its page in kfree

2012-05-11 Thread Glauber Costa
On 05/11/2012 04:09 PM, Christoph Lameter wrote: On Fri, 11 May 2012, Glauber Costa wrote: On 05/11/2012 03:56 PM, Christoph Lameter wrote: On Fri, 11 May 2012, Glauber Costa wrote: So we don't mix pages from multiple memcgs in the same cache - we believe that would be too confusing. Well

[Devel] Re: [PATCH v2 04/29] slub: always get the cache from its page in kfree

2012-05-11 Thread Glauber Costa
On 05/11/2012 04:20 PM, Christoph Lameter wrote: On Fri, 11 May 2012, Glauber Costa wrote: I see that. But there are other subsystems from slab allocators that do the same. There are also objects that may be used by multiple processes. This is also true for normal user pages. And then, we do

[Devel] [PATCH v5 0/2] fix static_key disabling problem in memcg

2012-05-11 Thread Glauber Costa
() call, effectively guaranteeing the behavior we need. Glauber Costa (2): Always free struct memcg through schedule_work() decrement static keys on real destroy time include/net/sock.h|9 mm/memcontrol.c | 50 +--- net

[Devel] [PATCH v5 1/2] Always free struct memcg through schedule_work()

2012-05-11 Thread Glauber Costa
in a separate thread. The goal is to have a stable place to call the upcoming jump label destruction function outside the realm of the complicated and quite far-reaching cgroup lock (that can't be held when calling neither the cpu_hotplug.lock nor the jump_label_mutex) Signed-off-by: Glauber Costa

[Devel] [PATCH v5 2/2] decrement static keys on real destroy time

2012-05-11 Thread Glauber Costa
interface ] Signed-off-by: Glauber Costa glom...@parallels.com CC: Tejun Heo t...@kernel.org CC: Li Zefan lize...@huawei.com CC: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com CC: Johannes Weiner han...@cmpxchg.org CC: Michal Hocko mho...@suse.cz --- include/net/sock.h|9 + mm

[Devel] Re: [RFC] slub: show dead memcg caches in a separate file

2012-05-08 Thread Glauber Costa
On 05/08/2012 02:42 AM, Pekka Enberg wrote: On Tue, May 8, 2012 at 6:30 AM, Glauber Costaglom...@parallels.com wrote: But there is another aspect: those dead caches have one thing in common, which is the fact that no new objects will ever be allocated on them. You can't tune them, or do

[Devel] Re: [RFC] alternative mechanism to skip memcg kmem allocations

2012-05-08 Thread Glauber Costa
On 05/08/2012 05:47 PM, Suleiman Souhlal wrote: On Mon, May 7, 2012 at 8:37 PM, Glauber Costaglom...@parallels.com wrote: Since Kame expressed the wish to see a context-based method to skip accounting for caches, I came up with the following proposal for your appreciation. It basically works

[Devel] Re: [RFC] slub: show dead memcg caches in a separate file

2012-05-07 Thread Glauber Costa
On 05/07/2012 07:04 PM, Suleiman Souhlal wrote: On Thu, May 3, 2012 at 11:47 AM, Glauber Costaglom...@parallels.com wrote: One of the very few things that still unsettles me in the kmem controller for memcg, is how badly we mess up with the /proc/slabinfo file. It is alright to have the

[Devel] [RFC] alternative mechanism to skip memcg kmem allocations

2012-05-07 Thread Glauber Costa
always set this in the task_struct... Let me know what you think of it. Signed-off-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb...@cs.helsinki.fi CC: Michal Hocko mho...@suse.cz CC: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com CC: Johannes

[Devel] [RFC] slub: show dead memcg caches in a separate file

2012-05-03 Thread Glauber Costa
this separately to collect opinions from all of you. I can either implement a version of this for the slab, or follow any other route. Thanks Signed-off-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb...@cs.helsinki.fi CC: Michal Hocko mho...@suse.cz CC

[Devel] Re: [PATCH 00/23] slab+slub accounting for memcg

2012-05-02 Thread Glauber Costa
On 04/30/2012 06:43 PM, Suleiman Souhlal wrote: I am leaving destruction of caches out of the series, although most of the infrastructure for that is here, since we did it in earlier series. This is basically because right now Kame is reworking it for user memcg, and I like the new

[Devel] Re: [PATCH 09/23] kmem slab accounting basic infrastructure

2012-05-02 Thread Glauber Costa
@@ -3951,8 +3966,26 @@ static int mem_cgroup_write(struct cgroup *cont, struct cftype *cft, break; if (type == _MEM) ret = mem_cgroup_resize_limit(memcg, val); - else + else if (type == _MEMSWAP)

[Devel] Re: [PATCH 11/23] slub: consider a memcg parameter in kmem_create_cache

2012-05-02 Thread Glauber Costa
On 04/30/2012 04:51 PM, Suleiman Souhlal wrote: On Fri, Apr 20, 2012 at 2:57 PM, Glauber Costaglom...@parallels.com wrote: Allow a memcg parameter to be passed during cache creation. The slub allocator will only merge caches that belong to the same memcg. Default function is created as a

[Devel] Re: [PATCH 17/23] kmem controller charge/uncharge infrastructure

2012-05-02 Thread Glauber Costa
On 04/30/2012 05:56 PM, Suleiman Souhlal wrote: + +static void kmem_cache_destroy_work_func(struct work_struct *w) +{ + struct kmem_cache *cachep; + char *name; + + spin_lock_irq(cache_queue_lock); + while (!list_empty(destroyed_caches)) { + cachep =

[Devel] Re: [PATCH 19/23] slab: per-memcg accounting of slab caches

2012-05-02 Thread Glauber Costa
@@ -3834,11 +3866,15 @@ static inline void __cache_free(struct kmem_cache *cachep, void *objp, */ void *kmem_cache_alloc(struct kmem_cache *cachep, gfp_t flags) { - void *ret = __cache_alloc(cachep, flags, __builtin_return_address(0)); + void *ret; + + rcu_read_lock();

[Devel] Re: [PATCH v4 1/3] make jump_labels wait while updates are in place

2012-04-27 Thread Glauber Costa
On 04/27/2012 10:53 AM, Jason Baron wrote: On Thu, Apr 26, 2012 at 08:43:06PM -0400, Steven Rostedt wrote: On Thu, Apr 26, 2012 at 07:51:05PM -0300, Glauber Costa wrote: In mem cgroup, we need to guarantee that two concurrent updates of the jump_label interface wait for each other. IOW, we

[Devel] [PATCH v3 2/2] decrement static keys on real destroy time

2012-04-26 Thread Glauber Costa
, only limited memcgs will have its sockets accounted. [v2: changed a tcp limited flag for a generic proto limited flag ] [v3: update the current active flag only after the static_key update ] [v4: disarm_static_keys() inside free_work ] Signed-off-by: Glauber Costa glom...@parallels.com

[Devel] [PATCH v3 0/2] fix problem with static_branch() for sock memcg

2012-04-26 Thread Glauber Costa
by Kame. Let me know if this is acceptable. Thanks Glauber Costa (2): Always free struct memcg through schedule_work() decrement static keys on real destroy time include/net/sock.h|9 ++ mm/memcontrol.c | 54 ++ net/ipv4

[Devel] [PATCH v3 1/2] Always free struct memcg through schedule_work()

2012-04-26 Thread Glauber Costa
in a separate thread. The goal is to have a stable place to call the upcoming jump label destruction function outside the realm of the complicated and quite far-reaching cgroup lock (that can't be held when calling neither the cpu_hotplug.lock nor the jump_label_mutex) Signed-off-by: Glauber Costa

[Devel] Re: [PATCH v3 2/2] decrement static keys on real destroy time

2012-04-26 Thread Glauber Costa
On 04/26/2012 06:39 PM, Tejun Heo wrote: Hello, Glauber. Overall, I like this approach much better. Just some nits below. On Thu, Apr 26, 2012 at 06:24:23PM -0300, Glauber Costa wrote: @@ -4836,6 +4851,18 @@ static void free_work(struct work_struct *work) int size = sizeof(struct

[Devel] Re: [PATCH v3 2/2] decrement static keys on real destroy time

2012-04-26 Thread Glauber Costa
No, what I mean is that why can't you do about the same mutexed activated inside static_key API function instead of requiring every user to worry about the function returning asynchronously. ie. synchronize inside static_key API instead of in the callers. Like this? diff --git

[Devel] Re: [PATCH v3 2/2] decrement static keys on real destroy time

2012-04-26 Thread Glauber Costa
On 04/26/2012 07:22 PM, Tejun Heo wrote: Hello, On Thu, Apr 26, 2012 at 3:17 PM, Glauber Costaglom...@parallels.com wrote: No, what I mean is that why can't you do about the same mutexed activated inside static_key API function instead of requiring every user to worry about the function

[Devel] [PATCH v4 0/3] fix problem with static_branch() for sock memcg

2012-04-26 Thread Glauber Costa
me know if this is acceptable. Thanks Glauber Costa (3): make jump_labels wait while updates are in place Always free struct memcg through schedule_work() decrement static keys on real destroy time include/net/sock.h|9 kernel/jump_label.c | 13 --- mm

[Devel] [PATCH v4 1/3] make jump_labels wait while updates are in place

2012-04-26 Thread Glauber Costa
to be called in any fast path, otherwise it would be expected to have quite a different name. Therefore the mutex + atomic combination instead of just an atomic should not kill us. Signed-off-by: Glauber Costa glom...@parallels.com CC: Tejun Heo t...@kernel.org CC: Li Zefan lize...@huawei.com CC

[Devel] [PATCH v4 2/3] Always free struct memcg through schedule_work()

2012-04-26 Thread Glauber Costa
in a separate thread. The goal is to have a stable place to call the upcoming jump label destruction function outside the realm of the complicated and quite far-reaching cgroup lock (that can't be held when calling neither the cpu_hotplug.lock nor the jump_label_mutex) Signed-off-by: Glauber Costa

[Devel] [PATCH v4 3/3] decrement static keys on real destroy time

2012-04-26 Thread Glauber Costa
interface ] Signed-off-by: Glauber Costa glom...@parallels.com CC: Tejun Heo t...@kernel.org CC: Li Zefan lize...@huawei.com CC: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com CC: Johannes Weiner han...@cmpxchg.org CC: Michal Hocko mho...@suse.cz --- include/net/sock.h|9 + mm

[Devel] Re: [PATCH 16/23] slab: provide kmalloc_no_account

2012-04-25 Thread Glauber Costa
On 04/24/2012 10:44 PM, KAMEZAWA Hiroyuki wrote: (2012/04/23 8:53), Glauber Costa wrote: Some allocations need to be accounted to the root memcg regardless of their context. One trivial example, is the allocations we do during the memcg slab cache creation themselves. Strictly speaking

[Devel] Re: [PATCH 11/23] slub: consider a memcg parameter in kmem_create_cache

2012-04-25 Thread Glauber Costa
On 04/24/2012 10:38 PM, KAMEZAWA Hiroyuki wrote: (2012/04/21 6:57), Glauber Costa wrote: Allow a memcg parameter to be passed during cache creation. The slub allocator will only merge caches that belong to the same memcg. Default function is created as a wrapper, passing NULL to the memcg

[Devel] Re: [PATCH 09/23] kmem slab accounting basic infrastructure

2012-04-25 Thread Glauber Costa
On 04/24/2012 10:32 PM, KAMEZAWA Hiroyuki wrote: (2012/04/21 6:57), Glauber Costa wrote: This patch adds the basic infrastructure for the accounting of the slab caches. To control that, the following files are created: * memory.kmem.usage_in_bytes * memory.kmem.limit_in_bytes

[Devel] Re: [PATCH 17/23] kmem controller charge/uncharge infrastructure

2012-04-25 Thread Glauber Costa
On 04/24/2012 07:54 PM, David Rientjes wrote: On Tue, 24 Apr 2012, Glauber Costa wrote: Yes, for user memory, I see charging to p-mm-owner as allowing that process to eventually move and be charged to a different memcg and there's no way to do proper accounting if the charge is split amongst

[Devel] Re: [PATCH 17/23] kmem controller charge/uncharge infrastructure

2012-04-25 Thread Glauber Costa
About kmem, if we count task_struct, page tables, etc...which can be freed by OOM-Killer i.e. it's allocated for 'process', should be aware of OOM problem. Using mm-owner makes sense to me until someone finds a great idea to handle OOM situation rather than task killing. noted, will update.

[Devel] Re: [PATCH v2 5/5] decrement static keys on real destroy time

2012-04-24 Thread Glauber Costa
On 04/23/2012 11:40 PM, KAMEZAWA Hiroyuki wrote: (2012/04/24 4:37), Glauber Costa wrote: We call the destroy function when a cgroup starts to be removed, such as by a rmdir event. However, because of our reference counters, some objects are still inflight. Right now, we are decrementing

[Devel] Re: [PATCH v2 4/5] don't take cgroup_mutex in destroy()

2012-04-24 Thread Glauber Costa
On 04/23/2012 11:31 PM, KAMEZAWA Hiroyuki wrote: (2012/04/24 4:37), Glauber Costa wrote: Most of the destroy functions are only doing very simple things like freeing memory. The ones who goes through lists and such, already use its own locking for those. * The cgroup itself won't go away

[Devel] Re: [PATCH 11/23] slub: consider a memcg parameter in kmem_create_cache

2012-04-24 Thread Glauber Costa
On 04/24/2012 11:03 AM, Frederic Weisbecker wrote: On Fri, Apr 20, 2012 at 06:57:19PM -0300, Glauber Costa wrote: diff --git a/mm/slub.c b/mm/slub.c index 2652e7c..86e40cc 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -32,6 +32,7 @@ #includelinux/prefetch.h #includetrace/events/kmem.h

[Devel] Re: [PATCH 13/23] slub: create duplicate cache

2012-04-24 Thread Glauber Costa
On 04/24/2012 11:18 AM, Frederic Weisbecker wrote: On Sun, Apr 22, 2012 at 08:53:30PM -0300, Glauber Costa wrote: This patch provides kmem_cache_dup(), that duplicates a cache for a memcg, preserving its creation properties. Object size, alignment and flags are all respected. When a duplicate

[Devel] Re: [PATCH 17/23] kmem controller charge/uncharge infrastructure

2012-04-24 Thread Glauber Costa
On 04/24/2012 11:22 AM, Frederic Weisbecker wrote: On Mon, Apr 23, 2012 at 03:25:59PM -0700, David Rientjes wrote: On Sun, 22 Apr 2012, Glauber Costa wrote: +/* + * Return the kmem_cache we're supposed to use for a slab allocation. + * If we are in interrupt context or otherwise have

[Devel] Re: [PATCH v2 3/5] change number_of_cpusets to an atomic

2012-04-24 Thread Glauber Costa
On 04/24/2012 12:02 PM, Christoph Lameter wrote: On Mon, 23 Apr 2012, Glauber Costa wrote: This will allow us to call destroy() without holding the cgroup_mutex(). Other important updates inside update_flags() are protected by the callback_mutex. We could protect this variable

[Devel] Re: [PATCH v2 3/5] change number_of_cpusets to an atomic

2012-04-24 Thread Glauber Costa
On 04/24/2012 01:24 PM, Christoph Lameter wrote: On Tue, 24 Apr 2012, Glauber Costa wrote: Would this not also be a good case to introduce static branching? number_of_cpusets is used to avoid going through unnecessary processing should there be no cpusets in use. static branches comes

[Devel] Re: [PATCH 17/23] kmem controller charge/uncharge infrastructure

2012-04-24 Thread Glauber Costa
On 04/24/2012 05:25 PM, David Rientjes wrote: On Tue, 24 Apr 2012, Glauber Costa wrote: I think memcg is not necessarily wrong. That is because threads in a process share an address space, and you will eventually need to map a page to deliver it to userspace. The mm struct points you

[Devel] Re: [PATCH 2/3] don't take cgroup_mutex in destroy()

2012-04-23 Thread Glauber Costa
On 04/21/2012 03:47 AM, Li Zefan wrote: Glauber Costa wrote: On 04/19/2012 07:57 PM, Tejun Heo wrote: On Thu, Apr 19, 2012 at 07:49:17PM -0300, Glauber Costa wrote: Most of the destroy functions are only doing very simple things like freeing memory. The ones who goes through lists

[Devel] [PATCH v2 0/5] Fix problem with static_key decrement

2012-04-23 Thread Glauber Costa
. I am ready to make any further modifications on this that you guys deem necessary. Thanks Glauber Costa (5): don't attach a task to a dead cgroup blkcg: protect blkcg-policy_list change number_of_cpusets to an atomic don't take cgroup_mutex in destroy() decrement static keys on real

[Devel] [PATCH v2 2/5] blkcg: protect blkcg-policy_list

2012-04-23 Thread Glauber Costa
policy_list walks are protected with blkcg-lock everywhere else in the code. In destroy(), they are not. Because destroy is usually protected with the cgroup_mutex(), this is usually not a problem. But it would be a lot better not to assume this. Signed-off-by: Glauber Costa glom...@parallels.com

[Devel] [PATCH v2 1/5] don't attach a task to a dead cgroup

2012-04-23 Thread Glauber Costa
-off-by: Glauber Costa glom...@parallels.com CC: Tejun Heo t...@kernel.org CC: Li Zefan lize...@huawei.com CC: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com --- kernel/cgroup.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/kernel/cgroup.c b/kernel/cgroup.c index

[Devel] [PATCH v2 4/5] don't take cgroup_mutex in destroy()

2012-04-23 Thread Glauber Costa
* There are no more tasks in the cgroup, and the cgroup is declared dead (cgroup_is_removed() == true) [v2: don't cgroup_lock the freezer and blkcg ] Signed-off-by: Glauber Costa glom...@parallels.com CC: Tejun Heo t...@kernel.org CC: Li Zefan lize...@huawei.com CC: Kamezawa Hiroyuki kamezawa.hir

[Devel] [PATCH v2 5/5] decrement static keys on real destroy time

2012-04-23 Thread Glauber Costa
, only limited memcgs will have its sockets accounted. [v2: changed a tcp limited flag for a generic proto limited flag ] [v3: update the current active flag only after the static_key update ] Signed-off-by: Glauber Costa glom...@parallels.com --- include/net/sock.h|9 mm

[Devel] [PATCH v2 3/5] change number_of_cpusets to an atomic

2012-04-23 Thread Glauber Costa
by that mutex at all times, and some of its updates happen inside the cgroup_mutex - which means we would deadlock. An atomic variable is not expensive, since it is seldom updated, and protect us well. Signed-off-by: Glauber Costa glom...@parallels.com --- include/linux/cpuset.h |6 +++--- kernel

[Devel] [PATCH 12/23] slab: pass memcg parameter to kmem_cache_create

2012-04-22 Thread Glauber Costa
of simplifications Signed-off-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb...@cs.helsinki.fi CC: Michal Hocko mho...@suse.cz CC: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com CC: Johannes Weiner han...@cmpxchg.org CC: Suleiman Souhlal sulei

[Devel] [PATCH 21/23] memcg: Track all the memcg children of a kmem_cache.

2012-04-22 Thread Glauber Costa
From: Suleiman Souhlal ssouh...@freebsd.org This enables us to remove all the children of a kmem_cache being destroyed, if for example the kernel module it's being used in gets unloaded. Otherwise, the children will still point to the destroyed parent. We also use this to propagate

[Devel] [PATCH 13/23] slub: create duplicate cache

2012-04-22 Thread Glauber Costa
count is increased if the cache creation succeeds. Signed-off-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb...@cs.helsinki.fi CC: Michal Hocko mho...@suse.cz CC: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com CC: Johannes Weiner han...@cmpxchg.org

[Devel] [PATCH 14/23] slub: provide kmalloc_no_account

2012-04-22 Thread Glauber Costa
kmalloc allocations are allowed to be bypassed. The function is not exported, because drivers code should always be accounted. Signed-off-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb...@cs.helsinki.fi CC: Michal Hocko mho...@suse.cz CC: Kamezawa

[Devel] [PATCH 22/23] memcg: Per-memcg memory.kmem.slabinfo file.

2012-04-22 Thread Glauber Costa
From: Suleiman Souhlal ssouh...@freebsd.org This file shows all the kmem_caches used by a memcg. Signed-off-by: Suleiman Souhlal sulei...@google.com --- include/linux/slab.h |1 + mm/memcontrol.c | 17 ++ mm/slab.c| 88

[Devel] [PATCH 15/23] slab: create duplicate cache

2012-04-22 Thread Glauber Costa
Souhlal, with some adaptations and simplifications by me. Signed-off-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb...@cs.helsinki.fi CC: Michal Hocko mho...@suse.cz CC: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com CC: Johannes Weiner han

[Devel] [PATCH 16/23] slab: provide kmalloc_no_account

2012-04-22 Thread Glauber Costa
kmalloc allocations are allowed to be bypassed. The function is not exported, because drivers code should always be accounted. This code is mosly written by Suleiman Souhlal. Signed-off-by: Glauber Costa glom...@parallels.com CC: Christoph Lameter c...@linux.com CC: Pekka Enberg penb

<    2   3   4   5   6   7   8   9   10   >