Re: [Devel] [RFC] how should we deal with dead memcgs' kmem caches?

2014-04-21 Thread Christoph Lameter
On Sun, 20 Apr 2014, Vladimir Davydov wrote: * Way #1 - prevent dead kmem caches from caching slabs on free * We can modify sl[au]b implementation so that it won't cache any objects on free if the kmem cache belongs to a dead memcg. Then it'd be enough to drain per-cpu pools of all dead kmem

Re: [Devel] [PATCH -mm 1/4] memcg, slab: do not schedule cache destruction when last page goes away

2014-04-15 Thread Christoph Lameter
On Tue, 15 Apr 2014, Vladimir Davydov wrote: 2) When freeing an object of a dead memcg cache, initiate thorough check if the cache is really empty and destroy it then. That could be implemented by poking the reaping thread on kfree, and actually does not require the schedule_work in

Re: [Devel] [PATCH -mm 1/4] memcg, slab: do not schedule cache destruction when last page goes away

2014-04-15 Thread Christoph Lameter
On Tue, 15 Apr 2014, Vladimir Davydov wrote: There is already logic in both slub and slab that does that on cache close. Yeah, but here the question is when we should close caches left after memcg offline. Obviously we should do it after all objects of such a cache have gone, but when

Re: [Devel] [PATCH -mm] slab: document kmalloc_order

2014-04-11 Thread Christoph Lameter
On Fri, 11 Apr 2014, Vladimir Davydov wrote: diff --git a/mm/slab_common.c b/mm/slab_common.c index cab4c49b3e8c..3ffd2e76b5d2 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -573,6 +573,11 @@ void __init create_kmalloc_caches(unsigned long flags) } #endif /* !CONFIG_SLOB */

Re: [Devel] [PATCH -mm v2.2] mm: get rid of __GFP_KMEMCG

2014-04-11 Thread Christoph Lameter
On Thu, 3 Apr 2014, Vladimir Davydov wrote: --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -358,16 +358,7 @@ kmem_cache_alloc_node_trace(struct kmem_cache *s, #include linux/slub_def.h #endif -static __always_inline void * -kmalloc_order(size_t size, gfp_t flags, unsigned int

Re: [Devel] [PATCH 2/2] slub: do not drop slab_mutex for sysfs_slab_add

2014-02-10 Thread Christoph Lameter
On Sun, 9 Feb 2014, Vladimir Davydov wrote: Fortunately, recently kobject_uevent was patched to call the usermode helper with the UMH_NO_WAIT flag, making the deadlock impossible. Great If you can get that other patch merged then Acked-by: Christoph Lameter c...@linux.com

Re: [Devel] [PATCH RFC] slub: do not drop slab_mutex for sysfs_slab_{add, remove}

2014-02-06 Thread Christoph Lameter
On Thu, 6 Feb 2014, Vladimir Davydov wrote: When creating/destroying a kmem cache, we do a lot of work holding the slab_mutex, but we drop it for sysfs_slab_{add,remove} for some reason. Since __kmem_cache_create and __kmem_cache_shutdown are extremely rare, I propose to simplify locking by

Re: [Devel] [PATCH RFC] slub: do not drop slab_mutex for sysfs_slab_{add, remove}

2014-02-06 Thread Christoph Lameter
On Thu, 6 Feb 2014, Vladimir Davydov wrote: Hmm... IIUC the only function of concern is kobject_uevent() - everything else called from sysfs_slab_{add,remove} is a mix of kmalloc, kfree, mutex_lock/unlock - in short, nothing dangerous. There we do call_usermodehelper(), but we do it with

[Devel] Re: [PATCH v5 16/18] slab: propagate tunables values

2012-10-23 Thread Christoph Lameter
On Mon, 22 Oct 2012, Glauber Costa wrote: On 10/19/2012 11:51 PM, Christoph Lameter wrote: On Fri, 19 Oct 2012, Glauber Costa wrote: SLAB allows us to tune a particular cache behavior with tunables. When creating a new memcg cache copy, we'd like to preserve any tunables the parent

[Devel] Re: [PATCH v5 10/18] sl[au]b: always get the cache from its page in kfree

2012-10-19 Thread Christoph Lameter
On Fri, 19 Oct 2012, Glauber Costa wrote: struct page already have this information. If we start chaining caches, this information will always be more trustworthy than whatever is passed into the function Yes it does but the information is not standardized between the allocators yet. Coul you

[Devel] Re: [PATCH v5 11/18] sl[au]b: Allocate objects from memcg cache

2012-10-19 Thread Christoph Lameter
On Fri, 19 Oct 2012, Glauber Costa wrote: We are able to match a cache allocation to a particular memcg. If the task doesn't change groups during the allocation itself - a rare event, this will give us a good picture about who is the first group to touch a cache page. No that the

[Devel] Re: [PATCH v5 14/18] memcg/sl[au]b: shrink dead caches

2012-10-19 Thread Christoph Lameter
On Fri, 19 Oct 2012, Glauber Costa wrote: An unlikely branch is used to make sure this case does not affect performance in the usual slab_free path. The slab allocator has a time based reaper that would eventually get rid of the objects, but we can also call it explicitly, since dead caches

[Devel] Re: [PATCH v5 15/18] Aggregate memcg cache values in slabinfo

2012-10-19 Thread Christoph Lameter
On Fri, 19 Oct 2012, Glauber Costa wrote: + +/* + * We use suffixes to the name in memcg because we can't have caches + * created in the system with the same name. But when we print them + * locally, better refer to them with the base name + */ +static inline const char *cache_name(struct

[Devel] Re: [PATCH v5 07/14] mm: Allocate kernel pages to the right memcg

2012-10-16 Thread Christoph Lameter
On Tue, 16 Oct 2012, Glauber Costa wrote: To avoid adding markers to the page - and a kmem flag that would necessarily follow, as much as doing page_cgroup lookups for no reason, whoever is marking its allocations with __GFP_KMEMCG flag is responsible for telling the page allocator that this

[Devel] Re: [PATCH v5 14/14] Add documentation about the kmem controller

2012-10-16 Thread Christoph Lameter
On Tue, 16 Oct 2012, Glauber Costa wrote: + memory.kmem.limit_in_bytes # set/show hard limit for kernel memory + memory.kmem.usage_in_bytes # show current kernel memory allocation + memory.kmem.failcnt # show the number of kernel memory usage hits limits +

[Devel] Re: [PATCH v5 14/14] Add documentation about the kmem controller

2012-10-16 Thread Christoph Lameter
On Tue, 16 Oct 2012, Glauber Costa wrote: A limitation of kernel memory use would be good, for example, to prevent abuse from non-trusted containers in a high density, shared, container environment. But that would be against intentional abuse by someone who has code that causes the kernel to

[Devel] Re: [PATCH v3 06/16] memcg: infrastructure to match an allocation to the right cache

2012-09-25 Thread Christoph Lameter
On Tue, 25 Sep 2012, Glauber Costa wrote: 1) Do like the events mechanism and allocate this in a separate structure. Add a pointer chase in the access, and I don't think it helps much because it gets allocated anyway. But we could at least defer it to the time when we limit the cache.

[Devel] Re: [PATCH v3 05/16] consider a memcg parameter in kmem_create_cache

2012-09-24 Thread Christoph Lameter
On Mon, 24 Sep 2012, Glauber Costa wrote: But that is orthogonal, isn't it? People will still expect to see it in the old slabinfo file. The current scheme for memory statistics is /proc/meminfo contains global counters /sys/devices/system/node/nodeX/meminfo contains node specific counters.

[Devel] Re: [PATCH v3 05/16] consider a memcg parameter in kmem_create_cache

2012-09-24 Thread Christoph Lameter
On Mon, 24 Sep 2012, Glauber Costa wrote: The reason I say it is orthogonal, is that people will still want to see their caches in /proc/slabinfo, regardless of wherever else they'll be. It was a requirement from Pekka in one of the first times I posted this, IIRC. They want to see total

[Devel] Re: [PATCH v3 05/16] consider a memcg parameter in kmem_create_cache

2012-09-24 Thread Christoph Lameter
On Mon, 24 Sep 2012, Glauber Costa wrote: So Christoph is proposing that the new caches appear somewhere under the cgroups directory and /proc/slabinfo includes aggregated counts, right? I'm certainly OK with that. Just for clarification, I am not sure about the aggregate counts -

[Devel] Re: [PATCH v3 05/16] consider a memcg parameter in kmem_create_cache

2012-09-24 Thread Christoph Lameter
On Mon, 24 Sep 2012, Pekka Enberg wrote: So Christoph is proposing that the new caches appear somewhere under the cgroups directory and /proc/slabinfo includes aggregated counts, right? I'm certainly OK with that. Caches would appear either in cgroup/slabinfo (which would have the same format

[Devel] Re: [PATCH v3 05/13] Add a __GFP_KMEMCG flag

2012-09-19 Thread Christoph Lameter
On Wed, 19 Sep 2012, Glauber Costa wrote: On 09/18/2012 07:06 PM, Christoph Lameter wrote: On Tue, 18 Sep 2012, Glauber Costa wrote: +++ b/include/linux/gfp.h @@ -35,6 +35,11 @@ struct vm_area_struct; #else #define ___GFP_NOTRACK0 #endif +#ifdef CONFIG_MEMCG_KMEM

[Devel] Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-19 Thread Christoph Lameter
On Wed, 19 Sep 2012, Glauber Costa wrote: This is an extremely hot path of the kernel and you are adding significant processing. Check how the benchmarks are influenced by this change. virt_to_cache can be a bit expensive. Would it be enough for you to have a separate code path for

[Devel] Re: [PATCH v3 05/13] Add a __GFP_KMEMCG flag

2012-09-18 Thread Christoph Lameter
On Tue, 18 Sep 2012, Glauber Costa wrote: +++ b/include/linux/gfp.h @@ -35,6 +35,11 @@ struct vm_area_struct; #else #define ___GFP_NOTRACK 0 #endif +#ifdef CONFIG_MEMCG_KMEM +#define ___GFP_KMEMCG0x40u +#else +#define ___GFP_KMEMCG0

[Devel] Re: [PATCH v3 03/16] slab: Ignore the cflgs bit in cache creation

2012-09-18 Thread Christoph Lameter
On Tue, 18 Sep 2012, Glauber Costa wrote: No cache should ever pass that as a creation flag, since this bit is used to mark an internal decision of the slab about object placement. We can just ignore this bit if it happens to be passed (such as when duplicating a cache in the kmem memcg

[Devel] Re: [PATCH v3 04/16] provide a common place for initcall processing in kmem_cache

2012-09-18 Thread Christoph Lameter
an empty placeholder for the SLOB. Acked-by: Christoph Lameter c...@linux.com ___ Devel mailing list Devel@openvz.org https://openvz.org/mailman/listinfo/devel

[Devel] Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-18 Thread Christoph Lameter
On Tue, 18 Sep 2012, Glauber Costa wrote: index f2d760c..18de3f6 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -3938,9 +3938,12 @@ EXPORT_SYMBOL(__kmalloc); * Free an object which was previously allocated from this * cache. */ -void kmem_cache_free(struct kmem_cache *cachep, void *objp)

[Devel] Re: [PATCH v3 13/16] slab: slab-specific propagation changes.

2012-09-18 Thread Christoph Lameter
On Tue, 18 Sep 2012, Glauber Costa wrote: When a parent cache does tune_cpucache, we need to propagate that to the children as well. For that, we unfortunately need to tap into the slab core. One of the todo list items for the common stuff is to have actually a common kmem_cache structure. If

[Devel] Re: [PATCH v3 15/16] memcg/sl[au]b: shrink dead caches

2012-09-18 Thread Christoph Lameter
Why doesnt slab need that too? It keeps a number of free pages on the per node lists until shrink is called. ___ Devel mailing list Devel@openvz.org https://openvz.org/mailman/listinfo/devel

[Devel] Re: [PATCH v2 04/11] kmem accounting basic infrastructure

2012-08-15 Thread Christoph Lameter
On Wed, 15 Aug 2012, Michal Hocko wrote: That is not what the kernel does, in general. We assume that if he wants that memory and we can serve it, we should. Also, not all kernel memory is unreclaimable. We can shrink the slabs, for instance. Ying Han claims she has patches for that

[Devel] Re: [PATCH v2 04/11] kmem accounting basic infrastructure

2012-08-15 Thread Christoph Lameter
On Wed, 15 Aug 2012, Glauber Costa wrote: On 08/15/2012 06:47 PM, Christoph Lameter wrote: On Wed, 15 Aug 2012, Michal Hocko wrote: That is not what the kernel does, in general. We assume that if he wants that memory and we can serve it, we should. Also, not all kernel memory

[Devel] Re: [PATCH v2 04/11] kmem accounting basic infrastructure

2012-08-15 Thread Christoph Lameter
On Wed, 15 Aug 2012, Greg Thelen wrote: You can already shrink the reclaimable slabs (dentries / inodes) via calls to the subsystem specific shrinkers. Did Ying Han do anything to go beyond that? cc: Ying The Google shrinker patches enhance prune_dcache_sb() to limit dentry pressure to

[Devel] Re: [PATCH v2 04/11] kmem accounting basic infrastructure

2012-08-15 Thread Christoph Lameter
On Wed, 15 Aug 2012, Glauber Costa wrote: Remember we copy over the metadata and create copies of the caches per-memcg. Therefore, a dentry belongs to a memcg if it was allocated from the slab pertaining to that memcg. The dentry could be used by other processes in the system though. F.e.

[Devel] Re: [PATCH v4 24/25] memcg/slub: shrink dead caches

2012-07-25 Thread Christoph Lameter
On Fri, 20 Jul 2012, Glauber Costa wrote: This is the same btw in SLAB which keeps objects in per cpu caches and keeps empty slab pages on special queues. This patch marks all memcg caches as dead. kmem_cache_shrink is called for the ones who are not yet dead - this will force internal

[Devel] Re: [PATCH 05/10] slab: allow enable_cpu_cache to use preset values for its tunables

2012-07-25 Thread Christoph Lameter
On Wed, 25 Jul 2012, Glauber Costa wrote: SLAB allows us to tune a particular cache behavior with tunables. When creating a new memcg cache copy, we'd like to preserve any tunables the parent cache already had. So does SLUB but I do not see a patch for that allocator.

[Devel] Re: [PATCH 09/10] slab: slab-specific propagation changes.

2012-07-25 Thread Christoph Lameter
On Wed, 25 Jul 2012, Glauber Costa wrote: When a parent cache does tune_cpucache, we need to propagate that to the children as well. For that, we unfortunately need to tap into the slab core. Slub also has tunables. ___ Devel mailing list

[Devel] Re: [PATCH 10/10] memcg/sl[au]b: shrink dead caches

2012-07-25 Thread Christoph Lameter
On Wed, 25 Jul 2012, Glauber Costa wrote: In the slub allocator, when the last object of a page goes away, we don't necessarily free it - there is not necessarily a test for empty page in any slab_free path. That is true for the slab allocator as well. In either case calling

[Devel] Re: [PATCH 05/10] slab: allow enable_cpu_cache to use preset values for its tunables

2012-07-25 Thread Christoph Lameter
On Wed, 25 Jul 2012, Glauber Costa wrote: It is certainly not through does the same method as SLAB, right ? Writing to /proc/slabinfo gives me an I/O error I assume it is something through sysfs, but schiming through the code now, I can't find any per-cache tunables. Would you mind pointing

[Devel] Re: [PATCH] provide a common place for initcall processing in kmem_cache

2012-07-24 Thread Christoph Lameter
On Mon, 23 Jul 2012, Glauber Costa wrote: This patch moves that to slab_common.c, while creating an empty placeholder for the SLOB. Acked-by: Christoph Lameter c...@linux.com ___ Devel mailing list Devel@openvz.org https://openvz.org/mailman

[Devel] Re: [PATCH 4/4] make CFLGS_OFF_SLAB visible for all slabs

2012-06-14 Thread Christoph Lameter
On Thu, 14 Jun 2012, Glauber Costa wrote: Since we're now moving towards a unified slab allocator interface, make CFLGS_OFF_SLAB visible to all allocators, even though SLAB keeps being its only users. Also, make the name consistent with the other flags, that start with SLAB_xx. What is the

[Devel] Re: [PATCH 4/4] make CFLGS_OFF_SLAB visible for all slabs

2012-06-14 Thread Christoph Lameter
On Thu, 14 Jun 2012, Glauber Costa wrote: I want to mask that out in kmem-specific slab creation. Since I am copying the original flags, and that flag is embedded in the slab saved flags, it will be carried to the new slab if I don't mask it out. I thought you intercepted slab creation? You

[Devel] Re: [PATCH 2/4] Add a __GFP_SLABMEMCG flag

2012-06-11 Thread Christoph Lameter
On Sat, 9 Jun 2012, James Bottomley wrote: On Fri, 2012-06-08 at 14:31 -0500, Christoph Lameter wrote: On Fri, 8 Jun 2012, Glauber Costa wrote: */ #define __GFP_NOTRACK_FALSE_POSITIVE (__GFP_NOTRACK) -#define __GFP_BITS_SHIFT 25 /* Room for N __GFP_FOO bits */ +#define

[Devel] Re: [PATCH 2/4] Add a __GFP_SLABMEMCG flag

2012-06-08 Thread Christoph Lameter
On Fri, 8 Jun 2012, Glauber Costa wrote: */ #define __GFP_NOTRACK_FALSE_POSITIVE (__GFP_NOTRACK) -#define __GFP_BITS_SHIFT 25 /* Room for N __GFP_FOO bits */ +#define __GFP_BITS_SHIFT 26 /* Room for N __GFP_FOO bits */ #define __GFP_BITS_MASK ((__force gfp_t)((1 __GFP_BITS_SHIFT) -

[Devel] Re: [PATCH v3 13/28] slub: create duplicate cache

2012-05-30 Thread Christoph Lameter
On Wed, 30 May 2012, Tejun Heo wrote: Yeah, I prefer your per-cg cache approach but do hope that it stays as far from actual allocator code as possible. Christoph, would it be acceptable if the cg logic is better separated? Certainly anything that would allow this to be separated out would be

[Devel] Re: [PATCH v3 05/28] memcg: Reclaim when more than one page needed.

2012-05-29 Thread Christoph Lameter
On Fri, 25 May 2012, Glauber Costa wrote: From: Suleiman Souhlal ssouh...@freebsd.org mem_cgroup_do_charge() was written before slab accounting, and expects three cases: being called for 1 page, being called for a stock of 32 pages, or being called for a hugepage. If we call for 2 pages

[Devel] Re: [PATCH v3 12/28] slab: pass memcg parameter to kmem_cache_create

2012-05-29 Thread Christoph Lameter
On Fri, 25 May 2012, Glauber Costa wrote: index 06e4a3e..7c0cdd6 100644 --- a/include/linux/slab_def.h +++ b/include/linux/slab_def.h @@ -102,6 +102,13 @@ struct kmem_cache { */ }; +static inline void store_orig_align(struct kmem_cache *cachep, int orig_align) +{ +#ifdef

[Devel] Re: [PATCH v3 13/28] slub: create duplicate cache

2012-05-29 Thread Christoph Lameter
On Fri, 25 May 2012, Glauber Costa wrote: index dacd1fb..4689034 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -467,6 +467,23 @@ struct cg_proto *tcp_proto_cgroup(struct mem_cgroup *memcg) EXPORT_SYMBOL(tcp_proto_cgroup); #endif /* CONFIG_INET */ +char

[Devel] Re: [PATCH v3 15/28] slub: always get the cache from its page in kfree

2012-05-29 Thread Christoph Lameter
On Fri, 25 May 2012, Glauber Costa wrote: struct page already have this information. If we start chaining caches, this information will always be more trustworthy than whatever is passed into the function Yes but the lookup of the page struct also costs some cycles. SLAB in !NUMA mode and

[Devel] Re: [PATCH v3 16/28] memcg: kmem controller charge/uncharge infrastructure

2012-05-29 Thread Christoph Lameter
On Fri, 25 May 2012, Glauber Costa wrote: --- a/init/Kconfig +++ b/init/Kconfig @@ -696,7 +696,7 @@ config CGROUP_MEM_RES_CTLR_SWAP_ENABLED then swapaccount=0 does the trick). config CGROUP_MEM_RES_CTLR_KMEM bool Memory Resource Controller Kernel Memory accounting

[Devel] Re: [PATCH v3 18/28] slub: charge allocation to a memcg

2012-05-29 Thread Christoph Lameter
On Fri, 25 May 2012, Glauber Costa wrote: This patch charges allocation of a slab object to a particular memcg. I am wondering why you need all the other patches. The simplest approach would just to hook into page allocation and freeing from the slab allocators as done here and charge to the

[Devel] Re: [PATCH v3 19/28] slab: per-memcg accounting of slab caches

2012-05-29 Thread Christoph Lameter
On Fri, 25 May 2012, Glauber Costa wrote: This patch charges allocation of a slab object to a particular memcg. Ok so a requirement is to support tracking of individual slab objects to cgroups? That is going to be quite expensive since it will touch the hotpaths.

[Devel] Re: [PATCH v3 00/28] kmem limitation for memcg

2012-05-29 Thread Christoph Lameter
On Mon, 28 May 2012, Glauber Costa wrote: It would be best to merge these with my patchset to extract common code from the allocators. The modifications of individual slab allocators would then be not necessary anymore and it would save us a lot of work. Some of them would not, some of

[Devel] Re: [PATCH v3 00/28] kmem limitation for memcg

2012-05-29 Thread Christoph Lameter
On Tue, 29 May 2012, Glauber Costa wrote: I think it may be simplest to only account for the pages used by a slab in a memcg. That code could be added to the functions in the slab allocators that interface with the page allocators. Those are not that performance critical and would do not

[Devel] Re: [PATCH v3 12/28] slab: pass memcg parameter to kmem_cache_create

2012-05-29 Thread Christoph Lameter
On Tue, 29 May 2012, Glauber Costa wrote: Ok this only duplicates the kmalloc arrays. Why not the others? It does duplicate the others. First it does a while look on the kmalloc caches, then a list_for_each_entry in the rest. You probably missed it. There is no need to separately

[Devel] Re: [PATCH v3 13/28] slub: create duplicate cache

2012-05-29 Thread Christoph Lameter
On Tue, 29 May 2012, Glauber Costa wrote: Accounting pages seems just crazy to me. If new allocators come in the future, organizing the pages in a different way, instead of patching it here and there, we need to totally rewrite this. Quite to the contrary. We could either pass a

[Devel] Re: [PATCH v3 12/28] slab: pass memcg parameter to kmem_cache_create

2012-05-29 Thread Christoph Lameter
On Tue, 29 May 2012, Glauber Costa wrote: How do you detect that someone is touching it? kmem_alloc_cache will create mem_cgroup_get_kmem_cache. (protected by static_branches, so won't happen if you don't have at least non-root memcg using it) * Then it detects which memcg the calling

[Devel] Re: [PATCH v3 13/28] slub: create duplicate cache

2012-05-29 Thread Christoph Lameter
On Tue, 29 May 2012, Glauber Costa wrote: I will try to at least have the page accounting done in a consistent way. How about that? Ok. What do you mean by consistent? Since objects and pages can be used in a shared way and since accounting in many areas of the kernel is intentional fuzzy to

[Devel] Re: [PATCH v3 13/28] slub: create duplicate cache

2012-05-29 Thread Christoph Lameter
On Tue, 29 May 2012, Glauber Costa wrote: But we really need a page to be filled with objects from the same cgroup, and the non-shared objects to be accounted to the right place. No other subsystem has such a requirement. Even the NUMA nodes are mostly suggestions and can be ignored by the

[Devel] Re: [PATCH v3 13/28] slub: create duplicate cache

2012-05-29 Thread Christoph Lameter
On Tue, 29 May 2012, Glauber Costa wrote: I don't know about cpusets in details, but at least with NUMA, this is not an apple-to-apple comparison. a NUMA node is not meant to contain you. A container is, and that is why it is called a container. Cpusets contains sets of nodes. A cpusets

[Devel] Re: [PATCH v3 13/28] slub: create duplicate cache

2012-05-29 Thread Christoph Lameter
On Wed, 30 May 2012, Glauber Costa wrote: Well, I'd have to dive in the code a bit more, but that the impression that the documentation gives me, by saying: Cpusets constrain the CPU and Memory placement of tasks to only the resources within a task's current cpuset. is that you can't

[Devel] Re: [PATCH v3 00/28] kmem limitation for memcg

2012-05-25 Thread Christoph Lameter
On Fri, 25 May 2012, Michal Hocko wrote: On Fri 25-05-12 17:03:20, Glauber Costa wrote: I believe some of the early patches here are already in some trees around. I don't know who should pick this, so if everyone agrees with what's in here, please just ack them and tell me which tree I

[Devel] Re: [PATCH v2 02/29] slub: fix slab_state for slub

2012-05-11 Thread Christoph Lameter
Acked-by: Christoph Lameter c...@linux.com ___ Devel mailing list Devel@openvz.org https://openvz.org/mailman/listinfo/devel

[Devel] Re: [PATCH v2 04/29] slub: always get the cache from its page in kfree

2012-05-11 Thread Christoph Lameter
On Fri, 11 May 2012, Glauber Costa wrote: struct page already have this information. If we start chaining caches, this information will always be more trustworthy than whatever is passed into the function Other allocators may not have that information and this patch may cause bugs to go

[Devel] Re: [PATCH v2 05/29] slab: rename gfpflags to allocflags

2012-05-11 Thread Christoph Lameter
On Fri, 11 May 2012, Glauber Costa wrote: A consistent name with slub saves us an acessor function. In both caches, this field represents the same thing. We would like to use it from the mem_cgroup code. Acked-by: Christoph Lameter c...@linux.com

[Devel] Re: [PATCH v2 04/29] slub: always get the cache from its page in kfree

2012-05-11 Thread Christoph Lameter
On Fri, 11 May 2012, Glauber Costa wrote: Adding a VM_BUG_ON may be useful to make sure that kmem_cache_free is always passed the correct slab cache. Well, problem is , it isn't always passed the correct slab cache. At least not after this series, since we'll have child caches associated

[Devel] Re: [PATCH v2 04/29] slub: always get the cache from its page in kfree

2012-05-11 Thread Christoph Lameter
On Fri, 11 May 2012, Glauber Costa wrote: On 05/11/2012 03:06 PM, Christoph Lameter wrote: On Fri, 11 May 2012, Glauber Costa wrote: Adding a VM_BUG_ON may be useful to make sure that kmem_cache_free is always passed the correct slab cache. Well, problem is , it isn't always

[Devel] Re: [PATCH v2 04/29] slub: always get the cache from its page in kfree

2012-05-11 Thread Christoph Lameter
On Fri, 11 May 2012, Glauber Costa wrote: Thank you in advance for your time reviewing this! Where do I find the rationale for all of this? Trouble is that pages can contain multiple objects f.e. so accounting of pages to groups is a bit fuzzy. I have not followed memcg too much since it is not

[Devel] Re: [PATCH v2 04/29] slub: always get the cache from its page in kfree

2012-05-11 Thread Christoph Lameter
On Fri, 11 May 2012, Glauber Costa wrote: So we don't mix pages from multiple memcgs in the same cache - we believe that would be too confusing. Well subsystem create caches and other things that are shared between multiple processes. How can you track that? /proc/slabinfo reflects this

[Devel] Re: [PATCH v2 04/29] slub: always get the cache from its page in kfree

2012-05-11 Thread Christoph Lameter
On Fri, 11 May 2012, Glauber Costa wrote: On 05/11/2012 03:56 PM, Christoph Lameter wrote: On Fri, 11 May 2012, Glauber Costa wrote: So we don't mix pages from multiple memcgs in the same cache - we believe that would be too confusing. Well subsystem create caches and other

[Devel] Re: [PATCH v2 04/29] slub: always get the cache from its page in kfree

2012-05-11 Thread Christoph Lameter
On Fri, 11 May 2012, Glauber Costa wrote: I see that. But there are other subsystems from slab allocators that do the same. There are also objects that may be used by multiple processes. This is also true for normal user pages. And then, we do what memcg does: first one to touch, gets

[Devel] Re: [PATCH v2 3/5] change number_of_cpusets to an atomic

2012-04-24 Thread Christoph Lameter
On Mon, 23 Apr 2012, Glauber Costa wrote: This will allow us to call destroy() without holding the cgroup_mutex(). Other important updates inside update_flags() are protected by the callback_mutex. We could protect this variable with the callback_mutex as well, as suggested by Li Zefan, but

[Devel] Re: [PATCH v2 3/5] change number_of_cpusets to an atomic

2012-04-24 Thread Christoph Lameter
On Tue, 24 Apr 2012, Glauber Costa wrote: Would this not also be a good case to introduce static branching? number_of_cpusets is used to avoid going through unnecessary processing should there be no cpusets in use. static branches comes with a set of problems themselves, so I usually

[Devel] Re: [PATCH v2 3/5] change number_of_cpusets to an atomic

2012-04-24 Thread Christoph Lameter
On Tue, 24 Apr 2012, Glauber Costa wrote: It doesn't seem to be the case here. How did you figure that? number_of_cpusets was introduced exactly because the functions are used in places where we do not pay the cost of calling __cpuset_node_allowed_soft/hardwall. Have a look at these.

[Devel] Re: [PATCH] slub: don't create a copy of the name string in kmem_cache_create

2012-04-16 Thread Christoph Lameter
On Fri, 13 Apr 2012, Glauber Costa wrote: When creating a cache, slub keeps a copy of the cache name through strdup. The slab however, doesn't do that. This means that everyone registering caches have to keep a copy themselves anyway, since code needs to work on all allocators. Having slab

[Devel] Re: [PATCH] cgroup: Avoid a memset by using vzalloc

2011-02-24 Thread Christoph Lameter
On Mon, 1 Nov 2010, Jesper Juhl wrote: On Sun, 31 Oct 2010, Balbir Singh wrote: There are so many placed need vzalloc. Thanks, Jesper. Could we avoid this painful exercise with a semantic patch? ___ Containers mailing list

[Devel] Re: [PATCH] cgroup: Avoid a memset by using vzalloc

2011-02-24 Thread Christoph Lameter
On Wed, 3 Nov 2010, jovi zhang wrote: On Wed, Nov 3, 2010 at 10:38 PM, Christoph Lameter c...@linux.com wrote: Could we avoid this painful exercise with a semantic patch? Can we make a grep script to walk all files to find vzalloc usage like this? No need to send patch mail one by one like

[Devel] Re: [PATCH] cgroup: Avoid a memset by using vzalloc

2011-02-24 Thread Christoph Lameter
On Wed, 3 Nov 2010, Joe Perches wrote: On Wed, 2010-11-03 at 23:20 +0800, jovi zhang wrote: On Wed, Nov 3, 2010 at 10:38 PM, Christoph Lameter c...@linux.com wrote: On Mon, 1 Nov 2010, Jesper Juhl wrote: On Sun, 31 Oct 2010, Balbir Singh wrote: There are so many placed need

[Devel] Re: [patch 0/7] cpuset writeback throttling

2008-11-07 Thread Christoph Lameter
On Tue, 4 Nov 2008, Andrew Morton wrote: What are the alternatives here? What do we need to do to make throttling a per-memcg thing? Add statistics to the memcg lru and then you need some kind of sets of memcgs that are represented by bitmaps or so attached to an inode. The patchset is

[Devel] Re: [patch 0/7] cpuset writeback throttling

2008-11-07 Thread Christoph Lameter
On Tue, 4 Nov 2008, Andrew Morton wrote: To fix this with a memcg-based throttling, the operator would need to be able to create memcg's which have pages only from particular nodes. (That's a bit indirect relative to what they want to do, but is presumably workable). The system would need to

[Devel] Re: [patch 0/7] cpuset writeback throttling

2008-11-07 Thread Christoph Lameter
On Tue, 4 Nov 2008, Andrew Morton wrote: In a memcg implementation what we would implement is throttle page-dirtying tasks in this memcg when the memcg's dirty memory reaches 40% of its total. Right that is similar to what this patch does for cpusets. A memcg implementation would need to

[Devel] Re: [patch 0/7] cpuset writeback throttling

2008-11-07 Thread Christoph Lameter
On Tue, 4 Nov 2008, Andrew Morton wrote: That is one aspect. When performing writeback then we need to figure out which inodes have dirty pages in the memcg and we need to start writeout on those inodes and not on others that have their dirty pages elsewhere. There are two components of this

[Devel] Re: [patch 0/7] cpuset writeback throttling

2008-11-07 Thread Christoph Lameter
On Wed, 5 Nov 2008, Andrew Morton wrote: That means running reclaim. But we are only interested in getting rid of dirty pages. Plus the filesystem guys have repeatedly pointed out that page sized I/O to random places in a file is not a good thing to do. There was actually talk of stopping

[Devel] Re: [patch 0/7] cpuset writeback throttling

2008-11-07 Thread Christoph Lameter
On Wed, 5 Nov 2008, Andrew Morton wrote: Doable. lru-page-mapping-host is a good start. The block layer has a list of inodes that are dirty. From that we need to select ones that will improve the situation from the cpuset/memcg. How does the LRU come into this? In the simplest case,

[Devel] Re: [patch 0/7] cpuset writeback throttling

2008-11-07 Thread Christoph Lameter
On Wed, 5 Nov 2008, Andrew Morton wrote: See, here's my problem: we have a pile of new code which fixes some problem. But the problem seems to be fairly small - it only affects a small number of sophisticated users and they already have workarounds in place. Well yes... Great situation with

[Devel] Re: [PATCH][for -mm] per-zone and reclaim enhancements for memory controller take 3 [3/10] per-zone active inactive counter

2007-11-28 Thread Christoph Lameter
On Thu, 29 Nov 2007, KAMEZAWA Hiroyuki wrote: ok, just use N_HIGH_MEMORY here and add comment for hotplugging support is not yet. Christoph-san, Lee-san, could you confirm following ? - when SLAB is used, kmalloc_node() against offline node will success. - when SLUB is used,

[Devel] Re: [PATCH 2/5] Generic notifiers for SLUB events

2007-10-01 Thread Christoph Lameter
On Mon, 1 Oct 2007, Pavel Emelyanov wrote: Should the default be on? Shouldn't it depend on KMEM? Well... I think that is should be N by default and has noting to do with the KMEM :) Thanks for noticing. Right. +#ifdef CONFIG_SLUB_NOTIFY + srcu_init_notifier_head(slub_nb); Can

[Devel] Re: [PATCH 3/5] Switch caches notification dynamically

2007-10-01 Thread Christoph Lameter
On Mon, 1 Oct 2007, Balbir Singh wrote: Is this documented somewhere or is this interpreted from looking at the code of other file handlers? Documentation/vm/slub.txt ___ Containers mailing list [EMAIL PROTECTED]

[Devel] Re: [PATCH 5/5] Account for the slub objects

2007-10-01 Thread Christoph Lameter
On Mon, 1 Oct 2007, Pavel Emelyanov wrote: + Quick check, slub_free_notify() and slab_alloc_notify() are called from serialized contexts, right? Yup. How is it serialized? ___ Containers mailing list [EMAIL PROTECTED]

[Devel] Re: [PATCH 3/5] Switch caches notification dynamically

2007-09-26 Thread Christoph Lameter
On Wed, 26 Sep 2007, Pavel Emelyanov wrote: Is it necessary to mark all the existing slabs with SLAB_DEBUG? Would it Yup. Otherwise we can never receive a single event e.g. if we make alloc/free in a loop, or similar, so that new slabs simply are not created. Right but on the other

[Devel] Re: [PATCH 1/5] Add notification about some major slab events

2007-09-26 Thread Christoph Lameter
On Wed, 26 Sep 2007, Pavel Emelyanov wrote: True, but we mark the slubs as notifyable at runtime, after they are merged. However, once someone decides to make his slab notifyable from the very beginning this makes sense, thanks. This also makes sense if a device driver later creates a new

[Devel] Re: [PATCH 1/5] Add notification about some major slab events

2007-09-25 Thread Christoph Lameter
On Tue, 25 Sep 2007, Pavel Emelyanov wrote: @@ -28,6 +28,7 @@ #define SLAB_DESTROY_BY_RCU 0x0008UL/* Defer freeing slabs to RCU */ #define SLAB_MEM_SPREAD 0x0010UL/* Spread some memory over cpuset */ #define SLAB_TRACE 0x0020UL/* Trace

[Devel] Re: [PATCH 3/5] Switch caches notification dynamically

2007-09-25 Thread Christoph Lameter
On Tue, 25 Sep 2007, Pavel Emelyanov wrote: + for_each_node_state(n, N_NORMAL_MEMORY) { + struct kmem_cache_node *node; + struct page *pg; + + node = get_node(s, n); +

[Devel] Re: [PATCH 1/4] Add notification about some major slab events

2007-09-24 Thread Christoph Lameter
On Fri, 21 Sep 2007, Pavel Emelyanov wrote: @@ -1486,7 +1597,7 @@ load_freelist: object = c-page-freelist; if (unlikely(!object)) goto another_slab; - if (unlikely(SlabDebug(c-page))) + if (unlikely(SlabDebug(c-page)) || (s-flags SLAB_NOTIFY))

[Devel] Re: [PATCH 3/5] Switch caches notification dynamically

2007-09-24 Thread Christoph Lameter
On Fri, 21 Sep 2007, Pavel Emelyanov wrote: The /sys/slab/name/cache_notify attribute controls whether the cache name is to be accounted or not. For the reasons described before kmalloc caches cannot be turned on. It looks like the patch is forbidding to turn the notification off? On is

[Devel] Re: [PATCH 1/4] Add notification about some major slab events

2007-09-19 Thread Christoph Lameter
On Wed, 19 Sep 2007, Pavel Emelyanov wrote: so the fast path is still fast, and we have two ways: 1. we keep the checks on the fastpath and have 0 overhead for unaccounted caches and some overhead for accounted; This stuff accumulates. I have a bad experience from SLAB. We are counting

[Devel] Re: [PATCH 2/4] Switch caches notification dynamically

2007-09-18 Thread Christoph Lameter
On Tue, 18 Sep 2007, Pavel Emelyanov wrote: I meant that we cannot find the pages that are full of objects to notify others that these ones are no longer tracked. I know that we can do it by tracking these pages with some performance penalty, but does it worth having the ability to turn

[Devel] Re: [PATCH 1/4] Add notification about some major slab events

2007-09-17 Thread Christoph Lameter
On Mon, 17 Sep 2007, Pavel Emelyanov wrote: @@ -1036,7 +1121,10 @@ static struct page *allocate_slab(struct page = alloc_pages_node(node, flags, s-order); if (!page) - return NULL; + goto out; + + if (slub_newpage_notify(s, page, flags)

[Devel] Re: [PATCH 0/4] Kernel memory accounting container (v3)

2007-09-17 Thread Christoph Lameter
On Mon, 17 Sep 2007, Pavel Emelyanov wrote: As I have already told kmalloc caches cannot be accounted easily so turning the accounting on for them will fail with -EINVAL. Turning the accounting off is possible only if the cache has no objects. This is done so because turning accounting off

[Devel] Re: [PATCH 2/4] Switch caches notification dynamically

2007-09-17 Thread Christoph Lameter
On Mon, 17 Sep 2007, Pavel Emelyanov wrote: If we turn accounting on on some cache and this cache is merged with some other, this other will be notified as well. We can solve this by disabling of cache merging, but maybe we can do it some other way. You could write a 1 to slub_nomerge

[Devel] Re: [PATCH 2/4] Switch caches notification dynamically

2007-09-17 Thread Christoph Lameter
On Mon, 17 Sep 2007, Pavel Emelyanov wrote: struct kmem_cache kmalloc_caches[PAGE_SHIFT] __cacheline_aligned; EXPORT_SYMBOL(kmalloc_caches); +static inline int is_kmalloc_cache(struct kmem_cache *s) +{ + int km_idx; + + km_idx = s - kmalloc_caches; + return km_idx = 0

  1   2   >