Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-21 Thread Tejun Heo
On Fri, Sep 21, 2012 at 11:14:45PM +0300, Pekka Enberg wrote:
> On Fri, Sep 21, 2012 at 11:07 PM, Tejun Heo  wrote:
> > Not necessarily disagreeing, but I don't think it's helpful to set the
> > bar impossibly high.  Even static_key doesn't have "exactly *zero*"
> > impact.  Let's stick to as minimal as possible when not in use and
> > reasonable in use.
> 
> For !CONFIG_MEMCG_KMEM, it should be exactly zero. No need to play
> games with static_key.

Ah, misread, though you were talking about MEMCG_KMEM && not in use.
Yeap, if turned off in Kconfig, it shouldn't have any performance
overhead.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-21 Thread Pekka Enberg
On Fri, Sep 21, 2012 at 11:07 PM, Tejun Heo  wrote:
> Not necessarily disagreeing, but I don't think it's helpful to set the
> bar impossibly high.  Even static_key doesn't have "exactly *zero*"
> impact.  Let's stick to as minimal as possible when not in use and
> reasonable in use.

For !CONFIG_MEMCG_KMEM, it should be exactly zero. No need to play
games with static_key.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-21 Thread Tejun Heo
Hello,

On Fri, Sep 21, 2012 at 12:41:52PM +0300, Pekka Enberg wrote:
> > I am already using static keys extensively in this patchset, and that is
> > how I intend to handle this particular case.
> 
> Cool.
> 
> The key point here is that !CONFIG_MEMCG_KMEM should have exactly *zero* 
> performance impact and CONFIG_MEMCG_KMEM disabled at runtime should have 
> absolute minimal impact.

Not necessarily disagreeing, but I don't think it's helpful to set the
bar impossibly high.  Even static_key doesn't have "exactly *zero*"
impact.  Let's stick to as minimal as possible when not in use and
reasonable in use.

And, yeah, this one can be easily solved by using static_key.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-21 Thread Pekka Enberg
On Fri, 21 Sep 2012, Glauber Costa wrote:
> > We should assume that most distributions enable CONFIG_MEMCG_KMEM,
> > right? Therfore, any performance impact should be dependent on whether
> > or not kmem memcg is *enabled* at runtime or not.
> > 
> > Can we use the "static key" thingy introduced by tracing folks for this?
>
> Yes.
> 
> I am already using static keys extensively in this patchset, and that is
> how I intend to handle this particular case.

Cool.

The key point here is that !CONFIG_MEMCG_KMEM should have exactly *zero* 
performance impact and CONFIG_MEMCG_KMEM disabled at runtime should have 
absolute minimal impact.

Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-21 Thread Glauber Costa
On 09/21/2012 01:33 PM, Pekka Enberg wrote:
> On Wed, Sep 19, 2012 at 10:42 AM, Glauber Costa  wrote:
 index f2d760c..18de3f6 100644
 --- a/mm/slab.c
 +++ b/mm/slab.c
 @@ -3938,9 +3938,12 @@ EXPORT_SYMBOL(__kmalloc);
   * Free an object which was previously allocated from this
   * cache.
   */
 -void kmem_cache_free(struct kmem_cache *cachep, void *objp)
 +void kmem_cache_free(struct kmem_cache *s, void *objp)
  {
  unsigned long flags;
 +struct kmem_cache *cachep = virt_to_cache(objp);
 +
 +VM_BUG_ON(!slab_equal_or_parent(cachep, s));
>>>
>>> This is an extremely hot path of the kernel and you are adding significant
>>> processing. Check how the benchmarks are influenced by this change.
>>> virt_to_cache can be a bit expensive.
>>
>> Would it be enough for you to have a separate code path for
>> !CONFIG_MEMCG_KMEM?
>>
>> I don't really see another way to do it, aside from deriving the cache
>> from the object in our case. I am open to suggestions if you do.
> 
> We should assume that most distributions enable CONFIG_MEMCG_KMEM,
> right? Therfore, any performance impact should be dependent on whether
> or not kmem memcg is *enabled* at runtime or not.
> 
> Can we use the "static key" thingy introduced by tracing folks for this?
> 
Yes.

I am already using static keys extensively in this patchset, and that is
how I intend to handle this particular case.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-21 Thread Pekka Enberg
On Wed, Sep 19, 2012 at 10:42 AM, Glauber Costa  wrote:
>>> index f2d760c..18de3f6 100644
>>> --- a/mm/slab.c
>>> +++ b/mm/slab.c
>>> @@ -3938,9 +3938,12 @@ EXPORT_SYMBOL(__kmalloc);
>>>   * Free an object which was previously allocated from this
>>>   * cache.
>>>   */
>>> -void kmem_cache_free(struct kmem_cache *cachep, void *objp)
>>> +void kmem_cache_free(struct kmem_cache *s, void *objp)
>>>  {
>>>  unsigned long flags;
>>> +struct kmem_cache *cachep = virt_to_cache(objp);
>>> +
>>> +VM_BUG_ON(!slab_equal_or_parent(cachep, s));
>>
>> This is an extremely hot path of the kernel and you are adding significant
>> processing. Check how the benchmarks are influenced by this change.
>> virt_to_cache can be a bit expensive.
>
> Would it be enough for you to have a separate code path for
> !CONFIG_MEMCG_KMEM?
>
> I don't really see another way to do it, aside from deriving the cache
> from the object in our case. I am open to suggestions if you do.

We should assume that most distributions enable CONFIG_MEMCG_KMEM,
right? Therfore, any performance impact should be dependent on whether
or not kmem memcg is *enabled* at runtime or not.

Can we use the "static key" thingy introduced by tracing folks for this?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-21 Thread Pekka Enberg
On Wed, Sep 19, 2012 at 10:42 AM, Glauber Costa glom...@parallels.com wrote:
 index f2d760c..18de3f6 100644
 --- a/mm/slab.c
 +++ b/mm/slab.c
 @@ -3938,9 +3938,12 @@ EXPORT_SYMBOL(__kmalloc);
   * Free an object which was previously allocated from this
   * cache.
   */
 -void kmem_cache_free(struct kmem_cache *cachep, void *objp)
 +void kmem_cache_free(struct kmem_cache *s, void *objp)
  {
  unsigned long flags;
 +struct kmem_cache *cachep = virt_to_cache(objp);
 +
 +VM_BUG_ON(!slab_equal_or_parent(cachep, s));

 This is an extremely hot path of the kernel and you are adding significant
 processing. Check how the benchmarks are influenced by this change.
 virt_to_cache can be a bit expensive.

 Would it be enough for you to have a separate code path for
 !CONFIG_MEMCG_KMEM?

 I don't really see another way to do it, aside from deriving the cache
 from the object in our case. I am open to suggestions if you do.

We should assume that most distributions enable CONFIG_MEMCG_KMEM,
right? Therfore, any performance impact should be dependent on whether
or not kmem memcg is *enabled* at runtime or not.

Can we use the static key thingy introduced by tracing folks for this?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-21 Thread Glauber Costa
On 09/21/2012 01:33 PM, Pekka Enberg wrote:
 On Wed, Sep 19, 2012 at 10:42 AM, Glauber Costa glom...@parallels.com wrote:
 index f2d760c..18de3f6 100644
 --- a/mm/slab.c
 +++ b/mm/slab.c
 @@ -3938,9 +3938,12 @@ EXPORT_SYMBOL(__kmalloc);
   * Free an object which was previously allocated from this
   * cache.
   */
 -void kmem_cache_free(struct kmem_cache *cachep, void *objp)
 +void kmem_cache_free(struct kmem_cache *s, void *objp)
  {
  unsigned long flags;
 +struct kmem_cache *cachep = virt_to_cache(objp);
 +
 +VM_BUG_ON(!slab_equal_or_parent(cachep, s));

 This is an extremely hot path of the kernel and you are adding significant
 processing. Check how the benchmarks are influenced by this change.
 virt_to_cache can be a bit expensive.

 Would it be enough for you to have a separate code path for
 !CONFIG_MEMCG_KMEM?

 I don't really see another way to do it, aside from deriving the cache
 from the object in our case. I am open to suggestions if you do.
 
 We should assume that most distributions enable CONFIG_MEMCG_KMEM,
 right? Therfore, any performance impact should be dependent on whether
 or not kmem memcg is *enabled* at runtime or not.
 
 Can we use the static key thingy introduced by tracing folks for this?
 
Yes.

I am already using static keys extensively in this patchset, and that is
how I intend to handle this particular case.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-21 Thread Pekka Enberg
On Fri, 21 Sep 2012, Glauber Costa wrote:
  We should assume that most distributions enable CONFIG_MEMCG_KMEM,
  right? Therfore, any performance impact should be dependent on whether
  or not kmem memcg is *enabled* at runtime or not.
  
  Can we use the static key thingy introduced by tracing folks for this?

 Yes.
 
 I am already using static keys extensively in this patchset, and that is
 how I intend to handle this particular case.

Cool.

The key point here is that !CONFIG_MEMCG_KMEM should have exactly *zero* 
performance impact and CONFIG_MEMCG_KMEM disabled at runtime should have 
absolute minimal impact.

Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-21 Thread Tejun Heo
Hello,

On Fri, Sep 21, 2012 at 12:41:52PM +0300, Pekka Enberg wrote:
  I am already using static keys extensively in this patchset, and that is
  how I intend to handle this particular case.
 
 Cool.
 
 The key point here is that !CONFIG_MEMCG_KMEM should have exactly *zero* 
 performance impact and CONFIG_MEMCG_KMEM disabled at runtime should have 
 absolute minimal impact.

Not necessarily disagreeing, but I don't think it's helpful to set the
bar impossibly high.  Even static_key doesn't have exactly *zero*
impact.  Let's stick to as minimal as possible when not in use and
reasonable in use.

And, yeah, this one can be easily solved by using static_key.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-21 Thread Pekka Enberg
On Fri, Sep 21, 2012 at 11:07 PM, Tejun Heo t...@kernel.org wrote:
 Not necessarily disagreeing, but I don't think it's helpful to set the
 bar impossibly high.  Even static_key doesn't have exactly *zero*
 impact.  Let's stick to as minimal as possible when not in use and
 reasonable in use.

For !CONFIG_MEMCG_KMEM, it should be exactly zero. No need to play
games with static_key.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-21 Thread Tejun Heo
On Fri, Sep 21, 2012 at 11:14:45PM +0300, Pekka Enberg wrote:
 On Fri, Sep 21, 2012 at 11:07 PM, Tejun Heo t...@kernel.org wrote:
  Not necessarily disagreeing, but I don't think it's helpful to set the
  bar impossibly high.  Even static_key doesn't have exactly *zero*
  impact.  Let's stick to as minimal as possible when not in use and
  reasonable in use.
 
 For !CONFIG_MEMCG_KMEM, it should be exactly zero. No need to play
 games with static_key.

Ah, misread, though you were talking about MEMCG_KMEM  not in use.
Yeap, if turned off in Kconfig, it shouldn't have any performance
overhead.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-19 Thread Christoph Lameter
On Wed, 19 Sep 2012, Glauber Costa wrote:

> > This is an extremely hot path of the kernel and you are adding significant
> > processing. Check how the benchmarks are influenced by this change.
> > virt_to_cache can be a bit expensive.
> Would it be enough for you to have a separate code path for
> !CONFIG_MEMCG_KMEM?

Yes, at least add an #ifdef around this.

> I don't really see another way to do it, aside from deriving the cache
> from the object in our case. I am open to suggestions if you do.

Rethink the whole memcg approach and find some other way to do it? This
whole scheme is very intrusive and is likely to increase instability in
the VM given the explosion of lru lists, duplication of slab caches and
significantly more complex processing throughout the VM.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-19 Thread Glauber Costa
On 09/18/2012 07:28 PM, Christoph Lameter wrote:
> On Tue, 18 Sep 2012, Glauber Costa wrote:
> 
>> index f2d760c..18de3f6 100644
>> --- a/mm/slab.c
>> +++ b/mm/slab.c
>> @@ -3938,9 +3938,12 @@ EXPORT_SYMBOL(__kmalloc);
>>   * Free an object which was previously allocated from this
>>   * cache.
>>   */
>> -void kmem_cache_free(struct kmem_cache *cachep, void *objp)
>> +void kmem_cache_free(struct kmem_cache *s, void *objp)
>>  {
>>  unsigned long flags;
>> +struct kmem_cache *cachep = virt_to_cache(objp);
>> +
>> +VM_BUG_ON(!slab_equal_or_parent(cachep, s));
>>
> 
> This is an extremely hot path of the kernel and you are adding significant
> processing. Check how the benchmarks are influenced by this change.
> virt_to_cache can be a bit expensive.
> 

Would it be enough for you to have a separate code path for
!CONFIG_MEMCG_KMEM?

I don't really see another way to do it, aside from deriving the cache
from the object in our case. I am open to suggestions if you do.

>> diff --git a/mm/slub.c b/mm/slub.c
>> index 4778548..a045dfc 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -2604,7 +2604,9 @@ void kmem_cache_free(struct kmem_cache *s, void *x)
>>
>>  page = virt_to_head_page(x);
>>
>> -slab_free(s, page, x, _RET_IP_);
>> +VM_BUG_ON(!slab_equal_or_parent(page->slab, s));
>> +
>> +slab_free(page->slab, page, x, _RET_IP_);
>>
> 
> Less of a problem here but you are eroding one advantage that slab has had
> in the past over slub in terms of freeing objects.
> 
likewise.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-19 Thread Glauber Costa
On 09/18/2012 07:28 PM, Christoph Lameter wrote:
 On Tue, 18 Sep 2012, Glauber Costa wrote:
 
 index f2d760c..18de3f6 100644
 --- a/mm/slab.c
 +++ b/mm/slab.c
 @@ -3938,9 +3938,12 @@ EXPORT_SYMBOL(__kmalloc);
   * Free an object which was previously allocated from this
   * cache.
   */
 -void kmem_cache_free(struct kmem_cache *cachep, void *objp)
 +void kmem_cache_free(struct kmem_cache *s, void *objp)
  {
  unsigned long flags;
 +struct kmem_cache *cachep = virt_to_cache(objp);
 +
 +VM_BUG_ON(!slab_equal_or_parent(cachep, s));

 
 This is an extremely hot path of the kernel and you are adding significant
 processing. Check how the benchmarks are influenced by this change.
 virt_to_cache can be a bit expensive.
 

Would it be enough for you to have a separate code path for
!CONFIG_MEMCG_KMEM?

I don't really see another way to do it, aside from deriving the cache
from the object in our case. I am open to suggestions if you do.

 diff --git a/mm/slub.c b/mm/slub.c
 index 4778548..a045dfc 100644
 --- a/mm/slub.c
 +++ b/mm/slub.c
 @@ -2604,7 +2604,9 @@ void kmem_cache_free(struct kmem_cache *s, void *x)

  page = virt_to_head_page(x);

 -slab_free(s, page, x, _RET_IP_);
 +VM_BUG_ON(!slab_equal_or_parent(page-slab, s));
 +
 +slab_free(page-slab, page, x, _RET_IP_);

 
 Less of a problem here but you are eroding one advantage that slab has had
 in the past over slub in terms of freeing objects.
 
likewise.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-19 Thread Christoph Lameter
On Wed, 19 Sep 2012, Glauber Costa wrote:

  This is an extremely hot path of the kernel and you are adding significant
  processing. Check how the benchmarks are influenced by this change.
  virt_to_cache can be a bit expensive.
 Would it be enough for you to have a separate code path for
 !CONFIG_MEMCG_KMEM?

Yes, at least add an #ifdef around this.

 I don't really see another way to do it, aside from deriving the cache
 from the object in our case. I am open to suggestions if you do.

Rethink the whole memcg approach and find some other way to do it? This
whole scheme is very intrusive and is likely to increase instability in
the VM given the explosion of lru lists, duplication of slab caches and
significantly more complex processing throughout the VM.



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-18 Thread Christoph Lameter
On Tue, 18 Sep 2012, Glauber Costa wrote:

> index f2d760c..18de3f6 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -3938,9 +3938,12 @@ EXPORT_SYMBOL(__kmalloc);
>   * Free an object which was previously allocated from this
>   * cache.
>   */
> -void kmem_cache_free(struct kmem_cache *cachep, void *objp)
> +void kmem_cache_free(struct kmem_cache *s, void *objp)
>  {
>   unsigned long flags;
> + struct kmem_cache *cachep = virt_to_cache(objp);
> +
> + VM_BUG_ON(!slab_equal_or_parent(cachep, s));
>

This is an extremely hot path of the kernel and you are adding significant
processing. Check how the benchmarks are influenced by this change.
virt_to_cache can be a bit expensive.

> diff --git a/mm/slub.c b/mm/slub.c
> index 4778548..a045dfc 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -2604,7 +2604,9 @@ void kmem_cache_free(struct kmem_cache *s, void *x)
>
>   page = virt_to_head_page(x);
>
> - slab_free(s, page, x, _RET_IP_);
> + VM_BUG_ON(!slab_equal_or_parent(page->slab, s));
> +
> + slab_free(page->slab, page, x, _RET_IP_);
>

Less of a problem here but you are eroding one advantage that slab has had
in the past over slub in terms of freeing objects.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-18 Thread Glauber Costa
struct page already have this information. If we start chaining
caches, this information will always be more trustworthy than
whatever is passed into the function

A parent pointer is added to the slub structure, so we can make sure
the freeing comes from either the right slab, or from its rightful
parent.

[ v3: added parent testing with VM_BUG_ON ]

Signed-off-by: Glauber Costa 
CC: Christoph Lameter 
CC: Pekka Enberg 
---
 mm/slab.c |  5 -
 mm/slab.h | 11 +++
 mm/slub.c |  4 +++-
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index f2d760c..18de3f6 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3938,9 +3938,12 @@ EXPORT_SYMBOL(__kmalloc);
  * Free an object which was previously allocated from this
  * cache.
  */
-void kmem_cache_free(struct kmem_cache *cachep, void *objp)
+void kmem_cache_free(struct kmem_cache *s, void *objp)
 {
unsigned long flags;
+   struct kmem_cache *cachep = virt_to_cache(objp);
+
+   VM_BUG_ON(!slab_equal_or_parent(cachep, s));
 
local_irq_save(flags);
debug_check_no_locks_freed(objp, cachep->object_size);
diff --git a/mm/slab.h b/mm/slab.h
index 6f2a34d..f2501ab 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -60,11 +60,22 @@ static inline bool cache_match_memcg(struct kmem_cache 
*cachep,
return cachep->memcg_params.memcg == memcg;
 }
 
+static inline bool slab_equal_or_parent(struct kmem_cache *s,
+   struct kmem_cache *p)
+{
+   return (p == s) || (p == s->memcg_params.parent);
+}
 #else
 static inline bool cache_match_memcg(struct kmem_cache *cachep,
 struct mem_cgroup *memcg)
 {
return true;
 }
+
+static inline bool slab_equal_or_parent(struct kmem_cache *s,
+   struct kmem_cache *p)
+{
+   return true;
+}
 #endif
 #endif
diff --git a/mm/slub.c b/mm/slub.c
index 4778548..a045dfc 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2604,7 +2604,9 @@ void kmem_cache_free(struct kmem_cache *s, void *x)
 
page = virt_to_head_page(x);
 
-   slab_free(s, page, x, _RET_IP_);
+   VM_BUG_ON(!slab_equal_or_parent(page->slab, s));
+
+   slab_free(page->slab, page, x, _RET_IP_);
 
trace_kmem_cache_free(_RET_IP_, x);
 }
-- 
1.7.11.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-18 Thread Glauber Costa
struct page already have this information. If we start chaining
caches, this information will always be more trustworthy than
whatever is passed into the function

A parent pointer is added to the slub structure, so we can make sure
the freeing comes from either the right slab, or from its rightful
parent.

[ v3: added parent testing with VM_BUG_ON ]

Signed-off-by: Glauber Costa glom...@parallels.com
CC: Christoph Lameter c...@linux.com
CC: Pekka Enberg penb...@cs.helsinki.fi
---
 mm/slab.c |  5 -
 mm/slab.h | 11 +++
 mm/slub.c |  4 +++-
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index f2d760c..18de3f6 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3938,9 +3938,12 @@ EXPORT_SYMBOL(__kmalloc);
  * Free an object which was previously allocated from this
  * cache.
  */
-void kmem_cache_free(struct kmem_cache *cachep, void *objp)
+void kmem_cache_free(struct kmem_cache *s, void *objp)
 {
unsigned long flags;
+   struct kmem_cache *cachep = virt_to_cache(objp);
+
+   VM_BUG_ON(!slab_equal_or_parent(cachep, s));
 
local_irq_save(flags);
debug_check_no_locks_freed(objp, cachep-object_size);
diff --git a/mm/slab.h b/mm/slab.h
index 6f2a34d..f2501ab 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -60,11 +60,22 @@ static inline bool cache_match_memcg(struct kmem_cache 
*cachep,
return cachep-memcg_params.memcg == memcg;
 }
 
+static inline bool slab_equal_or_parent(struct kmem_cache *s,
+   struct kmem_cache *p)
+{
+   return (p == s) || (p == s-memcg_params.parent);
+}
 #else
 static inline bool cache_match_memcg(struct kmem_cache *cachep,
 struct mem_cgroup *memcg)
 {
return true;
 }
+
+static inline bool slab_equal_or_parent(struct kmem_cache *s,
+   struct kmem_cache *p)
+{
+   return true;
+}
 #endif
 #endif
diff --git a/mm/slub.c b/mm/slub.c
index 4778548..a045dfc 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2604,7 +2604,9 @@ void kmem_cache_free(struct kmem_cache *s, void *x)
 
page = virt_to_head_page(x);
 
-   slab_free(s, page, x, _RET_IP_);
+   VM_BUG_ON(!slab_equal_or_parent(page-slab, s));
+
+   slab_free(page-slab, page, x, _RET_IP_);
 
trace_kmem_cache_free(_RET_IP_, x);
 }
-- 
1.7.11.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 09/16] sl[au]b: always get the cache from its page in kfree

2012-09-18 Thread Christoph Lameter
On Tue, 18 Sep 2012, Glauber Costa wrote:

 index f2d760c..18de3f6 100644
 --- a/mm/slab.c
 +++ b/mm/slab.c
 @@ -3938,9 +3938,12 @@ EXPORT_SYMBOL(__kmalloc);
   * Free an object which was previously allocated from this
   * cache.
   */
 -void kmem_cache_free(struct kmem_cache *cachep, void *objp)
 +void kmem_cache_free(struct kmem_cache *s, void *objp)
  {
   unsigned long flags;
 + struct kmem_cache *cachep = virt_to_cache(objp);
 +
 + VM_BUG_ON(!slab_equal_or_parent(cachep, s));


This is an extremely hot path of the kernel and you are adding significant
processing. Check how the benchmarks are influenced by this change.
virt_to_cache can be a bit expensive.

 diff --git a/mm/slub.c b/mm/slub.c
 index 4778548..a045dfc 100644
 --- a/mm/slub.c
 +++ b/mm/slub.c
 @@ -2604,7 +2604,9 @@ void kmem_cache_free(struct kmem_cache *s, void *x)

   page = virt_to_head_page(x);

 - slab_free(s, page, x, _RET_IP_);
 + VM_BUG_ON(!slab_equal_or_parent(page-slab, s));
 +
 + slab_free(page-slab, page, x, _RET_IP_);


Less of a problem here but you are eroding one advantage that slab has had
in the past over slub in terms of freeing objects.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/