Re: [PATCH v1] memcg: Prevent caches to be both OFF_SLAB & OBJFREELIST_SLAB

2016-10-27 Thread Thomas Garnier
On Thu, Oct 27, 2016 at 12:25 AM, Michal Hocko  wrote:
> The patch is marked for memcg but I do not see any direct relation.
> I am not familiar with this code enough probably but if this really is
> memcg kmem related, please do not forget to CC Vladimir
>

Yes, the next iteration should be closer to memcg. I will CC Vladimir.

Thanks for the heads-up.

> On Wed 26-10-16 10:41:28, Thomas Garnier wrote:
>> While testing OBJFREELIST_SLAB integration with pagealloc, we found a
>> bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
>> CFLGS_OBJFREELIST_SLAB.
>>
>> The original kmem_cache is created early making OFF_SLAB not possible.
>> When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
>> is enabled it will try to enable it first under certain conditions.
>> Given kmem_cache(sys) reuses the original flag, you can have both flags
>> at the same time resulting in allocation failures and odd behaviors.
>>
>> The proposed fix removes these flags by default at the entrance of
>> __kmem_cache_create. This way the function will define which way the
>> freelist should be handled at this stage for the new cache.
>>
>> Fixes: b03a017bebc4 ("mm/slab: introduce new slab management type, 
>> OBJFREELIST_SLAB")
>> Signed-off-by: Thomas Garnier 
>> Signed-off-by: Greg Thelen 
>> ---
>> Based on next-20161025
>> ---
>>  mm/slab.c | 8 
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/mm/slab.c b/mm/slab.c
>> index 3c83c29..efe280a 100644
>> --- a/mm/slab.c
>> +++ b/mm/slab.c
>> @@ -2027,6 +2027,14 @@ __kmem_cache_create (struct kmem_cache *cachep, 
>> unsigned long flags)
>>   int err;
>>   size_t size = cachep->size;
>>
>> + /*
>> +  * memcg re-creates caches with the flags of the originals. Remove
>> +  * the freelist related flags to ensure they are re-defined at this
>> +  * stage. Prevent having both flags on edge cases like with pagealloc
>> +  * if the original cache was created too early to be OFF_SLAB.
>> +  */
>> + flags &= ~(CFLGS_OBJFREELIST_SLAB|CFLGS_OFF_SLAB);
>> +
>>  #if DEBUG
>>  #if FORCED_DEBUG
>>   /*
>> --
>> 2.8.0.rc3.226.g39d4020
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majord...@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 
>
> --
> Michal Hocko
> SUSE Labs



-- 
Thomas


Re: [PATCH v1] memcg: Prevent caches to be both OFF_SLAB & OBJFREELIST_SLAB

2016-10-27 Thread Thomas Garnier
On Thu, Oct 27, 2016 at 12:25 AM, Michal Hocko  wrote:
> The patch is marked for memcg but I do not see any direct relation.
> I am not familiar with this code enough probably but if this really is
> memcg kmem related, please do not forget to CC Vladimir
>

Yes, the next iteration should be closer to memcg. I will CC Vladimir.

Thanks for the heads-up.

> On Wed 26-10-16 10:41:28, Thomas Garnier wrote:
>> While testing OBJFREELIST_SLAB integration with pagealloc, we found a
>> bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
>> CFLGS_OBJFREELIST_SLAB.
>>
>> The original kmem_cache is created early making OFF_SLAB not possible.
>> When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
>> is enabled it will try to enable it first under certain conditions.
>> Given kmem_cache(sys) reuses the original flag, you can have both flags
>> at the same time resulting in allocation failures and odd behaviors.
>>
>> The proposed fix removes these flags by default at the entrance of
>> __kmem_cache_create. This way the function will define which way the
>> freelist should be handled at this stage for the new cache.
>>
>> Fixes: b03a017bebc4 ("mm/slab: introduce new slab management type, 
>> OBJFREELIST_SLAB")
>> Signed-off-by: Thomas Garnier 
>> Signed-off-by: Greg Thelen 
>> ---
>> Based on next-20161025
>> ---
>>  mm/slab.c | 8 
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/mm/slab.c b/mm/slab.c
>> index 3c83c29..efe280a 100644
>> --- a/mm/slab.c
>> +++ b/mm/slab.c
>> @@ -2027,6 +2027,14 @@ __kmem_cache_create (struct kmem_cache *cachep, 
>> unsigned long flags)
>>   int err;
>>   size_t size = cachep->size;
>>
>> + /*
>> +  * memcg re-creates caches with the flags of the originals. Remove
>> +  * the freelist related flags to ensure they are re-defined at this
>> +  * stage. Prevent having both flags on edge cases like with pagealloc
>> +  * if the original cache was created too early to be OFF_SLAB.
>> +  */
>> + flags &= ~(CFLGS_OBJFREELIST_SLAB|CFLGS_OFF_SLAB);
>> +
>>  #if DEBUG
>>  #if FORCED_DEBUG
>>   /*
>> --
>> 2.8.0.rc3.226.g39d4020
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majord...@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 
>
> --
> Michal Hocko
> SUSE Labs



-- 
Thomas


Re: [PATCH v1] memcg: Prevent caches to be both OFF_SLAB & OBJFREELIST_SLAB

2016-10-27 Thread Michal Hocko
The patch is marked for memcg but I do not see any direct relation.
I am not familiar with this code enough probably but if this really is
memcg kmem related, please do not forget to CC Vladimir

On Wed 26-10-16 10:41:28, Thomas Garnier wrote:
> While testing OBJFREELIST_SLAB integration with pagealloc, we found a
> bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
> CFLGS_OBJFREELIST_SLAB.
> 
> The original kmem_cache is created early making OFF_SLAB not possible.
> When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
> is enabled it will try to enable it first under certain conditions.
> Given kmem_cache(sys) reuses the original flag, you can have both flags
> at the same time resulting in allocation failures and odd behaviors.
> 
> The proposed fix removes these flags by default at the entrance of
> __kmem_cache_create. This way the function will define which way the
> freelist should be handled at this stage for the new cache.
> 
> Fixes: b03a017bebc4 ("mm/slab: introduce new slab management type, 
> OBJFREELIST_SLAB")
> Signed-off-by: Thomas Garnier 
> Signed-off-by: Greg Thelen 
> ---
> Based on next-20161025
> ---
>  mm/slab.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/mm/slab.c b/mm/slab.c
> index 3c83c29..efe280a 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -2027,6 +2027,14 @@ __kmem_cache_create (struct kmem_cache *cachep, 
> unsigned long flags)
>   int err;
>   size_t size = cachep->size;
>  
> + /*
> +  * memcg re-creates caches with the flags of the originals. Remove
> +  * the freelist related flags to ensure they are re-defined at this
> +  * stage. Prevent having both flags on edge cases like with pagealloc
> +  * if the original cache was created too early to be OFF_SLAB.
> +  */
> + flags &= ~(CFLGS_OBJFREELIST_SLAB|CFLGS_OFF_SLAB);
> +
>  #if DEBUG
>  #if FORCED_DEBUG
>   /*
> -- 
> 2.8.0.rc3.226.g39d4020
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 

-- 
Michal Hocko
SUSE Labs


Re: [PATCH v1] memcg: Prevent caches to be both OFF_SLAB & OBJFREELIST_SLAB

2016-10-27 Thread Michal Hocko
The patch is marked for memcg but I do not see any direct relation.
I am not familiar with this code enough probably but if this really is
memcg kmem related, please do not forget to CC Vladimir

On Wed 26-10-16 10:41:28, Thomas Garnier wrote:
> While testing OBJFREELIST_SLAB integration with pagealloc, we found a
> bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
> CFLGS_OBJFREELIST_SLAB.
> 
> The original kmem_cache is created early making OFF_SLAB not possible.
> When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
> is enabled it will try to enable it first under certain conditions.
> Given kmem_cache(sys) reuses the original flag, you can have both flags
> at the same time resulting in allocation failures and odd behaviors.
> 
> The proposed fix removes these flags by default at the entrance of
> __kmem_cache_create. This way the function will define which way the
> freelist should be handled at this stage for the new cache.
> 
> Fixes: b03a017bebc4 ("mm/slab: introduce new slab management type, 
> OBJFREELIST_SLAB")
> Signed-off-by: Thomas Garnier 
> Signed-off-by: Greg Thelen 
> ---
> Based on next-20161025
> ---
>  mm/slab.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/mm/slab.c b/mm/slab.c
> index 3c83c29..efe280a 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -2027,6 +2027,14 @@ __kmem_cache_create (struct kmem_cache *cachep, 
> unsigned long flags)
>   int err;
>   size_t size = cachep->size;
>  
> + /*
> +  * memcg re-creates caches with the flags of the originals. Remove
> +  * the freelist related flags to ensure they are re-defined at this
> +  * stage. Prevent having both flags on edge cases like with pagealloc
> +  * if the original cache was created too early to be OFF_SLAB.
> +  */
> + flags &= ~(CFLGS_OBJFREELIST_SLAB|CFLGS_OFF_SLAB);
> +
>  #if DEBUG
>  #if FORCED_DEBUG
>   /*
> -- 
> 2.8.0.rc3.226.g39d4020
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 

-- 
Michal Hocko
SUSE Labs


Re: [PATCH v1] memcg: Prevent caches to be both OFF_SLAB & OBJFREELIST_SLAB

2016-10-26 Thread Christoph Lameter
On Wed, 26 Oct 2016, Thomas Garnier wrote:

> Okay, I think for SLAB we can allow everything except the two flags
> mentioned here.

No no no. Just allow the flags already defined in include/linux/slab.h
that can be specd by subsystems when they call into the slab allocators.

> Should I deny certain flags for SLUB? I can allow everything for now.

All allocator should just allow flags defined in include/linux/slab.h be
passed to kmem_cache_create(). That is the API that all allocators need to 
support.
If someone wants to add new flags then we need to make sure that all
allocators can handle it.


The flags are (from include/linux/slab.h)
/*
 * Flags to pass to kmem_cache_create().
 */
#define SLAB_CONSISTENCY_CHECKS 0x0100UL/* DEBUG: Perform (expensive) 
checks on alloc/free */
#define SLAB_RED_ZONE   0x0400UL/* DEBUG: Red zone objs in a 
cache */
#define SLAB_POISON 0x0800UL/* DEBUG: Poison objects */
#define SLAB_HWCACHE_ALIGN  0x2000UL/* Align objs on cache lines */
#define SLAB_CACHE_DMA  0x4000UL/* Use GFP_DMA memory */
#define SLAB_STORE_USER 0x0001UL/* DEBUG: Store the last owner 
for bug hunting */
#define SLAB_PANIC  0x0004UL/* Panic if kmem_cache_create() 
fails */
#define SLAB_DESTROY_BY_RCU 0x0008UL/* Defer freeing slabs to RCU */
#define SLAB_MEM_SPREAD 0x0010UL/* Spread some memory over 
cpuset */
#define SLAB_TRACE  0x0020UL/* Trace allocations and frees 
*/
#define SLAB_DEBUG_OBJECTS  0x0040UL
#define SLAB_NOLEAKTRACE0x0080UL/* Avoid kmemleak tracing
#define SLAB_NOTRACK0x0100UL
#define SLAB_FAILSLAB   0x0200UL/* Fault injection mark */
#define SLAB_ACCOUNT0x0400UL/* Account to memcg */
#define SLAB_KASAN  0x0800UL
#define SLAB_RECLAIM_ACCOUNT0x0002UL/* Objects are 
reclaimable */



Re: [PATCH v1] memcg: Prevent caches to be both OFF_SLAB & OBJFREELIST_SLAB

2016-10-26 Thread Christoph Lameter
On Wed, 26 Oct 2016, Thomas Garnier wrote:

> Okay, I think for SLAB we can allow everything except the two flags
> mentioned here.

No no no. Just allow the flags already defined in include/linux/slab.h
that can be specd by subsystems when they call into the slab allocators.

> Should I deny certain flags for SLUB? I can allow everything for now.

All allocator should just allow flags defined in include/linux/slab.h be
passed to kmem_cache_create(). That is the API that all allocators need to 
support.
If someone wants to add new flags then we need to make sure that all
allocators can handle it.


The flags are (from include/linux/slab.h)
/*
 * Flags to pass to kmem_cache_create().
 */
#define SLAB_CONSISTENCY_CHECKS 0x0100UL/* DEBUG: Perform (expensive) 
checks on alloc/free */
#define SLAB_RED_ZONE   0x0400UL/* DEBUG: Red zone objs in a 
cache */
#define SLAB_POISON 0x0800UL/* DEBUG: Poison objects */
#define SLAB_HWCACHE_ALIGN  0x2000UL/* Align objs on cache lines */
#define SLAB_CACHE_DMA  0x4000UL/* Use GFP_DMA memory */
#define SLAB_STORE_USER 0x0001UL/* DEBUG: Store the last owner 
for bug hunting */
#define SLAB_PANIC  0x0004UL/* Panic if kmem_cache_create() 
fails */
#define SLAB_DESTROY_BY_RCU 0x0008UL/* Defer freeing slabs to RCU */
#define SLAB_MEM_SPREAD 0x0010UL/* Spread some memory over 
cpuset */
#define SLAB_TRACE  0x0020UL/* Trace allocations and frees 
*/
#define SLAB_DEBUG_OBJECTS  0x0040UL
#define SLAB_NOLEAKTRACE0x0080UL/* Avoid kmemleak tracing
#define SLAB_NOTRACK0x0100UL
#define SLAB_FAILSLAB   0x0200UL/* Fault injection mark */
#define SLAB_ACCOUNT0x0400UL/* Account to memcg */
#define SLAB_KASAN  0x0800UL
#define SLAB_RECLAIM_ACCOUNT0x0002UL/* Objects are 
reclaimable */



Re: [PATCH v1] memcg: Prevent caches to be both OFF_SLAB & OBJFREELIST_SLAB

2016-10-26 Thread Thomas Garnier
On Wed, Oct 26, 2016 at 12:08 PM, Christoph Lameter  wrote:
> Hmmm...Doesnt this belong into memcg_create_kmem_cache() or into
> kmem_cache_create() in mm/slab_common.h? Definitely not in an allocator
> specific function since this is an issue for all allocators.
>
> memcg_create_kmem_cache() simply assumes that it can pass flags from the
> kmem_cache structure to kmem_cache_create(). However, those flags may
> contain slab specific options.
>
> kmem_cache_create() could filter out flags that cannot be specified.

That make sense.

>
> Maybe create SLAB_FLAGS_PERMITTED in linux/mm/slab.h and mask other bits
> out in kmem_cache_create()?
>
> Slub also has internal flags and those also should not be passed to
> kmem_cache_create(). If we define the valid ones we can mask them out.
>
> The cleanest approach would be if kmem_cache_create() would reject invalid
> flags and fail and if memcg_create_kmem_cache() would mask out the invalid
> flags using SLAB_FLAGS_PERMITTED or so.

Okay, I think for SLAB we can allow everything except the two flags
mentioned here.

Should I deny certain flags for SLUB? I can allow everything for now.

>
>
>
> On Wed, 26 Oct 2016, Thomas Garnier wrote:
>
>> While testing OBJFREELIST_SLAB integration with pagealloc, we found a
>> bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
>> CFLGS_OBJFREELIST_SLAB.
>>
>> The original kmem_cache is created early making OFF_SLAB not possible.
>> When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
>> is enabled it will try to enable it first under certain conditions.
>> Given kmem_cache(sys) reuses the original flag, you can have both flags
>> at the same time resulting in allocation failures and odd behaviors.
>>
>> The proposed fix removes these flags by default at the entrance of
>> __kmem_cache_create. This way the function will define which way the
>> freelist should be handled at this stage for the new cache.
>>
>> Fixes: b03a017bebc4 ("mm/slab: introduce new slab management type, 
>> OBJFREELIST_SLAB")
>> Signed-off-by: Thomas Garnier 
>> Signed-off-by: Greg Thelen 
>> ---
>> Based on next-20161025
>> ---
>>  mm/slab.c | 8 
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/mm/slab.c b/mm/slab.c
>> index 3c83c29..efe280a 100644
>> --- a/mm/slab.c
>> +++ b/mm/slab.c
>> @@ -2027,6 +2027,14 @@ __kmem_cache_create (struct kmem_cache *cachep, 
>> unsigned long flags)
>>   int err;
>>   size_t size = cachep->size;
>>
>> + /*
>> +  * memcg re-creates caches with the flags of the originals. Remove
>> +  * the freelist related flags to ensure they are re-defined at this
>> +  * stage. Prevent having both flags on edge cases like with pagealloc
>> +  * if the original cache was created too early to be OFF_SLAB.
>> +  */
>> + flags &= ~(CFLGS_OBJFREELIST_SLAB|CFLGS_OFF_SLAB);
>> +
>>  #if DEBUG
>>  #if FORCED_DEBUG
>>   /*
>>



-- 
Thomas


Re: [PATCH v1] memcg: Prevent caches to be both OFF_SLAB & OBJFREELIST_SLAB

2016-10-26 Thread Thomas Garnier
On Wed, Oct 26, 2016 at 12:08 PM, Christoph Lameter  wrote:
> Hmmm...Doesnt this belong into memcg_create_kmem_cache() or into
> kmem_cache_create() in mm/slab_common.h? Definitely not in an allocator
> specific function since this is an issue for all allocators.
>
> memcg_create_kmem_cache() simply assumes that it can pass flags from the
> kmem_cache structure to kmem_cache_create(). However, those flags may
> contain slab specific options.
>
> kmem_cache_create() could filter out flags that cannot be specified.

That make sense.

>
> Maybe create SLAB_FLAGS_PERMITTED in linux/mm/slab.h and mask other bits
> out in kmem_cache_create()?
>
> Slub also has internal flags and those also should not be passed to
> kmem_cache_create(). If we define the valid ones we can mask them out.
>
> The cleanest approach would be if kmem_cache_create() would reject invalid
> flags and fail and if memcg_create_kmem_cache() would mask out the invalid
> flags using SLAB_FLAGS_PERMITTED or so.

Okay, I think for SLAB we can allow everything except the two flags
mentioned here.

Should I deny certain flags for SLUB? I can allow everything for now.

>
>
>
> On Wed, 26 Oct 2016, Thomas Garnier wrote:
>
>> While testing OBJFREELIST_SLAB integration with pagealloc, we found a
>> bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
>> CFLGS_OBJFREELIST_SLAB.
>>
>> The original kmem_cache is created early making OFF_SLAB not possible.
>> When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
>> is enabled it will try to enable it first under certain conditions.
>> Given kmem_cache(sys) reuses the original flag, you can have both flags
>> at the same time resulting in allocation failures and odd behaviors.
>>
>> The proposed fix removes these flags by default at the entrance of
>> __kmem_cache_create. This way the function will define which way the
>> freelist should be handled at this stage for the new cache.
>>
>> Fixes: b03a017bebc4 ("mm/slab: introduce new slab management type, 
>> OBJFREELIST_SLAB")
>> Signed-off-by: Thomas Garnier 
>> Signed-off-by: Greg Thelen 
>> ---
>> Based on next-20161025
>> ---
>>  mm/slab.c | 8 
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/mm/slab.c b/mm/slab.c
>> index 3c83c29..efe280a 100644
>> --- a/mm/slab.c
>> +++ b/mm/slab.c
>> @@ -2027,6 +2027,14 @@ __kmem_cache_create (struct kmem_cache *cachep, 
>> unsigned long flags)
>>   int err;
>>   size_t size = cachep->size;
>>
>> + /*
>> +  * memcg re-creates caches with the flags of the originals. Remove
>> +  * the freelist related flags to ensure they are re-defined at this
>> +  * stage. Prevent having both flags on edge cases like with pagealloc
>> +  * if the original cache was created too early to be OFF_SLAB.
>> +  */
>> + flags &= ~(CFLGS_OBJFREELIST_SLAB|CFLGS_OFF_SLAB);
>> +
>>  #if DEBUG
>>  #if FORCED_DEBUG
>>   /*
>>



-- 
Thomas


Re: [PATCH v1] memcg: Prevent caches to be both OFF_SLAB & OBJFREELIST_SLAB

2016-10-26 Thread Christoph Lameter
Hmmm...Doesnt this belong into memcg_create_kmem_cache() or into
kmem_cache_create() in mm/slab_common.h? Definitely not in an allocator
specific function since this is an issue for all allocators.

memcg_create_kmem_cache() simply assumes that it can pass flags from the
kmem_cache structure to kmem_cache_create(). However, those flags may
contain slab specific options.

kmem_cache_create() could filter out flags that cannot be specified.

Maybe create SLAB_FLAGS_PERMITTED in linux/mm/slab.h and mask other bits
out in kmem_cache_create()?

Slub also has internal flags and those also should not be passed to
kmem_cache_create(). If we define the valid ones we can mask them out.

The cleanest approach would be if kmem_cache_create() would reject invalid
flags and fail and if memcg_create_kmem_cache() would mask out the invalid
flags using SLAB_FLAGS_PERMITTED or so.



On Wed, 26 Oct 2016, Thomas Garnier wrote:

> While testing OBJFREELIST_SLAB integration with pagealloc, we found a
> bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
> CFLGS_OBJFREELIST_SLAB.
>
> The original kmem_cache is created early making OFF_SLAB not possible.
> When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
> is enabled it will try to enable it first under certain conditions.
> Given kmem_cache(sys) reuses the original flag, you can have both flags
> at the same time resulting in allocation failures and odd behaviors.
>
> The proposed fix removes these flags by default at the entrance of
> __kmem_cache_create. This way the function will define which way the
> freelist should be handled at this stage for the new cache.
>
> Fixes: b03a017bebc4 ("mm/slab: introduce new slab management type, 
> OBJFREELIST_SLAB")
> Signed-off-by: Thomas Garnier 
> Signed-off-by: Greg Thelen 
> ---
> Based on next-20161025
> ---
>  mm/slab.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/mm/slab.c b/mm/slab.c
> index 3c83c29..efe280a 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -2027,6 +2027,14 @@ __kmem_cache_create (struct kmem_cache *cachep, 
> unsigned long flags)
>   int err;
>   size_t size = cachep->size;
>
> + /*
> +  * memcg re-creates caches with the flags of the originals. Remove
> +  * the freelist related flags to ensure they are re-defined at this
> +  * stage. Prevent having both flags on edge cases like with pagealloc
> +  * if the original cache was created too early to be OFF_SLAB.
> +  */
> + flags &= ~(CFLGS_OBJFREELIST_SLAB|CFLGS_OFF_SLAB);
> +
>  #if DEBUG
>  #if FORCED_DEBUG
>   /*
>


Re: [PATCH v1] memcg: Prevent caches to be both OFF_SLAB & OBJFREELIST_SLAB

2016-10-26 Thread Christoph Lameter
Hmmm...Doesnt this belong into memcg_create_kmem_cache() or into
kmem_cache_create() in mm/slab_common.h? Definitely not in an allocator
specific function since this is an issue for all allocators.

memcg_create_kmem_cache() simply assumes that it can pass flags from the
kmem_cache structure to kmem_cache_create(). However, those flags may
contain slab specific options.

kmem_cache_create() could filter out flags that cannot be specified.

Maybe create SLAB_FLAGS_PERMITTED in linux/mm/slab.h and mask other bits
out in kmem_cache_create()?

Slub also has internal flags and those also should not be passed to
kmem_cache_create(). If we define the valid ones we can mask them out.

The cleanest approach would be if kmem_cache_create() would reject invalid
flags and fail and if memcg_create_kmem_cache() would mask out the invalid
flags using SLAB_FLAGS_PERMITTED or so.



On Wed, 26 Oct 2016, Thomas Garnier wrote:

> While testing OBJFREELIST_SLAB integration with pagealloc, we found a
> bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
> CFLGS_OBJFREELIST_SLAB.
>
> The original kmem_cache is created early making OFF_SLAB not possible.
> When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
> is enabled it will try to enable it first under certain conditions.
> Given kmem_cache(sys) reuses the original flag, you can have both flags
> at the same time resulting in allocation failures and odd behaviors.
>
> The proposed fix removes these flags by default at the entrance of
> __kmem_cache_create. This way the function will define which way the
> freelist should be handled at this stage for the new cache.
>
> Fixes: b03a017bebc4 ("mm/slab: introduce new slab management type, 
> OBJFREELIST_SLAB")
> Signed-off-by: Thomas Garnier 
> Signed-off-by: Greg Thelen 
> ---
> Based on next-20161025
> ---
>  mm/slab.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/mm/slab.c b/mm/slab.c
> index 3c83c29..efe280a 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -2027,6 +2027,14 @@ __kmem_cache_create (struct kmem_cache *cachep, 
> unsigned long flags)
>   int err;
>   size_t size = cachep->size;
>
> + /*
> +  * memcg re-creates caches with the flags of the originals. Remove
> +  * the freelist related flags to ensure they are re-defined at this
> +  * stage. Prevent having both flags on edge cases like with pagealloc
> +  * if the original cache was created too early to be OFF_SLAB.
> +  */
> + flags &= ~(CFLGS_OBJFREELIST_SLAB|CFLGS_OFF_SLAB);
> +
>  #if DEBUG
>  #if FORCED_DEBUG
>   /*
>