Re: [PATCH v2 2/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-12 Thread Christopher Lameter
On Thu, 12 Apr 2018, Matthew Wilcox wrote:

> > Thus the next invocation of the fastpath will find that c->freelist is
> > NULL and go to the slowpath. ...
>
> _ah_.  I hadn't figured out that c->page was always NULL in the debugging
> case too, so ___slab_alloc() always hits the 'new_slab' case.  Thanks!

Also note that you can have SLUB also do the build with all debugging on
without having to use a command like parameter (like SLAB). That requires
CONFIG_SLUB_DEBUG_ON to be set. CONFIG_SLUB_DEBUG is set by default for
all distro builds. It only includes the debug code but does not activate
them by default. Kernel command line parameters allow you to selectively
switch on debugging features for specific slab caches so that you can
limit the latency variations introduced by debugging into the production
kernel. Thus subtle races may be found.





Re: [PATCH v2 2/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-12 Thread Matthew Wilcox
On Thu, Apr 12, 2018 at 09:10:23AM -0500, Christopher Lameter wrote:
> On Wed, 11 Apr 2018, Matthew Wilcox wrote:
> > I don't see how that works ... can you explain a little more?
>
> c->freelist is NULL and thus ___slab_alloc (slowpath) is called.
> ___slab_alloc populates c->freelist and gets the new object pointer.
> 
> if debugging is on then c->freelist is set to NULL at the end of
> ___slab_alloc because deactivate_slab() is called.
> 
> Thus the next invocation of the fastpath will find that c->freelist is
> NULL and go to the slowpath. ...

_ah_.  I hadn't figured out that c->page was always NULL in the debugging
case too, so ___slab_alloc() always hits the 'new_slab' case.  Thanks!


Re: [PATCH v2 2/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-12 Thread Christopher Lameter
On Wed, 11 Apr 2018, Matthew Wilcox wrote:

>
> I don't see how that works ... can you explain a little more?
>
> I see ___slab_alloc() is called from __slab_alloc().  And I see
> slab_alloc_node does this:
>
> object = c->freelist;
> page = c->page;
> if (unlikely(!object || !node_match(page, node))) {
> object = __slab_alloc(s, gfpflags, node, addr, c);
> stat(s, ALLOC_SLOWPATH);
>
> But I don't see how slub_debug leads to c->freelist always being NULL.
> It looks like it gets repopulated from page->freelist in ___slab_alloc()
> at the load_freelist label.

c->freelist is NULL and thus ___slab_alloc (slowpath) is called.
___slab_alloc populates c->freelist and gets the new object pointer.

if debugging is on then c->freelist is set to NULL at the end of
___slab_alloc because deactivate_slab() is called.

Thus the next invocation of the fastpath will find that c->freelist is
NULL and go to the slowpath. ...




Re: [PATCH v2 2/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-11 Thread Matthew Wilcox
On Wed, Apr 11, 2018 at 04:11:17PM -0500, Christopher Lameter wrote:
> On Wed, 11 Apr 2018, Matthew Wilcox wrote:
> > > Please put this in a code path that is enabled by specifying
> > >
> > > slub_debug
> > >
> > > on the kernel command line.
> >
> > I don't understand.  First, I had:
> >
> > if (unlikely(gfpflags & __GFP_ZERO) && object && !WARN_ON_ONCE(s->ctor))
> >
> > and you didn't like that because it was putting checking into a (semi)fast
> > path.  Now you want me to add a check for slub_debug somewhere?  I dont
> > see an existing one I can leverage that will hit on every allocation.
> > Perhaps I'm missing something.
> 
> The WARN_ON is only enabled when you configure and build the kernel with
> debugging enabled (CONFIG_VM_DEBUG). That is a compile time debugging
> feature like supported by SLAB.

Yes.  I want to have an option to check *every single* allocation.

> "slub_debug" enables kmem_cache->flags & SLAB_DEBUG and that forces all
> fastpath processing to be disabled. Thus you can check reliably in the
> slow path only for the GFP_ZERO problem.
> 
> Add the check to the other debug stuff already there. F.e. in
> alloc_debug_processing() or after
> 
> if (kmem_cache_debug(s) ...
> 
> in slab_alloc()

I don't see how that works ... can you explain a little more?

I see ___slab_alloc() is called from __slab_alloc().  And I see
slab_alloc_node does this:

object = c->freelist;
page = c->page;
if (unlikely(!object || !node_match(page, node))) {
object = __slab_alloc(s, gfpflags, node, addr, c);
stat(s, ALLOC_SLOWPATH);

But I don't see how slub_debug leads to c->freelist always being NULL.
It looks like it gets repopulated from page->freelist in ___slab_alloc()
at the load_freelist label.


Re: [PATCH v2 2/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-11 Thread Christopher Lameter
On Wed, 11 Apr 2018, Matthew Wilcox wrote:

> > >   slab_post_alloc_hook(s, gfpflags, 1, &object);
> >
> > Please put this in a code path that is enabled by specifying
> >
> > slub_debug
> >
> > on the kernel command line.
>
> I don't understand.  First, I had:
>
>   if (unlikely(gfpflags & __GFP_ZERO) && object && !WARN_ON_ONCE(s->ctor))
>
> and you didn't like that because it was putting checking into a (semi)fast
> path.  Now you want me to add a check for slub_debug somewhere?  I dont
> see an existing one I can leverage that will hit on every allocation.
> Perhaps I'm missing something.

The WARN_ON is only enabled when you configure and build the kernel with
debugging enabled (CONFIG_VM_DEBUG). That is a compile time debugging
feature like supported by SLAB.

SLUB debugging is different because we had problems isolating memory
corruption bugs in the production kernels for years. The debug code is
always included in the build but kept out of the hotpaths.


The debug can be enabled when needed to find memory corruption errors
without the need to rebuild a kernel for a prod environment (which may
change race conditions etc) because we only then need to add a kernel
parameter.

"slub_debug" enables kmem_cache->flags & SLAB_DEBUG and that forces all
fastpath processing to be disabled. Thus you can check reliably in the
slow path only for the GFP_ZERO problem.

Add the check to the other debug stuff already there. F.e. in
alloc_debug_processing() or after

if (kmem_cache_debug(s) ...

in slab_alloc()





Re: [PATCH v2 2/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-11 Thread Matthew Wilcox
On Wed, Apr 11, 2018 at 08:44:23AM -0500, Christopher Lameter wrote:
> > +++ b/mm/slub.c
> > @@ -2725,7 +2726,7 @@ static __always_inline void *slab_alloc_node(struct 
> > kmem_cache *s,
> > stat(s, ALLOC_FASTPATH);
> > }
> >
> > -   if (unlikely(gfpflags & __GFP_ZERO) && object)
> > +   if (unlikely(gfpflags & __GFP_ZERO) && object && slab_no_ctor(s))
> > memset(object, 0, s->object_size);
> >
> > slab_post_alloc_hook(s, gfpflags, 1, &object);
> 
> Please put this in a code path that is enabled by specifying
> 
> slub_debug
> 
> on the kernel command line.

I don't understand.  First, I had:

if (unlikely(gfpflags & __GFP_ZERO) && object && !WARN_ON_ONCE(s->ctor))

and you didn't like that because it was putting checking into a (semi)fast
path.  Now you want me to add a check for slub_debug somewhere?  I dont
see an existing one I can leverage that will hit on every allocation.
Perhaps I'm missing something.


Re: [PATCH v2 2/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-11 Thread Christopher Lameter
On Tue, 10 Apr 2018, Matthew Wilcox wrote:

> diff --git a/mm/slab.h b/mm/slab.h
> index 3cd4677953c6..896818c7b30a 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -515,6 +515,13 @@ static inline void dump_unreclaimable_slab(void)
>
>  void ___cache_free(struct kmem_cache *cache, void *x, unsigned long addr);
>
> +static inline bool slab_no_ctor(struct kmem_cache *s)
> +{
> + if (IS_ENABLED(CONFIG_DEBUG_VM))
> + return !WARN_ON_ONCE(s->ctor);
> + return true;
> +}
> +
>  #ifdef CONFIG_SLAB_FREELIST_RANDOM
>  int cache_random_seq_create(struct kmem_cache *cachep, unsigned int count,
>   gfp_t gfp);

Move that to mm/slab.c? Debugging is runtime enabled with SLUB not compile
time as with SLAB.

> +++ b/mm/slub.c
> @@ -2725,7 +2726,7 @@ static __always_inline void *slab_alloc_node(struct 
> kmem_cache *s,
>   stat(s, ALLOC_FASTPATH);
>   }
>
> - if (unlikely(gfpflags & __GFP_ZERO) && object)
> + if (unlikely(gfpflags & __GFP_ZERO) && object && slab_no_ctor(s))
>   memset(object, 0, s->object_size);
>
>   slab_post_alloc_hook(s, gfpflags, 1, &object);

Please put this in a code path that is enabled by specifying

slub_debug

on the kernel command line.


Re: [PATCH v2 2/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-10 Thread Michal Hocko
On Tue 10-04-18 23:03:20, Matthew Wilcox wrote:
> diff --git a/mm/slab.c b/mm/slab.c
> index 58c8cecc26ab..9ad85fd9fca8 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -2661,6 +2661,7 @@ static struct page *cache_grow_begin(struct kmem_cache 
> *cachep,
>   invalid_mask, &invalid_mask, flags, &flags);
>   dump_stack();
>   }
> + BUG_ON(cachep->ctor && (flags & __GFP_ZERO));

NAK. We really do not want to blow the whole kernel just because
somebody is doing something stupid. Make it WARN_ON_ONCE and fix up the
flag.

> +static inline bool slab_no_ctor(struct kmem_cache *s)
> +{
> + if (IS_ENABLED(CONFIG_DEBUG_VM))
> + return !WARN_ON_ONCE(s->ctor);
> + return true;
> +}

I do realize that you want to keep the hotpath without additional checks
but if for nothing else this is a really bad misnomer.
debug_slab_no_ctor()? I can clearly see how somebody uses this blindly
for a different purpose.
[...]
> diff --git a/mm/slub.c b/mm/slub.c
> index a28488643603..9f8f38a552e5 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -1576,6 +1576,7 @@ static struct page *allocate_slab(struct kmem_cache *s, 
> gfp_t flags, int node)
>  
>   if (gfpflags_allow_blocking(flags))
>   local_irq_enable();
> + BUG_ON(s->ctor && (flags & __GFP_ZERO));

No no on this as well.

Othe than that. Once those are fixed, feel free to add
Acked-by: Michal Hocko 
-- 
Michal Hocko
SUSE Labs