Re: [PATCH -mm v3 8/8] slab: do not keep free objects/slabs on dead memcg caches

2014-07-01 Thread Vladimir Davydov
On Mon, Jun 30, 2014 at 10:49:03AM -0500, Christoph Lameter wrote:
> On Fri, 27 Jun 2014, Joonsoo Kim wrote:
> 
> > Christoph,
> > Is it tolerable result for large scale system? Or do we need to find
> > another solution?
> 
> 
> The overhead is pretty intense but then this is a rare event I guess?

Yes, provided cgroups are created/destroyed rarely.

> It seems that it is much easier on the code and much faster to do the
> periodic reaping. Why not simply go with that?

A bad thing about the periodic reaping is that the time it may take
isn't predictable, because the number of dead caches is, in fact, only
limited by the amount of RAM.

We can have hundreds, if not thousands, copies of dcaches/icaches left
from cgroups destroyed some time ago. The dead caches will hang around
until memory pressure evicts all the objects they host, which may take
quite long on systems with a lot of memory.

With periodic reaping, we will have to iterate over all dead caches
trying to drain per cpu/node arrays each time, which might therefore
result in slowing down the whole system unexpectedly.

I'm not quite sure if such slowdowns are really a threat though.
Actually, cache_reap will only do something (take locks, drain
arrays/lists) only if there are free objects on the cache. Otherwise it
will, in fact, only check cpu_cache->avail, alien->avail, shared->avail,
and node->free_list, which shouldn't take much time, should it?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm v3 8/8] slab: do not keep free objects/slabs on dead memcg caches

2014-07-01 Thread Vladimir Davydov
On Mon, Jun 30, 2014 at 10:49:03AM -0500, Christoph Lameter wrote:
 On Fri, 27 Jun 2014, Joonsoo Kim wrote:
 
  Christoph,
  Is it tolerable result for large scale system? Or do we need to find
  another solution?
 
 
 The overhead is pretty intense but then this is a rare event I guess?

Yes, provided cgroups are created/destroyed rarely.

 It seems that it is much easier on the code and much faster to do the
 periodic reaping. Why not simply go with that?

A bad thing about the periodic reaping is that the time it may take
isn't predictable, because the number of dead caches is, in fact, only
limited by the amount of RAM.

We can have hundreds, if not thousands, copies of dcaches/icaches left
from cgroups destroyed some time ago. The dead caches will hang around
until memory pressure evicts all the objects they host, which may take
quite long on systems with a lot of memory.

With periodic reaping, we will have to iterate over all dead caches
trying to drain per cpu/node arrays each time, which might therefore
result in slowing down the whole system unexpectedly.

I'm not quite sure if such slowdowns are really a threat though.
Actually, cache_reap will only do something (take locks, drain
arrays/lists) only if there are free objects on the cache. Otherwise it
will, in fact, only check cpu_cache-avail, alien-avail, shared-avail,
and node-free_list, which shouldn't take much time, should it?

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm v3 8/8] slab: do not keep free objects/slabs on dead memcg caches

2014-06-30 Thread Christoph Lameter
On Fri, 27 Jun 2014, Joonsoo Kim wrote:

> Christoph,
> Is it tolerable result for large scale system? Or do we need to find
> another solution?


The overhead is pretty intense but then this is a rare event I guess?

It seems that it is much easier on the code and much faster to do the
periodic reaping. Why not simply go with that?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm v3 8/8] slab: do not keep free objects/slabs on dead memcg caches

2014-06-30 Thread Christoph Lameter
On Fri, 27 Jun 2014, Joonsoo Kim wrote:

 Christoph,
 Is it tolerable result for large scale system? Or do we need to find
 another solution?


The overhead is pretty intense but then this is a rare event I guess?

It seems that it is much easier on the code and much faster to do the
periodic reaping. Why not simply go with that?


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm v3 8/8] slab: do not keep free objects/slabs on dead memcg caches

2014-06-27 Thread Joonsoo Kim
On Wed, Jun 25, 2014 at 05:45:45PM +0400, Vladimir Davydov wrote:
> On Tue, Jun 24, 2014 at 04:38:41PM +0900, Joonsoo Kim wrote:
> > On Fri, Jun 13, 2014 at 12:38:22AM +0400, Vladimir Davydov wrote:
> > And, you said that this way of implementation would be slow because
> > there could be many object in dead caches and this implementation
> > needs node spin_lock on each object freeing. Is it no problem now?
> > 
> > If you have any performance data about this implementation and
> > alternative one, could you share it?
> 
> I ran some tests on a 2 CPU x 6 core x 2 HT box. The kernel was compiled
> with a config taken from a popular distro, so it had most of debug
> options turned off.
> 
> ---
> 
> TEST #1: Each logical CPU executes a task that frees 1M objects
>  allocated from the same cache. All frees are node-local.
> 
> RESULTS:
> 
> objsize (bytes) | cache is dead? | objects free time (ms)
> ++---
>   64|   -|   373 +- 5
>-|   +|  1300 +- 6
> ||
>  128|   -|   387 +- 6
>-|   +|  1337 +- 6
> ||
>  256|   -|   484 +- 4
>-|   +|  1407 +- 6
> ||
>  512|   -|   686 +-  5
>-|   +|  1561 +- 18
> ||
> 1024|   -|  1073 +- 11
>-|   +|  1897 +- 12
> 
> TEST #2: Each logical CPU executes a task that removes 1M empty files
>  from its own RAMFS mount. All frees are node-local.
> 
> RESULTS:
> 
>  cache is dead? | files removal time (s)
> +--
>   - |   15.57 +- 0.55   (base)
>   + |   16.80 +- 0.62   (base + 8%)
> 
> ---
> 
> So, according to TEST #1 the relative slowdown introduced by zapping per
> cpu arrays is really dreadful - it can be up to 4x! However, the
> absolute numbers aren't that huge - ~1 second for 24 million objects.
> If we do something else except kfree the slowdown shouldn't be that
> visible IMO.
> 
> TEST #2 is an attempt to estimate how zapping of per cpu arrays will
> affect FS objects destruction, which is the most common case of dead
> caches usage. To avoid disk-bound operations it uses RAMFS. From the
> test results it follows that the relative slowdown of massive file
> deletion is within 2 stdev, which looks decent.
> 
> Anyway, the alternative approach (reaping dead caches periodically)
> won't have this kfree slowdown at all. However, periodic reaping can
> become a real disaster as the system evolves and the number of dead
> caches grows. Currently I don't know how we can estimate real life
> effects of this. If you have any ideas, please let me know.
> 

Hello,

I have no idea here. I don't have much experience on large scale
system. But, current implementation would also have big trouble if
system is larger than yours.

I think that Christoph can say something about this result.

Christoph,
Is it tolerable result for large scale system? Or do we need to find
another solution?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm v3 8/8] slab: do not keep free objects/slabs on dead memcg caches

2014-06-27 Thread Joonsoo Kim
On Wed, Jun 25, 2014 at 05:45:45PM +0400, Vladimir Davydov wrote:
 On Tue, Jun 24, 2014 at 04:38:41PM +0900, Joonsoo Kim wrote:
  On Fri, Jun 13, 2014 at 12:38:22AM +0400, Vladimir Davydov wrote:
  And, you said that this way of implementation would be slow because
  there could be many object in dead caches and this implementation
  needs node spin_lock on each object freeing. Is it no problem now?
  
  If you have any performance data about this implementation and
  alternative one, could you share it?
 
 I ran some tests on a 2 CPU x 6 core x 2 HT box. The kernel was compiled
 with a config taken from a popular distro, so it had most of debug
 options turned off.
 
 ---
 
 TEST #1: Each logical CPU executes a task that frees 1M objects
  allocated from the same cache. All frees are node-local.
 
 RESULTS:
 
 objsize (bytes) | cache is dead? | objects free time (ms)
 ++---
   64|   -|   373 +- 5
-|   +|  1300 +- 6
 ||
  128|   -|   387 +- 6
-|   +|  1337 +- 6
 ||
  256|   -|   484 +- 4
-|   +|  1407 +- 6
 ||
  512|   -|   686 +-  5
-|   +|  1561 +- 18
 ||
 1024|   -|  1073 +- 11
-|   +|  1897 +- 12
 
 TEST #2: Each logical CPU executes a task that removes 1M empty files
  from its own RAMFS mount. All frees are node-local.
 
 RESULTS:
 
  cache is dead? | files removal time (s)
 +--
   - |   15.57 +- 0.55   (base)
   + |   16.80 +- 0.62   (base + 8%)
 
 ---
 
 So, according to TEST #1 the relative slowdown introduced by zapping per
 cpu arrays is really dreadful - it can be up to 4x! However, the
 absolute numbers aren't that huge - ~1 second for 24 million objects.
 If we do something else except kfree the slowdown shouldn't be that
 visible IMO.
 
 TEST #2 is an attempt to estimate how zapping of per cpu arrays will
 affect FS objects destruction, which is the most common case of dead
 caches usage. To avoid disk-bound operations it uses RAMFS. From the
 test results it follows that the relative slowdown of massive file
 deletion is within 2 stdev, which looks decent.
 
 Anyway, the alternative approach (reaping dead caches periodically)
 won't have this kfree slowdown at all. However, periodic reaping can
 become a real disaster as the system evolves and the number of dead
 caches grows. Currently I don't know how we can estimate real life
 effects of this. If you have any ideas, please let me know.
 

Hello,

I have no idea here. I don't have much experience on large scale
system. But, current implementation would also have big trouble if
system is larger than yours.

I think that Christoph can say something about this result.

Christoph,
Is it tolerable result for large scale system? Or do we need to find
another solution?

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm v3 8/8] slab: do not keep free objects/slabs on dead memcg caches

2014-06-25 Thread Vladimir Davydov
On Tue, Jun 24, 2014 at 04:38:41PM +0900, Joonsoo Kim wrote:
> On Fri, Jun 13, 2014 at 12:38:22AM +0400, Vladimir Davydov wrote:
> And, you said that this way of implementation would be slow because
> there could be many object in dead caches and this implementation
> needs node spin_lock on each object freeing. Is it no problem now?
> 
> If you have any performance data about this implementation and
> alternative one, could you share it?

I ran some tests on a 2 CPU x 6 core x 2 HT box. The kernel was compiled
with a config taken from a popular distro, so it had most of debug
options turned off.

---

TEST #1: Each logical CPU executes a task that frees 1M objects
 allocated from the same cache. All frees are node-local.

RESULTS:

objsize (bytes) | cache is dead? | objects free time (ms)
++---
  64|   -|   373 +- 5
   -|   +|  1300 +- 6
||
 128|   -|   387 +- 6
   -|   +|  1337 +- 6
||
 256|   -|   484 +- 4
   -|   +|  1407 +- 6
||
 512|   -|   686 +-  5
   -|   +|  1561 +- 18
||
1024|   -|  1073 +- 11
   -|   +|  1897 +- 12

TEST #2: Each logical CPU executes a task that removes 1M empty files
 from its own RAMFS mount. All frees are node-local.

RESULTS:

 cache is dead? | files removal time (s)
+--
  - |   15.57 +- 0.55   (base)
  + |   16.80 +- 0.62   (base + 8%)

---

So, according to TEST #1 the relative slowdown introduced by zapping per
cpu arrays is really dreadful - it can be up to 4x! However, the
absolute numbers aren't that huge - ~1 second for 24 million objects.
If we do something else except kfree the slowdown shouldn't be that
visible IMO.

TEST #2 is an attempt to estimate how zapping of per cpu arrays will
affect FS objects destruction, which is the most common case of dead
caches usage. To avoid disk-bound operations it uses RAMFS. From the
test results it follows that the relative slowdown of massive file
deletion is within 2 stdev, which looks decent.

Anyway, the alternative approach (reaping dead caches periodically)
won't have this kfree slowdown at all. However, periodic reaping can
become a real disaster as the system evolves and the number of dead
caches grows. Currently I don't know how we can estimate real life
effects of this. If you have any ideas, please let me know.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm v3 8/8] slab: do not keep free objects/slabs on dead memcg caches

2014-06-25 Thread Vladimir Davydov
On Tue, Jun 24, 2014 at 04:38:41PM +0900, Joonsoo Kim wrote:
 On Fri, Jun 13, 2014 at 12:38:22AM +0400, Vladimir Davydov wrote:
 And, you said that this way of implementation would be slow because
 there could be many object in dead caches and this implementation
 needs node spin_lock on each object freeing. Is it no problem now?
 
 If you have any performance data about this implementation and
 alternative one, could you share it?

I ran some tests on a 2 CPU x 6 core x 2 HT box. The kernel was compiled
with a config taken from a popular distro, so it had most of debug
options turned off.

---

TEST #1: Each logical CPU executes a task that frees 1M objects
 allocated from the same cache. All frees are node-local.

RESULTS:

objsize (bytes) | cache is dead? | objects free time (ms)
++---
  64|   -|   373 +- 5
   -|   +|  1300 +- 6
||
 128|   -|   387 +- 6
   -|   +|  1337 +- 6
||
 256|   -|   484 +- 4
   -|   +|  1407 +- 6
||
 512|   -|   686 +-  5
   -|   +|  1561 +- 18
||
1024|   -|  1073 +- 11
   -|   +|  1897 +- 12

TEST #2: Each logical CPU executes a task that removes 1M empty files
 from its own RAMFS mount. All frees are node-local.

RESULTS:

 cache is dead? | files removal time (s)
+--
  - |   15.57 +- 0.55   (base)
  + |   16.80 +- 0.62   (base + 8%)

---

So, according to TEST #1 the relative slowdown introduced by zapping per
cpu arrays is really dreadful - it can be up to 4x! However, the
absolute numbers aren't that huge - ~1 second for 24 million objects.
If we do something else except kfree the slowdown shouldn't be that
visible IMO.

TEST #2 is an attempt to estimate how zapping of per cpu arrays will
affect FS objects destruction, which is the most common case of dead
caches usage. To avoid disk-bound operations it uses RAMFS. From the
test results it follows that the relative slowdown of massive file
deletion is within 2 stdev, which looks decent.

Anyway, the alternative approach (reaping dead caches periodically)
won't have this kfree slowdown at all. However, periodic reaping can
become a real disaster as the system evolves and the number of dead
caches grows. Currently I don't know how we can estimate real life
effects of this. If you have any ideas, please let me know.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm v3 8/8] slab: do not keep free objects/slabs on dead memcg caches

2014-06-24 Thread Vladimir Davydov
On Tue, Jun 24, 2014 at 04:38:41PM +0900, Joonsoo Kim wrote:
> On Fri, Jun 13, 2014 at 12:38:22AM +0400, Vladimir Davydov wrote:
> > @@ -3462,6 +3474,17 @@ static inline void __cache_free(struct kmem_cache 
> > *cachep, void *objp,
> >  
> > kmemcheck_slab_free(cachep, objp, cachep->object_size);
> >  
> > +#ifdef CONFIG_MEMCG_KMEM
> > +   if (unlikely(!ac)) {
> > +   int nodeid = page_to_nid(virt_to_page(objp));
> > +
> > +   spin_lock(>node[nodeid]->list_lock);
> > +   free_block(cachep, , 1, nodeid);
> > +   spin_unlock(>node[nodeid]->list_lock);
> > +   return;
> > +   }
> > +#endif
> > +
> 
> And, please document intention of this code. :)

Sure.

> And, you said that this way of implementation would be slow because
> there could be many object in dead caches and this implementation
> needs node spin_lock on each object freeing. Is it no problem now?

It may be :(

> If you have any performance data about this implementation and
> alternative one, could you share it?

I haven't (shame on me!). I'll do some testing today and send you the
results.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm v3 8/8] slab: do not keep free objects/slabs on dead memcg caches

2014-06-24 Thread Vladimir Davydov
Hi,

On Tue, Jun 24, 2014 at 04:25:54PM +0900, Joonsoo Kim wrote:
> On Fri, Jun 13, 2014 at 12:38:22AM +0400, Vladimir Davydov wrote:
> > @@ -3368,7 +3379,8 @@ static void free_block(struct kmem_cache *cachep, 
> > void **objpp, int nr_objects,
> >  
> > /* fixup slab chains */
> > if (page->active == 0) {
> > -   if (n->free_objects > n->free_limit) {
> > +   if (n->free_objects > n->free_limit ||
> > +   memcg_cache_dead(cachep)) {
> 
> I'd like to set 0 to free_limit in __kmem_cache_shrink()
> rather than memcg_cache_dead() test here, because memcg_cache_dead()
> is more expensive than it. Is there any problem in this way?

We'd have to be careful on cpu hotplug then, because it may update the
free_limit. Not a big problem though. Will fix.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm v3 8/8] slab: do not keep free objects/slabs on dead memcg caches

2014-06-24 Thread Joonsoo Kim
On Fri, Jun 13, 2014 at 12:38:22AM +0400, Vladimir Davydov wrote:
> Since a dead memcg cache is destroyed only after the last slab allocated
> to it is freed, we must disable caching of free objects/slabs for such
> caches, otherwise they will be hanging around forever.
> 
> For SLAB that means we must disable per cpu free object arrays and make
> free_block always discard empty slabs irrespective of node's free_limit.
> 
> To disable per cpu arrays, we free them on kmem_cache_shrink (see
> drain_cpu_caches -> do_drain) and make __cache_free fall back to
> free_block if there is no per cpu array. Also, we have to disable
> allocation of per cpu arrays on cpu hotplug for dead caches (see
> cpuup_prepare, __do_tune_cpucache).
> 
> After we disabled free objects/slabs caching, there is no need to reap
> those caches periodically. Moreover, it will only result in slowdown. So
> we also make cache_reap skip then.
> 
> Signed-off-by: Vladimir Davydov 
> ---
>  mm/slab.c |   31 ++-
>  1 file changed, 30 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/slab.c b/mm/slab.c
> index b3af82419251..7e91f5f1341d 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -1210,6 +1210,9 @@ static int cpuup_prepare(long cpu)
>   struct array_cache *shared = NULL;
>   struct array_cache **alien = NULL;
>  
> + if (memcg_cache_dead(cachep))
> + continue;
> +
>   nc = alloc_arraycache(node, cachep->limit,
>   cachep->batchcount, GFP_KERNEL);
>   if (!nc)
> @@ -2411,10 +2414,18 @@ static void do_drain(void *arg)
>  
>   check_irq_off();
>   ac = cpu_cache_get(cachep);
> + if (!ac)
> + return;
> +
>   spin_lock(>node[node]->list_lock);
>   free_block(cachep, ac->entry, ac->avail, node);
>   spin_unlock(>node[node]->list_lock);
>   ac->avail = 0;
> +
> + if (memcg_cache_dead(cachep)) {
> + cachep->array[smp_processor_id()] = NULL;
> + kfree(ac);
> + }
>  }
>  
>  static void drain_cpu_caches(struct kmem_cache *cachep)
> @@ -3368,7 +3379,8 @@ static void free_block(struct kmem_cache *cachep, void 
> **objpp, int nr_objects,
>  
>   /* fixup slab chains */
>   if (page->active == 0) {
> - if (n->free_objects > n->free_limit) {
> + if (n->free_objects > n->free_limit ||
> + memcg_cache_dead(cachep)) {
>   n->free_objects -= cachep->num;
>   /* No need to drop any previously held
>* lock here, even if we have a off-slab slab
> @@ -3462,6 +3474,17 @@ static inline void __cache_free(struct kmem_cache 
> *cachep, void *objp,
>  
>   kmemcheck_slab_free(cachep, objp, cachep->object_size);
>  
> +#ifdef CONFIG_MEMCG_KMEM
> + if (unlikely(!ac)) {
> + int nodeid = page_to_nid(virt_to_page(objp));
> +
> + spin_lock(>node[nodeid]->list_lock);
> + free_block(cachep, , 1, nodeid);
> + spin_unlock(>node[nodeid]->list_lock);
> + return;
> + }
> +#endif
> +

And, please document intention of this code. :)

And, you said that this way of implementation would be slow because
there could be many object in dead caches and this implementation
needs node spin_lock on each object freeing. Is it no problem now?

If you have any performance data about this implementation and
alternative one, could you share it?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm v3 8/8] slab: do not keep free objects/slabs on dead memcg caches

2014-06-24 Thread Joonsoo Kim
On Fri, Jun 13, 2014 at 12:38:22AM +0400, Vladimir Davydov wrote:
> Since a dead memcg cache is destroyed only after the last slab allocated
> to it is freed, we must disable caching of free objects/slabs for such
> caches, otherwise they will be hanging around forever.
> 
> For SLAB that means we must disable per cpu free object arrays and make
> free_block always discard empty slabs irrespective of node's free_limit.
> 
> To disable per cpu arrays, we free them on kmem_cache_shrink (see
> drain_cpu_caches -> do_drain) and make __cache_free fall back to
> free_block if there is no per cpu array. Also, we have to disable
> allocation of per cpu arrays on cpu hotplug for dead caches (see
> cpuup_prepare, __do_tune_cpucache).
> 
> After we disabled free objects/slabs caching, there is no need to reap
> those caches periodically. Moreover, it will only result in slowdown. So
> we also make cache_reap skip then.
> 
> Signed-off-by: Vladimir Davydov 
> ---
>  mm/slab.c |   31 ++-
>  1 file changed, 30 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/slab.c b/mm/slab.c
> index b3af82419251..7e91f5f1341d 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -1210,6 +1210,9 @@ static int cpuup_prepare(long cpu)
>   struct array_cache *shared = NULL;
>   struct array_cache **alien = NULL;
>  
> + if (memcg_cache_dead(cachep))
> + continue;
> +
>   nc = alloc_arraycache(node, cachep->limit,
>   cachep->batchcount, GFP_KERNEL);
>   if (!nc)
> @@ -2411,10 +2414,18 @@ static void do_drain(void *arg)
>  
>   check_irq_off();
>   ac = cpu_cache_get(cachep);
> + if (!ac)
> + return;
> +
>   spin_lock(>node[node]->list_lock);
>   free_block(cachep, ac->entry, ac->avail, node);
>   spin_unlock(>node[node]->list_lock);
>   ac->avail = 0;
> +
> + if (memcg_cache_dead(cachep)) {
> + cachep->array[smp_processor_id()] = NULL;
> + kfree(ac);
> + }
>  }
>  
>  static void drain_cpu_caches(struct kmem_cache *cachep)
> @@ -3368,7 +3379,8 @@ static void free_block(struct kmem_cache *cachep, void 
> **objpp, int nr_objects,
>  
>   /* fixup slab chains */
>   if (page->active == 0) {
> - if (n->free_objects > n->free_limit) {
> + if (n->free_objects > n->free_limit ||
> + memcg_cache_dead(cachep)) {

Hello, Vladimir.

I'd like to set 0 to free_limit in __kmem_cache_shrink()
rather than memcg_cache_dead() test here, because memcg_cache_dead()
is more expensive than it. Is there any problem in this way?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm v3 8/8] slab: do not keep free objects/slabs on dead memcg caches

2014-06-24 Thread Joonsoo Kim
On Fri, Jun 13, 2014 at 12:38:22AM +0400, Vladimir Davydov wrote:
 Since a dead memcg cache is destroyed only after the last slab allocated
 to it is freed, we must disable caching of free objects/slabs for such
 caches, otherwise they will be hanging around forever.
 
 For SLAB that means we must disable per cpu free object arrays and make
 free_block always discard empty slabs irrespective of node's free_limit.
 
 To disable per cpu arrays, we free them on kmem_cache_shrink (see
 drain_cpu_caches - do_drain) and make __cache_free fall back to
 free_block if there is no per cpu array. Also, we have to disable
 allocation of per cpu arrays on cpu hotplug for dead caches (see
 cpuup_prepare, __do_tune_cpucache).
 
 After we disabled free objects/slabs caching, there is no need to reap
 those caches periodically. Moreover, it will only result in slowdown. So
 we also make cache_reap skip then.
 
 Signed-off-by: Vladimir Davydov vdavy...@parallels.com
 ---
  mm/slab.c |   31 ++-
  1 file changed, 30 insertions(+), 1 deletion(-)
 
 diff --git a/mm/slab.c b/mm/slab.c
 index b3af82419251..7e91f5f1341d 100644
 --- a/mm/slab.c
 +++ b/mm/slab.c
 @@ -1210,6 +1210,9 @@ static int cpuup_prepare(long cpu)
   struct array_cache *shared = NULL;
   struct array_cache **alien = NULL;
  
 + if (memcg_cache_dead(cachep))
 + continue;
 +
   nc = alloc_arraycache(node, cachep-limit,
   cachep-batchcount, GFP_KERNEL);
   if (!nc)
 @@ -2411,10 +2414,18 @@ static void do_drain(void *arg)
  
   check_irq_off();
   ac = cpu_cache_get(cachep);
 + if (!ac)
 + return;
 +
   spin_lock(cachep-node[node]-list_lock);
   free_block(cachep, ac-entry, ac-avail, node);
   spin_unlock(cachep-node[node]-list_lock);
   ac-avail = 0;
 +
 + if (memcg_cache_dead(cachep)) {
 + cachep-array[smp_processor_id()] = NULL;
 + kfree(ac);
 + }
  }
  
  static void drain_cpu_caches(struct kmem_cache *cachep)
 @@ -3368,7 +3379,8 @@ static void free_block(struct kmem_cache *cachep, void 
 **objpp, int nr_objects,
  
   /* fixup slab chains */
   if (page-active == 0) {
 - if (n-free_objects  n-free_limit) {
 + if (n-free_objects  n-free_limit ||
 + memcg_cache_dead(cachep)) {

Hello, Vladimir.

I'd like to set 0 to free_limit in __kmem_cache_shrink()
rather than memcg_cache_dead() test here, because memcg_cache_dead()
is more expensive than it. Is there any problem in this way?

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm v3 8/8] slab: do not keep free objects/slabs on dead memcg caches

2014-06-24 Thread Joonsoo Kim
On Fri, Jun 13, 2014 at 12:38:22AM +0400, Vladimir Davydov wrote:
 Since a dead memcg cache is destroyed only after the last slab allocated
 to it is freed, we must disable caching of free objects/slabs for such
 caches, otherwise they will be hanging around forever.
 
 For SLAB that means we must disable per cpu free object arrays and make
 free_block always discard empty slabs irrespective of node's free_limit.
 
 To disable per cpu arrays, we free them on kmem_cache_shrink (see
 drain_cpu_caches - do_drain) and make __cache_free fall back to
 free_block if there is no per cpu array. Also, we have to disable
 allocation of per cpu arrays on cpu hotplug for dead caches (see
 cpuup_prepare, __do_tune_cpucache).
 
 After we disabled free objects/slabs caching, there is no need to reap
 those caches periodically. Moreover, it will only result in slowdown. So
 we also make cache_reap skip then.
 
 Signed-off-by: Vladimir Davydov vdavy...@parallels.com
 ---
  mm/slab.c |   31 ++-
  1 file changed, 30 insertions(+), 1 deletion(-)
 
 diff --git a/mm/slab.c b/mm/slab.c
 index b3af82419251..7e91f5f1341d 100644
 --- a/mm/slab.c
 +++ b/mm/slab.c
 @@ -1210,6 +1210,9 @@ static int cpuup_prepare(long cpu)
   struct array_cache *shared = NULL;
   struct array_cache **alien = NULL;
  
 + if (memcg_cache_dead(cachep))
 + continue;
 +
   nc = alloc_arraycache(node, cachep-limit,
   cachep-batchcount, GFP_KERNEL);
   if (!nc)
 @@ -2411,10 +2414,18 @@ static void do_drain(void *arg)
  
   check_irq_off();
   ac = cpu_cache_get(cachep);
 + if (!ac)
 + return;
 +
   spin_lock(cachep-node[node]-list_lock);
   free_block(cachep, ac-entry, ac-avail, node);
   spin_unlock(cachep-node[node]-list_lock);
   ac-avail = 0;
 +
 + if (memcg_cache_dead(cachep)) {
 + cachep-array[smp_processor_id()] = NULL;
 + kfree(ac);
 + }
  }
  
  static void drain_cpu_caches(struct kmem_cache *cachep)
 @@ -3368,7 +3379,8 @@ static void free_block(struct kmem_cache *cachep, void 
 **objpp, int nr_objects,
  
   /* fixup slab chains */
   if (page-active == 0) {
 - if (n-free_objects  n-free_limit) {
 + if (n-free_objects  n-free_limit ||
 + memcg_cache_dead(cachep)) {
   n-free_objects -= cachep-num;
   /* No need to drop any previously held
* lock here, even if we have a off-slab slab
 @@ -3462,6 +3474,17 @@ static inline void __cache_free(struct kmem_cache 
 *cachep, void *objp,
  
   kmemcheck_slab_free(cachep, objp, cachep-object_size);
  
 +#ifdef CONFIG_MEMCG_KMEM
 + if (unlikely(!ac)) {
 + int nodeid = page_to_nid(virt_to_page(objp));
 +
 + spin_lock(cachep-node[nodeid]-list_lock);
 + free_block(cachep, objp, 1, nodeid);
 + spin_unlock(cachep-node[nodeid]-list_lock);
 + return;
 + }
 +#endif
 +

And, please document intention of this code. :)

And, you said that this way of implementation would be slow because
there could be many object in dead caches and this implementation
needs node spin_lock on each object freeing. Is it no problem now?

If you have any performance data about this implementation and
alternative one, could you share it?

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm v3 8/8] slab: do not keep free objects/slabs on dead memcg caches

2014-06-24 Thread Vladimir Davydov
Hi,

On Tue, Jun 24, 2014 at 04:25:54PM +0900, Joonsoo Kim wrote:
 On Fri, Jun 13, 2014 at 12:38:22AM +0400, Vladimir Davydov wrote:
  @@ -3368,7 +3379,8 @@ static void free_block(struct kmem_cache *cachep, 
  void **objpp, int nr_objects,
   
  /* fixup slab chains */
  if (page-active == 0) {
  -   if (n-free_objects  n-free_limit) {
  +   if (n-free_objects  n-free_limit ||
  +   memcg_cache_dead(cachep)) {
 
 I'd like to set 0 to free_limit in __kmem_cache_shrink()
 rather than memcg_cache_dead() test here, because memcg_cache_dead()
 is more expensive than it. Is there any problem in this way?

We'd have to be careful on cpu hotplug then, because it may update the
free_limit. Not a big problem though. Will fix.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm v3 8/8] slab: do not keep free objects/slabs on dead memcg caches

2014-06-24 Thread Vladimir Davydov
On Tue, Jun 24, 2014 at 04:38:41PM +0900, Joonsoo Kim wrote:
 On Fri, Jun 13, 2014 at 12:38:22AM +0400, Vladimir Davydov wrote:
  @@ -3462,6 +3474,17 @@ static inline void __cache_free(struct kmem_cache 
  *cachep, void *objp,
   
  kmemcheck_slab_free(cachep, objp, cachep-object_size);
   
  +#ifdef CONFIG_MEMCG_KMEM
  +   if (unlikely(!ac)) {
  +   int nodeid = page_to_nid(virt_to_page(objp));
  +
  +   spin_lock(cachep-node[nodeid]-list_lock);
  +   free_block(cachep, objp, 1, nodeid);
  +   spin_unlock(cachep-node[nodeid]-list_lock);
  +   return;
  +   }
  +#endif
  +
 
 And, please document intention of this code. :)

Sure.

 And, you said that this way of implementation would be slow because
 there could be many object in dead caches and this implementation
 needs node spin_lock on each object freeing. Is it no problem now?

It may be :(

 If you have any performance data about this implementation and
 alternative one, could you share it?

I haven't (shame on me!). I'll do some testing today and send you the
results.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm v3 8/8] slab: do not keep free objects/slabs on dead memcg caches

2014-06-12 Thread Vladimir Davydov
On Fri, Jun 13, 2014 at 12:38:22AM +0400, Vladimir Davydov wrote:
> Since a dead memcg cache is destroyed only after the last slab allocated
> to it is freed, we must disable caching of free objects/slabs for such
> caches, otherwise they will be hanging around forever.
> 
> For SLAB that means we must disable per cpu free object arrays and make
> free_block always discard empty slabs irrespective of node's free_limit.

An alternative to this could be making cache_reap, which drains per cpu
arrays and drops free slabs periodically for all caches, shrink dead
caches aggressively. The patch doing this is attached.

This approach has its pros and cons comparing to disabling per cpu
arrays.

Pros:
 - Less intrusive: it only requires modification of cache_reap.
 - Doesn't impact performance: free path isn't touched.

Cons:
 - Delays dead cache destruction: lag between the last object is freed
   and the cache is destroyed isn't constant. It depends on the number
   of kmem-active memcgs and the number of dead caches (the more of
   them, the longer it'll take to shrink dead caches). Also, on NUMA
   machines the upper bound will be proportional to the number of NUMA
   nodes, because alien caches are reaped one at a time (see
   reap_alien).
 - If there are a lot of dead caches, periodic shrinking will be slowed
   down even for active caches (see cache_reap).

--

diff --git a/mm/slab.c b/mm/slab.c
index 9ca3b87edabc..811fdb214b9e 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3980,6 +3980,11 @@ static void cache_reap(struct work_struct *w)
goto out;
 
list_for_each_entry(searchp, _caches, list) {
+   int force = 0;
+
+   if (memcg_cache_dead(searchp))
+   force = 1;
+
check_irq_on();
 
/*
@@ -3991,7 +3996,7 @@ static void cache_reap(struct work_struct *w)
 
reap_alien(searchp, n);
 
-   drain_array(searchp, n, cpu_cache_get(searchp), 0, node);
+   drain_array(searchp, n, cpu_cache_get(searchp), force, node);
 
/*
 * These are racy checks but it does not matter
@@ -4002,15 +4007,17 @@ static void cache_reap(struct work_struct *w)
 
n->next_reap = jiffies + REAPTIMEOUT_NODE;
 
-   drain_array(searchp, n, n->shared, 0, node);
+   drain_array(searchp, n, n->shared, force, node);
 
if (n->free_touched)
n->free_touched = 0;
else {
-   int freed;
+   int freed, tofree;
+
+   tofree = force ? slabs_tofree(searchp, n) :
+   DIV_ROUND_UP(n->free_limit, 5 * searchp->num);
 
-   freed = drain_freelist(searchp, n, (n->free_limit +
-   5 * searchp->num - 1) / (5 * searchp->num));
+   freed = drain_freelist(searchp, n, tofree);
STATS_ADD_REAPED(searchp, freed);
}
 next:
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -mm v3 8/8] slab: do not keep free objects/slabs on dead memcg caches

2014-06-12 Thread Vladimir Davydov
Since a dead memcg cache is destroyed only after the last slab allocated
to it is freed, we must disable caching of free objects/slabs for such
caches, otherwise they will be hanging around forever.

For SLAB that means we must disable per cpu free object arrays and make
free_block always discard empty slabs irrespective of node's free_limit.

To disable per cpu arrays, we free them on kmem_cache_shrink (see
drain_cpu_caches -> do_drain) and make __cache_free fall back to
free_block if there is no per cpu array. Also, we have to disable
allocation of per cpu arrays on cpu hotplug for dead caches (see
cpuup_prepare, __do_tune_cpucache).

After we disabled free objects/slabs caching, there is no need to reap
those caches periodically. Moreover, it will only result in slowdown. So
we also make cache_reap skip then.

Signed-off-by: Vladimir Davydov 
---
 mm/slab.c |   31 ++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/mm/slab.c b/mm/slab.c
index b3af82419251..7e91f5f1341d 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1210,6 +1210,9 @@ static int cpuup_prepare(long cpu)
struct array_cache *shared = NULL;
struct array_cache **alien = NULL;
 
+   if (memcg_cache_dead(cachep))
+   continue;
+
nc = alloc_arraycache(node, cachep->limit,
cachep->batchcount, GFP_KERNEL);
if (!nc)
@@ -2411,10 +2414,18 @@ static void do_drain(void *arg)
 
check_irq_off();
ac = cpu_cache_get(cachep);
+   if (!ac)
+   return;
+
spin_lock(>node[node]->list_lock);
free_block(cachep, ac->entry, ac->avail, node);
spin_unlock(>node[node]->list_lock);
ac->avail = 0;
+
+   if (memcg_cache_dead(cachep)) {
+   cachep->array[smp_processor_id()] = NULL;
+   kfree(ac);
+   }
 }
 
 static void drain_cpu_caches(struct kmem_cache *cachep)
@@ -3368,7 +3379,8 @@ static void free_block(struct kmem_cache *cachep, void 
**objpp, int nr_objects,
 
/* fixup slab chains */
if (page->active == 0) {
-   if (n->free_objects > n->free_limit) {
+   if (n->free_objects > n->free_limit ||
+   memcg_cache_dead(cachep)) {
n->free_objects -= cachep->num;
/* No need to drop any previously held
 * lock here, even if we have a off-slab slab
@@ -3462,6 +3474,17 @@ static inline void __cache_free(struct kmem_cache 
*cachep, void *objp,
 
kmemcheck_slab_free(cachep, objp, cachep->object_size);
 
+#ifdef CONFIG_MEMCG_KMEM
+   if (unlikely(!ac)) {
+   int nodeid = page_to_nid(virt_to_page(objp));
+
+   spin_lock(>node[nodeid]->list_lock);
+   free_block(cachep, , 1, nodeid);
+   spin_unlock(>node[nodeid]->list_lock);
+   return;
+   }
+#endif
+
/*
 * Skip calling cache_free_alien() when the platform is not numa.
 * This will avoid cache misses that happen while accessing slabp (which
@@ -3803,6 +3826,9 @@ static int __do_tune_cpucache(struct kmem_cache *cachep, 
int limit,
struct ccupdate_struct *new;
int i;
 
+   if (memcg_cache_dead(cachep))
+   return 0;
+
new = kzalloc(sizeof(*new) + nr_cpu_ids * sizeof(struct array_cache *),
  gfp);
if (!new)
@@ -3988,6 +4014,9 @@ static void cache_reap(struct work_struct *w)
list_for_each_entry(searchp, _caches, list) {
check_irq_on();
 
+   if (memcg_cache_dead(searchp))
+   continue;
+
/*
 * We only take the node lock if absolutely necessary and we
 * have established with reasonable certainty that
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -mm v3 8/8] slab: do not keep free objects/slabs on dead memcg caches

2014-06-12 Thread Vladimir Davydov
Since a dead memcg cache is destroyed only after the last slab allocated
to it is freed, we must disable caching of free objects/slabs for such
caches, otherwise they will be hanging around forever.

For SLAB that means we must disable per cpu free object arrays and make
free_block always discard empty slabs irrespective of node's free_limit.

To disable per cpu arrays, we free them on kmem_cache_shrink (see
drain_cpu_caches - do_drain) and make __cache_free fall back to
free_block if there is no per cpu array. Also, we have to disable
allocation of per cpu arrays on cpu hotplug for dead caches (see
cpuup_prepare, __do_tune_cpucache).

After we disabled free objects/slabs caching, there is no need to reap
those caches periodically. Moreover, it will only result in slowdown. So
we also make cache_reap skip then.

Signed-off-by: Vladimir Davydov vdavy...@parallels.com
---
 mm/slab.c |   31 ++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/mm/slab.c b/mm/slab.c
index b3af82419251..7e91f5f1341d 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1210,6 +1210,9 @@ static int cpuup_prepare(long cpu)
struct array_cache *shared = NULL;
struct array_cache **alien = NULL;
 
+   if (memcg_cache_dead(cachep))
+   continue;
+
nc = alloc_arraycache(node, cachep-limit,
cachep-batchcount, GFP_KERNEL);
if (!nc)
@@ -2411,10 +2414,18 @@ static void do_drain(void *arg)
 
check_irq_off();
ac = cpu_cache_get(cachep);
+   if (!ac)
+   return;
+
spin_lock(cachep-node[node]-list_lock);
free_block(cachep, ac-entry, ac-avail, node);
spin_unlock(cachep-node[node]-list_lock);
ac-avail = 0;
+
+   if (memcg_cache_dead(cachep)) {
+   cachep-array[smp_processor_id()] = NULL;
+   kfree(ac);
+   }
 }
 
 static void drain_cpu_caches(struct kmem_cache *cachep)
@@ -3368,7 +3379,8 @@ static void free_block(struct kmem_cache *cachep, void 
**objpp, int nr_objects,
 
/* fixup slab chains */
if (page-active == 0) {
-   if (n-free_objects  n-free_limit) {
+   if (n-free_objects  n-free_limit ||
+   memcg_cache_dead(cachep)) {
n-free_objects -= cachep-num;
/* No need to drop any previously held
 * lock here, even if we have a off-slab slab
@@ -3462,6 +3474,17 @@ static inline void __cache_free(struct kmem_cache 
*cachep, void *objp,
 
kmemcheck_slab_free(cachep, objp, cachep-object_size);
 
+#ifdef CONFIG_MEMCG_KMEM
+   if (unlikely(!ac)) {
+   int nodeid = page_to_nid(virt_to_page(objp));
+
+   spin_lock(cachep-node[nodeid]-list_lock);
+   free_block(cachep, objp, 1, nodeid);
+   spin_unlock(cachep-node[nodeid]-list_lock);
+   return;
+   }
+#endif
+
/*
 * Skip calling cache_free_alien() when the platform is not numa.
 * This will avoid cache misses that happen while accessing slabp (which
@@ -3803,6 +3826,9 @@ static int __do_tune_cpucache(struct kmem_cache *cachep, 
int limit,
struct ccupdate_struct *new;
int i;
 
+   if (memcg_cache_dead(cachep))
+   return 0;
+
new = kzalloc(sizeof(*new) + nr_cpu_ids * sizeof(struct array_cache *),
  gfp);
if (!new)
@@ -3988,6 +4014,9 @@ static void cache_reap(struct work_struct *w)
list_for_each_entry(searchp, slab_caches, list) {
check_irq_on();
 
+   if (memcg_cache_dead(searchp))
+   continue;
+
/*
 * We only take the node lock if absolutely necessary and we
 * have established with reasonable certainty that
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -mm v3 8/8] slab: do not keep free objects/slabs on dead memcg caches

2014-06-12 Thread Vladimir Davydov
On Fri, Jun 13, 2014 at 12:38:22AM +0400, Vladimir Davydov wrote:
 Since a dead memcg cache is destroyed only after the last slab allocated
 to it is freed, we must disable caching of free objects/slabs for such
 caches, otherwise they will be hanging around forever.
 
 For SLAB that means we must disable per cpu free object arrays and make
 free_block always discard empty slabs irrespective of node's free_limit.

An alternative to this could be making cache_reap, which drains per cpu
arrays and drops free slabs periodically for all caches, shrink dead
caches aggressively. The patch doing this is attached.

This approach has its pros and cons comparing to disabling per cpu
arrays.

Pros:
 - Less intrusive: it only requires modification of cache_reap.
 - Doesn't impact performance: free path isn't touched.

Cons:
 - Delays dead cache destruction: lag between the last object is freed
   and the cache is destroyed isn't constant. It depends on the number
   of kmem-active memcgs and the number of dead caches (the more of
   them, the longer it'll take to shrink dead caches). Also, on NUMA
   machines the upper bound will be proportional to the number of NUMA
   nodes, because alien caches are reaped one at a time (see
   reap_alien).
 - If there are a lot of dead caches, periodic shrinking will be slowed
   down even for active caches (see cache_reap).

--

diff --git a/mm/slab.c b/mm/slab.c
index 9ca3b87edabc..811fdb214b9e 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3980,6 +3980,11 @@ static void cache_reap(struct work_struct *w)
goto out;
 
list_for_each_entry(searchp, slab_caches, list) {
+   int force = 0;
+
+   if (memcg_cache_dead(searchp))
+   force = 1;
+
check_irq_on();
 
/*
@@ -3991,7 +3996,7 @@ static void cache_reap(struct work_struct *w)
 
reap_alien(searchp, n);
 
-   drain_array(searchp, n, cpu_cache_get(searchp), 0, node);
+   drain_array(searchp, n, cpu_cache_get(searchp), force, node);
 
/*
 * These are racy checks but it does not matter
@@ -4002,15 +4007,17 @@ static void cache_reap(struct work_struct *w)
 
n-next_reap = jiffies + REAPTIMEOUT_NODE;
 
-   drain_array(searchp, n, n-shared, 0, node);
+   drain_array(searchp, n, n-shared, force, node);
 
if (n-free_touched)
n-free_touched = 0;
else {
-   int freed;
+   int freed, tofree;
+
+   tofree = force ? slabs_tofree(searchp, n) :
+   DIV_ROUND_UP(n-free_limit, 5 * searchp-num);
 
-   freed = drain_freelist(searchp, n, (n-free_limit +
-   5 * searchp-num - 1) / (5 * searchp-num));
+   freed = drain_freelist(searchp, n, tofree);
STATS_ADD_REAPED(searchp, freed);
}
 next:
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/