Thank you for the answer.
On Oct 19, 2012, at 6:24 PM, Peter Zijlstra wrote:
its a quick hack similar to existing hacks done for rt, preferably we'd
do smarter things though.
If you have any ideas how to fix this in a better way, please share.
--
To unsubscribe from this list: send the line
If several tasks in different cpu cgroups are contending for the same resource
(e.g. a semaphore) and one of those task groups is cpu limited (using cfs
bandwidth control), the priority inversion problem is likely to arise: if a cpu
limited task goes to sleep holding the resource (e.g. trying to
There is an error in the test script: I forgot to initialize cpuset.mems of
test cgroups - without it it is impossible to add a task into a cpuset cgroup.
Sorry for that.
Fixed version of the test script is attached.
On Oct 18, 2012, at 11:32 AM, Vladimir Davydov wrote:
If several tasks
# modprobe nf_conntrack
# time modprobe -r nf_conntrack
real 0m10.337s
user 0m0.000s
sys0m0.376s
with the patch
# modprobe nf_conntrack
# time modprobe -r nf_conntrack
real0m5.661s
user0m0.000s
sys 0m0.216s
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Patrick
mnt_drop_write() must be called only if mnt_want_write() succeeded,
otherwise the mnt_writers counter will diverge.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Doug Ledford dledf...@redhat.com
Cc: Andrew Morton a...@linux-foundation.org
Cc: KOSAKI Motohiro kosaki.motoh
On Mar 20, 2013, at 1:09 AM, Andrew Morton a...@linux-foundation.org
wrote:
On Tue, 19 Mar 2013 13:31:18 +0400 Vladimir Davydov vdavy...@parallels.com
wrote:
mnt_drop_write() must be called only if mnt_want_write() succeeded,
otherwise the mnt_writers counter will diverge
above.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
---
kernel/sched/core.c |2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 26058d0..c7a078f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7686,7
On Feb 8, 2013, at 6:46 PM, Paul Turner p...@google.com wrote:
On Fri, Feb 08, 2013 at 11:10:46AM +0400, Vladimir Davydov wrote:
If cfs_rq-runtime_remaining is = 0 then either
- cfs_rq is throttled and waiting for quota redistribution, or
- cfs_rq is currently executing and will be throttled
On Feb 8, 2013, at 7:26 PM, Vladimir Davydov vdavy...@parallels.com wrote:
On Feb 8, 2013, at 6:46 PM, Paul Turner p...@google.com wrote:
On Fri, Feb 08, 2013 at 11:10:46AM +0400, Vladimir Davydov wrote:
If cfs_rq-runtime_remaining is = 0 then either
- cfs_rq is throttled and waiting
.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
---
block/blk-exec.c |4 ++--
block/blk-flush.c |2 +-
block/blk-lib.c |6 +++---
3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/block/blk-exec.c b/block/blk-exec.c
index 74638ec..f634de7 100644
--- a/block/blk
accounting when the
completion struct is actually used for waiting for IO (e.g. completion
of a bio request in the block layer).
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
---
include/linux/completion.h |3 ++
kernel/sched/core.c| 57
The patch introduces nf_conntrack_cleanup_list(), which cleanups
nf_conntracks for a list of netns and calls synchronize_net() only
once for them all.
---
include/net/netfilter/nf_conntrack_core.h | 10 +-
net/netfilter/nf_conntrack_core.c | 21 +
It is more convenient to write 'clearcpuid=147,148,...' than
'clearcpuid=147 clearcpuid=148 ...'
---
arch/x86/kernel/cpu/common.c |8
1 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 6b9333b..8ffe1b9
If 'clearcpuid=N' is specified in boot options, CPU feature #N won't be
reported in /proc/cpuinfo and used by the kernel. However, if a
userpsace process checks CPU features directly using the cpuid
instruction, it will be reported about all features supported by the CPU
irrespective of what
On Jul 20, 2012, at 9:20 PM, H. Peter Anvin wrote:
On 07/20/2012 09:37 AM, Vladimir Davydov wrote:
If 'clearcpuid=N' is specified in boot options, CPU feature #N won't be
reported in /proc/cpuinfo and used by the kernel. However, if a
userpsace process checks CPU features directly using
On Jul 21, 2012, at 12:19 AM, H. Peter Anvin wrote:
On 07/20/2012 11:21 AM, Vladimir Davydov wrote:
I am a bit concerned about this patch:
1. it silently changes existing behavior.
Yes, but who needs the current implementation of 'clearcpuid' which,
in fact, just hides flags in /proc
On 07/21/2012 02:37 PM, Borislav Petkov wrote:
(+ Andre who's been doing some cross vendor stuff)
On Fri, Jul 20, 2012 at 08:37:33PM +0400, Vladimir Davydov wrote:
If 'clearcpuid=N' is specified in boot options, CPU feature #N won't be
reported in /proc/cpuinfo and used by the kernel. However
On 07/24/2012 12:14 PM, Andre Przywara wrote:
On 07/24/2012 09:06 AM, Vladimir Davydov wrote:
On 07/21/2012 02:37 PM, Borislav Petkov wrote:
(+ Andre who's been doing some cross vendor stuff)
On Fri, Jul 20, 2012 at 08:37:33PM +0400, Vladimir Davydov wrote:
If 'clearcpuid=N' is specified
On 07/24/2012 02:10 PM, Borislav Petkov wrote:
On Tue, Jul 24, 2012 at 12:29:19PM +0400, Vladimir Davydov wrote:
I guess that when the more advanced features become widely-used,
vendors will offer new MSRs and/or CPUID faulting.
And this right there is the dealbreaker:
So what are you doing
On 07/25/2012 04:57 AM, H. Peter Anvin wrote:
On 07/24/2012 04:09 AM, Vladimir Davydov wrote:
We have not encountered this situation in our environments and I hope we
won't :-)
But look, these CPUID functions cover majority of CPU features, don't
they? So, most of normal apps inside VM
On 07/24/2012 04:34 PM, Andre Przywara wrote:
On 07/24/2012 01:09 PM, Vladimir Davydov wrote:
On 07/24/2012 02:10 PM, Borislav Petkov wrote:
On Tue, Jul 24, 2012 at 12:29:19PM +0400, Vladimir Davydov wrote:
I guess that when the more advanced features become widely-used,
vendors will offer
On 07/24/2012 04:44 PM, Alan Cox wrote:
This approach does not need any kernel support (except for the
/proc/cpuinfo filtering). Does this address the issues you have?
You can do the /proc/cpuinfo filtering in user space too
How?
--
To unsubscribe from this list: send the line unsubscribe
On 07/25/2012 02:58 PM, Andre Przywara wrote:
On 07/25/2012 12:31 PM, Vladimir Davydov wrote:
On 07/24/2012 04:44 PM, Alan Cox wrote:
This approach does not need any kernel support (except for the
/proc/cpuinfo filtering). Does this address the issues you have?
You can do the /proc/cpuinfo
On 07/25/2012 02:43 PM, Borislav Petkov wrote:
On Wed, Jul 25, 2012 at 02:31:23PM +0400, Vladimir Davydov wrote:
So, you prefer adding some filtering of /proc/cpuinfo into the
mainstream kernel
That's already there right? And your 1/2 patch was making toggling those
bits easier.
(not now
On 07/25/2012 03:17 PM, Andre Przywara wrote:
On 07/25/2012 01:02 PM, Vladimir Davydov wrote:
On 07/25/2012 02:58 PM, Andre Przywara wrote:
On 07/25/2012 12:31 PM, Vladimir Davydov wrote:
On 07/24/2012 04:44 PM, Alan Cox wrote:
This approach does not need any kernel support (except
On 07/25/2012 03:31 PM, Alan Cox wrote:
On Wed, 25 Jul 2012 14:31:30 +0400
Vladimir Davydovvdavy...@parallels.com wrote:
On 07/24/2012 04:44 PM, Alan Cox wrote:
This approach does not need any kernel support (except for the
/proc/cpuinfo filtering). Does this address the issues you have?
On 07/25/2012 04:57 AM, H. Peter Anvin wrote:
On 07/24/2012 04:09 AM, Vladimir Davydov wrote:
We have not encountered this situation in our environments and I hope we
won't :-)
But look, these CPUID functions cover majority of CPU features, don't
they? So, most of normal apps inside VM
On 07/20/2012 09:10 PM, Andi Kleen wrote:
+ unsigned int *msr_ext_cpuid_mask)
+{
+ unsigned int msr, msr_ext;
+
+ msr = msr_ext = 0;
+
+ switch (c-x86_model) {
You have to check the family too.
+
+ return msr;
+}
+
+static void
loads,
which would probably lead to the cpuidle governor making wrong decisions due to
overestimating the system load.
So, this seems to be another reason to use some different performance
multiplier in cpuidle governor.
On Jun 4, 2012, at 2:24 PM, Vladimir Davydov vdavy...@parallels.com wrote
Hi,
We want to propose a way to upgrade a kernel on a machine without
restarting all the user-space services. This is to be done with CRIU
project, but we need help from the kernel to preserve some data in
memory while doing kexec.
The key point of our implementation is leaving process memory
On 07/27/2013 07:41 PM, Marco Stornelli wrote:
Il 26/07/2013 14:29, Vladimir Davydov ha scritto:
Hi,
We want to propose a way to upgrade a kernel on a machine without
restarting all the user-space services. This is to be done with CRIU
project, but we need help from the kernel to preserve some
On 07/27/2013 09:37 PM, Marco Stornelli wrote:
Il 27/07/2013 19:35, Vladimir Davydov ha scritto:
On 07/27/2013 07:41 PM, Marco Stornelli wrote:
Il 26/07/2013 14:29, Vladimir Davydov ha scritto:
Hi,
We want to propose a way to upgrade a kernel on a machine without
restarting all the user
On 07/28/2013 03:02 PM, Marco Stornelli wrote:
Il 28/07/2013 12:05, Vladimir Davydov ha scritto:
On 07/27/2013 09:37 PM, Marco Stornelli wrote:
Il 27/07/2013 19:35, Vladimir Davydov ha scritto:
On 07/27/2013 07:41 PM, Marco Stornelli wrote:
Il 26/07/2013 14:29, Vladimir Davydov ha scritto
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
---
kernel/sched/fair.c | 56 ++
kernel/sched/sched.h |7 +++
2 files changed, 28 insertions(+), 35 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index f77f9c5
On 07/15/2013 12:28 PM, Peter Zijlstra wrote:
OK, fair enough. It does somewhat rely on us getting the single
rq-clock update thing right, but that should be ok.
Frankly, I doubt that rq-clock is the right thing to use here, because
it can be updated very frequently under some conditions, so
Changes in v2:
* use jiffies instead of rq-clock for last_h_load_update.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
---
kernel/sched/fair.c | 58 +++---
kernel/sched/sched.h |7 +++---
2 files changed, 30 insertions(+), 35 deletions
runnable tasks there instead. Fix it.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
---
kernel/sched/fair.c |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 9b3fe1c..13abc29 100644
--- a/kernel/sched/fair.c
+++ b/kernel
can be caught by running 2*N cpuhogs pinned to two logical cpus
belonging to different cores on an HT-enabled machine with N logical
cpus: just look at se.nr_migrations growth.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
---
kernel/sched/fair.c |4 ++--
1 file changed, 2 insertions
to two logical cpus
belonging to different cores on an HT-enabled machine with N logical
cpus: just look at se.nr_migrations growth.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
---
kernel/sched/fair.c |3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/sched
Currently new_dst_cpu is prevented from being reselected actually, not
dst_cpu. This can result in attempting to pull tasks to this_cpu twice.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
---
kernel/sched/fair.c |6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git
handling 'some
pinned' case when pulling tasks from a new busiest cpu.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
---
kernel/sched/fair.c | 12 ++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index cd59640..d840e51
On 09/16/2013 09:52 AM, Peter Zijlstra wrote:
On Sun, Sep 15, 2013 at 05:49:13PM +0400, Vladimir Davydov wrote:
In busiest-group_imb case we can come to calculate_imbalance() with
local-avg_load = busiest-avg_load = sds-avg_load. This can result
in imbalance overflow, because it is calculated
On 09/16/2013 09:43 AM, Peter Zijlstra wrote:
On Sun, Sep 15, 2013 at 09:30:14PM +0400, Vladimir Davydov wrote:
Firstly, reset env.dst_cpu/dst_rq to this_cpu/this_rq, because it could
have changed in 'some pinned' case. Otherwise, should_we_balance() can
stop balancing beforehand.
Secondly
-by: Vladimir Davydov vdavy...@parallels.com
Cc: Tushar Dave tushar.n.d...@intel.com
Cc: Patrick McHardy ka...@trash.net
Cc: David S. Miller da...@davemloft.net
---
drivers/net/ethernet/intel/e1000/e1000.h |2 --
drivers/net/ethernet/intel/e1000/e1000_main.c | 36 +++--
2
moves cancel_delayed_work_sync(watchdog_task) at the beginning
of e1000_down_and_stop() thus ensuring the race is impossible.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Tushar Dave tushar.n.d...@intel.com
Cc: Patrick McHardy ka...@trash.net
Cc: David S. Miller da...@davemloft.net
From: Glauber Costa glom...@openvz.org
I need to move this up a bit, and I am doing it in a separate patch just to
reduce churn in the patch that needs it.
Signed-off-by: Glauber Costa glom...@openvz.org
Cc: Johannes Weiner han...@cmpxchg.org
Cc: Michal Hocko mho...@suse.cz
Cc: Hugh Dickins
From: Glauber Costa glom...@openvz.org
I caught myself doing something like the following outside memcg core:
memcg_id = -1;
if (memcg memcg_kmem_is_active(memcg))
memcg_id = memcg_cache_id(memcg);
to be able to handle all possible memcgs in a sane manner. In
From: Glauber Costa glom...@openvz.org
Each caller of memcg_cache_id ends up sanitizing its parameters in its own way.
Now that the memcg_cache_id itself is more robust, we can consolidate this.
Also, as suggested by Michal, a special helper memcg_cache_idx is used when the
result is expected to
From: Glauber Costa glom...@openvz.org
When we delete kmem-enabled memcgs, they can still be zombieing
around for a while. The reason is that the objects may still be alive,
and we won't be able to delete them at destruction time.
The only entry point for that, though, are the shrinkers. The
From: Glauber Costa glom...@openvz.org
During the past weeks, it became clear to us that the shrinker interface
we have right now works very well for some particular types of users,
but not that well for others. The later are usually people interested in
one-shot notifications, that were forced
From: Glauber Costa glom...@openvz.org
When a memcg is destroyed, it won't be imediately released until all
objects are gone. This means that if a memcg is restarted with the very
same workload - a very common case, the objects already cached won't be
billed to the new memcg. This is mostly
and inode, which seems to be too costly.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Glauber Costa glom...@openvz.org
Cc: Dave Chinner dchin...@redhat.com
Cc: Mel Gorman mgor...@suse.de
Cc: Rik van Riel r...@redhat.com
Cc: Johannes Weiner han...@cmpxchg.org
Cc: Michal Hocko mho...@suse.cz
the pointer to the
appropriate list_lru object from a memcg or a kmem ptr, which should be
further operated with conventional list_lru methods.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Glauber Costa glom...@openvz.org
Cc: Dave Chinner dchin...@redhat.com
Cc: Mel Gorman mgor...@suse.de
Cc
From: Glauber Costa glom...@openvz.org
The userspace memory limit can be freely resized down. Upon attempt,
reclaim will be called to flush the pages away until we either reach the
limit we want or give up.
It wasn't possible so far with the kmem limit, since we had no way to
shrink the kmem
list_lru_walk(), but
shrink_dcache_sb(), which is going to be the only user of this function,
does not need it.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Glauber Costa glom...@openvz.org
Cc: Dave Chinner dchin...@redhat.com
Cc: Mel Gorman mgor...@suse.de
Cc: Rik van Riel r...@redhat.com
Cc
From: Glauber Costa glom...@openvz.org
In very low free kernel memory situations, it may be the case that we
have less objects to free than our initial batch size. If this is the
case, it is better to shrink those, and open space for the new workload
then to keep them and fail the new
From: Glauber Costa glom...@openvz.org
If the kernel limit is smaller than the user limit, we will have
situations in which our allocations fail but freeing user pages will buy
us nothing. In those, we would like to call a specialized memcg
reclaimer that only frees kernel memory and leave the
From: Glauber Costa glom...@openvz.org
Those structures are only used for memcgs that are effectively using
kmemcg. However, in a later patch I intend to use scan that list
inconditionally (list empty meaning no kmem caches present), which
simplifies the code a lot.
So move the initialization to
From: Glauber Costa glom...@openvz.org
Without the surrounding infrastructure, this patch is a bit of a hammer:
it will basically shrink objects from all memcgs under memcg pressure.
At least, however, we will keep the scan limited to the shrinkers marked
as per-memcg.
Future patches will
From: Glauber Costa glom...@openvz.org
When reaching shrink_slab, we should descent in children memcg searching
for objects that could be shrunk. This is true even if the memcg does
not have kmem limits on, since the kmem res_counter will also be billed
against the user res_counter of the parent.
memcg: reap dead memcgs upon global memory pressure
memcg: flush memcg items upon memcg destruction
Vladimir Davydov (3):
memcg,list_lru: add per-memcg LRU list infrastructure
memcg,list_lru: add function walking over all lists of a per-memcg
LRU
super: make icache, dcache shrinkers
Hi,
Thank you for the review. I agree with all your comments and I'll resend
the fixed version soon.
If anyone still has something to say about the patchset, I'd be glad to
hear from them.
On 11/25/2013 09:41 PM, Johannes Weiner wrote:
I ran out of steam reviewing these because there were
On 11/26/2013 10:47 AM, Vladimir Davydov wrote:
Hi,
Thank you for the review. I agree with all your comments and I'll
resend the fixed version soon.
If anyone still has something to say about the patchset, I'd be glad
to hear from them.
On 11/25/2013 09:41 PM, Johannes Weiner wrote:
I
On 11/27/2013 02:47 AM, Dave Chinner wrote:
On Tue, Nov 26, 2013 at 10:47:00AM +0400, Vladimir Davydov wrote:
Hi,
Thank you for the review. I agree with all your comments and I'll
resend the fixed version soon.
If anyone still has something to say about the patchset, I'd be glad
to hear from
-by: Vladimir Davydov vdavy...@parallels.com
Cc: Johannes Weiner han...@cmpxchg.org
Cc: Michal Hocko mho...@suse.cz
Cc: Balbir Singh bsinghar...@gmail.com
Cc: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com
---
mm/memcontrol.c |3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git
This function is not used outside of memcontrol.c so make it static.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Johannes Weiner han...@cmpxchg.org
Cc: Michal Hocko mho...@suse.cz
Cc: Balbir Singh bsinghar...@gmail.com
Cc: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com
---
mm
On 11/29/2013 01:45 PM, Michal Hocko wrote:
On Wed 27-11-13 19:46:01, Vladimir Davydov wrote:
We should start kmem accounting for a memory cgroup only after both its
kmem limit is set (KMEM_ACCOUNTED_ACTIVE) and related call sites are
patched (KMEM_ACCOUNTED_ACTIVATED).
This should be vice
to early kmem creation.
Signed-off-by: Glauber Costa glom...@openvz.org
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Johannes Weiner han...@cmpxchg.org
Cc: Michal Hocko mho...@suse.cz
Cc: Balbir Singh bsinghar...@gmail.com
Cc: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com
---
mm
the nr_deferred per-shrinker counter to avoid
memory cgroup isolation issues. Ideally, this counter should be made
per-memcg.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Johannes Weiner han...@cmpxchg.org
Cc: Michal Hocko mho...@suse.cz
Cc: Dave Chinner dchin...@redhat.com
Cc: Andrew Morton
to be used directly as an array index to make sure we never
accesses in a negative index.
Signed-off-by: Glauber Costa glom...@openvz.org
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Johannes Weiner han...@cmpxchg.org
Cc: Michal Hocko mho...@suse.cz
Cc: Balbir Singh bsinghar
to shrink_zones(). So let's move
shrink_control initialization to shrink_zones().
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Johannes Weiner han...@cmpxchg.org
Cc: Michal Hocko mho...@suse.cz
Cc: Andrew Morton a...@linux-foundation.org
Cc: Mel Gorman mgor...@suse.de
Cc: Rik van Riel
the
limit we want or give up.
Signed-off-by: Glauber Costa glom...@openvz.org
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Johannes Weiner han...@cmpxchg.org
Cc: Michal Hocko mho...@suse.cz
Cc: Balbir Singh bsinghar...@gmail.com
Cc: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com
---
mm
From: Glauber Costa glom...@openvz.org
During the past weeks, it became clear to us that the shrinker interface
we have right now works very well for some particular types of users,
but not that well for others. The latter are usually people interested in
one-shot notifications, that were forced
, we
have no option rather than failing all GFP_NOFS allocations when we are
close to the kmem limit. The best thing we can do in such a situation is
to spawn the reclaimer in a background process hoping next allocations
will succeed.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc
the pointer to the
appropriate list_lru object from a memcg or a kmem ptr, which should be
further operated with conventional list_lru methods.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Johannes Weiner han...@cmpxchg.org
Cc: Michal Hocko mho...@suse.cz
Cc: Dave Chinner dchin...@redhat.com
-aware shrinker.
I would appreciate if you could look at the new version and share your
attitude toward it.
Thank you.
On 12/02/2013 03:19 PM, Vladimir Davydov wrote:
Hi,
This is the 12th iteration of Glauber Costa's patchset implementing targeted
shrinking for memory cgroups when kmem
list_lru_walk(), but
shrink_dcache_sb(), which is going to be the only user of this function,
does not need it.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Johannes Weiner han...@cmpxchg.org
Cc: Michal Hocko mho...@suse.cz
Cc: Dave Chinner dchin...@redhat.com
Cc: Andrew Morton a...@linux
and inode, which seems to be too costly.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Johannes Weiner han...@cmpxchg.org
Cc: Michal Hocko mho...@suse.cz
Cc: Dave Chinner dchin...@redhat.com
Cc: Andrew Morton a...@linux-foundation.org
Cc: Al Viro v...@zeniv.linux.org.uk
Cc: Balbir Singh
assume that a memcg that goes away most likely indicates an
isolated workload that is terminated.
Signed-off-by: Glauber Costa glom...@openvz.org
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Johannes Weiner han...@cmpxchg.org
Cc: Michal Hocko mho...@suse.cz
Cc: Balbir Singh bsinghar
memcg: reap dead memcgs upon global memory pressure
memcg: flush memcg items upon memcg destruction
Vladimir Davydov (11):
memcg: move several kmemcg functions upper
fs: do not use destroy_super() in alloc_super() fail path
vmscan: rename shrink_slab() args to make it more generic
vmscan
From: Glauber Costa glom...@openvz.org
When we delete kmem-enabled memcgs, they can still be zombieing
around for a while. The reason is that the objects may still be alive,
and we won't be able to delete them at destruction time.
The only entry point for that, though, are the shrinkers. The
with the direct reclaim case for memcg.
Although this same technique can be applied to other situations just as
well, we will start conservative and apply it for that case, which is
the one that matters the most.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Johannes Weiner han
, we will have to make up phony values for
nr_pages_scanned and lru_pages again when doing kmem-only reclaim for a
memory cgroup, which is possible if the cgroup has its kmem limit less
than the total memory limit.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Johannes Weiner han
.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Johannes Weiner han...@cmpxchg.org
Cc: Michal Hocko mho...@suse.cz
Cc: Balbir Singh bsinghar...@gmail.com
Cc: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com
---
mm/memcontrol.c | 92
This reduces the indentation level of do_try_to_free_pages() and removes
extra loop over all eligible zones counting the number of on-LRU pages.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Johannes Weiner han...@cmpxchg.org
Cc: Michal Hocko mho...@suse.cz
Cc: Andrew Morton
memcg_cache_id and make sure it always return a meaningful value.
Signed-off-by: Glauber Costa glom...@openvz.org
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Johannes Weiner han...@cmpxchg.org
Cc: Michal Hocko mho...@suse.cz
Cc: Balbir Singh bsinghar...@gmail.com
Cc: KAMEZAWA Hiroyuki
inline appropriate snippets from destroy_super() to alloc_super() fail
path instead of using the whole function there.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Al Viro v...@zeniv.linux.org.uk
---
fs/super.c |9 +++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff
() is called under the set_limit_mutex, but the
leftover from the above-mentioned commit is still here. Let's remove it.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Johannes Weiner han...@cmpxchg.org
Cc: Michal Hocko mho...@suse.cz
Cc: Balbir Singh bsinghar...@gmail.com
Cc: KAMEZAWA Hiroyuki
On 12/02/2013 10:26 PM, Glauber Costa wrote:
On Mon, Dec 2, 2013 at 10:15 PM, Michal Hocko mho...@suse.cz wrote:
[CCing Glauber - please do so in other posts for kmem related changes]
On Mon 02-12-13 17:08:13, Vladimir Davydov wrote:
The KMEM_ACCOUNTED_ACTIVATED was introduced by commit
On 12/02/2013 10:15 PM, Michal Hocko wrote:
[CCing Glauber - please do so in other posts for kmem related changes]
On Mon 02-12-13 17:08:13, Vladimir Davydov wrote:
The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b (memcg:
use static branches when code not in use) in order
On 12/03/2013 11:56 AM, Glauber Costa wrote:
On Mon, Dec 2, 2013 at 11:21 PM, Vladimir Davydov
vdavy...@parallels.com wrote:
On 12/02/2013 10:26 PM, Glauber Costa wrote:
On Mon, Dec 2, 2013 at 10:15 PM, Michal Hocko mho...@suse.cz wrote:
[CCing Glauber - please do so in other posts for kmem
On 12/03/2013 01:00 PM, Dave Chinner wrote:
On Mon, Dec 02, 2013 at 03:19:40PM +0400, Vladimir Davydov wrote:
Using destroy_super() in alloc_super() fail path is bad, because:
* It will trigger WARN_ON(!list_empty(s-s_mounts)) since s_mounts is
initialized after several 'goto fail's.
So
On 12/03/2013 01:33 PM, Dave Chinner wrote:
kmemcg reclaim is introduced, we will have to make up phony values for
nr_pages_scanned and lru_pages again when doing kmem-only reclaim for a
memory cgroup, which is possible if the cgroup has its kmem limit less
than the total memory limit.
I'm
On 12/03/2013 02:48 PM, Dave Chinner wrote:
@@ -236,11 +236,17 @@ shrink_slab_node(struct shrink_control *shrinkctl,
struct shrinker *shrinker,
return 0;
/*
- * copy the current shrinker scan count into a local variable
- * and zero it so that other concurrent
On 12/03/2013 03:18 PM, Dave Chinner wrote:
On Mon, Dec 02, 2013 at 03:19:45PM +0400, Vladimir Davydov wrote:
FS-shrinkers, which shrink dcaches and icaches, keep dentries and inodes
in list_lru structures in order to evict least recently used objects.
With per-memcg kmem shrinking
On 12/03/2013 03:45 PM, Dave Chinner wrote:
On Mon, Dec 02, 2013 at 03:19:47PM +0400, Vladimir Davydov wrote:
Using the per-memcg LRU infrastructure introduced by previous patches,
this patch makes dcache and icache shrinkers memcg-aware. To achieve
that, it converts s_dentry_lru
On 12/03/2013 05:37 PM, Al Viro wrote:
On Tue, Dec 03, 2013 at 01:23:01PM +0400, Vladimir Davydov wrote:
Actually, I'm not going to modify the list_lru structure, because I
think it's good as it is. I'd like to substitute it with a new
structure, memcg_list_lru, only in those places where
On 12/04/2013 08:51 AM, Dave Chinner wrote:
On Tue, Dec 03, 2013 at 04:15:57PM +0400, Vladimir Davydov wrote:
On 12/03/2013 02:48 PM, Dave Chinner wrote:
@@ -236,11 +236,17 @@ shrink_slab_node(struct shrink_control *shrinkctl,
struct shrinker *shrinker,
return 0
On 12/04/2013 02:38 AM, Glauber Costa wrote:
In memcg_update_kmem_limit() we do the whole process of limit
initialization under a mutex so the situation we need protection from in
tcp_update_limit() is impossible. BTW once set, the 'activated' flag is
never cleared and never checked alone,
On fail path alloc_super() calls destroy_super(), which issues a warning
if list_empty() returns false on the s_mounts field. That said s_mounts
should be initialized in alloc_super() before any possible failure.
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Al Viro v
1 - 100 of 2986 matches
Mail list logo