Re: [PATCH RFC] sched: boost throttled entities on wakeups

2012-10-19 Thread Vladimir Davydov
Thank you for the answer. On Oct 19, 2012, at 6:24 PM, Peter Zijlstra wrote: its a quick hack similar to existing hacks done for rt, preferably we'd do smarter things though. If you have any ideas how to fix this in a better way, please share. -- To unsubscribe from this list: send the line

[PATCH RFC] sched: boost throttled entities on wakeups

2012-10-18 Thread Vladimir Davydov
If several tasks in different cpu cgroups are contending for the same resource (e.g. a semaphore) and one of those task groups is cpu limited (using cfs bandwidth control), the priority inversion problem is likely to arise: if a cpu limited task goes to sleep holding the resource (e.g. trying to

Re: [Devel] [PATCH RFC] sched: boost throttled entities on wakeups

2012-10-18 Thread Vladimir Davydov
There is an error in the test script: I forgot to initialize cpuset.mems of test cgroups - without it it is impossible to add a task into a cpuset cgroup. Sorry for that. Fixed version of the test script is attached. On Oct 18, 2012, at 11:32 AM, Vladimir Davydov wrote: If several tasks

[PATCH] netfilter: nf_conntrack: Batch cleanup

2013-03-14 Thread Vladimir Davydov
# modprobe nf_conntrack # time modprobe -r nf_conntrack real 0m10.337s user 0m0.000s sys0m0.376s with the patch # modprobe nf_conntrack # time modprobe -r nf_conntrack real0m5.661s user0m0.000s sys 0m0.216s Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Patrick

[PATCH] mqueue: sys_mq_open: do not call mnt_drop_write() if read-only

2013-03-19 Thread Vladimir Davydov
mnt_drop_write() must be called only if mnt_want_write() succeeded, otherwise the mnt_writers counter will diverge. Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Doug Ledford dledf...@redhat.com Cc: Andrew Morton a...@linux-foundation.org Cc: KOSAKI Motohiro kosaki.motoh

Re: [PATCH] mqueue: sys_mq_open: do not call mnt_drop_write() if read-only

2013-03-19 Thread Vladimir Davydov
On Mar 20, 2013, at 1:09 AM, Andrew Morton a...@linux-foundation.org wrote: On Tue, 19 Mar 2013 13:31:18 +0400 Vladimir Davydov vdavy...@parallels.com wrote: mnt_drop_write() must be called only if mnt_want_write() succeeded, otherwise the mnt_writers counter will diverge

[PATCH] sched: initialize runtime to non-zero on cfs bw set

2013-02-07 Thread Vladimir Davydov
above. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- kernel/sched/core.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 26058d0..c7a078f 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -7686,7

Re: [PATCH] sched: initialize runtime to non-zero on cfs bw set

2013-02-08 Thread Vladimir Davydov
On Feb 8, 2013, at 6:46 PM, Paul Turner p...@google.com wrote: On Fri, Feb 08, 2013 at 11:10:46AM +0400, Vladimir Davydov wrote: If cfs_rq-runtime_remaining is = 0 then either - cfs_rq is throttled and waiting for quota redistribution, or - cfs_rq is currently executing and will be throttled

Re: [PATCH] sched: initialize runtime to non-zero on cfs bw set

2013-02-08 Thread Vladimir Davydov
On Feb 8, 2013, at 7:26 PM, Vladimir Davydov vdavy...@parallels.com wrote: On Feb 8, 2013, at 6:46 PM, Paul Turner p...@google.com wrote: On Fri, Feb 08, 2013 at 11:10:46AM +0400, Vladimir Davydov wrote: If cfs_rq-runtime_remaining is = 0 then either - cfs_rq is throttled and waiting

[PATCH 2/2] block: account iowait time when waiting for completion of IO request

2013-02-14 Thread Vladimir Davydov
. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- block/blk-exec.c |4 ++-- block/blk-flush.c |2 +- block/blk-lib.c |6 +++--- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/block/blk-exec.c b/block/blk-exec.c index 74638ec..f634de7 100644 --- a/block/blk

[PATCH 1/2] sched: add wait_for_completion_io[_timeout]

2013-02-14 Thread Vladimir Davydov
accounting when the completion struct is actually used for waiting for IO (e.g. completion of a bio request in the block layer). Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- include/linux/completion.h |3 ++ kernel/sched/core.c| 57

[PATCH] net: batch nf_conntrack_net_exit

2012-07-30 Thread Vladimir Davydov
The patch introduces nf_conntrack_cleanup_list(), which cleanups nf_conntracks for a list of netns and calls synchronize_net() only once for them all. --- include/net/netfilter/nf_conntrack_core.h | 10 +- net/netfilter/nf_conntrack_core.c | 21 +

[PATCH 1/2] cpu: common: make clearcpuid option take bits list

2012-07-20 Thread Vladimir Davydov
It is more convenient to write 'clearcpuid=147,148,...' than 'clearcpuid=147 clearcpuid=148 ...' --- arch/x86/kernel/cpu/common.c |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 6b9333b..8ffe1b9

[PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-20 Thread Vladimir Davydov
If 'clearcpuid=N' is specified in boot options, CPU feature #N won't be reported in /proc/cpuinfo and used by the kernel. However, if a userpsace process checks CPU features directly using the cpuid instruction, it will be reported about all features supported by the CPU irrespective of what

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-20 Thread Vladimir Davydov
On Jul 20, 2012, at 9:20 PM, H. Peter Anvin wrote: On 07/20/2012 09:37 AM, Vladimir Davydov wrote: If 'clearcpuid=N' is specified in boot options, CPU feature #N won't be reported in /proc/cpuinfo and used by the kernel. However, if a userpsace process checks CPU features directly using

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-20 Thread Vladimir Davydov
On Jul 21, 2012, at 12:19 AM, H. Peter Anvin wrote: On 07/20/2012 11:21 AM, Vladimir Davydov wrote: I am a bit concerned about this patch: 1. it silently changes existing behavior. Yes, but who needs the current implementation of 'clearcpuid' which, in fact, just hides flags in /proc

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-24 Thread Vladimir Davydov
On 07/21/2012 02:37 PM, Borislav Petkov wrote: (+ Andre who's been doing some cross vendor stuff) On Fri, Jul 20, 2012 at 08:37:33PM +0400, Vladimir Davydov wrote: If 'clearcpuid=N' is specified in boot options, CPU feature #N won't be reported in /proc/cpuinfo and used by the kernel. However

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-24 Thread Vladimir Davydov
On 07/24/2012 12:14 PM, Andre Przywara wrote: On 07/24/2012 09:06 AM, Vladimir Davydov wrote: On 07/21/2012 02:37 PM, Borislav Petkov wrote: (+ Andre who's been doing some cross vendor stuff) On Fri, Jul 20, 2012 at 08:37:33PM +0400, Vladimir Davydov wrote: If 'clearcpuid=N' is specified

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-24 Thread Vladimir Davydov
On 07/24/2012 02:10 PM, Borislav Petkov wrote: On Tue, Jul 24, 2012 at 12:29:19PM +0400, Vladimir Davydov wrote: I guess that when the more advanced features become widely-used, vendors will offer new MSRs and/or CPUID faulting. And this right there is the dealbreaker: So what are you doing

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-25 Thread Vladimir Davydov
On 07/25/2012 04:57 AM, H. Peter Anvin wrote: On 07/24/2012 04:09 AM, Vladimir Davydov wrote: We have not encountered this situation in our environments and I hope we won't :-) But look, these CPUID functions cover majority of CPU features, don't they? So, most of normal apps inside VM

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-25 Thread Vladimir Davydov
On 07/24/2012 04:34 PM, Andre Przywara wrote: On 07/24/2012 01:09 PM, Vladimir Davydov wrote: On 07/24/2012 02:10 PM, Borislav Petkov wrote: On Tue, Jul 24, 2012 at 12:29:19PM +0400, Vladimir Davydov wrote: I guess that when the more advanced features become widely-used, vendors will offer

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-25 Thread Vladimir Davydov
On 07/24/2012 04:44 PM, Alan Cox wrote: This approach does not need any kernel support (except for the /proc/cpuinfo filtering). Does this address the issues you have? You can do the /proc/cpuinfo filtering in user space too How? -- To unsubscribe from this list: send the line unsubscribe

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-25 Thread Vladimir Davydov
On 07/25/2012 02:58 PM, Andre Przywara wrote: On 07/25/2012 12:31 PM, Vladimir Davydov wrote: On 07/24/2012 04:44 PM, Alan Cox wrote: This approach does not need any kernel support (except for the /proc/cpuinfo filtering). Does this address the issues you have? You can do the /proc/cpuinfo

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-25 Thread Vladimir Davydov
On 07/25/2012 02:43 PM, Borislav Petkov wrote: On Wed, Jul 25, 2012 at 02:31:23PM +0400, Vladimir Davydov wrote: So, you prefer adding some filtering of /proc/cpuinfo into the mainstream kernel That's already there right? And your 1/2 patch was making toggling those bits easier. (not now

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-25 Thread Vladimir Davydov
On 07/25/2012 03:17 PM, Andre Przywara wrote: On 07/25/2012 01:02 PM, Vladimir Davydov wrote: On 07/25/2012 02:58 PM, Andre Przywara wrote: On 07/25/2012 12:31 PM, Vladimir Davydov wrote: On 07/24/2012 04:44 PM, Alan Cox wrote: This approach does not need any kernel support (except

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-25 Thread Vladimir Davydov
On 07/25/2012 03:31 PM, Alan Cox wrote: On Wed, 25 Jul 2012 14:31:30 +0400 Vladimir Davydovvdavy...@parallels.com wrote: On 07/24/2012 04:44 PM, Alan Cox wrote: This approach does not need any kernel support (except for the /proc/cpuinfo filtering). Does this address the issues you have?

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-25 Thread Vladimir Davydov
On 07/25/2012 04:57 AM, H. Peter Anvin wrote: On 07/24/2012 04:09 AM, Vladimir Davydov wrote: We have not encountered this situation in our environments and I hope we won't :-) But look, these CPUID functions cover majority of CPU features, don't they? So, most of normal apps inside VM

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-25 Thread Vladimir Davydov
On 07/20/2012 09:10 PM, Andi Kleen wrote: + unsigned int *msr_ext_cpuid_mask) +{ + unsigned int msr, msr_ext; + + msr = msr_ext = 0; + + switch (c-x86_model) { You have to check the family too. + + return msr; +} + +static void

Re: [PATCH] cpuidle: menu: use nr_running instead of cpuload for calculating perf mult

2012-11-27 Thread Vladimir Davydov
loads, which would probably lead to the cpuidle governor making wrong decisions due to overestimating the system load. So, this seems to be another reason to use some different performance multiplier in cpuidle governor. On Jun 4, 2012, at 2:24 PM, Vladimir Davydov vdavy...@parallels.com wrote

[PATCH RFC] pram: persistent over-kexec memory file system

2013-07-26 Thread Vladimir Davydov
Hi, We want to propose a way to upgrade a kernel on a machine without restarting all the user-space services. This is to be done with CRIU project, but we need help from the kernel to preserve some data in memory while doing kexec. The key point of our implementation is leaving process memory

Re: [PATCH RFC] pram: persistent over-kexec memory file system

2013-07-27 Thread Vladimir Davydov
On 07/27/2013 07:41 PM, Marco Stornelli wrote: Il 26/07/2013 14:29, Vladimir Davydov ha scritto: Hi, We want to propose a way to upgrade a kernel on a machine without restarting all the user-space services. This is to be done with CRIU project, but we need help from the kernel to preserve some

Re: [PATCH RFC] pram: persistent over-kexec memory file system

2013-07-28 Thread Vladimir Davydov
On 07/27/2013 09:37 PM, Marco Stornelli wrote: Il 27/07/2013 19:35, Vladimir Davydov ha scritto: On 07/27/2013 07:41 PM, Marco Stornelli wrote: Il 26/07/2013 14:29, Vladimir Davydov ha scritto: Hi, We want to propose a way to upgrade a kernel on a machine without restarting all the user

Re: [PATCH RFC] pram: persistent over-kexec memory file system

2013-07-28 Thread Vladimir Davydov
On 07/28/2013 03:02 PM, Marco Stornelli wrote: Il 28/07/2013 12:05, Vladimir Davydov ha scritto: On 07/27/2013 09:37 PM, Marco Stornelli wrote: Il 27/07/2013 19:35, Vladimir Davydov ha scritto: On 07/27/2013 07:41 PM, Marco Stornelli wrote: Il 26/07/2013 14:29, Vladimir Davydov ha scritto

[PATCH RFC] sched: move h_load calculation to task_h_load

2013-07-13 Thread Vladimir Davydov
Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- kernel/sched/fair.c | 56 ++ kernel/sched/sched.h |7 +++ 2 files changed, 28 insertions(+), 35 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index f77f9c5

Re: [PATCH RFC] sched: move h_load calculation to task_h_load

2013-07-15 Thread Vladimir Davydov
On 07/15/2013 12:28 PM, Peter Zijlstra wrote: OK, fair enough. It does somewhat rely on us getting the single rq-clock update thing right, but that should be ok. Frankly, I doubt that rq-clock is the right thing to use here, because it can be updated very frequently under some conditions, so

[PATCH v2] sched: move h_load calculation to task_h_load

2013-07-15 Thread Vladimir Davydov
Changes in v2: * use jiffies instead of rq-clock for last_h_load_update. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- kernel/sched/fair.c | 58 +++--- kernel/sched/sched.h |7 +++--- 2 files changed, 30 insertions(+), 35 deletions

[PATCH] sched: Fix task_h_load calculation

2013-09-14 Thread Vladimir Davydov
runnable tasks there instead. Fix it. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- kernel/sched/fair.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 9b3fe1c..13abc29 100644 --- a/kernel/sched/fair.c +++ b/kernel

[PATCH 2/2] sched: fix_small_imbalance: Fix local-avg_load busiest-avg_load case

2013-09-15 Thread Vladimir Davydov
can be caught by running 2*N cpuhogs pinned to two logical cpus belonging to different cores on an HT-enabled machine with N logical cpus: just look at se.nr_migrations growth. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- kernel/sched/fair.c |4 ++-- 1 file changed, 2 insertions

[PATCH 1/2] sched: calculate_imbalance: Fix local-avg_load sds-avg_load case

2013-09-15 Thread Vladimir Davydov
to two logical cpus belonging to different cores on an HT-enabled machine with N logical cpus: just look at se.nr_migrations growth. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- kernel/sched/fair.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/sched

[PATCH 1/2] sched: load_balance: Prevent reselect prev dst_cpu if some pinned

2013-09-15 Thread Vladimir Davydov
Currently new_dst_cpu is prevented from being reselected actually, not dst_cpu. This can result in attempting to pull tasks to this_cpu twice. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- kernel/sched/fair.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git

[PATCH 2/2] sched: load_balance: Reset env when going to redo due to all pinned

2013-09-15 Thread Vladimir Davydov
handling 'some pinned' case when pulling tasks from a new busiest cpu. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- kernel/sched/fair.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index cd59640..d840e51

Re: [PATCH 1/2] sched: calculate_imbalance: Fix local-avg_load sds-avg_load case

2013-09-16 Thread Vladimir Davydov
On 09/16/2013 09:52 AM, Peter Zijlstra wrote: On Sun, Sep 15, 2013 at 05:49:13PM +0400, Vladimir Davydov wrote: In busiest-group_imb case we can come to calculate_imbalance() with local-avg_load = busiest-avg_load = sds-avg_load. This can result in imbalance overflow, because it is calculated

Re: [PATCH 2/2] sched: load_balance: Reset env when going to redo due to all pinned

2013-09-16 Thread Vladimir Davydov
On 09/16/2013 09:43 AM, Peter Zijlstra wrote: On Sun, Sep 15, 2013 at 09:30:14PM +0400, Vladimir Davydov wrote: Firstly, reset env.dst_cpu/dst_rq to this_cpu/this_rq, because it could have changed in 'some pinned' case. Otherwise, should_we_balance() can stop balancing beforehand. Secondly

[PATCH 1/2] e1000: fix lockdep warning in e1000_reset_task

2013-11-22 Thread Vladimir Davydov
-by: Vladimir Davydov vdavy...@parallels.com Cc: Tushar Dave tushar.n.d...@intel.com Cc: Patrick McHardy ka...@trash.net Cc: David S. Miller da...@davemloft.net --- drivers/net/ethernet/intel/e1000/e1000.h |2 -- drivers/net/ethernet/intel/e1000/e1000_main.c | 36 +++-- 2

[PATCH 2/2] e1000: fix possible reset_task running after adapter down

2013-11-22 Thread Vladimir Davydov
moves cancel_delayed_work_sync(watchdog_task) at the beginning of e1000_down_and_stop() thus ensuring the race is impossible. Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Tushar Dave tushar.n.d...@intel.com Cc: Patrick McHardy ka...@trash.net Cc: David S. Miller da...@davemloft.net

[PATCH v11 05/15] memcg: move stop and resume accounting functions

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa glom...@openvz.org I need to move this up a bit, and I am doing it in a separate patch just to reduce churn in the patch that needs it. Signed-off-by: Glauber Costa glom...@openvz.org Cc: Johannes Weiner han...@cmpxchg.org Cc: Michal Hocko mho...@suse.cz Cc: Hugh Dickins

[PATCH v11 01/15] memcg: make cache index determination more robust

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa glom...@openvz.org I caught myself doing something like the following outside memcg core: memcg_id = -1; if (memcg memcg_kmem_is_active(memcg)) memcg_id = memcg_cache_id(memcg); to be able to handle all possible memcgs in a sane manner. In

[PATCH v11 02/15] memcg: consolidate callers of memcg_cache_id

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa glom...@openvz.org Each caller of memcg_cache_id ends up sanitizing its parameters in its own way. Now that the memcg_cache_id itself is more robust, we can consolidate this. Also, as suggested by Michal, a special helper memcg_cache_idx is used when the result is expected to

[PATCH v11 14/15] memcg: reap dead memcgs upon global memory pressure

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa glom...@openvz.org When we delete kmem-enabled memcgs, they can still be zombieing around for a while. The reason is that the objects may still be alive, and we won't be able to delete them at destruction time. The only entry point for that, though, are the shrinkers. The

[PATCH v11 13/15] vmpressure: in-kernel notifications

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa glom...@openvz.org During the past weeks, it became clear to us that the shrinker interface we have right now works very well for some particular types of users, but not that well for others. The later are usually people interested in one-shot notifications, that were forced

[PATCH v11 15/15] memcg: flush memcg items upon memcg destruction

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa glom...@openvz.org When a memcg is destroyed, it won't be imediately released until all objects are gone. This means that if a memcg is restarted with the very same workload - a very common case, the objects already cached won't be billed to the new memcg. This is mostly

[PATCH v11 11/15] super: make icache, dcache shrinkers memcg-aware

2013-11-25 Thread Vladimir Davydov
and inode, which seems to be too costly. Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Glauber Costa glom...@openvz.org Cc: Dave Chinner dchin...@redhat.com Cc: Mel Gorman mgor...@suse.de Cc: Rik van Riel r...@redhat.com Cc: Johannes Weiner han...@cmpxchg.org Cc: Michal Hocko mho...@suse.cz

[PATCH v11 09/15] memcg,list_lru: add per-memcg LRU list infrastructure

2013-11-25 Thread Vladimir Davydov
the pointer to the appropriate list_lru object from a memcg or a kmem ptr, which should be further operated with conventional list_lru methods. Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Glauber Costa glom...@openvz.org Cc: Dave Chinner dchin...@redhat.com Cc: Mel Gorman mgor...@suse.de Cc

[PATCH v11 12/15] memcg: allow kmem limit to be resized down

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa glom...@openvz.org The userspace memory limit can be freely resized down. Upon attempt, reclaim will be called to flush the pages away until we either reach the limit we want or give up. It wasn't possible so far with the kmem limit, since we had no way to shrink the kmem

[PATCH v11 10/15] memcg,list_lru: add function walking over all lists of a per-memcg LRU

2013-11-25 Thread Vladimir Davydov
list_lru_walk(), but shrink_dcache_sb(), which is going to be the only user of this function, does not need it. Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Glauber Costa glom...@openvz.org Cc: Dave Chinner dchin...@redhat.com Cc: Mel Gorman mgor...@suse.de Cc: Rik van Riel r...@redhat.com Cc

[PATCH v11 08/15] vmscan: take at least one pass with shrinkers

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa glom...@openvz.org In very low free kernel memory situations, it may be the case that we have less objects to free than our initial batch size. If this is the case, it is better to shrink those, and open space for the new workload then to keep them and fail the new

[PATCH v11 06/15] memcg: per-memcg kmem shrinking

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa glom...@openvz.org If the kernel limit is smaller than the user limit, we will have situations in which our allocations fail but freeing user pages will buy us nothing. In those, we would like to call a specialized memcg reclaimer that only frees kernel memory and leave the

[PATCH v11 04/15] memcg: move initialization to memcg creation

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa glom...@openvz.org Those structures are only used for memcgs that are effectively using kmemcg. However, in a later patch I intend to use scan that list inconditionally (list empty meaning no kmem caches present), which simplifies the code a lot. So move the initialization to

[PATCH v11 03/15] vmscan: also shrink slab in memcg pressure

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa glom...@openvz.org Without the surrounding infrastructure, this patch is a bit of a hammer: it will basically shrink objects from all memcgs under memcg pressure. At least, however, we will keep the scan limited to the shrinkers marked as per-memcg. Future patches will

[PATCH v11 07/15] memcg: scan cache objects hierarchically

2013-11-25 Thread Vladimir Davydov
From: Glauber Costa glom...@openvz.org When reaching shrink_slab, we should descent in children memcg searching for objects that could be shrunk. This is true even if the memcg does not have kmem limits on, since the kmem res_counter will also be billed against the user res_counter of the parent.

[PATCH v11 00/15] kmemcg shrinkers

2013-11-25 Thread Vladimir Davydov
memcg: reap dead memcgs upon global memory pressure memcg: flush memcg items upon memcg destruction Vladimir Davydov (3): memcg,list_lru: add per-memcg LRU list infrastructure memcg,list_lru: add function walking over all lists of a per-memcg LRU super: make icache, dcache shrinkers

Re: [PATCH v11 00/15] kmemcg shrinkers

2013-11-25 Thread Vladimir Davydov
Hi, Thank you for the review. I agree with all your comments and I'll resend the fixed version soon. If anyone still has something to say about the patchset, I'd be glad to hear from them. On 11/25/2013 09:41 PM, Johannes Weiner wrote: I ran out of steam reviewing these because there were

Re: [Devel] [PATCH v11 00/15] kmemcg shrinkers

2013-11-26 Thread Vladimir Davydov
On 11/26/2013 10:47 AM, Vladimir Davydov wrote: Hi, Thank you for the review. I agree with all your comments and I'll resend the fixed version soon. If anyone still has something to say about the patchset, I'd be glad to hear from them. On 11/25/2013 09:41 PM, Johannes Weiner wrote: I

Re: [PATCH v11 00/15] kmemcg shrinkers

2013-11-26 Thread Vladimir Davydov
On 11/27/2013 02:47 AM, Dave Chinner wrote: On Tue, Nov 26, 2013 at 10:47:00AM +0400, Vladimir Davydov wrote: Hi, Thank you for the review. I agree with all your comments and I'll resend the fixed version soon. If anyone still has something to say about the patchset, I'd be glad to hear from

[PATCH] memcg: fix kmem_account_flags check in memcg_can_account_kmem()

2013-11-27 Thread Vladimir Davydov
-by: Vladimir Davydov vdavy...@parallels.com Cc: Johannes Weiner han...@cmpxchg.org Cc: Michal Hocko mho...@suse.cz Cc: Balbir Singh bsinghar...@gmail.com Cc: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com --- mm/memcontrol.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git

[PATCH] memcg: make memcg_update_cache_sizes() static

2013-11-27 Thread Vladimir Davydov
This function is not used outside of memcontrol.c so make it static. Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Johannes Weiner han...@cmpxchg.org Cc: Michal Hocko mho...@suse.cz Cc: Balbir Singh bsinghar...@gmail.com Cc: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com --- mm

Re: [PATCH] memcg: fix kmem_account_flags check in memcg_can_account_kmem()

2013-11-29 Thread Vladimir Davydov
On 11/29/2013 01:45 PM, Michal Hocko wrote: On Wed 27-11-13 19:46:01, Vladimir Davydov wrote: We should start kmem accounting for a memory cgroup only after both its kmem limit is set (KMEM_ACCOUNTED_ACTIVE) and related call sites are patched (KMEM_ACCOUNTED_ACTIVATED). This should be vice

[PATCH v12 03/18] memcg: move initialization to memcg creation

2013-12-02 Thread Vladimir Davydov
to early kmem creation. Signed-off-by: Glauber Costa glom...@openvz.org Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Johannes Weiner han...@cmpxchg.org Cc: Michal Hocko mho...@suse.cz Cc: Balbir Singh bsinghar...@gmail.com Cc: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com --- mm

[PATCH v12 09/18] vmscan: shrink slab on memcg pressure

2013-12-02 Thread Vladimir Davydov
the nr_deferred per-shrinker counter to avoid memory cgroup isolation issues. Ideally, this counter should be made per-memcg. Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Johannes Weiner han...@cmpxchg.org Cc: Michal Hocko mho...@suse.cz Cc: Dave Chinner dchin...@redhat.com Cc: Andrew Morton

[PATCH v12 02/18] memcg: consolidate callers of memcg_cache_id

2013-12-02 Thread Vladimir Davydov
to be used directly as an array index to make sure we never accesses in a negative index. Signed-off-by: Glauber Costa glom...@openvz.org Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Johannes Weiner han...@cmpxchg.org Cc: Michal Hocko mho...@suse.cz Cc: Balbir Singh bsinghar

[PATCH v12 08/18] vmscan: do_try_to_free_pages(): remove shrink_control argument

2013-12-02 Thread Vladimir Davydov
to shrink_zones(). So let's move shrink_control initialization to shrink_zones(). Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Johannes Weiner han...@cmpxchg.org Cc: Michal Hocko mho...@suse.cz Cc: Andrew Morton a...@linux-foundation.org Cc: Mel Gorman mgor...@suse.de Cc: Rik van Riel

[PATCH v12 15/18] memcg: allow kmem limit to be resized down

2013-12-02 Thread Vladimir Davydov
the limit we want or give up. Signed-off-by: Glauber Costa glom...@openvz.org Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Johannes Weiner han...@cmpxchg.org Cc: Michal Hocko mho...@suse.cz Cc: Balbir Singh bsinghar...@gmail.com Cc: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com --- mm

[PATCH v12 16/18] vmpressure: in-kernel notifications

2013-12-02 Thread Vladimir Davydov
From: Glauber Costa glom...@openvz.org During the past weeks, it became clear to us that the shrinker interface we have right now works very well for some particular types of users, but not that well for others. The latter are usually people interested in one-shot notifications, that were forced

[PATCH v12 13/18] memcg: per-memcg kmem shrinking

2013-12-02 Thread Vladimir Davydov
, we have no option rather than failing all GFP_NOFS allocations when we are close to the kmem limit. The best thing we can do in such a situation is to spawn the reclaimer in a background process hoping next allocations will succeed. Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc

[PATCH v12 10/18] memcg,list_lru: add per-memcg LRU list infrastructure

2013-12-02 Thread Vladimir Davydov
the pointer to the appropriate list_lru object from a memcg or a kmem ptr, which should be further operated with conventional list_lru methods. Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Johannes Weiner han...@cmpxchg.org Cc: Michal Hocko mho...@suse.cz Cc: Dave Chinner dchin...@redhat.com

Re: [PATCH v12 00/18] kmemcg shrinkers

2013-12-02 Thread Vladimir Davydov
-aware shrinker. I would appreciate if you could look at the new version and share your attitude toward it. Thank you. On 12/02/2013 03:19 PM, Vladimir Davydov wrote: Hi, This is the 12th iteration of Glauber Costa's patchset implementing targeted shrinking for memory cgroups when kmem

[PATCH v12 11/18] memcg,list_lru: add function walking over all lists of a per-memcg LRU

2013-12-02 Thread Vladimir Davydov
list_lru_walk(), but shrink_dcache_sb(), which is going to be the only user of this function, does not need it. Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Johannes Weiner han...@cmpxchg.org Cc: Michal Hocko mho...@suse.cz Cc: Dave Chinner dchin...@redhat.com Cc: Andrew Morton a...@linux

[PATCH v12 12/18] fs: make icache, dcache shrinkers memcg-aware

2013-12-02 Thread Vladimir Davydov
and inode, which seems to be too costly. Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Johannes Weiner han...@cmpxchg.org Cc: Michal Hocko mho...@suse.cz Cc: Dave Chinner dchin...@redhat.com Cc: Andrew Morton a...@linux-foundation.org Cc: Al Viro v...@zeniv.linux.org.uk Cc: Balbir Singh

[PATCH v12 18/18] memcg: flush memcg items upon memcg destruction

2013-12-02 Thread Vladimir Davydov
assume that a memcg that goes away most likely indicates an isolated workload that is terminated. Signed-off-by: Glauber Costa glom...@openvz.org Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Johannes Weiner han...@cmpxchg.org Cc: Michal Hocko mho...@suse.cz Cc: Balbir Singh bsinghar

[PATCH v12 00/18] kmemcg shrinkers

2013-12-02 Thread Vladimir Davydov
memcg: reap dead memcgs upon global memory pressure memcg: flush memcg items upon memcg destruction Vladimir Davydov (11): memcg: move several kmemcg functions upper fs: do not use destroy_super() in alloc_super() fail path vmscan: rename shrink_slab() args to make it more generic vmscan

[PATCH v12 17/18] memcg: reap dead memcgs upon global memory pressure

2013-12-02 Thread Vladimir Davydov
From: Glauber Costa glom...@openvz.org When we delete kmem-enabled memcgs, they can still be zombieing around for a while. The reason is that the objects may still be alive, and we won't be able to delete them at destruction time. The only entry point for that, though, are the shrinkers. The

[PATCH v12 14/18] vmscan: take at least one pass with shrinkers

2013-12-02 Thread Vladimir Davydov
with the direct reclaim case for memcg. Although this same technique can be applied to other situations just as well, we will start conservative and apply it for that case, which is the one that matters the most. Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Johannes Weiner han

[PATCH v12 06/18] vmscan: rename shrink_slab() args to make it more generic

2013-12-02 Thread Vladimir Davydov
, we will have to make up phony values for nr_pages_scanned and lru_pages again when doing kmem-only reclaim for a memory cgroup, which is possible if the cgroup has its kmem limit less than the total memory limit. Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Johannes Weiner han

[PATCH v12 04/18] memcg: move several kmemcg functions upper

2013-12-02 Thread Vladimir Davydov
. Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Johannes Weiner han...@cmpxchg.org Cc: Michal Hocko mho...@suse.cz Cc: Balbir Singh bsinghar...@gmail.com Cc: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com --- mm/memcontrol.c | 92

[PATCH v12 07/18] vmscan: move call to shrink_slab() to shrink_zones()

2013-12-02 Thread Vladimir Davydov
This reduces the indentation level of do_try_to_free_pages() and removes extra loop over all eligible zones counting the number of on-LRU pages. Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Johannes Weiner han...@cmpxchg.org Cc: Michal Hocko mho...@suse.cz Cc: Andrew Morton

[PATCH v12 01/18] memcg: make cache index determination more robust

2013-12-02 Thread Vladimir Davydov
memcg_cache_id and make sure it always return a meaningful value. Signed-off-by: Glauber Costa glom...@openvz.org Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Johannes Weiner han...@cmpxchg.org Cc: Michal Hocko mho...@suse.cz Cc: Balbir Singh bsinghar...@gmail.com Cc: KAMEZAWA Hiroyuki

[PATCH v12 05/18] fs: do not use destroy_super() in alloc_super() fail path

2013-12-02 Thread Vladimir Davydov
inline appropriate snippets from destroy_super() to alloc_super() fail path instead of using the whole function there. Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Al Viro v...@zeniv.linux.org.uk --- fs/super.c |9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff

[PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-02 Thread Vladimir Davydov
() is called under the set_limit_mutex, but the leftover from the above-mentioned commit is still here. Let's remove it. Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Johannes Weiner han...@cmpxchg.org Cc: Michal Hocko mho...@suse.cz Cc: Balbir Singh bsinghar...@gmail.com Cc: KAMEZAWA Hiroyuki

Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-02 Thread Vladimir Davydov
On 12/02/2013 10:26 PM, Glauber Costa wrote: On Mon, Dec 2, 2013 at 10:15 PM, Michal Hocko mho...@suse.cz wrote: [CCing Glauber - please do so in other posts for kmem related changes] On Mon 02-12-13 17:08:13, Vladimir Davydov wrote: The KMEM_ACCOUNTED_ACTIVATED was introduced by commit

Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-02 Thread Vladimir Davydov
On 12/02/2013 10:15 PM, Michal Hocko wrote: [CCing Glauber - please do so in other posts for kmem related changes] On Mon 02-12-13 17:08:13, Vladimir Davydov wrote: The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b (memcg: use static branches when code not in use) in order

Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-03 Thread Vladimir Davydov
On 12/03/2013 11:56 AM, Glauber Costa wrote: On Mon, Dec 2, 2013 at 11:21 PM, Vladimir Davydov vdavy...@parallels.com wrote: On 12/02/2013 10:26 PM, Glauber Costa wrote: On Mon, Dec 2, 2013 at 10:15 PM, Michal Hocko mho...@suse.cz wrote: [CCing Glauber - please do so in other posts for kmem

Re: [PATCH v12 05/18] fs: do not use destroy_super() in alloc_super() fail path

2013-12-03 Thread Vladimir Davydov
On 12/03/2013 01:00 PM, Dave Chinner wrote: On Mon, Dec 02, 2013 at 03:19:40PM +0400, Vladimir Davydov wrote: Using destroy_super() in alloc_super() fail path is bad, because: * It will trigger WARN_ON(!list_empty(s-s_mounts)) since s_mounts is initialized after several 'goto fail's. So

Re: [PATCH v12 06/18] vmscan: rename shrink_slab() args to make it more generic

2013-12-03 Thread Vladimir Davydov
On 12/03/2013 01:33 PM, Dave Chinner wrote: kmemcg reclaim is introduced, we will have to make up phony values for nr_pages_scanned and lru_pages again when doing kmem-only reclaim for a memory cgroup, which is possible if the cgroup has its kmem limit less than the total memory limit. I'm

Re: [PATCH v12 09/18] vmscan: shrink slab on memcg pressure

2013-12-03 Thread Vladimir Davydov
On 12/03/2013 02:48 PM, Dave Chinner wrote: @@ -236,11 +236,17 @@ shrink_slab_node(struct shrink_control *shrinkctl, struct shrinker *shrinker, return 0; /* - * copy the current shrinker scan count into a local variable - * and zero it so that other concurrent

Re: [PATCH v12 10/18] memcg,list_lru: add per-memcg LRU list infrastructure

2013-12-03 Thread Vladimir Davydov
On 12/03/2013 03:18 PM, Dave Chinner wrote: On Mon, Dec 02, 2013 at 03:19:45PM +0400, Vladimir Davydov wrote: FS-shrinkers, which shrink dcaches and icaches, keep dentries and inodes in list_lru structures in order to evict least recently used objects. With per-memcg kmem shrinking

Re: [PATCH v12 12/18] fs: make icache, dcache shrinkers memcg-aware

2013-12-03 Thread Vladimir Davydov
On 12/03/2013 03:45 PM, Dave Chinner wrote: On Mon, Dec 02, 2013 at 03:19:47PM +0400, Vladimir Davydov wrote: Using the per-memcg LRU infrastructure introduced by previous patches, this patch makes dcache and icache shrinkers memcg-aware. To achieve that, it converts s_dentry_lru

Re: [PATCH v12 05/18] fs: do not use destroy_super() in alloc_super() fail path

2013-12-03 Thread Vladimir Davydov
On 12/03/2013 05:37 PM, Al Viro wrote: On Tue, Dec 03, 2013 at 01:23:01PM +0400, Vladimir Davydov wrote: Actually, I'm not going to modify the list_lru structure, because I think it's good as it is. I'd like to substitute it with a new structure, memcg_list_lru, only in those places where

Re: [PATCH v12 09/18] vmscan: shrink slab on memcg pressure

2013-12-03 Thread Vladimir Davydov
On 12/04/2013 08:51 AM, Dave Chinner wrote: On Tue, Dec 03, 2013 at 04:15:57PM +0400, Vladimir Davydov wrote: On 12/03/2013 02:48 PM, Dave Chinner wrote: @@ -236,11 +236,17 @@ shrink_slab_node(struct shrink_control *shrinkctl, struct shrinker *shrinker, return 0

Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-03 Thread Vladimir Davydov
On 12/04/2013 02:38 AM, Glauber Costa wrote: In memcg_update_kmem_limit() we do the whole process of limit initialization under a mutex so the situation we need protection from in tcp_update_limit() is impossible. BTW once set, the 'activated' flag is never cleared and never checked alone,

[PATCH] fs: fix WARN on alloc_super() fail path

2013-12-04 Thread Vladimir Davydov
On fail path alloc_super() calls destroy_super(), which issues a warning if list_empty() returns false on the s_mounts field. That said s_mounts should be initialized in alloc_super() before any possible failure. Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Al Viro v

  1   2   3   4   5   6   7   8   9   10   >