[PATCH -mm v2 1/3] slub: never fail to shrink cache

2015-01-28 Thread Vladimir Davydov
on them. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- mm/slub.c | 57 ++--- 1 file changed, 30 insertions(+), 27 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 1562955fe099..dbf9334b6a5c 100644 --- a/mm/slub.c +++ b/mm

Re: [PATCH -mm v2 2/3] slub: fix kmem_cache_shrink return value

2015-01-28 Thread Vladimir Davydov
On Wed, Jan 28, 2015 at 10:33:50AM -0600, Christoph Lameter wrote: On Wed, 28 Jan 2015, Vladimir Davydov wrote: @@ -3419,6 +3420,9 @@ int __kmem_cache_shrink(struct kmem_cache *s) for (i = SHRINK_PROMOTE_MAX - 1; i = 0; i--) list_splice_init(promote + i, n

[PATCH -mm v2 0/3] slub: make dead caches discard free slabs immediately

2015-01-28 Thread Vladimir Davydov
between put_cpu_partial reading -cpu_partial and kmem_cache_shrink updating it as proposed by Joonsoo v1: https://lkml.org/lkml/2015/1/26/317 Thanks, Vladimir Davydov (3): slub: never fail to shrink cache slub: fix kmem_cache_shrink return value slub: make dead caches discard free slabs

Re: [PATCH -mm v2 1/3] slub: never fail to shrink cache

2015-01-28 Thread Vladimir Davydov
On Wed, Jan 28, 2015 at 10:37:09AM -0600, Christoph Lameter wrote: On Wed, 28 Jan 2015, Vladimir Davydov wrote: + /* We do not keep full slabs on the list */ + BUG_ON(free = 0); Well sorry we do actually keep a number of empty slabs on the partial

Re: [PATCH -mm 1/3] slub: don't fail kmem_cache_shrink if slab placement optimization fails

2015-01-28 Thread Vladimir Davydov
On Tue, Jan 27, 2015 at 11:02:12AM -0600, Christoph Lameter wrote: What you could do is simply put all slab pages with more than 32 objects available at the end of the list. OK, got it, will redo. Thanks! -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a

[PATCH -mm v2 3/3] slub: make dead caches discard free slabs immediately

2015-01-28 Thread Vladimir Davydov
Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- mm/slab.c|4 ++-- mm/slab.h|2 +- mm/slab_common.c | 15 +-- mm/slob.c|2 +- mm/slub.c| 31 ++- 5 files changed, 43 insertions(+), 11 deletions(-) diff

[PATCH -mm v2 2/3] slub: fix kmem_cache_shrink return value

2015-01-28 Thread Vladimir Davydov
It is supposed to return 0 if the cache has no remaining objects and 1 otherwise, while currently it always returns 0. Fix it. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- mm/slub.c |6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/mm/slub.c b/mm/slub.c

Re: [PATCH -mm v2 1/3] slub: never fail to shrink cache

2015-01-29 Thread Vladimir Davydov
On Wed, Jan 28, 2015 at 01:57:52PM -0800, Andrew Morton wrote: On Wed, 28 Jan 2015 19:22:49 +0300 Vladimir Davydov vdavy...@parallels.com wrote: @@ -3375,51 +3376,56 @@ int __kmem_cache_shrink(struct kmem_cache *s) struct kmem_cache_node *n; struct page *page; struct page *t

Re: [PATCH -mm v2 1/3] slub: never fail to shrink cache

2015-01-29 Thread Vladimir Davydov
On Thu, Jan 29, 2015 at 10:22:16AM -0600, Christoph Lameter wrote: On Thu, 29 Jan 2015, Vladimir Davydov wrote: Yeah, but the tool just writes 1 to /sys/kernel/slab/cache/shrink, i.e. invokes shrink_store(), and I don't propose to remove slab placement optimization from there. What I

Re: [PATCH -mm v2 1/3] slub: never fail to shrink cache

2015-01-29 Thread Vladimir Davydov
On Thu, Jan 29, 2015 at 09:55:56AM -0600, Christoph Lameter wrote: On Thu, 29 Jan 2015, Vladimir Davydov wrote: Come to think of it, do we really need to optimize slab placement in kmem_cache_shrink? None of its users except shrink_store expects it - they just want to purge the cache

[PATCH -mm] slab: update_memcg_params: explicitly check that old array != NULL

2015-01-26 Thread Vladimir Davydov
' (see line 162) git remote add mmotm git://git.cmpxchg.org/linux-mmotm.git git remote update mmotm git checkout 5d06629c100b942a51f02b4d886c116ba3afb32a vim +/old +166 mm/slab_common.c 5d06629c Vladimir Davydov 2015-01-24 156 lockdep_is_held(slab_mutex

Re: [PATCH -mm] slab: update_memcg_params: explicitly check that old array != NULL

2015-01-26 Thread Vladimir Davydov
On Mon, Jan 26, 2015 at 01:23:05PM +0300, Dan Carpenter wrote: On Mon, Jan 26, 2015 at 01:01:19PM +0300, Vladimir Davydov wrote: This warning is false-positive, because @old equals NULL iff @memcg_nr_cache_ids equals 0. I don't see how it could be a false positive. The old pointer

[PATCH -mm] slab: suppress warnings caused by expansion of for_each_memcg_cache if !MEMCG_KMEM

2015-01-24 Thread Vladimir Davydov
/slab_common.c:603:2: note: in expansion of macro 'for_each_memcg_cache_safe' for_each_memcg_cache_safe(c, c2, s) { ^ fixes: slab-link-memcg-caches-of-the-same-kind-into-a-list Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- mm/slab.h |4 ++-- 1 file changed, 2 insertions(+), 2

Re: [PATCH -mm 1/3] slub: don't fail kmem_cache_shrink if slab placement optimization fails

2015-01-27 Thread Vladimir Davydov
On Mon, Jan 26, 2015 at 01:53:32PM -0600, Christoph Lameter wrote: On Mon, 26 Jan 2015, Vladimir Davydov wrote: We could do that, but IMO that would only complicate the code w/o yielding any real benefits. This function is slow and called rarely anyway, so I don't think there is any point

Re: [Regression] 3.19-rc3 : memcg: Hang in mount memcg

2015-01-10 Thread Vladimir Davydov
On Fri, Jan 09, 2015 at 05:43:17PM +, Suzuki K. Poulose wrote: Hi We have hit a hang on ARM64 defconfig, while running LTP tests on 3.19-rc3. We are in the process of a git bisect and will update the results as and when we find the commit. During the ksm ltp run, the test hangs

Re: [PATCH cgroup/for-3.19-fixes] cgroup: implement cgroup_subsys-unbind() callback

2015-01-12 Thread Vladimir Davydov
On Sun, Jan 11, 2015 at 03:55:43PM -0500, Johannes Weiner wrote: On Sat, Jan 10, 2015 at 04:43:16PM -0500, Tejun Heo wrote: May be, we should kill the ref counter to the memory controller root in cgroup_kill_sb only if there is no children at all, neither online nor offline. Ah,

[PATCH -mm 2/2] mm: vmscan: init reclaim_state in do_try_to_free_pages

2015-01-12 Thread Vladimir Davydov
All users of do_try_to_free_pages() want to have current-reclaim_state set in order to account reclaimed slab pages. So instead of duplicating the reclaim_state initialization code in each call site, let's do it directly in do_try_to_free_pages(). Signed-off-by: Vladimir Davydov vdavy

[PATCH -mm 1/2] mm: vmscan: account slab pages on memcg reclaim

2015-01-12 Thread Vladimir Davydov
Since try_to_free_mem_cgroup_pages() can now call slab shrinkers, we should initialize reclaim_state and account reclaimed slab pages in scan_control-nr_reclaimed. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- mm/vmscan.c | 33 ++--- 1 file changed, 22

[PATCH -mm] fs: shrinker: always scan at least one object of each type

2015-01-12 Thread Vladimir Davydov
() will scan at least one object of each type if any. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- fs/super.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/super.c b/fs/super.c index 482b4071f4de..63136156867e 100644 --- a/fs/super.c +++ b/fs/super.c @@ -92,13

Re: [patch 1/3] mm: memcontrol: remove unnecessary soft limit tree node test

2015-01-12 Thread Vladimir Davydov
On Fri, Jan 09, 2015 at 09:13:59PM -0500, Johannes Weiner wrote: kzalloc_node() automatically falls back to nodes with suitable memory. Signed-off-by: Johannes Weiner han...@cmpxchg.org Reviewed-by: Vladimir Davydov vdavy...@parallels.com -- To unsubscribe from this list: send the line

Re: [patch 3/3] mm: memcontrol: consolidate swap controller code

2015-01-12 Thread Vladimir Davydov
...@cmpxchg.org I was always wondering why it had to be scattered all over the place. I guess we'll have to do the same for the kmem part. Reviewed-by: Vladimir Davydov vdavy...@parallels.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord

Re: [patch 2/3] mm: memcontrol: consolidate memory controller initialization

2015-01-12 Thread Vladimir Davydov
, + drain_local_stock); + + for_each_node(nid) { + struct mem_cgroup_tree_per_node *rtpn; + int zone; + + rtpn = kzalloc_node(sizeof(*rtpn), GFP_KERNEL, nid); I'd like to see BUG_ON(!rtpn) here, just for clarity. Not critical though. Reviewed-by: Vladimir Davydov vdavy

Re: [PATCH cgroup/for-3.19-fixes] cgroup: implement cgroup_subsys-unbind() callback

2015-01-12 Thread Vladimir Davydov
On Mon, Jan 12, 2015 at 06:28:45AM -0500, Tejun Heo wrote: On Mon, Jan 12, 2015 at 11:01:14AM +0300, Vladimir Davydov wrote: Come to think of it, I wonder how many users actually want to mount different controllers subset after unmount. Because we could allow It wouldn't be a common use

[PATCH v2] vmscan: force scan offline memory cgroups

2015-01-09 Thread Vladimir Davydov
this by unconditionally forcing scanning dead lruvecs from kswapd. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- Changes in v2: - code style fixes (Johannes) include/linux/memcontrol.h |6 ++ mm/memcontrol.c| 14 ++ mm/vmscan.c|8

Re: [PATCH -mm v3 3/9] vmscan: per memory cgroup slab shrinkers

2015-01-09 Thread Vladimir Davydov
On Fri, Jan 09, 2015 at 02:33:46PM +0800, Hillf Danton wrote: @@ -2318,16 +2357,22 @@ static bool shrink_zone(struct zone *zone, struct scan_control *sc, memcg = mem_cgroup_iter(root, NULL, reclaim); do { - unsigned long lru_pages; +

Re: [PATCH -mm] fs: shrinker: always scan at least one object of each type

2015-01-13 Thread Vladimir Davydov
On Tue, Jan 13, 2015 at 03:56:39PM -0800, Andrew Morton wrote: On Mon, 12 Jan 2015 13:20:46 +0300 Vladimir Davydov vdavy...@parallels.com wrote: In super_cache_scan() we divide the number of objects of particular type by the total number of objects in order to distribute pressure among

Re: [PATCH -mm 2/2] mm: vmscan: init reclaim_state in do_try_to_free_pages

2015-01-12 Thread Vladimir Davydov
On Mon, Jan 12, 2015 at 05:26:34PM -0500, Johannes Weiner wrote: On Mon, Jan 12, 2015 at 12:30:38PM +0300, Vladimir Davydov wrote: All users of do_try_to_free_pages() want to have current-reclaim_state set in order to account reclaimed slab pages. So instead of duplicating the reclaim_state

Re: [PATCH -mm 1/2] mm: vmscan: account slab pages on memcg reclaim

2015-01-12 Thread Vladimir Davydov
On Mon, Jan 12, 2015 at 05:18:39PM -0500, Johannes Weiner wrote: On Mon, Jan 12, 2015 at 12:30:37PM +0300, Vladimir Davydov wrote: Since try_to_free_mem_cgroup_pages() can now call slab shrinkers, we should initialize reclaim_state and account reclaimed slab pages in scan_control

[RFC] A question about memcg/kmem

2015-01-13 Thread Vladimir Davydov
Hi, There's one thing about kmemcg implementation that's bothering me. It's about arrays holding per-memcg data (e.g. kmem_cache-memcg_params- memcg_caches). On kmalloc or list_lru_{add,del} we want to quickly lookup the copy of kmem_cache or list_lru corresponding to the current cgroup.

Re: [RFC] A question about memcg/kmem

2015-01-13 Thread Vladimir Davydov
On Tue, Jan 13, 2015 at 09:25:44AM -0500, Johannes Weiner wrote: On Tue, Jan 13, 2015 at 12:24:24PM +0300, Vladimir Davydov wrote: Hi, There's one thing about kmemcg implementation that's bothering me. It's about arrays holding per-memcg data (e.g. kmem_cache-memcg_params- memcg_caches

Re: [patch 1/2] mm: page_counter: pull -1 handling out of page_counter_memparse()

2015-01-13 Thread Vladimir Davydov
...@cmpxchg.org Reviewed-by: Vladimir Davydov vdavy...@parallels.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org

[PATCH -mm] vmscan: move reclaim_state handling to shrink_slab

2015-01-14 Thread Vladimir Davydov
...@cmpxchg.org Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- mm/page_alloc.c |4 --- mm/vmscan.c | 73 --- 2 files changed, 27 insertions(+), 50 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e1963ea0684a

Re: [patch] mm: memcontrol: fold move_anon() and move_file()

2015-01-14 Thread Vladimir Davydov
...@cmpxchg.org Reviewed-by: Vladimir Davydov vdavy...@parallels.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

[PATCH -mm] slub: kmem_cache_shrink: init discard list after freeing slabs

2015-02-11 Thread Vladimir Davydov
[81c3ca7c] ret_from_fork+0x7c/0xb0 [81c1f31a] ? rest_init+0x13e/0x13e fixes: slub-never-fail-to-shrink-cache Signed-off-by: Vladimir Davydov vdavy...@parallels.com Reported-by: Huang Ying ying.hu...@intel.com Cc: Christoph Lameter c...@linux.com Cc: Pekka Enberg penb...@kernel.org

[PATCH -mm v3 2/9] fs: consolidate {nr,free}_cached_objects args in shrink_control

2015-01-08 Thread Vladimir Davydov
will be added to it when we introduce memcg-aware vmscan, let us consolidate the methods' arguments in this structure to keep things clean. Suggested-by: Dave Chinner da...@fromorbit.com Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- fs/super.c | 12 ++-- fs/xfs

[PATCH -mm v3 0/9] Per memcg slab shrinkers

2015-01-08 Thread Vladimir Davydov
to the list_lru structure, and finally patch 9 marks fs shrinkers as memcg aware. Thanks, Vladimir Davydov (9): list_lru: introduce list_lru_shrink_{count,walk} fs: consolidate {nr,free}_cached_objects args in shrink_control vmscan: per memory cgroup slab shrinkers memcg: rename some cache id

[PATCH -mm v3 3/9] vmscan: per memory cgroup slab shrinkers

2015-01-08 Thread Vladimir Davydov
shrinkers are only called on global pressure with memcg=NULL. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- fs/drop_caches.c | 14 include/linux/memcontrol.h |7 include/linux/mm.h |5 ++- include/linux/shrinker.h |6 +++- mm

[PATCH -mm v3 7/9] list_lru: organize all list_lrus to list

2015-01-08 Thread Vladimir Davydov
this patch this was guaranteed by kfree, but now we need an explicit check there. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- fs/super.c |8 include/linux/list_lru.h |3 +++ mm/list_lru.c| 34 ++ 3 files

[PATCH -mm v3 9/9] fs: make shrinker memcg aware

2015-01-08 Thread Vladimir Davydov
, but since they reclaim objects that are shared among different cgroups, there is no point making them memcg aware. It's a big question whether we should account them to memcg at all. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- fs/super.c |6 +++--- 1 file changed, 3 insertions

[PATCH -mm v3 6/9] list_lru: get rid of -active_nodes

2015-01-08 Thread Vladimir Davydov
list_lru per-memcg. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- include/linux/list_lru.h |5 ++--- mm/list_lru.c| 10 +++--- 2 files changed, 5 insertions(+), 10 deletions(-) diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h index f500a2e39b13

[PATCH -mm v3 5/9] memcg: add rwsem to synchronize against memcg_caches arrays relocation

2015-01-08 Thread Vladimir Davydov
the slab_mutex, so right now there's no much point in using rwsem instead of mutex. However, once list_lru is made per-memcg it will allow list_lru initializations to proceed concurrently. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- include/linux/memcontrol.h | 12 ++-- mm

[PATCH -mm v3 8/9] list_lru: introduce per-memcg lists

2015-01-08 Thread Vladimir Davydov
and per cgroup) on the node. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- include/linux/list_lru.h | 52 -- include/linux/memcontrol.h | 14 ++ mm/list_lru.c | 374 +--- mm/memcontrol.c| 20 +++ 4 files

[PATCH -mm v3 4/9] memcg: rename some cache id related variables

2015-01-08 Thread Vladimir Davydov
kmem_limited_groups to memcg_cache_ida. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- include/linux/memcontrol.h |4 ++-- mm/memcontrol.c| 19 +-- mm/slab_common.c |4 ++-- 3 files changed, 13 insertions(+), 14 deletions(-) diff --git a/include

[PATCH -mm v3 1/9] list_lru: introduce list_lru_shrink_{count,walk}

2015-01-08 Thread Vladimir Davydov
the target memcg and make list_lru_shrink_{count,walk} handle this appropriately. Suggested-by: Dave Chinner da...@fromorbit.com Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- fs/dcache.c | 14 ++ fs/gfs2/quota.c |6 +++--- fs/inode.c

Re: [patch] mm: memcontrol: track move_lock state internally

2015-01-05 Thread Vladimir Davydov
-by: Johannes Weiner han...@cmpxchg.org Reviewed-by: Vladimir Davydov vdavy...@parallels.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ

[PATCH] vmscan: force scan offline memory cgroups

2015-01-08 Thread Vladimir Davydov
this by unconditionally forcing scanning dead lruvecs from kswapd. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- include/linux/memcontrol.h |6 ++ mm/memcontrol.c| 14 ++ mm/vmscan.c|3 ++- 3 files changed, 22 insertions(+), 1

[PATCH 0/3] idle memory tracking

2015-03-18 Thread Vladimir Davydov
scan_interval % sys.argv[0] exit(1) cg_path = sys.argv[1] scan_interval = int(sys.argv[2]) while True: set_idle() time.sleep(scan_interval) clear_refs(cg_path) print count_idle(cg_path) END SCRIPT Thanks, Vladimir Davydov (3): memcg: add page_cgroup_ino helper proc

[PATCH 1/3] memcg: add page_cgroup_ino helper

2015-03-18 Thread Vladimir Davydov
on CONFIG_MEMCG_SWAP initially). Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- include/linux/memcontrol.h |8 ++ mm/hwpoison-inject.c |5 +--- mm/memcontrol.c| 61 ++-- mm/memory-failure.c| 16 ++-- 4

[PATCH 2/3] proc: add kpagecgroup file

2015-03-18 Thread Vladimir Davydov
/proc/kpagecgroup contains a 64-bit inode number of the memory cgroup each page is charged to, indexed by PFN. Having this information is useful for estimating a cgroup working set size. The file is present if CONFIG_PROC_PAGE_MONITOR CONFIG_MEMCG. Signed-off-by: Vladimir Davydov vdavy

[PATCH 3/3] mm: idle memory tracking

2015-03-18 Thread Vladimir Davydov
the PG_young flag in addition to PG_idle. The PG_young flag is set if the ACCESS/YOUNG bit is cleared at step 3. page_referenced() returns = 1 if the page has the PG_young flag set. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- Documentation/filesystems/proc.txt |3 ++ Documentation/vm

[PATCH] memcg: remove obsolete comment

2015-03-17 Thread Vladimir Davydov
Low and high watermarks, as they defined in the TODO to the mem_cgroup struct, have already been implemented by Johannes, so remove the stale comment. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- mm/memcontrol.c |5 - 1 file changed, 5 deletions(-) diff --git a/mm

Re: [PATCH 3/3] mm: idle memory tracking

2015-03-19 Thread Vladimir Davydov
On Thu, Mar 19, 2015 at 01:12:05PM +0300, Cyrill Gorcunov wrote: On Wed, Mar 18, 2015 at 11:44:36PM +0300, Vladimir Davydov wrote: +static void set_mem_idle(void) +{ + int nid; + + for_each_online_node(nid) + set_mem_idle_node(nid); +} Vladimir, might we need

Re: [PATCH 0/3] idle memory tracking

2015-03-19 Thread Vladimir Davydov
On Thu, Mar 19, 2015 at 11:13:37AM +0900, Minchan Kim wrote: On Wed, Mar 18, 2015 at 11:44:33PM +0300, Vladimir Davydov wrote: 1. Write 1 to /proc/sys/vm/set_idle. This will set the IDLE flag for all user pages. The IDLE flag is cleared when the page is read or the ACCESS/YOUNG

[PATCH] signal: improve warning about using SI_TKILL in rt_[tg]sigqueueinfo

2015-03-19 Thread Vladimir Davydov
therefore substitutes the WARN_ON_ONCE with a pr_warn_once. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- kernel/signal.c | 16 +++- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/kernel/signal.c b/kernel/signal.c index a390499943e4..3cbcd94457af 100644

Re: [PATCH] signal: improve warning about using SI_TKILL in rt_[tg]sigqueueinfo

2015-03-19 Thread Vladimir Davydov
On Thu, Mar 19, 2015 at 02:00:46PM +0100, Oleg Nesterov wrote: On 03/19, Vladimir Davydov wrote: Sending SI_TKILL from rt_[tg]sigqueueinfo was deprecated, so now we issue a warning on the first attempt of doing it. We use WARN_ON_ONCE, which is not informative and, what is worse, taints

Re: [PATCH 0/4] cleancache: remove limit on the number of cleancache enabled filesystems

2015-03-06 Thread Vladimir Davydov
On Fri, Mar 06, 2015 at 10:14:26AM -0500, Konrad Rzeszutek Wilk wrote: Would you be willing to fold in the description in the patch #4 and repost it? Andrew - are you OK picking it up or would you prefer me as the maintainer to feed it to Linus? [either option is fine with me] AFAICS Andrew

[PATCH -mm] memcg: zap mem_cgroup_lookup

2015-03-13 Thread Vladimir Davydov
for any id = 0. Since mem_cgroup_from_id is only called from mem_cgroup_lookup, let us zap mem_cgroup_lookup, substituting calls to it with mem_cgroup_from_id and moving the check if id 0 to css_from_id. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- kernel/cgroup.c |2 +- mm

[PATCH] Documentation/memcg: update memcg/kmem status

2015-04-01 Thread Vladimir Davydov
Memcg/kmem reclaim support has been finally merged. Reflect this in the documentation. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- Documentation/cgroups/memory.txt |8 +++- init/Kconfig |6 -- 2 files changed, 3 insertions(+), 11 deletions

Re: [PATCH] Documentation/memcg: update memcg/kmem status

2015-04-01 Thread Vladimir Davydov
On Wed, Apr 01, 2015 at 04:44:31PM +0200, Jonathan Corbet wrote: On Wed, 1 Apr 2015 17:30:36 +0300 Vladimir Davydov vdavy...@parallels.com wrote: Memcg/kmem reclaim support has been finally merged. Reflect this in the documentation. So the text you've removed says not to select kmem

Re: [PATCH 0/4] cleancache: remove limit on the number of cleancache enabled filesystems

2015-03-05 Thread Vladimir Davydov
On Wed, Mar 04, 2015 at 04:22:30PM -0500, Konrad Rzeszutek Wilk wrote: On Tue, Feb 24, 2015 at 01:34:06PM +0300, Vladimir Davydov wrote: On Mon, Feb 23, 2015 at 11:12:22AM -0500, Konrad Rzeszutek Wilk wrote: Thank you for posting these patches. I was wondering if you had run through some

Re: [PATCH -next] cpuset: initialize cpuset a bit early

2015-03-04 Thread Vladimir Davydov
cgroup_init() and cpuset_init(). Cc: Vladimir Davydov vdavy...@parallels.com Fixes: 295458e67284 (cgroup: call cgroup_subsys-bind on cgroup subsys initialization) Reported by: Ming Lei tom.leim...@gmail.com Signed-off-by: Zefan Li lize...@huawei.com Acked-by: Vladimir Davydov vdavy

Re: [PATCH 4/4] cleancache: remove limit on the number of cleancache enabled filesystems

2015-02-23 Thread Vladimir Davydov
Rechecking this patch, I find it rather difficult to review, because it not only rids of fake_pool_id, but also rearranges code of cleancache methods. Here is an updated patch, which attempts to be less intrusive: --- From: Vladimir Davydov vdavy...@parallels.com Subject: [PATCH v2] cleancache

[PATCH 4/4] cleancache: remove limit on the number of cleancache enabled filesystems

2015-02-22 Thread Vladimir Davydov
newer super blocks will receive it in cleancache_init_fs. This patch therefore removes the maps and hence the artificial limit on the number of cleancache enabled filesystems. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- fs/super.c |2 +- include/linux/cleancache.h

[PATCH 0/4] cleancache: remove limit on the number of cleancache enabled filesystems

2015-02-22 Thread Vladimir Davydov
. Patches 1-3 prepare the code for this change. Thanks, Vladimir Davydov (4): ocfs2: copy fs uuid to superblock cleancache: zap uuid arg of cleancache_init_shared_fs cleancache: forbid overriding cleancache_ops cleancache: remove limit on the number of cleancache enabled filesystems

[PATCH 3/4] cleancache: forbid overriding cleancache_ops

2015-02-22 Thread Vladimir Davydov
to the code outside the cleancache core. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- Documentation/vm/cleancache.txt |4 +--- drivers/xen/tmem.c | 16 +--- include/linux/cleancache.h |3 +-- mm/cleancache.c | 12 +++- 4

[PATCH 2/4] cleancache: zap uuid arg of cleancache_init_shared_fs

2015-02-22 Thread Vladimir Davydov
Use super_block-s_uuid instead. Every shared filesystem using cleancache must now initialize super_block-s_uuid before calling cleancache_init_shared_fs. The only one on the tree, ocfs2, already meets this requirement. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- fs/ocfs2/super.c

[PATCH 1/4] ocfs2: copy fs uuid to superblock

2015-02-22 Thread Vladimir Davydov
This will allow us to remove the uuid argument from cleancache_init_shared_fs. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- fs/ocfs2/super.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c index 26675185b886..43f5a9e71b35 100644 --- a/fs

Re: [PATCH 0/4] cleancache: remove limit on the number of cleancache enabled filesystems

2015-02-24 Thread Vladimir Davydov
On Mon, Feb 23, 2015 at 11:12:22AM -0500, Konrad Rzeszutek Wilk wrote: Thank you for posting these patches. I was wondering if you had run through some of the different combinations that you can load the filesystems/tmem drivers in random order? The #4 patch deleted a nice chunk of

Re: [PATCH 0/3] idle memory tracking

2015-03-24 Thread Vladimir Davydov
On Wed, Mar 18, 2015 at 11:44:33PM +0300, Vladimir Davydov wrote: Usage: 1. Write 1 to /proc/sys/vm/set_idle. This will set the IDLE flag for all user pages. The IDLE flag is cleared when the page is read or the ACCESS/YOUNG bit is cleared in any PTE pointing to the page

Re: [cgroup] WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/common.c:1439 warn_pre_alternatives()

2015-03-05 Thread Vladimir Davydov
Hi, This bug should have been fixed by [PATCH -next] cpuset: initialize cpuset a bit early: http://www.spinics.net/lists/cgroups/msg12599.html Thanks, Vladimir On Fri, Mar 06, 2015 at 01:57:58PM +0800, Fengguang Wu wrote: [0.021989] [ cut here ] [0.021989]

[PATCH v3 3/3] proc: add kpageidle file

2015-04-28 Thread Vladimir Davydov
when compiled on 32 bit. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- Documentation/vm/pagemap.txt | 10 ++- fs/proc/page.c | 154 ++ fs/proc/task_mmu.c |4 +- include/linux/mm.h | 88

[PATCH v3 1/3] memcg: add page_cgroup_ino helper

2015-04-28 Thread Vladimir Davydov
on CONFIG_MEMCG instead of CONFIG_MEMCG_SWAP (I've no idea why it was made dependant on CONFIG_MEMCG_SWAP initially). Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- include/linux/memcontrol.h |8 ++--- mm/hwpoison-inject.c |5 +-- mm/memcontrol.c| 73

[PATCH v3 0/3] idle memory tracking

2015-04-28 Thread Vladimir Davydov
accesses its working set, then press Enter) print Counting idle pages... nidle = count_idle() for dir, subdirs, files in os.walk(CGROUP_MOUNT): ino = os.stat(dir)[stat.ST_INO] print dir + : + str(nidle.get(ino, 0)) END SCRIPT Comments are more than welcome. Thanks, Vladimir

[PATCH v3 2/3] proc: add kpagecgroup file

2015-04-28 Thread Vladimir Davydov
/proc/kpagecgroup contains a 64-bit inode number of the memory cgroup each page is charged to, indexed by PFN. Having this information is useful for estimating a cgroup working set size. The file is present if CONFIG_PROC_PAGE_MONITOR CONFIG_MEMCG. Signed-off-by: Vladimir Davydov vdavy

Re: [PATCH v3 0/3] idle memory tracking

2015-04-29 Thread Vladimir Davydov
Hi Minchan, Thank you for taking a look at this patch set. On Wed, Apr 29, 2015 at 12:57:22PM +0900, Minchan Kim wrote: On Tue, Apr 28, 2015 at 03:24:39PM +0300, Vladimir Davydov wrote: * /proc/kpageidle. For each page this file contains a 64-bit number, which equals 1 if the page

Re: [PATCH v3 3/3] proc: add kpageidle file

2015-04-29 Thread Vladimir Davydov
On Wed, Apr 29, 2015 at 01:57:59PM +0900, Minchan Kim wrote: On Tue, Apr 28, 2015 at 03:24:42PM +0300, Vladimir Davydov wrote: @@ -69,6 +69,14 @@ There are four components to pagemap: memory cgroup each page is charged to, indexed by PFN. Only available when CONFIG_MEMCG is set

Re: [PATCH v3 3/3] proc: add kpageidle file

2015-04-29 Thread Vladimir Davydov
On Wed, Apr 29, 2015 at 01:35:36PM +0900, Minchan Kim wrote: On Tue, Apr 28, 2015 at 03:24:42PM +0300, Vladimir Davydov wrote: diff --git a/fs/proc/page.c b/fs/proc/page.c index 70d23245dd43..cfc55ba7fee6 100644 --- a/fs/proc/page.c +++ b/fs/proc/page.c @@ -275,6 +275,156 @@ static

Re: [PATCH v3 3/3] proc: add kpageidle file

2015-05-04 Thread Vladimir Davydov
On Mon, May 04, 2015 at 12:17:22PM +0900, Minchan Kim wrote: On Thu, Apr 30, 2015 at 05:50:55PM +0300, Vladimir Davydov wrote: On Thu, Apr 30, 2015 at 05:25:31PM +0900, Minchan Kim wrote: On Wed, Apr 29, 2015 at 12:12:48PM +0300, Vladimir Davydov wrote: On Wed, Apr 29, 2015 at 01:35:36PM

Re: [PATCH v3 3/3] proc: add kpageidle file

2015-04-30 Thread Vladimir Davydov
On Thu, Apr 30, 2015 at 05:25:31PM +0900, Minchan Kim wrote: On Wed, Apr 29, 2015 at 12:12:48PM +0300, Vladimir Davydov wrote: On Wed, Apr 29, 2015 at 01:35:36PM +0900, Minchan Kim wrote: On Tue, Apr 28, 2015 at 03:24:42PM +0300, Vladimir Davydov wrote: +#ifdef CONFIG_IDLE_PAGE_TRACKING

Re: [PATCH 1/2] gfp: add __GFP_NOACCOUNT

2015-05-06 Thread Vladimir Davydov
On Wed, May 06, 2015 at 01:59:41PM +0200, Michal Hocko wrote: On Tue 05-05-15 12:45:42, Vladimir Davydov wrote: Not all kmem allocations should be accounted to memcg. The following patch gives an example when accounting of a certain type of allocations to memcg can effectively result

Re: [PATCH 1/2] gfp: add __GFP_NOACCOUNT

2015-05-06 Thread Vladimir Davydov
On Wed, May 06, 2015 at 02:35:41PM +0200, Michal Hocko wrote: On Wed 06-05-15 15:24:31, Vladimir Davydov wrote: On Wed, May 06, 2015 at 01:59:41PM +0200, Michal Hocko wrote: On Tue 05-05-15 12:45:42, Vladimir Davydov wrote: Not all kmem allocations should be accounted to memcg

Re: [PATCH 1/2] gfp: add __GFP_NOACCOUNT

2015-05-06 Thread Vladimir Davydov
On Wed, May 06, 2015 at 03:55:20PM +0200, Michal Hocko wrote: On Wed 06-05-15 16:25:10, Vladimir Davydov wrote: On Wed, May 06, 2015 at 02:35:41PM +0200, Michal Hocko wrote: [...] NOACCOUNT doesn't imply kmem at all so it is not clear who is in charge of the accounting. IMO

[PATCH v4 1/3] memcg: add page_cgroup_ino helper

2015-05-07 Thread Vladimir Davydov
on CONFIG_MEMCG instead of CONFIG_MEMCG_SWAP (I've no idea why it was made dependant on CONFIG_MEMCG_SWAP initially). Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- include/linux/memcontrol.h |8 ++--- mm/hwpoison-inject.c |5 +-- mm/memcontrol.c| 73

[PATCH v4 0/3] idle memory tracking

2015-05-07 Thread Vladimir Davydov
, 0) * 4) + KB END SCRIPT Comments are more than welcome. Thanks, Vladimir Davydov (3): memcg: add page_cgroup_ino helper proc: add kpagecgroup file proc: add kpageidle file Documentation/vm/pagemap.txt | 16 ++- fs/proc/Kconfig |5 +- fs/proc/page.c

[PATCH v4 3/3] proc: add kpageidle file

2015-05-07 Thread Vladimir Davydov
when compiled on 32 bit. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- Documentation/vm/pagemap.txt | 12 ++- fs/proc/page.c | 171 ++ fs/proc/task_mmu.c |4 +- include/linux/mm.h | 88

[PATCH v4 2/3] proc: add kpagecgroup file

2015-05-07 Thread Vladimir Davydov
/proc/kpagecgroup contains a 64-bit inode number of the memory cgroup each page is charged to, indexed by PFN. Having this information is useful for estimating a cgroup working set size. The file is present if CONFIG_PROC_PAGE_MONITOR CONFIG_MEMCG. Signed-off-by: Vladimir Davydov vdavy

Re: [PATCH v3 3/3] proc: add kpageidle file

2015-05-08 Thread Vladimir Davydov
On Mon, May 04, 2015 at 07:54:59PM +0900, Minchan Kim wrote: So, I guess once below compiler optimization happens in __page_set_anon_rmap, it could be corrupt in page_refernced. __page_set_anon_rmap: page-mapping = (struct address_space *) anon_vma; page-mapping = (struct

[PATCH v2] gfp: add __GFP_NOACCOUNT

2015-05-06 Thread Vladimir Davydov
-off-by: Vladimir Davydov vdavy...@parallels.com Cc: sta...@vger.kernel.org # 4.0 --- Changes in v2: - explain drawbacks of per kmem cache flag disabling accounting as a possible alternative to a GFP flag in commit message (Michal) - warn if __GFP_NOACCOUNT is passed to mem_cgroup_try_charge

Re: [PATCH v4 0/3] idle memory tracking

2015-05-08 Thread Vladimir Davydov
On Thu, May 07, 2015 at 05:09:39PM +0300, Vladimir Davydov wrote: SCRIPT FOR COUNTING IDLE PAGES PER CGROUP Oops, this script is stale. The correct one is here: --- #! /usr/bin/python # import os import stat import errno import struct CGROUP_MOUNT = /sys/fs/cgroup/memory BUFSIZE = 8

Re: [PATCH v4 3/3] proc: add kpageidle file

2015-05-08 Thread Vladimir Davydov
Oops, this patch is stale, the correct one is here: --- From: Vladimir Davydov vdavy...@parallels.com Subject: [PATCH] proc: add kpageidle file Knowing the portion of memory that is not used by a certain application or memory cgroup (idle memory) can be useful for partitioning the system

[RFC] rmap: fix race between do_wp_page and shrink_active_list

2015-05-11 Thread Vladimir Davydov
Hi, I've been arguing with Minchan for a while about whether store-tearing is possible while setting page-mapping in __page_set_anon_rmap and friends, see http://thread.gmane.org/gmane.linux.kernel.mm/131949/focus=132132 This patch is intended to draw attention to this discussion. It fixes a

Re: [PATCH 2/2] kernfs: do not account ino_ida allocations to memcg

2015-05-05 Thread Vladimir Davydov
On Tue, May 05, 2015 at 09:45:21AM -0400, Tejun Heo wrote: On Tue, May 05, 2015 at 12:45:43PM +0300, Vladimir Davydov wrote: root-ino_ida is used for kernfs inode number allocations. Since IDA has a layered structure, different IDs can reside on the same layer, which is currently accounted

Re: [PATCH v3 3/3] proc: add kpageidle file

2015-05-10 Thread Vladimir Davydov
On Sun, May 10, 2015 at 12:12:38AM +0900, Minchan Kim wrote: On Fri, May 08, 2015 at 12:56:04PM +0300, Vladimir Davydov wrote: On Mon, May 04, 2015 at 07:54:59PM +0900, Minchan Kim wrote: So, I guess once below compiler optimization happens in __page_set_anon_rmap, it could be corrupt

[PATCH 1/2] gfp: add __GFP_NOACCOUNT

2015-05-05 Thread Vladimir Davydov
to go through the root cgroup. It will be used by the next patch. Note, since in case of kmemleak enabled each kmalloc implies yet another allocation from the kmemleak_object cache, we add __GFP_NOACCOUNT to gfp_kmemleak_mask. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- include

[PATCH 2/2] kernfs: do not account ino_ida allocations to memcg

2015-05-05 Thread Vladimir Davydov
), an easy way to reproduce this issue is by creating network namespace(s) from inside a kmem-active memory cgroup. Signed-off-by: Vladimir Davydov vdavy...@parallels.com --- fs/kernfs/dir.c |9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c

Re: [RFC] rmap: fix race between do_wp_page and shrink_active_list

2015-05-12 Thread Vladimir Davydov
On Mon, May 11, 2015 at 04:59:27PM +0800, yalin wang wrote: i am confused about your analysis , for the race stack: CPU0 CPU1 do_wp_pageshrink_active_list lock_page

Re: [RFC] rmap: fix race between do_wp_page and shrink_active_list

2015-05-12 Thread Vladimir Davydov
On Mon, May 11, 2015 at 07:24:02AM -0700, Paul E. McKenney wrote: On Mon, May 11, 2015 at 10:51:17AM +0300, Vladimir Davydov wrote: Hi, I've been arguing with Minchan for a while about whether store-tearing is possible while setting page-mapping in __page_set_anon_rmap and friends, see

Re: [PATCH v3 3/3] proc: add kpageidle file

2015-05-12 Thread Vladimir Davydov
On Sun, May 10, 2015 at 01:34:29PM +0300, Vladimir Davydov wrote: On Sun, May 10, 2015 at 12:12:38AM +0900, Minchan Kim wrote: Yeb, I might be paranoid but my point is it might work now on most of arch but it seem to be buggy/fragile/subtle because we couldn't prove all arch/compiler don't

[PATCH v2] rmap: fix theoretical race between do_wp_page and shrink_active_list

2015-05-12 Thread Vladimir Davydov
of WRITE_ONCE. Signed-off-by: Vladimir Davydov vdavy...@parallels.com Cc: Paul E. McKenney paul...@linux.vnet.ibm.com Cc: Kirill A. Shutemov kir...@shutemov.name Cc: Rik van Riel r...@redhat.com Cc: Hugh Dickins hu...@google.com --- Changes in v2: - do not add READ_ONCE to PageAnon and WRITE_ONCE

Re: [RFC] rmap: fix race between do_wp_page and shrink_active_list

2015-05-12 Thread Vladimir Davydov
On Mon, May 11, 2015 at 12:36:52PM +0300, Kirill A. Shutemov wrote: On Mon, May 11, 2015 at 10:51:17AM +0300, Vladimir Davydov wrote: diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 5e7c4f50a644..a529e0a35fe9 100644 --- a/include/linux/page-flags.h +++ b/include

<    4   5   6   7   8   9   10   11   12   13   >