[Devel] Re: [PATCH 07/10] memcg: add dirty limits to mem_cgroup
KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com writes: On Mon, 11 Oct 2010 17:24:21 -0700 Greg Thelen gthe...@google.com wrote: Is your motivation to increase performance with the same functionality? If so, then would a 'static inline' be performance equivalent to a preprocessor macro yet be safer to use? Ah, if lockdep finds this as bug, I think other parts will hit this, too. like this. static struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm) { struct mem_cgroup *mem = NULL; if (!mm) return NULL; /* * Because we have no locks, mm-owner's may be being moved to other * cgroup. We use css_tryget() here even if this looks * pessimistic (rather than adding locks here). */ rcu_read_lock(); do { mem = mem_cgroup_from_task(rcu_dereference(mm-owner)); if (unlikely(!mem)) break; } while (!css_tryget(mem-css)); rcu_read_unlock(); return mem; } mem_cgroup_from_task() calls task_subsys_state() calls task_subsys_state_check(). task_subsys_state_check() will be happy if rcu_read_lock is held. yes. I don't think that this will fail lockdep, because rcu_read_lock_held() is true when calling mem_cgroup_from_task() within try_get_mem_cgroup_from_mm().. agreed. mem_cgroup_from_task() is designed to be used as this. If dqefined as macro, I think it will not be catched. I do not understand how making mem_cgroup_from_task() a macro will change its behavior wrt. to lockdep assertion checking. I assume that as a macro mem_cgroup_from_task() would still call task_subsys_state(), which requires either: a) rcu read lock held b) task-alloc_lock held c) cgroup lock held Hmm. Maybe I was wrong. Maybe it makes more sense to find a way to perform this check in mem_cgroup_has_dirty_limit() without needing to grab the rcu lock. I think this lock grab is unneeded. I am still collecting performance data, but suspect that this may be making the code slower than it needs to be. Hmm. css_set[] itself is freed by RCU..what idea to remove rcu_read_lock() do you have ? Adding some flags ? It seems like a shame to need a lock to determine if current is in the root cgroup. Especially given that as soon as mem_cgroup_has_dirty_limit() returns, the task could be moved in-to/out-of the root cgroup thereby invaliding the answer. So the answer is just a sample that may be wrong. Yes. But it's not a bug but a specification. But I think you are correct. We will need the rcu read lock in mem_cgroup_has_dirty_limit(). yes. Ah...I noticed that you should do mem = mem_cgroup_from_task(current-mm-owner); to check has_dirty_limit... What are the cases where current-mm-owner-cgroups != current-cgroups? In that case, assume group A and B. thread(1) - belongs to cgroup A (thread(1) is mm-owner) thread(2) - belongs to cgroup B and a page- charnged to cgroup A Then, thread(2) make the page dirty which is under cgroup A. In this case, if page's dirty_pages accounting is added to cgroup B, cgroup B' statistics may show dirty_pages all_lru_pages. This is bug. I agree that in this case the dirty_pages accounting should be added to cgroup A because that is where the page was charged. This will happen because pc-mem_cgroup was set to A when the page was charged. The mark-page-dirty code will check pc-mem_cgroup to determine which cgroup to add the dirty page to. I think that the current vs current-mm-owner decision is in areas of the code that is used to query the dirty limits. These routines do not use this data to determine which cgroup to charge for dirty pages. The usage of either mem_cgroup_from_task(current-mm-owner) or mem_cgroup_from_task(current) in mem_cgroup_has_dirty_limit() does not determine which cgroup is added for dirty_pages. mem_cgroup_has_dirty_limit() is only used to determine if the process has a dirty limit. As discussed, this is a momentary answer that may be wrong by the time decisions are made because the task may be migrated in-to/out-of root cgroup while mem_cgroup_has_dirty_limit() runs. If the process has a dirty limit, then the process's memcg is used to compute dirty limits. Using your example, I assume that thread(1) and thread(2) will git dirty limits from cgroup(A) and cgroup(B) respectively. Are you thinking that when accounting for a dirty page (by incrementing pc-mem_cgroup-stat-count[MEM_CGROUP_STAT_FILE_DIRTY]) that we should check the pc-mem_cgroup dirty limit? I was hoping to avoid having add even more logic into mem_cgroup_has_dirty_limit() to handle the case where current-mm is NULL. Blease check current-mm. We can't limit works of kernel-thread by this, let's consider it later if necessary. Presumably the newly proposed
[Devel] Re: [PATCH 07/10] memcg: add dirty limits to mem_cgroup
On Tue, 12 Oct 2010 00:32:33 -0700 Greg Thelen gthe...@google.com wrote: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com writes: What are the cases where current-mm-owner-cgroups != current-cgroups? In that case, assume group A and B. thread(1) - belongs to cgroup A (thread(1) is mm-owner) thread(2) - belongs to cgroup B and a page- charnged to cgroup A Then, thread(2) make the page dirty which is under cgroup A. In this case, if page's dirty_pages accounting is added to cgroup B, cgroup B' statistics may show dirty_pages all_lru_pages. This is bug. I agree that in this case the dirty_pages accounting should be added to cgroup A because that is where the page was charged. This will happen because pc-mem_cgroup was set to A when the page was charged. The mark-page-dirty code will check pc-mem_cgroup to determine which cgroup to add the dirty page to. I think that the current vs current-mm-owner decision is in areas of the code that is used to query the dirty limits. These routines do not use this data to determine which cgroup to charge for dirty pages. The usage of either mem_cgroup_from_task(current-mm-owner) or mem_cgroup_from_task(current) in mem_cgroup_has_dirty_limit() does not determine which cgroup is added for dirty_pages. mem_cgroup_has_dirty_limit() is only used to determine if the process has a dirty limit. As discussed, this is a momentary answer that may be wrong by the time decisions are made because the task may be migrated in-to/out-of root cgroup while mem_cgroup_has_dirty_limit() runs. If the process has a dirty limit, then the process's memcg is used to compute dirty limits. Using your example, I assume that thread(1) and thread(2) will git dirty limits from cgroup(A) and cgroup(B) respectively. Ok, thank you for clarification. Throttoling a thread based on its own cgroup not based on mm-owner makes sense. Could you add a brief comment on the code ? Thanks, -Kame ___ Containers mailing list contain...@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/containers ___ Devel mailing list Devel@openvz.org https://openvz.org/mailman/listinfo/devel
[Devel] Re: [PATCH 07/10] memcg: add dirty limits to mem_cgroup
KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com writes: On Wed, 06 Oct 2010 17:27:13 -0700 Greg Thelen gthe...@google.com wrote: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com writes: On Tue, 05 Oct 2010 12:00:17 -0700 Greg Thelen gthe...@google.com wrote: Andrea Righi ari...@develer.com writes: On Sun, Oct 03, 2010 at 11:58:02PM -0700, Greg Thelen wrote: Extend mem_cgroup to contain dirty page limits. Also add routines allowing the kernel to query the dirty usage of a memcg. These interfaces not used by the kernel yet. A subsequent commit will add kernel calls to utilize these new routines. A small note below. Signed-off-by: Greg Thelen gthe...@google.com Signed-off-by: Andrea Righi ari...@develer.com --- include/linux/memcontrol.h | 44 +++ mm/memcontrol.c| 180 +++- 2 files changed, 223 insertions(+), 1 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 6303da1..dc8952d 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -19,6 +19,7 @@ #ifndef _LINUX_MEMCONTROL_H #define _LINUX_MEMCONTROL_H +#include linux/writeback.h #include linux/cgroup.h struct mem_cgroup; struct page_cgroup; @@ -33,6 +34,30 @@ enum mem_cgroup_write_page_stat_item { MEMCG_NR_FILE_UNSTABLE_NFS, /* # of NFS unstable pages */ }; +/* Cgroup memory statistics items exported to the kernel */ +enum mem_cgroup_read_page_stat_item { + MEMCG_NR_DIRTYABLE_PAGES, + MEMCG_NR_RECLAIM_PAGES, + MEMCG_NR_WRITEBACK, + MEMCG_NR_DIRTY_WRITEBACK_PAGES, +}; + +/* Dirty memory parameters */ +struct vm_dirty_param { + int dirty_ratio; + int dirty_background_ratio; + unsigned long dirty_bytes; + unsigned long dirty_background_bytes; +}; + +static inline void get_global_vm_dirty_param(struct vm_dirty_param *param) +{ + param-dirty_ratio = vm_dirty_ratio; + param-dirty_bytes = vm_dirty_bytes; + param-dirty_background_ratio = dirty_background_ratio; + param-dirty_background_bytes = dirty_background_bytes; +} + extern unsigned long mem_cgroup_isolate_pages(unsigned long nr_to_scan, struct list_head *dst, unsigned long *scanned, int order, @@ -145,6 +170,10 @@ static inline void mem_cgroup_dec_page_stat(struct page *page, mem_cgroup_update_page_stat(page, idx, -1); } +bool mem_cgroup_has_dirty_limit(void); +void get_vm_dirty_param(struct vm_dirty_param *param); +s64 mem_cgroup_page_stat(enum mem_cgroup_read_page_stat_item item); + unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order, gfp_t gfp_mask); u64 mem_cgroup_get_limit(struct mem_cgroup *mem); @@ -326,6 +355,21 @@ static inline void mem_cgroup_dec_page_stat(struct page *page, { } +static inline bool mem_cgroup_has_dirty_limit(void) +{ + return false; +} + +static inline void get_vm_dirty_param(struct vm_dirty_param *param) +{ + get_global_vm_dirty_param(param); +} + +static inline s64 mem_cgroup_page_stat(enum mem_cgroup_read_page_stat_item item) +{ + return -ENOSYS; +} + static inline unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order, gfp_t gfp_mask) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index f40839f..6ec2625 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -233,6 +233,10 @@ struct mem_cgroup { atomic_trefcnt; unsigned intswappiness; + + /* control memory cgroup dirty pages */ + struct vm_dirty_param dirty_param; + /* OOM-Killer disable */ int oom_kill_disable; @@ -1132,6 +1136,172 @@ static unsigned int get_swappiness(struct mem_cgroup *memcg) return swappiness; } +/* + * Returns a snapshot of the current dirty limits which is not synchronized with + * the routines that change the dirty limits. If this routine races with an + * update to the dirty bytes/ratio value, then the caller must handle the case + * where both dirty_[background_]_ratio and _bytes are set. + */ +static void __mem_cgroup_get_dirty_param(struct vm_dirty_param *param, + struct mem_cgroup *mem) +{ + if (mem !mem_cgroup_is_root(mem)) { + param-dirty_ratio = mem-dirty_param.dirty_ratio; + param-dirty_bytes = mem-dirty_param.dirty_bytes; + param-dirty_background_ratio = +
[Devel] Re: [PATCH 07/10] memcg: add dirty limits to mem_cgroup
On Mon, 11 Oct 2010 17:24:21 -0700 Greg Thelen gthe...@google.com wrote: Is your motivation to increase performance with the same functionality? If so, then would a 'static inline' be performance equivalent to a preprocessor macro yet be safer to use? Ah, if lockdep finds this as bug, I think other parts will hit this, too. like this. static struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm) { struct mem_cgroup *mem = NULL; if (!mm) return NULL; /* * Because we have no locks, mm-owner's may be being moved to other * cgroup. We use css_tryget() here even if this looks * pessimistic (rather than adding locks here). */ rcu_read_lock(); do { mem = mem_cgroup_from_task(rcu_dereference(mm-owner)); if (unlikely(!mem)) break; } while (!css_tryget(mem-css)); rcu_read_unlock(); return mem; } mem_cgroup_from_task() calls task_subsys_state() calls task_subsys_state_check(). task_subsys_state_check() will be happy if rcu_read_lock is held. yes. I don't think that this will fail lockdep, because rcu_read_lock_held() is true when calling mem_cgroup_from_task() within try_get_mem_cgroup_from_mm().. agreed. mem_cgroup_from_task() is designed to be used as this. If dqefined as macro, I think it will not be catched. I do not understand how making mem_cgroup_from_task() a macro will change its behavior wrt. to lockdep assertion checking. I assume that as a macro mem_cgroup_from_task() would still call task_subsys_state(), which requires either: a) rcu read lock held b) task-alloc_lock held c) cgroup lock held Hmm. Maybe I was wrong. Maybe it makes more sense to find a way to perform this check in mem_cgroup_has_dirty_limit() without needing to grab the rcu lock. I think this lock grab is unneeded. I am still collecting performance data, but suspect that this may be making the code slower than it needs to be. Hmm. css_set[] itself is freed by RCU..what idea to remove rcu_read_lock() do you have ? Adding some flags ? It seems like a shame to need a lock to determine if current is in the root cgroup. Especially given that as soon as mem_cgroup_has_dirty_limit() returns, the task could be moved in-to/out-of the root cgroup thereby invaliding the answer. So the answer is just a sample that may be wrong. Yes. But it's not a bug but a specification. But I think you are correct. We will need the rcu read lock in mem_cgroup_has_dirty_limit(). yes. Ah...I noticed that you should do mem = mem_cgroup_from_task(current-mm-owner); to check has_dirty_limit... What are the cases where current-mm-owner-cgroups != current-cgroups? In that case, assume group A and B. thread(1) - belongs to cgroup A (thread(1) is mm-owner) thread(2) - belongs to cgroup B and a page- charnged to cgroup A Then, thread(2) make the page dirty which is under cgroup A. In this case, if page's dirty_pages accounting is added to cgroup B, cgroup B' statistics may show dirty_pages all_lru_pages. This is bug. I was hoping to avoid having add even more logic into mem_cgroup_has_dirty_limit() to handle the case where current-mm is NULL. Blease check current-mm. We can't limit works of kernel-thread by this, let's consider it later if necessary. Presumably the newly proposed vm_dirty_param(), mem_cgroup_has_dirty_limit(), and mem_cgroup_page_stat() routines all need to use the same logic. I assume they should all be consistently using current-mm-owner or current. please. Thanks, -Kame ___ Containers mailing list contain...@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/containers ___ Devel mailing list Devel@openvz.org https://openvz.org/mailman/listinfo/devel
[Devel] Re: [PATCH 07/10] memcg: add dirty limits to mem_cgroup
On Tue, 05 Oct 2010 12:00:17 -0700 Greg Thelen gthe...@google.com wrote: Andrea Righi ari...@develer.com writes: On Sun, Oct 03, 2010 at 11:58:02PM -0700, Greg Thelen wrote: Extend mem_cgroup to contain dirty page limits. Also add routines allowing the kernel to query the dirty usage of a memcg. These interfaces not used by the kernel yet. A subsequent commit will add kernel calls to utilize these new routines. A small note below. Signed-off-by: Greg Thelen gthe...@google.com Signed-off-by: Andrea Righi ari...@develer.com --- include/linux/memcontrol.h | 44 +++ mm/memcontrol.c| 180 +++- 2 files changed, 223 insertions(+), 1 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 6303da1..dc8952d 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -19,6 +19,7 @@ #ifndef _LINUX_MEMCONTROL_H #define _LINUX_MEMCONTROL_H +#include linux/writeback.h #include linux/cgroup.h struct mem_cgroup; struct page_cgroup; @@ -33,6 +34,30 @@ enum mem_cgroup_write_page_stat_item { MEMCG_NR_FILE_UNSTABLE_NFS, /* # of NFS unstable pages */ }; +/* Cgroup memory statistics items exported to the kernel */ +enum mem_cgroup_read_page_stat_item { + MEMCG_NR_DIRTYABLE_PAGES, + MEMCG_NR_RECLAIM_PAGES, + MEMCG_NR_WRITEBACK, + MEMCG_NR_DIRTY_WRITEBACK_PAGES, +}; + +/* Dirty memory parameters */ +struct vm_dirty_param { + int dirty_ratio; + int dirty_background_ratio; + unsigned long dirty_bytes; + unsigned long dirty_background_bytes; +}; + +static inline void get_global_vm_dirty_param(struct vm_dirty_param *param) +{ + param-dirty_ratio = vm_dirty_ratio; + param-dirty_bytes = vm_dirty_bytes; + param-dirty_background_ratio = dirty_background_ratio; + param-dirty_background_bytes = dirty_background_bytes; +} + extern unsigned long mem_cgroup_isolate_pages(unsigned long nr_to_scan, struct list_head *dst, unsigned long *scanned, int order, @@ -145,6 +170,10 @@ static inline void mem_cgroup_dec_page_stat(struct page *page, mem_cgroup_update_page_stat(page, idx, -1); } +bool mem_cgroup_has_dirty_limit(void); +void get_vm_dirty_param(struct vm_dirty_param *param); +s64 mem_cgroup_page_stat(enum mem_cgroup_read_page_stat_item item); + unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order, gfp_t gfp_mask); u64 mem_cgroup_get_limit(struct mem_cgroup *mem); @@ -326,6 +355,21 @@ static inline void mem_cgroup_dec_page_stat(struct page *page, { } +static inline bool mem_cgroup_has_dirty_limit(void) +{ + return false; +} + +static inline void get_vm_dirty_param(struct vm_dirty_param *param) +{ + get_global_vm_dirty_param(param); +} + +static inline s64 mem_cgroup_page_stat(enum mem_cgroup_read_page_stat_item item) +{ + return -ENOSYS; +} + static inline unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order, gfp_t gfp_mask) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index f40839f..6ec2625 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -233,6 +233,10 @@ struct mem_cgroup { atomic_trefcnt; unsigned intswappiness; + + /* control memory cgroup dirty pages */ + struct vm_dirty_param dirty_param; + /* OOM-Killer disable */ int oom_kill_disable; @@ -1132,6 +1136,172 @@ static unsigned int get_swappiness(struct mem_cgroup *memcg) return swappiness; } +/* + * Returns a snapshot of the current dirty limits which is not synchronized with + * the routines that change the dirty limits. If this routine races with an + * update to the dirty bytes/ratio value, then the caller must handle the case + * where both dirty_[background_]_ratio and _bytes are set. + */ +static void __mem_cgroup_get_dirty_param(struct vm_dirty_param *param, + struct mem_cgroup *mem) +{ + if (mem !mem_cgroup_is_root(mem)) { + param-dirty_ratio = mem-dirty_param.dirty_ratio; + param-dirty_bytes = mem-dirty_param.dirty_bytes; + param-dirty_background_ratio = + mem-dirty_param.dirty_background_ratio; + param-dirty_background_bytes = + mem-dirty_param.dirty_background_bytes; + } else { + get_global_vm_dirty_param(param); + } +} + +/* + * Get dirty memory parameters of the current memcg or global values (if memory + * cgroups are disabled or querying the root cgroup). + */ +void get_vm_dirty_param(struct vm_dirty_param *param) +{ + struct mem_cgroup *memcg; + +
[Devel] Re: [PATCH 07/10] memcg: add dirty limits to mem_cgroup
KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com writes: On Tue, 05 Oct 2010 12:00:17 -0700 Greg Thelen gthe...@google.com wrote: Andrea Righi ari...@develer.com writes: On Sun, Oct 03, 2010 at 11:58:02PM -0700, Greg Thelen wrote: Extend mem_cgroup to contain dirty page limits. Also add routines allowing the kernel to query the dirty usage of a memcg. These interfaces not used by the kernel yet. A subsequent commit will add kernel calls to utilize these new routines. A small note below. Signed-off-by: Greg Thelen gthe...@google.com Signed-off-by: Andrea Righi ari...@develer.com --- include/linux/memcontrol.h | 44 +++ mm/memcontrol.c| 180 +++- 2 files changed, 223 insertions(+), 1 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 6303da1..dc8952d 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -19,6 +19,7 @@ #ifndef _LINUX_MEMCONTROL_H #define _LINUX_MEMCONTROL_H +#include linux/writeback.h #include linux/cgroup.h struct mem_cgroup; struct page_cgroup; @@ -33,6 +34,30 @@ enum mem_cgroup_write_page_stat_item { MEMCG_NR_FILE_UNSTABLE_NFS, /* # of NFS unstable pages */ }; +/* Cgroup memory statistics items exported to the kernel */ +enum mem_cgroup_read_page_stat_item { + MEMCG_NR_DIRTYABLE_PAGES, + MEMCG_NR_RECLAIM_PAGES, + MEMCG_NR_WRITEBACK, + MEMCG_NR_DIRTY_WRITEBACK_PAGES, +}; + +/* Dirty memory parameters */ +struct vm_dirty_param { + int dirty_ratio; + int dirty_background_ratio; + unsigned long dirty_bytes; + unsigned long dirty_background_bytes; +}; + +static inline void get_global_vm_dirty_param(struct vm_dirty_param *param) +{ + param-dirty_ratio = vm_dirty_ratio; + param-dirty_bytes = vm_dirty_bytes; + param-dirty_background_ratio = dirty_background_ratio; + param-dirty_background_bytes = dirty_background_bytes; +} + extern unsigned long mem_cgroup_isolate_pages(unsigned long nr_to_scan, struct list_head *dst, unsigned long *scanned, int order, @@ -145,6 +170,10 @@ static inline void mem_cgroup_dec_page_stat(struct page *page, mem_cgroup_update_page_stat(page, idx, -1); } +bool mem_cgroup_has_dirty_limit(void); +void get_vm_dirty_param(struct vm_dirty_param *param); +s64 mem_cgroup_page_stat(enum mem_cgroup_read_page_stat_item item); + unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order, gfp_t gfp_mask); u64 mem_cgroup_get_limit(struct mem_cgroup *mem); @@ -326,6 +355,21 @@ static inline void mem_cgroup_dec_page_stat(struct page *page, { } +static inline bool mem_cgroup_has_dirty_limit(void) +{ + return false; +} + +static inline void get_vm_dirty_param(struct vm_dirty_param *param) +{ + get_global_vm_dirty_param(param); +} + +static inline s64 mem_cgroup_page_stat(enum mem_cgroup_read_page_stat_item item) +{ + return -ENOSYS; +} + static inline unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order, gfp_t gfp_mask) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index f40839f..6ec2625 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -233,6 +233,10 @@ struct mem_cgroup { atomic_trefcnt; unsigned intswappiness; + + /* control memory cgroup dirty pages */ + struct vm_dirty_param dirty_param; + /* OOM-Killer disable */ int oom_kill_disable; @@ -1132,6 +1136,172 @@ static unsigned int get_swappiness(struct mem_cgroup *memcg) return swappiness; } +/* + * Returns a snapshot of the current dirty limits which is not synchronized with + * the routines that change the dirty limits. If this routine races with an + * update to the dirty bytes/ratio value, then the caller must handle the case + * where both dirty_[background_]_ratio and _bytes are set. + */ +static void __mem_cgroup_get_dirty_param(struct vm_dirty_param *param, + struct mem_cgroup *mem) +{ + if (mem !mem_cgroup_is_root(mem)) { + param-dirty_ratio = mem-dirty_param.dirty_ratio; + param-dirty_bytes = mem-dirty_param.dirty_bytes; + param-dirty_background_ratio = + mem-dirty_param.dirty_background_ratio; + param-dirty_background_bytes = + mem-dirty_param.dirty_background_bytes; + } else { + get_global_vm_dirty_param(param); + } +} + +/* + * Get dirty memory parameters of the current memcg or global values (if memory + * cgroups are disabled or querying the root cgroup). + */ +void get_vm_dirty_param(struct vm_dirty_param *param) +{ + struct
[Devel] Re: [PATCH 07/10] memcg: add dirty limits to mem_cgroup
On Wed, 06 Oct 2010 17:27:13 -0700 Greg Thelen gthe...@google.com wrote: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com writes: On Tue, 05 Oct 2010 12:00:17 -0700 Greg Thelen gthe...@google.com wrote: Andrea Righi ari...@develer.com writes: On Sun, Oct 03, 2010 at 11:58:02PM -0700, Greg Thelen wrote: Extend mem_cgroup to contain dirty page limits. Also add routines allowing the kernel to query the dirty usage of a memcg. These interfaces not used by the kernel yet. A subsequent commit will add kernel calls to utilize these new routines. A small note below. Signed-off-by: Greg Thelen gthe...@google.com Signed-off-by: Andrea Righi ari...@develer.com --- include/linux/memcontrol.h | 44 +++ mm/memcontrol.c| 180 +++- 2 files changed, 223 insertions(+), 1 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 6303da1..dc8952d 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -19,6 +19,7 @@ #ifndef _LINUX_MEMCONTROL_H #define _LINUX_MEMCONTROL_H +#include linux/writeback.h #include linux/cgroup.h struct mem_cgroup; struct page_cgroup; @@ -33,6 +34,30 @@ enum mem_cgroup_write_page_stat_item { MEMCG_NR_FILE_UNSTABLE_NFS, /* # of NFS unstable pages */ }; +/* Cgroup memory statistics items exported to the kernel */ +enum mem_cgroup_read_page_stat_item { + MEMCG_NR_DIRTYABLE_PAGES, + MEMCG_NR_RECLAIM_PAGES, + MEMCG_NR_WRITEBACK, + MEMCG_NR_DIRTY_WRITEBACK_PAGES, +}; + +/* Dirty memory parameters */ +struct vm_dirty_param { + int dirty_ratio; + int dirty_background_ratio; + unsigned long dirty_bytes; + unsigned long dirty_background_bytes; +}; + +static inline void get_global_vm_dirty_param(struct vm_dirty_param *param) +{ + param-dirty_ratio = vm_dirty_ratio; + param-dirty_bytes = vm_dirty_bytes; + param-dirty_background_ratio = dirty_background_ratio; + param-dirty_background_bytes = dirty_background_bytes; +} + extern unsigned long mem_cgroup_isolate_pages(unsigned long nr_to_scan, struct list_head *dst, unsigned long *scanned, int order, @@ -145,6 +170,10 @@ static inline void mem_cgroup_dec_page_stat(struct page *page, mem_cgroup_update_page_stat(page, idx, -1); } +bool mem_cgroup_has_dirty_limit(void); +void get_vm_dirty_param(struct vm_dirty_param *param); +s64 mem_cgroup_page_stat(enum mem_cgroup_read_page_stat_item item); + unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order, gfp_t gfp_mask); u64 mem_cgroup_get_limit(struct mem_cgroup *mem); @@ -326,6 +355,21 @@ static inline void mem_cgroup_dec_page_stat(struct page *page, { } +static inline bool mem_cgroup_has_dirty_limit(void) +{ + return false; +} + +static inline void get_vm_dirty_param(struct vm_dirty_param *param) +{ + get_global_vm_dirty_param(param); +} + +static inline s64 mem_cgroup_page_stat(enum mem_cgroup_read_page_stat_item item) +{ + return -ENOSYS; +} + static inline unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order, gfp_t gfp_mask) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index f40839f..6ec2625 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -233,6 +233,10 @@ struct mem_cgroup { atomic_trefcnt; unsigned intswappiness; + + /* control memory cgroup dirty pages */ + struct vm_dirty_param dirty_param; + /* OOM-Killer disable */ int oom_kill_disable; @@ -1132,6 +1136,172 @@ static unsigned int get_swappiness(struct mem_cgroup *memcg) return swappiness; } +/* + * Returns a snapshot of the current dirty limits which is not synchronized with + * the routines that change the dirty limits. If this routine races with an + * update to the dirty bytes/ratio value, then the caller must handle the case + * where both dirty_[background_]_ratio and _bytes are set. + */ +static void __mem_cgroup_get_dirty_param(struct vm_dirty_param *param, +struct mem_cgroup *mem) +{ + if (mem !mem_cgroup_is_root(mem)) { + param-dirty_ratio = mem-dirty_param.dirty_ratio; + param-dirty_bytes = mem-dirty_param.dirty_bytes; + param-dirty_background_ratio = +
[Devel] Re: [PATCH 07/10] memcg: add dirty limits to mem_cgroup
On Sun, 3 Oct 2010 23:58:02 -0700 Greg Thelen gthe...@google.com wrote: Extend mem_cgroup to contain dirty page limits. Also add routines allowing the kernel to query the dirty usage of a memcg. These interfaces not used by the kernel yet. A subsequent commit will add kernel calls to utilize these new routines. Signed-off-by: Greg Thelen gthe...@google.com Signed-off-by: Andrea Righi ari...@develer.com Seems nice. Acked-by: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com ___ Containers mailing list contain...@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/containers ___ Devel mailing list Devel@openvz.org https://openvz.org/mailman/listinfo/devel
[Devel] Re: [PATCH 07/10] memcg: add dirty limits to mem_cgroup
On Sun, Oct 03, 2010 at 11:58:02PM -0700, Greg Thelen wrote: Extend mem_cgroup to contain dirty page limits. Also add routines allowing the kernel to query the dirty usage of a memcg. These interfaces not used by the kernel yet. A subsequent commit will add kernel calls to utilize these new routines. A small note below. Signed-off-by: Greg Thelen gthe...@google.com Signed-off-by: Andrea Righi ari...@develer.com --- include/linux/memcontrol.h | 44 +++ mm/memcontrol.c| 180 +++- 2 files changed, 223 insertions(+), 1 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 6303da1..dc8952d 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -19,6 +19,7 @@ #ifndef _LINUX_MEMCONTROL_H #define _LINUX_MEMCONTROL_H +#include linux/writeback.h #include linux/cgroup.h struct mem_cgroup; struct page_cgroup; @@ -33,6 +34,30 @@ enum mem_cgroup_write_page_stat_item { MEMCG_NR_FILE_UNSTABLE_NFS, /* # of NFS unstable pages */ }; +/* Cgroup memory statistics items exported to the kernel */ +enum mem_cgroup_read_page_stat_item { + MEMCG_NR_DIRTYABLE_PAGES, + MEMCG_NR_RECLAIM_PAGES, + MEMCG_NR_WRITEBACK, + MEMCG_NR_DIRTY_WRITEBACK_PAGES, +}; + +/* Dirty memory parameters */ +struct vm_dirty_param { + int dirty_ratio; + int dirty_background_ratio; + unsigned long dirty_bytes; + unsigned long dirty_background_bytes; +}; + +static inline void get_global_vm_dirty_param(struct vm_dirty_param *param) +{ + param-dirty_ratio = vm_dirty_ratio; + param-dirty_bytes = vm_dirty_bytes; + param-dirty_background_ratio = dirty_background_ratio; + param-dirty_background_bytes = dirty_background_bytes; +} + extern unsigned long mem_cgroup_isolate_pages(unsigned long nr_to_scan, struct list_head *dst, unsigned long *scanned, int order, @@ -145,6 +170,10 @@ static inline void mem_cgroup_dec_page_stat(struct page *page, mem_cgroup_update_page_stat(page, idx, -1); } +bool mem_cgroup_has_dirty_limit(void); +void get_vm_dirty_param(struct vm_dirty_param *param); +s64 mem_cgroup_page_stat(enum mem_cgroup_read_page_stat_item item); + unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order, gfp_t gfp_mask); u64 mem_cgroup_get_limit(struct mem_cgroup *mem); @@ -326,6 +355,21 @@ static inline void mem_cgroup_dec_page_stat(struct page *page, { } +static inline bool mem_cgroup_has_dirty_limit(void) +{ + return false; +} + +static inline void get_vm_dirty_param(struct vm_dirty_param *param) +{ + get_global_vm_dirty_param(param); +} + +static inline s64 mem_cgroup_page_stat(enum mem_cgroup_read_page_stat_item item) +{ + return -ENOSYS; +} + static inline unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order, gfp_t gfp_mask) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index f40839f..6ec2625 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -233,6 +233,10 @@ struct mem_cgroup { atomic_trefcnt; unsigned intswappiness; + + /* control memory cgroup dirty pages */ + struct vm_dirty_param dirty_param; + /* OOM-Killer disable */ int oom_kill_disable; @@ -1132,6 +1136,172 @@ static unsigned int get_swappiness(struct mem_cgroup *memcg) return swappiness; } +/* + * Returns a snapshot of the current dirty limits which is not synchronized with + * the routines that change the dirty limits. If this routine races with an + * update to the dirty bytes/ratio value, then the caller must handle the case + * where both dirty_[background_]_ratio and _bytes are set. + */ +static void __mem_cgroup_get_dirty_param(struct vm_dirty_param *param, + struct mem_cgroup *mem) +{ + if (mem !mem_cgroup_is_root(mem)) { + param-dirty_ratio = mem-dirty_param.dirty_ratio; + param-dirty_bytes = mem-dirty_param.dirty_bytes; + param-dirty_background_ratio = + mem-dirty_param.dirty_background_ratio; + param-dirty_background_bytes = + mem-dirty_param.dirty_background_bytes; + } else { + get_global_vm_dirty_param(param); + } +} + +/* + * Get dirty memory parameters of the current memcg or global values (if memory + * cgroups are disabled or querying the root cgroup). + */ +void get_vm_dirty_param(struct vm_dirty_param *param) +{ + struct mem_cgroup *memcg; + + if (mem_cgroup_disabled()) { + get_global_vm_dirty_param(param); + return; + } + + /* + * It's possible
[Devel] Re: [PATCH 07/10] memcg: add dirty limits to mem_cgroup
Andrea Righi ari...@develer.com writes: On Sun, Oct 03, 2010 at 11:58:02PM -0700, Greg Thelen wrote: Extend mem_cgroup to contain dirty page limits. Also add routines allowing the kernel to query the dirty usage of a memcg. These interfaces not used by the kernel yet. A subsequent commit will add kernel calls to utilize these new routines. A small note below. Signed-off-by: Greg Thelen gthe...@google.com Signed-off-by: Andrea Righi ari...@develer.com --- include/linux/memcontrol.h | 44 +++ mm/memcontrol.c| 180 +++- 2 files changed, 223 insertions(+), 1 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 6303da1..dc8952d 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -19,6 +19,7 @@ #ifndef _LINUX_MEMCONTROL_H #define _LINUX_MEMCONTROL_H +#include linux/writeback.h #include linux/cgroup.h struct mem_cgroup; struct page_cgroup; @@ -33,6 +34,30 @@ enum mem_cgroup_write_page_stat_item { MEMCG_NR_FILE_UNSTABLE_NFS, /* # of NFS unstable pages */ }; +/* Cgroup memory statistics items exported to the kernel */ +enum mem_cgroup_read_page_stat_item { +MEMCG_NR_DIRTYABLE_PAGES, +MEMCG_NR_RECLAIM_PAGES, +MEMCG_NR_WRITEBACK, +MEMCG_NR_DIRTY_WRITEBACK_PAGES, +}; + +/* Dirty memory parameters */ +struct vm_dirty_param { +int dirty_ratio; +int dirty_background_ratio; +unsigned long dirty_bytes; +unsigned long dirty_background_bytes; +}; + +static inline void get_global_vm_dirty_param(struct vm_dirty_param *param) +{ +param-dirty_ratio = vm_dirty_ratio; +param-dirty_bytes = vm_dirty_bytes; +param-dirty_background_ratio = dirty_background_ratio; +param-dirty_background_bytes = dirty_background_bytes; +} + extern unsigned long mem_cgroup_isolate_pages(unsigned long nr_to_scan, struct list_head *dst, unsigned long *scanned, int order, @@ -145,6 +170,10 @@ static inline void mem_cgroup_dec_page_stat(struct page *page, mem_cgroup_update_page_stat(page, idx, -1); } +bool mem_cgroup_has_dirty_limit(void); +void get_vm_dirty_param(struct vm_dirty_param *param); +s64 mem_cgroup_page_stat(enum mem_cgroup_read_page_stat_item item); + unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order, gfp_t gfp_mask); u64 mem_cgroup_get_limit(struct mem_cgroup *mem); @@ -326,6 +355,21 @@ static inline void mem_cgroup_dec_page_stat(struct page *page, { } +static inline bool mem_cgroup_has_dirty_limit(void) +{ +return false; +} + +static inline void get_vm_dirty_param(struct vm_dirty_param *param) +{ +get_global_vm_dirty_param(param); +} + +static inline s64 mem_cgroup_page_stat(enum mem_cgroup_read_page_stat_item item) +{ +return -ENOSYS; +} + static inline unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order, gfp_t gfp_mask) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index f40839f..6ec2625 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -233,6 +233,10 @@ struct mem_cgroup { atomic_trefcnt; unsigned intswappiness; + +/* control memory cgroup dirty pages */ +struct vm_dirty_param dirty_param; + /* OOM-Killer disable */ int oom_kill_disable; @@ -1132,6 +1136,172 @@ static unsigned int get_swappiness(struct mem_cgroup *memcg) return swappiness; } +/* + * Returns a snapshot of the current dirty limits which is not synchronized with + * the routines that change the dirty limits. If this routine races with an + * update to the dirty bytes/ratio value, then the caller must handle the case + * where both dirty_[background_]_ratio and _bytes are set. + */ +static void __mem_cgroup_get_dirty_param(struct vm_dirty_param *param, + struct mem_cgroup *mem) +{ +if (mem !mem_cgroup_is_root(mem)) { +param-dirty_ratio = mem-dirty_param.dirty_ratio; +param-dirty_bytes = mem-dirty_param.dirty_bytes; +param-dirty_background_ratio = +mem-dirty_param.dirty_background_ratio; +param-dirty_background_bytes = +mem-dirty_param.dirty_background_bytes; +} else { +get_global_vm_dirty_param(param); +} +} + +/* + * Get dirty memory parameters of the current memcg or global values (if memory + * cgroups are disabled or querying the root cgroup). + */ +void get_vm_dirty_param(struct vm_dirty_param *param) +{ +struct mem_cgroup *memcg; + +if (mem_cgroup_disabled()) { +get_global_vm_dirty_param(param); +return; +} + +/* + * It's possible