[Devel] Re: [PATCH 07/10] memcg: add dirty limits to mem_cgroup

2010-10-12 Thread Greg Thelen
KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com writes:

 On Mon, 11 Oct 2010 17:24:21 -0700
 Greg Thelen gthe...@google.com wrote:

  Is your motivation to increase performance with the same functionality?
  If so, then would a 'static inline' be performance equivalent to a
  preprocessor macro yet be safer to use?
  
  Ah, if lockdep finds this as bug, I think other parts will hit this,
  too.  like this.
  static struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm)
  {
  struct mem_cgroup *mem = NULL;
  
  if (!mm)
  return NULL;
  /*
   * Because we have no locks, mm-owner's may be being moved to 
  other
   * cgroup. We use css_tryget() here even if this looks
   * pessimistic (rather than adding locks here).
   */
  rcu_read_lock();
  do {
  mem = mem_cgroup_from_task(rcu_dereference(mm-owner));
  if (unlikely(!mem))
  break;
  } while (!css_tryget(mem-css));
  rcu_read_unlock();
  return mem;
  }
 
 mem_cgroup_from_task() calls task_subsys_state() calls
 task_subsys_state_check().  task_subsys_state_check() will be happy if
 rcu_read_lock is held.
 
 yes.

 I don't think that this will fail lockdep, because rcu_read_lock_held()
 is true when calling mem_cgroup_from_task() within
 try_get_mem_cgroup_from_mm()..
 
 agreed.

  mem_cgroup_from_task() is designed to be used as this.
  If dqefined as macro, I think it will not be catched.
 
 I do not understand how making mem_cgroup_from_task() a macro will
 change its behavior wrt. to lockdep assertion checking.  I assume that
 as a macro mem_cgroup_from_task() would still call task_subsys_state(),
 which requires either:
 a) rcu read lock held
 b) task-alloc_lock held
 c) cgroup lock held
 

 Hmm. Maybe I was wrong.

 
  Maybe it makes more sense to find a way to perform this check in
  mem_cgroup_has_dirty_limit() without needing to grab the rcu lock.  I
  think this lock grab is unneeded.  I am still collecting performance
  data, but suspect that this may be making the code slower than it needs
  to be.
  
 
  Hmm. css_set[] itself is freed by RCU..what idea to remove rcu_read_lock() 
  do
  you have ? Adding some flags ?
 
 It seems like a shame to need a lock to determine if current is in the
 root cgroup.  Especially given that as soon as
 mem_cgroup_has_dirty_limit() returns, the task could be moved
 in-to/out-of the root cgroup thereby invaliding the answer.  So the
 answer is just a sample that may be wrong. 

 Yes. But it's not a bug but a specification.

 But I think you are correct.
 We will need the rcu read lock in mem_cgroup_has_dirty_limit().
 

 yes.


  Ah...I noticed that you should do
 
   mem = mem_cgroup_from_task(current-mm-owner);
 
  to check has_dirty_limit...
 
 What are the cases where current-mm-owner-cgroups !=
 current-cgroups?
 
 In that case, assume group A and B.

thread(1) - belongs to cgroup A  (thread(1) is mm-owner)
thread(2) - belongs to cgroup B
 and
a page- charnged to cgroup A

 Then, thread(2) make the page dirty which is under cgroup A.

 In this case, if page's dirty_pages accounting is added to cgroup B,
 cgroup B' statistics may show dirty_pages  all_lru_pages. This is
 bug.

I agree that in this case the dirty_pages accounting should be added to
cgroup A because that is where the page was charged.  This will happen
because pc-mem_cgroup was set to A when the page was charged.  The
mark-page-dirty code will check pc-mem_cgroup to determine which cgroup
to add the dirty page to.

I think that the current vs current-mm-owner decision is in areas of
the code that is used to query the dirty limits.  These routines do not
use this data to determine which cgroup to charge for dirty pages.  The
usage of either mem_cgroup_from_task(current-mm-owner) or
mem_cgroup_from_task(current) in mem_cgroup_has_dirty_limit() does not
determine which cgroup is added for dirty_pages.
mem_cgroup_has_dirty_limit() is only used to determine if the process
has a dirty limit.  As discussed, this is a momentary answer that may be
wrong by the time decisions are made because the task may be migrated
in-to/out-of root cgroup while mem_cgroup_has_dirty_limit() runs.  If
the process has a dirty limit, then the process's memcg is used to
compute dirty limits.  Using your example, I assume that thread(1) and
thread(2) will git dirty limits from cgroup(A) and cgroup(B)
respectively.

Are you thinking that when accounting for a dirty page (by incrementing
pc-mem_cgroup-stat-count[MEM_CGROUP_STAT_FILE_DIRTY]) that we should
check the pc-mem_cgroup dirty limit?

 I was hoping to avoid having add even more logic into
 mem_cgroup_has_dirty_limit() to handle the case where current-mm is
 NULL.
 

 Blease check current-mm. We can't limit works of kernel-thread by this, let's
 consider it later if necessary.

 Presumably the newly proposed 

[Devel] Re: [PATCH 07/10] memcg: add dirty limits to mem_cgroup

2010-10-12 Thread KAMEZAWA Hiroyuki
On Tue, 12 Oct 2010 00:32:33 -0700
Greg Thelen gthe...@google.com wrote:

 KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com writes:

  What are the cases where current-mm-owner-cgroups !=
  current-cgroups?
  
  In that case, assume group A and B.
 
 thread(1) - belongs to cgroup A  (thread(1) is mm-owner)
 thread(2) - belongs to cgroup B
  and
 a page- charnged to cgroup A
 
  Then, thread(2) make the page dirty which is under cgroup A.
 
  In this case, if page's dirty_pages accounting is added to cgroup B,
  cgroup B' statistics may show dirty_pages  all_lru_pages. This is
  bug.
 
 I agree that in this case the dirty_pages accounting should be added to
 cgroup A because that is where the page was charged.  This will happen
 because pc-mem_cgroup was set to A when the page was charged.  The
 mark-page-dirty code will check pc-mem_cgroup to determine which cgroup
 to add the dirty page to.
 
 I think that the current vs current-mm-owner decision is in areas of
 the code that is used to query the dirty limits.  These routines do not
 use this data to determine which cgroup to charge for dirty pages.  The
 usage of either mem_cgroup_from_task(current-mm-owner) or
 mem_cgroup_from_task(current) in mem_cgroup_has_dirty_limit() does not
 determine which cgroup is added for dirty_pages.
 mem_cgroup_has_dirty_limit() is only used to determine if the process
 has a dirty limit.  As discussed, this is a momentary answer that may be
 wrong by the time decisions are made because the task may be migrated
 in-to/out-of root cgroup while mem_cgroup_has_dirty_limit() runs.  If
 the process has a dirty limit, then the process's memcg is used to
 compute dirty limits.  Using your example, I assume that thread(1) and
 thread(2) will git dirty limits from cgroup(A) and cgroup(B)
 respectively.
 

Ok, thank you for clarification. Throttoling a thread based on its own
cgroup not based on mm-owner makes sense. Could you add a brief comment on
the code ?

Thanks,
-Kame

___
Containers mailing list
contain...@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers

___
Devel mailing list
Devel@openvz.org
https://openvz.org/mailman/listinfo/devel


[Devel] Re: [PATCH 07/10] memcg: add dirty limits to mem_cgroup

2010-10-11 Thread Greg Thelen
KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com writes:

 On Wed, 06 Oct 2010 17:27:13 -0700
 Greg Thelen gthe...@google.com wrote:

 KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com writes:
 
  On Tue, 05 Oct 2010 12:00:17 -0700
  Greg Thelen gthe...@google.com wrote:
 
  Andrea Righi ari...@develer.com writes:
  
   On Sun, Oct 03, 2010 at 11:58:02PM -0700, Greg Thelen wrote:
   Extend mem_cgroup to contain dirty page limits.  Also add routines
   allowing the kernel to query the dirty usage of a memcg.
   
   These interfaces not used by the kernel yet.  A subsequent commit
   will add kernel calls to utilize these new routines.
  
   A small note below.
  
   
   Signed-off-by: Greg Thelen gthe...@google.com
   Signed-off-by: Andrea Righi ari...@develer.com
   ---
include/linux/memcontrol.h |   44 +++
mm/memcontrol.c|  180 
   +++-
2 files changed, 223 insertions(+), 1 deletions(-)
   
   diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
   index 6303da1..dc8952d 100644
   --- a/include/linux/memcontrol.h
   +++ b/include/linux/memcontrol.h
   @@ -19,6 +19,7 @@

#ifndef _LINUX_MEMCONTROL_H
#define _LINUX_MEMCONTROL_H
   +#include linux/writeback.h
#include linux/cgroup.h
struct mem_cgroup;
struct page_cgroup;
   @@ -33,6 +34,30 @@ enum mem_cgroup_write_page_stat_item {
  MEMCG_NR_FILE_UNSTABLE_NFS, /* # of NFS unstable pages */
};

   +/* Cgroup memory statistics items exported to the kernel */
   +enum mem_cgroup_read_page_stat_item {
   +  MEMCG_NR_DIRTYABLE_PAGES,
   +  MEMCG_NR_RECLAIM_PAGES,
   +  MEMCG_NR_WRITEBACK,
   +  MEMCG_NR_DIRTY_WRITEBACK_PAGES,
   +};
   +
   +/* Dirty memory parameters */
   +struct vm_dirty_param {
   +  int dirty_ratio;
   +  int dirty_background_ratio;
   +  unsigned long dirty_bytes;
   +  unsigned long dirty_background_bytes;
   +};
   +
   +static inline void get_global_vm_dirty_param(struct vm_dirty_param 
   *param)
   +{
   +  param-dirty_ratio = vm_dirty_ratio;
   +  param-dirty_bytes = vm_dirty_bytes;
   +  param-dirty_background_ratio = dirty_background_ratio;
   +  param-dirty_background_bytes = dirty_background_bytes;
   +}
   +
extern unsigned long mem_cgroup_isolate_pages(unsigned long 
   nr_to_scan,
  struct list_head *dst,
  unsigned long *scanned, int 
   order,
   @@ -145,6 +170,10 @@ static inline void 
   mem_cgroup_dec_page_stat(struct page *page,
  mem_cgroup_update_page_stat(page, idx, -1);
}

   +bool mem_cgroup_has_dirty_limit(void);
   +void get_vm_dirty_param(struct vm_dirty_param *param);
   +s64 mem_cgroup_page_stat(enum mem_cgroup_read_page_stat_item item);
   +
unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int 
   order,
  gfp_t gfp_mask);
u64 mem_cgroup_get_limit(struct mem_cgroup *mem);
   @@ -326,6 +355,21 @@ static inline void 
   mem_cgroup_dec_page_stat(struct page *page,
{
}

   +static inline bool mem_cgroup_has_dirty_limit(void)
   +{
   +  return false;
   +}
   +
   +static inline void get_vm_dirty_param(struct vm_dirty_param *param)
   +{
   +  get_global_vm_dirty_param(param);
   +}
   +
   +static inline s64 mem_cgroup_page_stat(enum 
   mem_cgroup_read_page_stat_item item)
   +{
   +  return -ENOSYS;
   +}
   +
static inline
unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int 
   order,
  gfp_t gfp_mask)
   diff --git a/mm/memcontrol.c b/mm/memcontrol.c
   index f40839f..6ec2625 100644
   --- a/mm/memcontrol.c
   +++ b/mm/memcontrol.c
   @@ -233,6 +233,10 @@ struct mem_cgroup {
  atomic_trefcnt;

  unsigned intswappiness;
   +
   +  /* control memory cgroup dirty pages */
   +  struct vm_dirty_param dirty_param;
   +
  /* OOM-Killer disable */
  int oom_kill_disable;

   @@ -1132,6 +1136,172 @@ static unsigned int get_swappiness(struct 
   mem_cgroup *memcg)
  return swappiness;
}

   +/*
   + * Returns a snapshot of the current dirty limits which is not 
   synchronized with
   + * the routines that change the dirty limits.  If this routine races 
   with an
   + * update to the dirty bytes/ratio value, then the caller must handle 
   the case
   + * where both dirty_[background_]_ratio and _bytes are set.
   + */
   +static void __mem_cgroup_get_dirty_param(struct vm_dirty_param *param,
   +   struct mem_cgroup *mem)
   +{
   +  if (mem  !mem_cgroup_is_root(mem)) {
   +  param-dirty_ratio = mem-dirty_param.dirty_ratio;
   +  param-dirty_bytes = mem-dirty_param.dirty_bytes;
   +  param-dirty_background_ratio =
   +  

[Devel] Re: [PATCH 07/10] memcg: add dirty limits to mem_cgroup

2010-10-11 Thread KAMEZAWA Hiroyuki
On Mon, 11 Oct 2010 17:24:21 -0700
Greg Thelen gthe...@google.com wrote:

  Is your motivation to increase performance with the same functionality?
  If so, then would a 'static inline' be performance equivalent to a
  preprocessor macro yet be safer to use?
  
  Ah, if lockdep finds this as bug, I think other parts will hit this,
  too.  like this.
  static struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm)
  {
  struct mem_cgroup *mem = NULL;
  
  if (!mm)
  return NULL;
  /*
   * Because we have no locks, mm-owner's may be being moved to 
  other
   * cgroup. We use css_tryget() here even if this looks
   * pessimistic (rather than adding locks here).
   */
  rcu_read_lock();
  do {
  mem = mem_cgroup_from_task(rcu_dereference(mm-owner));
  if (unlikely(!mem))
  break;
  } while (!css_tryget(mem-css));
  rcu_read_unlock();
  return mem;
  }
 
 mem_cgroup_from_task() calls task_subsys_state() calls
 task_subsys_state_check().  task_subsys_state_check() will be happy if
 rcu_read_lock is held.
 
yes.

 I don't think that this will fail lockdep, because rcu_read_lock_held()
 is true when calling mem_cgroup_from_task() within
 try_get_mem_cgroup_from_mm()..
 
agreed.

  mem_cgroup_from_task() is designed to be used as this.
  If dqefined as macro, I think it will not be catched.
 
 I do not understand how making mem_cgroup_from_task() a macro will
 change its behavior wrt. to lockdep assertion checking.  I assume that
 as a macro mem_cgroup_from_task() would still call task_subsys_state(),
 which requires either:
 a) rcu read lock held
 b) task-alloc_lock held
 c) cgroup lock held
 

Hmm. Maybe I was wrong.

 
  Maybe it makes more sense to find a way to perform this check in
  mem_cgroup_has_dirty_limit() without needing to grab the rcu lock.  I
  think this lock grab is unneeded.  I am still collecting performance
  data, but suspect that this may be making the code slower than it needs
  to be.
  
 
  Hmm. css_set[] itself is freed by RCU..what idea to remove rcu_read_lock() 
  do
  you have ? Adding some flags ?
 
 It seems like a shame to need a lock to determine if current is in the
 root cgroup.  Especially given that as soon as
 mem_cgroup_has_dirty_limit() returns, the task could be moved
 in-to/out-of the root cgroup thereby invaliding the answer.  So the
 answer is just a sample that may be wrong. 

Yes. But it's not a bug but a specification.

 But I think you are correct.
 We will need the rcu read lock in mem_cgroup_has_dirty_limit().
 

yes.


  Ah...I noticed that you should do
 
   mem = mem_cgroup_from_task(current-mm-owner);
 
  to check has_dirty_limit...
 
 What are the cases where current-mm-owner-cgroups !=
 current-cgroups?
 
In that case, assume group A and B.

   thread(1) - belongs to cgroup A  (thread(1) is mm-owner)
   thread(2) - belongs to cgroup B
and
   a page- charnged to cgroup A

Then, thread(2) make the page dirty which is under cgroup A.

In this case, if page's dirty_pages accounting is added to cgroup B, cgroup B'
statistics may show dirty_pages  all_lru_pages. This is bug.


 I was hoping to avoid having add even more logic into
 mem_cgroup_has_dirty_limit() to handle the case where current-mm is
 NULL.
 

Blease check current-mm. We can't limit works of kernel-thread by this, let's
consider it later if necessary.

 Presumably the newly proposed vm_dirty_param(),
 mem_cgroup_has_dirty_limit(), and mem_cgroup_page_stat() routines all
 need to use the same logic.  I assume they should all be consistently
 using current-mm-owner or current.
 

please.

Thanks,
-Kame



___
Containers mailing list
contain...@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers

___
Devel mailing list
Devel@openvz.org
https://openvz.org/mailman/listinfo/devel


[Devel] Re: [PATCH 07/10] memcg: add dirty limits to mem_cgroup

2010-10-06 Thread KAMEZAWA Hiroyuki
On Tue, 05 Oct 2010 12:00:17 -0700
Greg Thelen gthe...@google.com wrote:

 Andrea Righi ari...@develer.com writes:
 
  On Sun, Oct 03, 2010 at 11:58:02PM -0700, Greg Thelen wrote:
  Extend mem_cgroup to contain dirty page limits.  Also add routines
  allowing the kernel to query the dirty usage of a memcg.
  
  These interfaces not used by the kernel yet.  A subsequent commit
  will add kernel calls to utilize these new routines.
 
  A small note below.
 
  
  Signed-off-by: Greg Thelen gthe...@google.com
  Signed-off-by: Andrea Righi ari...@develer.com
  ---
   include/linux/memcontrol.h |   44 +++
   mm/memcontrol.c|  180 
  +++-
   2 files changed, 223 insertions(+), 1 deletions(-)
  
  diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
  index 6303da1..dc8952d 100644
  --- a/include/linux/memcontrol.h
  +++ b/include/linux/memcontrol.h
  @@ -19,6 +19,7 @@
   
   #ifndef _LINUX_MEMCONTROL_H
   #define _LINUX_MEMCONTROL_H
  +#include linux/writeback.h
   #include linux/cgroup.h
   struct mem_cgroup;
   struct page_cgroup;
  @@ -33,6 +34,30 @@ enum mem_cgroup_write_page_stat_item {
 MEMCG_NR_FILE_UNSTABLE_NFS, /* # of NFS unstable pages */
   };
   
  +/* Cgroup memory statistics items exported to the kernel */
  +enum mem_cgroup_read_page_stat_item {
  +  MEMCG_NR_DIRTYABLE_PAGES,
  +  MEMCG_NR_RECLAIM_PAGES,
  +  MEMCG_NR_WRITEBACK,
  +  MEMCG_NR_DIRTY_WRITEBACK_PAGES,
  +};
  +
  +/* Dirty memory parameters */
  +struct vm_dirty_param {
  +  int dirty_ratio;
  +  int dirty_background_ratio;
  +  unsigned long dirty_bytes;
  +  unsigned long dirty_background_bytes;
  +};
  +
  +static inline void get_global_vm_dirty_param(struct vm_dirty_param *param)
  +{
  +  param-dirty_ratio = vm_dirty_ratio;
  +  param-dirty_bytes = vm_dirty_bytes;
  +  param-dirty_background_ratio = dirty_background_ratio;
  +  param-dirty_background_bytes = dirty_background_bytes;
  +}
  +
   extern unsigned long mem_cgroup_isolate_pages(unsigned long nr_to_scan,
 struct list_head *dst,
 unsigned long *scanned, int order,
  @@ -145,6 +170,10 @@ static inline void mem_cgroup_dec_page_stat(struct 
  page *page,
 mem_cgroup_update_page_stat(page, idx, -1);
   }
   
  +bool mem_cgroup_has_dirty_limit(void);
  +void get_vm_dirty_param(struct vm_dirty_param *param);
  +s64 mem_cgroup_page_stat(enum mem_cgroup_read_page_stat_item item);
  +
   unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order,
 gfp_t gfp_mask);
   u64 mem_cgroup_get_limit(struct mem_cgroup *mem);
  @@ -326,6 +355,21 @@ static inline void mem_cgroup_dec_page_stat(struct 
  page *page,
   {
   }
   
  +static inline bool mem_cgroup_has_dirty_limit(void)
  +{
  +  return false;
  +}
  +
  +static inline void get_vm_dirty_param(struct vm_dirty_param *param)
  +{
  +  get_global_vm_dirty_param(param);
  +}
  +
  +static inline s64 mem_cgroup_page_stat(enum 
  mem_cgroup_read_page_stat_item item)
  +{
  +  return -ENOSYS;
  +}
  +
   static inline
   unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order,
 gfp_t gfp_mask)
  diff --git a/mm/memcontrol.c b/mm/memcontrol.c
  index f40839f..6ec2625 100644
  --- a/mm/memcontrol.c
  +++ b/mm/memcontrol.c
  @@ -233,6 +233,10 @@ struct mem_cgroup {
 atomic_trefcnt;
   
 unsigned intswappiness;
  +
  +  /* control memory cgroup dirty pages */
  +  struct vm_dirty_param dirty_param;
  +
 /* OOM-Killer disable */
 int oom_kill_disable;
   
  @@ -1132,6 +1136,172 @@ static unsigned int get_swappiness(struct 
  mem_cgroup *memcg)
 return swappiness;
   }
   
  +/*
  + * Returns a snapshot of the current dirty limits which is not 
  synchronized with
  + * the routines that change the dirty limits.  If this routine races with 
  an
  + * update to the dirty bytes/ratio value, then the caller must handle the 
  case
  + * where both dirty_[background_]_ratio and _bytes are set.
  + */
  +static void __mem_cgroup_get_dirty_param(struct vm_dirty_param *param,
  +   struct mem_cgroup *mem)
  +{
  +  if (mem  !mem_cgroup_is_root(mem)) {
  +  param-dirty_ratio = mem-dirty_param.dirty_ratio;
  +  param-dirty_bytes = mem-dirty_param.dirty_bytes;
  +  param-dirty_background_ratio =
  +  mem-dirty_param.dirty_background_ratio;
  +  param-dirty_background_bytes =
  +  mem-dirty_param.dirty_background_bytes;
  +  } else {
  +  get_global_vm_dirty_param(param);
  +  }
  +}
  +
  +/*
  + * Get dirty memory parameters of the current memcg or global values (if 
  memory
  + * cgroups are disabled or querying the root cgroup).
  + */
  +void get_vm_dirty_param(struct vm_dirty_param *param)
  +{
  +  struct mem_cgroup *memcg;
  +
  + 

[Devel] Re: [PATCH 07/10] memcg: add dirty limits to mem_cgroup

2010-10-06 Thread Greg Thelen
KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com writes:

 On Tue, 05 Oct 2010 12:00:17 -0700
 Greg Thelen gthe...@google.com wrote:

 Andrea Righi ari...@develer.com writes:
 
  On Sun, Oct 03, 2010 at 11:58:02PM -0700, Greg Thelen wrote:
  Extend mem_cgroup to contain dirty page limits.  Also add routines
  allowing the kernel to query the dirty usage of a memcg.
  
  These interfaces not used by the kernel yet.  A subsequent commit
  will add kernel calls to utilize these new routines.
 
  A small note below.
 
  
  Signed-off-by: Greg Thelen gthe...@google.com
  Signed-off-by: Andrea Righi ari...@develer.com
  ---
   include/linux/memcontrol.h |   44 +++
   mm/memcontrol.c|  180 
  +++-
   2 files changed, 223 insertions(+), 1 deletions(-)
  
  diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
  index 6303da1..dc8952d 100644
  --- a/include/linux/memcontrol.h
  +++ b/include/linux/memcontrol.h
  @@ -19,6 +19,7 @@
   
   #ifndef _LINUX_MEMCONTROL_H
   #define _LINUX_MEMCONTROL_H
  +#include linux/writeback.h
   #include linux/cgroup.h
   struct mem_cgroup;
   struct page_cgroup;
  @@ -33,6 +34,30 @@ enum mem_cgroup_write_page_stat_item {
MEMCG_NR_FILE_UNSTABLE_NFS, /* # of NFS unstable pages */
   };
   
  +/* Cgroup memory statistics items exported to the kernel */
  +enum mem_cgroup_read_page_stat_item {
  + MEMCG_NR_DIRTYABLE_PAGES,
  + MEMCG_NR_RECLAIM_PAGES,
  + MEMCG_NR_WRITEBACK,
  + MEMCG_NR_DIRTY_WRITEBACK_PAGES,
  +};
  +
  +/* Dirty memory parameters */
  +struct vm_dirty_param {
  + int dirty_ratio;
  + int dirty_background_ratio;
  + unsigned long dirty_bytes;
  + unsigned long dirty_background_bytes;
  +};
  +
  +static inline void get_global_vm_dirty_param(struct vm_dirty_param 
  *param)
  +{
  + param-dirty_ratio = vm_dirty_ratio;
  + param-dirty_bytes = vm_dirty_bytes;
  + param-dirty_background_ratio = dirty_background_ratio;
  + param-dirty_background_bytes = dirty_background_bytes;
  +}
  +
   extern unsigned long mem_cgroup_isolate_pages(unsigned long nr_to_scan,
struct list_head *dst,
unsigned long *scanned, int order,
  @@ -145,6 +170,10 @@ static inline void mem_cgroup_dec_page_stat(struct 
  page *page,
mem_cgroup_update_page_stat(page, idx, -1);
   }
   
  +bool mem_cgroup_has_dirty_limit(void);
  +void get_vm_dirty_param(struct vm_dirty_param *param);
  +s64 mem_cgroup_page_stat(enum mem_cgroup_read_page_stat_item item);
  +
   unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order,
gfp_t gfp_mask);
   u64 mem_cgroup_get_limit(struct mem_cgroup *mem);
  @@ -326,6 +355,21 @@ static inline void mem_cgroup_dec_page_stat(struct 
  page *page,
   {
   }
   
  +static inline bool mem_cgroup_has_dirty_limit(void)
  +{
  + return false;
  +}
  +
  +static inline void get_vm_dirty_param(struct vm_dirty_param *param)
  +{
  + get_global_vm_dirty_param(param);
  +}
  +
  +static inline s64 mem_cgroup_page_stat(enum 
  mem_cgroup_read_page_stat_item item)
  +{
  + return -ENOSYS;
  +}
  +
   static inline
   unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order,
gfp_t gfp_mask)
  diff --git a/mm/memcontrol.c b/mm/memcontrol.c
  index f40839f..6ec2625 100644
  --- a/mm/memcontrol.c
  +++ b/mm/memcontrol.c
  @@ -233,6 +233,10 @@ struct mem_cgroup {
atomic_trefcnt;
   
unsigned intswappiness;
  +
  + /* control memory cgroup dirty pages */
  + struct vm_dirty_param dirty_param;
  +
/* OOM-Killer disable */
int oom_kill_disable;
   
  @@ -1132,6 +1136,172 @@ static unsigned int get_swappiness(struct 
  mem_cgroup *memcg)
return swappiness;
   }
   
  +/*
  + * Returns a snapshot of the current dirty limits which is not 
  synchronized with
  + * the routines that change the dirty limits.  If this routine races 
  with an
  + * update to the dirty bytes/ratio value, then the caller must handle 
  the case
  + * where both dirty_[background_]_ratio and _bytes are set.
  + */
  +static void __mem_cgroup_get_dirty_param(struct vm_dirty_param *param,
  +  struct mem_cgroup *mem)
  +{
  + if (mem  !mem_cgroup_is_root(mem)) {
  + param-dirty_ratio = mem-dirty_param.dirty_ratio;
  + param-dirty_bytes = mem-dirty_param.dirty_bytes;
  + param-dirty_background_ratio =
  + mem-dirty_param.dirty_background_ratio;
  + param-dirty_background_bytes =
  + mem-dirty_param.dirty_background_bytes;
  + } else {
  + get_global_vm_dirty_param(param);
  + }
  +}
  +
  +/*
  + * Get dirty memory parameters of the current memcg or global values (if 
  memory
  + * cgroups are disabled or querying the root cgroup).
  + */
  +void get_vm_dirty_param(struct vm_dirty_param *param)
  +{
  + struct 

[Devel] Re: [PATCH 07/10] memcg: add dirty limits to mem_cgroup

2010-10-06 Thread KAMEZAWA Hiroyuki
On Wed, 06 Oct 2010 17:27:13 -0700
Greg Thelen gthe...@google.com wrote:

 KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com writes:
 
  On Tue, 05 Oct 2010 12:00:17 -0700
  Greg Thelen gthe...@google.com wrote:
 
  Andrea Righi ari...@develer.com writes:
  
   On Sun, Oct 03, 2010 at 11:58:02PM -0700, Greg Thelen wrote:
   Extend mem_cgroup to contain dirty page limits.  Also add routines
   allowing the kernel to query the dirty usage of a memcg.
   
   These interfaces not used by the kernel yet.  A subsequent commit
   will add kernel calls to utilize these new routines.
  
   A small note below.
  
   
   Signed-off-by: Greg Thelen gthe...@google.com
   Signed-off-by: Andrea Righi ari...@develer.com
   ---
include/linux/memcontrol.h |   44 +++
mm/memcontrol.c|  180 
   +++-
2 files changed, 223 insertions(+), 1 deletions(-)
   
   diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
   index 6303da1..dc8952d 100644
   --- a/include/linux/memcontrol.h
   +++ b/include/linux/memcontrol.h
   @@ -19,6 +19,7 @@

#ifndef _LINUX_MEMCONTROL_H
#define _LINUX_MEMCONTROL_H
   +#include linux/writeback.h
#include linux/cgroup.h
struct mem_cgroup;
struct page_cgroup;
   @@ -33,6 +34,30 @@ enum mem_cgroup_write_page_stat_item {
   MEMCG_NR_FILE_UNSTABLE_NFS, /* # of NFS unstable pages */
};

   +/* Cgroup memory statistics items exported to the kernel */
   +enum mem_cgroup_read_page_stat_item {
   +   MEMCG_NR_DIRTYABLE_PAGES,
   +   MEMCG_NR_RECLAIM_PAGES,
   +   MEMCG_NR_WRITEBACK,
   +   MEMCG_NR_DIRTY_WRITEBACK_PAGES,
   +};
   +
   +/* Dirty memory parameters */
   +struct vm_dirty_param {
   +   int dirty_ratio;
   +   int dirty_background_ratio;
   +   unsigned long dirty_bytes;
   +   unsigned long dirty_background_bytes;
   +};
   +
   +static inline void get_global_vm_dirty_param(struct vm_dirty_param 
   *param)
   +{
   +   param-dirty_ratio = vm_dirty_ratio;
   +   param-dirty_bytes = vm_dirty_bytes;
   +   param-dirty_background_ratio = dirty_background_ratio;
   +   param-dirty_background_bytes = dirty_background_bytes;
   +}
   +
extern unsigned long mem_cgroup_isolate_pages(unsigned long nr_to_scan,
   struct list_head *dst,
   unsigned long *scanned, int 
   order,
   @@ -145,6 +170,10 @@ static inline void mem_cgroup_dec_page_stat(struct 
   page *page,
   mem_cgroup_update_page_stat(page, idx, -1);
}

   +bool mem_cgroup_has_dirty_limit(void);
   +void get_vm_dirty_param(struct vm_dirty_param *param);
   +s64 mem_cgroup_page_stat(enum mem_cgroup_read_page_stat_item item);
   +
unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int 
   order,
   gfp_t gfp_mask);
u64 mem_cgroup_get_limit(struct mem_cgroup *mem);
   @@ -326,6 +355,21 @@ static inline void mem_cgroup_dec_page_stat(struct 
   page *page,
{
}

   +static inline bool mem_cgroup_has_dirty_limit(void)
   +{
   +   return false;
   +}
   +
   +static inline void get_vm_dirty_param(struct vm_dirty_param *param)
   +{
   +   get_global_vm_dirty_param(param);
   +}
   +
   +static inline s64 mem_cgroup_page_stat(enum 
   mem_cgroup_read_page_stat_item item)
   +{
   +   return -ENOSYS;
   +}
   +
static inline
unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int 
   order,
   gfp_t gfp_mask)
   diff --git a/mm/memcontrol.c b/mm/memcontrol.c
   index f40839f..6ec2625 100644
   --- a/mm/memcontrol.c
   +++ b/mm/memcontrol.c
   @@ -233,6 +233,10 @@ struct mem_cgroup {
   atomic_trefcnt;

   unsigned intswappiness;
   +
   +   /* control memory cgroup dirty pages */
   +   struct vm_dirty_param dirty_param;
   +
   /* OOM-Killer disable */
   int oom_kill_disable;

   @@ -1132,6 +1136,172 @@ static unsigned int get_swappiness(struct 
   mem_cgroup *memcg)
   return swappiness;
}

   +/*
   + * Returns a snapshot of the current dirty limits which is not 
   synchronized with
   + * the routines that change the dirty limits.  If this routine races 
   with an
   + * update to the dirty bytes/ratio value, then the caller must handle 
   the case
   + * where both dirty_[background_]_ratio and _bytes are set.
   + */
   +static void __mem_cgroup_get_dirty_param(struct vm_dirty_param *param,
   +struct mem_cgroup *mem)
   +{
   +   if (mem  !mem_cgroup_is_root(mem)) {
   +   param-dirty_ratio = mem-dirty_param.dirty_ratio;
   +   param-dirty_bytes = mem-dirty_param.dirty_bytes;
   +   param-dirty_background_ratio =
   +   

[Devel] Re: [PATCH 07/10] memcg: add dirty limits to mem_cgroup

2010-10-05 Thread KAMEZAWA Hiroyuki
On Sun,  3 Oct 2010 23:58:02 -0700
Greg Thelen gthe...@google.com wrote:

 Extend mem_cgroup to contain dirty page limits.  Also add routines
 allowing the kernel to query the dirty usage of a memcg.
 
 These interfaces not used by the kernel yet.  A subsequent commit
 will add kernel calls to utilize these new routines.
 
 Signed-off-by: Greg Thelen gthe...@google.com
 Signed-off-by: Andrea Righi ari...@develer.com

Seems nice.
Acked-by: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com

___
Containers mailing list
contain...@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers

___
Devel mailing list
Devel@openvz.org
https://openvz.org/mailman/listinfo/devel


[Devel] Re: [PATCH 07/10] memcg: add dirty limits to mem_cgroup

2010-10-05 Thread Andrea Righi
On Sun, Oct 03, 2010 at 11:58:02PM -0700, Greg Thelen wrote:
 Extend mem_cgroup to contain dirty page limits.  Also add routines
 allowing the kernel to query the dirty usage of a memcg.
 
 These interfaces not used by the kernel yet.  A subsequent commit
 will add kernel calls to utilize these new routines.

A small note below.

 
 Signed-off-by: Greg Thelen gthe...@google.com
 Signed-off-by: Andrea Righi ari...@develer.com
 ---
  include/linux/memcontrol.h |   44 +++
  mm/memcontrol.c|  180 
 +++-
  2 files changed, 223 insertions(+), 1 deletions(-)
 
 diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
 index 6303da1..dc8952d 100644
 --- a/include/linux/memcontrol.h
 +++ b/include/linux/memcontrol.h
 @@ -19,6 +19,7 @@
  
  #ifndef _LINUX_MEMCONTROL_H
  #define _LINUX_MEMCONTROL_H
 +#include linux/writeback.h
  #include linux/cgroup.h
  struct mem_cgroup;
  struct page_cgroup;
 @@ -33,6 +34,30 @@ enum mem_cgroup_write_page_stat_item {
   MEMCG_NR_FILE_UNSTABLE_NFS, /* # of NFS unstable pages */
  };
  
 +/* Cgroup memory statistics items exported to the kernel */
 +enum mem_cgroup_read_page_stat_item {
 + MEMCG_NR_DIRTYABLE_PAGES,
 + MEMCG_NR_RECLAIM_PAGES,
 + MEMCG_NR_WRITEBACK,
 + MEMCG_NR_DIRTY_WRITEBACK_PAGES,
 +};
 +
 +/* Dirty memory parameters */
 +struct vm_dirty_param {
 + int dirty_ratio;
 + int dirty_background_ratio;
 + unsigned long dirty_bytes;
 + unsigned long dirty_background_bytes;
 +};
 +
 +static inline void get_global_vm_dirty_param(struct vm_dirty_param *param)
 +{
 + param-dirty_ratio = vm_dirty_ratio;
 + param-dirty_bytes = vm_dirty_bytes;
 + param-dirty_background_ratio = dirty_background_ratio;
 + param-dirty_background_bytes = dirty_background_bytes;
 +}
 +
  extern unsigned long mem_cgroup_isolate_pages(unsigned long nr_to_scan,
   struct list_head *dst,
   unsigned long *scanned, int order,
 @@ -145,6 +170,10 @@ static inline void mem_cgroup_dec_page_stat(struct page 
 *page,
   mem_cgroup_update_page_stat(page, idx, -1);
  }
  
 +bool mem_cgroup_has_dirty_limit(void);
 +void get_vm_dirty_param(struct vm_dirty_param *param);
 +s64 mem_cgroup_page_stat(enum mem_cgroup_read_page_stat_item item);
 +
  unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order,
   gfp_t gfp_mask);
  u64 mem_cgroup_get_limit(struct mem_cgroup *mem);
 @@ -326,6 +355,21 @@ static inline void mem_cgroup_dec_page_stat(struct page 
 *page,
  {
  }
  
 +static inline bool mem_cgroup_has_dirty_limit(void)
 +{
 + return false;
 +}
 +
 +static inline void get_vm_dirty_param(struct vm_dirty_param *param)
 +{
 + get_global_vm_dirty_param(param);
 +}
 +
 +static inline s64 mem_cgroup_page_stat(enum mem_cgroup_read_page_stat_item 
 item)
 +{
 + return -ENOSYS;
 +}
 +
  static inline
  unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order,
   gfp_t gfp_mask)
 diff --git a/mm/memcontrol.c b/mm/memcontrol.c
 index f40839f..6ec2625 100644
 --- a/mm/memcontrol.c
 +++ b/mm/memcontrol.c
 @@ -233,6 +233,10 @@ struct mem_cgroup {
   atomic_trefcnt;
  
   unsigned intswappiness;
 +
 + /* control memory cgroup dirty pages */
 + struct vm_dirty_param dirty_param;
 +
   /* OOM-Killer disable */
   int oom_kill_disable;
  
 @@ -1132,6 +1136,172 @@ static unsigned int get_swappiness(struct mem_cgroup 
 *memcg)
   return swappiness;
  }
  
 +/*
 + * Returns a snapshot of the current dirty limits which is not synchronized 
 with
 + * the routines that change the dirty limits.  If this routine races with an
 + * update to the dirty bytes/ratio value, then the caller must handle the 
 case
 + * where both dirty_[background_]_ratio and _bytes are set.
 + */
 +static void __mem_cgroup_get_dirty_param(struct vm_dirty_param *param,
 +  struct mem_cgroup *mem)
 +{
 + if (mem  !mem_cgroup_is_root(mem)) {
 + param-dirty_ratio = mem-dirty_param.dirty_ratio;
 + param-dirty_bytes = mem-dirty_param.dirty_bytes;
 + param-dirty_background_ratio =
 + mem-dirty_param.dirty_background_ratio;
 + param-dirty_background_bytes =
 + mem-dirty_param.dirty_background_bytes;
 + } else {
 + get_global_vm_dirty_param(param);
 + }
 +}
 +
 +/*
 + * Get dirty memory parameters of the current memcg or global values (if 
 memory
 + * cgroups are disabled or querying the root cgroup).
 + */
 +void get_vm_dirty_param(struct vm_dirty_param *param)
 +{
 + struct mem_cgroup *memcg;
 +
 + if (mem_cgroup_disabled()) {
 + get_global_vm_dirty_param(param);
 + return;
 + }
 +
 + /*
 +  * It's possible 

[Devel] Re: [PATCH 07/10] memcg: add dirty limits to mem_cgroup

2010-10-05 Thread Greg Thelen
Andrea Righi ari...@develer.com writes:

 On Sun, Oct 03, 2010 at 11:58:02PM -0700, Greg Thelen wrote:
 Extend mem_cgroup to contain dirty page limits.  Also add routines
 allowing the kernel to query the dirty usage of a memcg.
 
 These interfaces not used by the kernel yet.  A subsequent commit
 will add kernel calls to utilize these new routines.

 A small note below.

 
 Signed-off-by: Greg Thelen gthe...@google.com
 Signed-off-by: Andrea Righi ari...@develer.com
 ---
  include/linux/memcontrol.h |   44 +++
  mm/memcontrol.c|  180 
 +++-
  2 files changed, 223 insertions(+), 1 deletions(-)
 
 diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
 index 6303da1..dc8952d 100644
 --- a/include/linux/memcontrol.h
 +++ b/include/linux/memcontrol.h
 @@ -19,6 +19,7 @@
  
  #ifndef _LINUX_MEMCONTROL_H
  #define _LINUX_MEMCONTROL_H
 +#include linux/writeback.h
  #include linux/cgroup.h
  struct mem_cgroup;
  struct page_cgroup;
 @@ -33,6 +34,30 @@ enum mem_cgroup_write_page_stat_item {
  MEMCG_NR_FILE_UNSTABLE_NFS, /* # of NFS unstable pages */
  };
  
 +/* Cgroup memory statistics items exported to the kernel */
 +enum mem_cgroup_read_page_stat_item {
 +MEMCG_NR_DIRTYABLE_PAGES,
 +MEMCG_NR_RECLAIM_PAGES,
 +MEMCG_NR_WRITEBACK,
 +MEMCG_NR_DIRTY_WRITEBACK_PAGES,
 +};
 +
 +/* Dirty memory parameters */
 +struct vm_dirty_param {
 +int dirty_ratio;
 +int dirty_background_ratio;
 +unsigned long dirty_bytes;
 +unsigned long dirty_background_bytes;
 +};
 +
 +static inline void get_global_vm_dirty_param(struct vm_dirty_param *param)
 +{
 +param-dirty_ratio = vm_dirty_ratio;
 +param-dirty_bytes = vm_dirty_bytes;
 +param-dirty_background_ratio = dirty_background_ratio;
 +param-dirty_background_bytes = dirty_background_bytes;
 +}
 +
  extern unsigned long mem_cgroup_isolate_pages(unsigned long nr_to_scan,
  struct list_head *dst,
  unsigned long *scanned, int order,
 @@ -145,6 +170,10 @@ static inline void mem_cgroup_dec_page_stat(struct page 
 *page,
  mem_cgroup_update_page_stat(page, idx, -1);
  }
  
 +bool mem_cgroup_has_dirty_limit(void);
 +void get_vm_dirty_param(struct vm_dirty_param *param);
 +s64 mem_cgroup_page_stat(enum mem_cgroup_read_page_stat_item item);
 +
  unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order,
  gfp_t gfp_mask);
  u64 mem_cgroup_get_limit(struct mem_cgroup *mem);
 @@ -326,6 +355,21 @@ static inline void mem_cgroup_dec_page_stat(struct page 
 *page,
  {
  }
  
 +static inline bool mem_cgroup_has_dirty_limit(void)
 +{
 +return false;
 +}
 +
 +static inline void get_vm_dirty_param(struct vm_dirty_param *param)
 +{
 +get_global_vm_dirty_param(param);
 +}
 +
 +static inline s64 mem_cgroup_page_stat(enum mem_cgroup_read_page_stat_item 
 item)
 +{
 +return -ENOSYS;
 +}
 +
  static inline
  unsigned long mem_cgroup_soft_limit_reclaim(struct zone *zone, int order,
  gfp_t gfp_mask)
 diff --git a/mm/memcontrol.c b/mm/memcontrol.c
 index f40839f..6ec2625 100644
 --- a/mm/memcontrol.c
 +++ b/mm/memcontrol.c
 @@ -233,6 +233,10 @@ struct mem_cgroup {
  atomic_trefcnt;
  
  unsigned intswappiness;
 +
 +/* control memory cgroup dirty pages */
 +struct vm_dirty_param dirty_param;
 +
  /* OOM-Killer disable */
  int oom_kill_disable;
  
 @@ -1132,6 +1136,172 @@ static unsigned int get_swappiness(struct mem_cgroup 
 *memcg)
  return swappiness;
  }
  
 +/*
 + * Returns a snapshot of the current dirty limits which is not synchronized 
 with
 + * the routines that change the dirty limits.  If this routine races with an
 + * update to the dirty bytes/ratio value, then the caller must handle the 
 case
 + * where both dirty_[background_]_ratio and _bytes are set.
 + */
 +static void __mem_cgroup_get_dirty_param(struct vm_dirty_param *param,
 + struct mem_cgroup *mem)
 +{
 +if (mem  !mem_cgroup_is_root(mem)) {
 +param-dirty_ratio = mem-dirty_param.dirty_ratio;
 +param-dirty_bytes = mem-dirty_param.dirty_bytes;
 +param-dirty_background_ratio =
 +mem-dirty_param.dirty_background_ratio;
 +param-dirty_background_bytes =
 +mem-dirty_param.dirty_background_bytes;
 +} else {
 +get_global_vm_dirty_param(param);
 +}
 +}
 +
 +/*
 + * Get dirty memory parameters of the current memcg or global values (if 
 memory
 + * cgroups are disabled or querying the root cgroup).
 + */
 +void get_vm_dirty_param(struct vm_dirty_param *param)
 +{
 +struct mem_cgroup *memcg;
 +
 +if (mem_cgroup_disabled()) {
 +get_global_vm_dirty_param(param);
 +return;
 +}
 +
 +/*
 + * It's possible