Re: [PATCH 1/2] cgroup map files: Add cgroup map data type
> The map type is printed in a similar format to /proc/meminfo or > /proc//status, i.e. "$key: $value\n" this description doesn't seem to match with the code. YAMAMOTO Takashi > +static int cgroup_map_add(struct cgroup_map_cb *cb, const char *key, u64 > value) > +{ > + struct seq_file *sf = cb->state; > + return seq_printf(sf, "%s %llu\n", key, value); > +} -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] cgroup map files: Add cgroup map data type
The map type is printed in a similar format to /proc/meminfo or /proc/pid/status, i.e. $key: $value\n this description doesn't seem to match with the code. YAMAMOTO Takashi +static int cgroup_map_add(struct cgroup_map_cb *cb, const char *key, u64 value) +{ + struct seq_file *sf = cb-state; + return seq_printf(sf, %s %llu\n, key, value); +} -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/2] cgroup map files: Add a key/value map file type to cgroups
> On Feb 19, 2008 9:48 PM, YAMAMOTO Takashi <[EMAIL PROTECTED]> wrote: > > > > it changes the format from "%s %lld" to "%s: %llu", right? > > why? > > > > The colon for consistency with maps in /proc. I think it also makes it > slightly more readable. can you be a little more specific? i object against the colon because i want to use the same parser for /proc/vmstat, which doesn't have colons. btw, when making ABI changes like this, can you please mention it explicitly in the patch descriptions? > For %lld versus %llu - I think that cgroup resource APIs are much more > likely to need to report unsigned rather than signed values. In the > case of the memory.stat file, that's certainly the case. > > But I guess there's an argument to be made that nothing's likely to > need the final 64th bit of an unsigned value, whereas the ability to > report negative numbers could potentially be useful for some cgroups. > > Paul i don't have any strong opinions about signedness. YAMAMOTO Takashi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/2] cgroup map files: Add a key/value map file type to cgroups
> These patches add a new cgroup control file output type - a map from > strings to u64 values - and make use of it for the memory controller > "stat" file. > > It is intended for use when the subsystem wants to return a collection > of values that are related in some way, for which a separate control > file for each value would make the reporting unwieldy. > > The advantages of this are: > > - more standardized output from control files that report > similarly-structured data > > - less boilerplate required in cgroup subsystems > > - simplifies transition to a future efficient cgroups binary API > > Signed-off-by: Paul Menage <[EMAIL PROTECTED]> it changes the format from "%s %lld" to "%s: %llu", right? why? YAMAMOTO Takashi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/2] cgroup map files: Add a key/value map file type to cgroups
These patches add a new cgroup control file output type - a map from strings to u64 values - and make use of it for the memory controller stat file. It is intended for use when the subsystem wants to return a collection of values that are related in some way, for which a separate control file for each value would make the reporting unwieldy. The advantages of this are: - more standardized output from control files that report similarly-structured data - less boilerplate required in cgroup subsystems - simplifies transition to a future efficient cgroups binary API Signed-off-by: Paul Menage [EMAIL PROTECTED] it changes the format from %s %lld to %s: %llu, right? why? YAMAMOTO Takashi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/2] cgroup map files: Add a key/value map file type to cgroups
On Feb 19, 2008 9:48 PM, YAMAMOTO Takashi [EMAIL PROTECTED] wrote: it changes the format from %s %lld to %s: %llu, right? why? The colon for consistency with maps in /proc. I think it also makes it slightly more readable. can you be a little more specific? i object against the colon because i want to use the same parser for /proc/vmstat, which doesn't have colons. btw, when making ABI changes like this, can you please mention it explicitly in the patch descriptions? For %lld versus %llu - I think that cgroup resource APIs are much more likely to need to report unsigned rather than signed values. In the case of the memory.stat file, that's certainly the case. But I guess there's an argument to be made that nothing's likely to need the final 64th bit of an unsigned value, whereas the ability to report negative numbers could potentially be useful for some cgroups. Paul i don't have any strong opinions about signedness. YAMAMOTO Takashi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] memcgroup: fix typo in VM_BUG_ON()
> No need for VM_BUG_ON(pc), since 'pc' is the list entry. This should > be VM_BUG_ON(page). > > Signed-off-by: Li Zefan <[EMAIL PROTECTED]> > Acked-by: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> > --- > mm/memcontrol.c |2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 6bded84..c2959ee 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -534,7 +534,7 @@ unsigned long mem_cgroup_isolate_pages(unsigned long > nr_to_scan, > if (scan >= nr_to_scan) > break; > page = pc->page; > - VM_BUG_ON(!pc); > + VM_BUG_ON(!page); can't page be NULL here if mem_cgroup_uncharge clears pc->page behind us? ie. bug. YAMAMOTO Takashi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] memcgroup: fix typo in VM_BUG_ON()
> Li Zefan wrote: > > No need for VM_BUG_ON(pc), since 'pc' is the list entry. This should > > be VM_BUG_ON(page). > > > > Signed-off-by: Li Zefan <[EMAIL PROTECTED]> > > Acked-by: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> > > pc is of type page_cgroup and we use list_for_each_entry_safe_reverse. Not > sure > why we can't bug on pc. pc is dereferenced before this VM_BUG_ON. YAMAMOTO Takashi > > > > -- > Warm Regards, > Balbir Singh > Linux Technology Center > IBM, ISTL > ___ > Containers mailing list > [EMAIL PROTECTED] > https://lists.linux-foundation.org/mailman/listinfo/containers -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] memcgroup: fix typo in VM_BUG_ON()
Li Zefan wrote: No need for VM_BUG_ON(pc), since 'pc' is the list entry. This should be VM_BUG_ON(page). Signed-off-by: Li Zefan [EMAIL PROTECTED] Acked-by: KAMEZAWA Hiroyuki [EMAIL PROTECTED] pc is of type page_cgroup and we use list_for_each_entry_safe_reverse. Not sure why we can't bug on pc. pc is dereferenced before this VM_BUG_ON. YAMAMOTO Takashi -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL ___ Containers mailing list [EMAIL PROTECTED] https://lists.linux-foundation.org/mailman/listinfo/containers -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/3] memcgroup: fix typo in VM_BUG_ON()
No need for VM_BUG_ON(pc), since 'pc' is the list entry. This should be VM_BUG_ON(page). Signed-off-by: Li Zefan [EMAIL PROTECTED] Acked-by: KAMEZAWA Hiroyuki [EMAIL PROTECTED] --- mm/memcontrol.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 6bded84..c2959ee 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -534,7 +534,7 @@ unsigned long mem_cgroup_isolate_pages(unsigned long nr_to_scan, if (scan = nr_to_scan) break; page = pc-page; - VM_BUG_ON(!pc); + VM_BUG_ON(!page); can't page be NULL here if mem_cgroup_uncharge clears pc-page behind us? ie. bug. YAMAMOTO Takashi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [dm-devel] [PATCH 0/2] dm-band: The I/O bandwidth controller: Overview
> Hi, > > > > > On Wed, Jan 23, 2008 at 09:53:50PM +0900, Ryo Tsuruta wrote: > > > > > Dm-band gives bandwidth to each job according to its weight, > > > > > which each job can set its own value to. > > > > > At this time, a job is a group of processes with the same pid or pgrp > > > > > or uid. > > > > > > > > It seems to rely on 'current' to classify bios and doesn't do it until > > > > the map > > > > function is called, possibly in a different process context, so it won't > > > > always identify the original source of the I/O correctly: > > > > > > Yes, this should be mentioned in the document with the current > > > implementation > > > as you pointed out. > > > > > > By the way, I think once a memory controller of cgroup is introduced, it > > > will > > > help to track down which cgroup is the original source. > > > > do you mean to make this a part of the memory subsystem? > > I just think if the memory subsystem is in front of us, we don't need to > reinvent the wheel. > > But I don't have a concrete image how the interface between dm-band and > the memory subsystem should be designed yet. I'd be appreciate if some of > the cgroup developers give some ideas about it. the current implementation of memory subsystem associates pages to cgroups directly, rather than via tasks. so it isn't straightforward to use the information for other classification mechanisms like yours which might not share the view of "hierarchy" with the memory subsystem. YAMAMOTO Takashi > > Thanks, > Hirokazu Takahashi. > > > > YAMAMOTO Takashi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [dm-devel] [PATCH 0/2] dm-band: The I/O bandwidth controller: Overview
Hi, On Wed, Jan 23, 2008 at 09:53:50PM +0900, Ryo Tsuruta wrote: Dm-band gives bandwidth to each job according to its weight, which each job can set its own value to. At this time, a job is a group of processes with the same pid or pgrp or uid. It seems to rely on 'current' to classify bios and doesn't do it until the map function is called, possibly in a different process context, so it won't always identify the original source of the I/O correctly: Yes, this should be mentioned in the document with the current implementation as you pointed out. By the way, I think once a memory controller of cgroup is introduced, it will help to track down which cgroup is the original source. do you mean to make this a part of the memory subsystem? I just think if the memory subsystem is in front of us, we don't need to reinvent the wheel. But I don't have a concrete image how the interface between dm-band and the memory subsystem should be designed yet. I'd be appreciate if some of the cgroup developers give some ideas about it. the current implementation of memory subsystem associates pages to cgroups directly, rather than via tasks. so it isn't straightforward to use the information for other classification mechanisms like yours which might not share the view of hierarchy with the memory subsystem. YAMAMOTO Takashi Thanks, Hirokazu Takahashi. YAMAMOTO Takashi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [dm-devel] [PATCH 0/2] dm-band: The I/O bandwidth controller: Overview
> Hi, > > > On Wed, Jan 23, 2008 at 09:53:50PM +0900, Ryo Tsuruta wrote: > > > Dm-band gives bandwidth to each job according to its weight, > > > which each job can set its own value to. > > > At this time, a job is a group of processes with the same pid or pgrp or > > > uid. > > > > It seems to rely on 'current' to classify bios and doesn't do it until the > > map > > function is called, possibly in a different process context, so it won't > > always identify the original source of the I/O correctly: > > Yes, this should be mentioned in the document with the current implementation > as you pointed out. > > By the way, I think once a memory controller of cgroup is introduced, it will > help to track down which cgroup is the original source. do you mean to make this a part of the memory subsystem? YAMAMOTO Takashi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [dm-devel] [PATCH 0/2] dm-band: The I/O bandwidth controller: Overview
Hi, On Wed, Jan 23, 2008 at 09:53:50PM +0900, Ryo Tsuruta wrote: Dm-band gives bandwidth to each job according to its weight, which each job can set its own value to. At this time, a job is a group of processes with the same pid or pgrp or uid. It seems to rely on 'current' to classify bios and doesn't do it until the map function is called, possibly in a different process context, so it won't always identify the original source of the I/O correctly: Yes, this should be mentioned in the document with the current implementation as you pointed out. By the way, I think once a memory controller of cgroup is introduced, it will help to track down which cgroup is the original source. do you mean to make this a part of the memory subsystem? YAMAMOTO Takashi -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][for -mm] per-zone and reclaim enhancements for memory controller take 3 [3/10] per-zone active inactive counter
> +static inline struct mem_cgroup_per_zone * > +mem_cgroup_zoneinfo(struct mem_cgroup *mem, int nid, int zid) > +{ > + if (!mem->info.nodeinfo[nid]) can this be true? YAMAMOTO Takashi > + return NULL; > + return >info.nodeinfo[nid]->zoneinfo[zid]; > +} > + - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][for -mm] per-zone and reclaim enhancements for memory controller take 3 [3/10] per-zone active inactive counter
> @@ -651,10 +758,11 @@ > /* Avoid race with charge */ > atomic_set(>ref_cnt, 0); > if (clear_page_cgroup(page, pc) == pc) { > + int active; > css_put(>css); > + active = pc->flags & PAGE_CGROUP_FLAG_ACTIVE; > res_counter_uncharge(>res, PAGE_SIZE); > - list_del_init(>lru); > - mem_cgroup_charge_statistics(mem, pc->flags, false); > + __mem_cgroup_remove_list(pc); > kfree(pc); > } else /* being uncharged ? ...do relax */ > break; 'active' seems unused. YAMAMOTO Takashi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][for -mm] per-zone and reclaim enhancements for memory controller take 3 [3/10] per-zone active inactive counter
@@ -651,10 +758,11 @@ /* Avoid race with charge */ atomic_set(pc-ref_cnt, 0); if (clear_page_cgroup(page, pc) == pc) { + int active; css_put(mem-css); + active = pc-flags PAGE_CGROUP_FLAG_ACTIVE; res_counter_uncharge(mem-res, PAGE_SIZE); - list_del_init(pc-lru); - mem_cgroup_charge_statistics(mem, pc-flags, false); + __mem_cgroup_remove_list(pc); kfree(pc); } else /* being uncharged ? ...do relax */ break; 'active' seems unused. YAMAMOTO Takashi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH][for -mm] per-zone and reclaim enhancements for memory controller take 3 [3/10] per-zone active inactive counter
+static inline struct mem_cgroup_per_zone * +mem_cgroup_zoneinfo(struct mem_cgroup *mem, int nid, int zid) +{ + if (!mem-info.nodeinfo[nid]) can this be true? YAMAMOTO Takashi + return NULL; + return mem-info.nodeinfo[nid]-zoneinfo[zid]; +} + - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 5/10] Memory controller task migration (v7)
> YAMAMOTO Takashi wrote: > >> Allow tasks to migrate from one container to the other. We migrate > >> mm_struct's mem_container only when the thread group id migrates. > > > >> + /* > >> + * Only thread group leaders are allowed to migrate, the mm_struct is > >> + * in effect owned by the leader > >> + */ > >> + if (p->tgid != p->pid) > >> + goto out; > > > > does it mean that you can't move a process between containers > > once its thread group leader exited? > > > > YAMAMOTO Takashi > > > Hi, > > Good catch! Currently, we treat the mm as owned by the thread group leader. > But this policy can be easily adapted to any other desired policy. > Would you like to see it change to something else? > > -- > Warm Regards, > Balbir Singh > Linux Technology Center > IBM, ISTL although i have no good idea right now, something which allows to move a process with its thread group leader dead would be better. YAMAMOTO Takashi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 5/10] Memory controller task migration (v7)
YAMAMOTO Takashi wrote: Allow tasks to migrate from one container to the other. We migrate mm_struct's mem_container only when the thread group id migrates. + /* + * Only thread group leaders are allowed to migrate, the mm_struct is + * in effect owned by the leader + */ + if (p-tgid != p-pid) + goto out; does it mean that you can't move a process between containers once its thread group leader exited? YAMAMOTO Takashi Hi, Good catch! Currently, we treat the mm as owned by the thread group leader. But this policy can be easily adapted to any other desired policy. Would you like to see it change to something else? -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL although i have no good idea right now, something which allows to move a process with its thread group leader dead would be better. YAMAMOTO Takashi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 5/10] Memory controller task migration (v7)
> Allow tasks to migrate from one container to the other. We migrate > mm_struct's mem_container only when the thread group id migrates. > + /* > + * Only thread group leaders are allowed to migrate, the mm_struct is > + * in effect owned by the leader > + */ > + if (p->tgid != p->pid) > + goto out; does it mean that you can't move a process between containers once its thread group leader exited? YAMAMOTO Takashi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 5/10] Memory controller task migration (v7)
Allow tasks to migrate from one container to the other. We migrate mm_struct's mem_container only when the thread group id migrates. + /* + * Only thread group leaders are allowed to migrate, the mm_struct is + * in effect owned by the leader + */ + if (p-tgid != p-pid) + goto out; does it mean that you can't move a process between containers once its thread group leader exited? YAMAMOTO Takashi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Memory controller Add Documentation
> +echo 1 > /proc/sys/vm/drop_pages will help get rid of some of the pages > +cached in the container (page cache pages). drop_caches YAMAMOTO Takashi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Memory controller Add Documentation
+echo 1 /proc/sys/vm/drop_pages will help get rid of some of the pages +cached in the container (page cache pages). drop_caches YAMAMOTO Takashi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 4/9] Memory controller memory accounting (v4)
> YAMAMOTO Takashi wrote: > >> + lock_meta_page(page); > >> + /* > >> + * Check if somebody else beat us to allocating the meta_page > >> + */ > >> + race_mp = page_get_meta_page(page); > >> + if (race_mp) { > >> + kfree(mp); > >> + mp = race_mp; > >> + atomic_inc(>ref_cnt); > >> + res_counter_uncharge(>res, 1); > >> + goto done; > >> + } > > > > i think you need css_put here. > > Thats correct. We do need css_put in this path. > > Thanks, > Vaidy v5 still seems to have the problem. YAMAMOTO Takashi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 4/9] Memory controller memory accounting (v4)
YAMAMOTO Takashi wrote: + lock_meta_page(page); + /* + * Check if somebody else beat us to allocating the meta_page + */ + race_mp = page_get_meta_page(page); + if (race_mp) { + kfree(mp); + mp = race_mp; + atomic_inc(mp-ref_cnt); + res_counter_uncharge(mem-res, 1); + goto done; + } i think you need css_put here. Thats correct. We do need css_put in this path. Thanks, Vaidy v5 still seems to have the problem. YAMAMOTO Takashi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 8/9] Memory controller add switch to control what type of pages to limit (v4)
> YAMAMOTO Takashi wrote: > >> Choose if we want cached pages to be accounted or not. By default both > >> are accounted for. A new set of tunables are added. > >> > >> echo -n 1 > mem_control_type > >> > >> switches the accounting to account for only mapped pages > >> > >> echo -n 2 > mem_control_type > >> > >> switches the behaviour back > > > > MEM_CONTAINER_TYPE_ALL is 3, not 2. > > > > Thanks, I'll fix the comment on the top. > > > YAMAMOTO Takashi > > > >> +enum { > >> + MEM_CONTAINER_TYPE_UNSPEC = 0, > >> + MEM_CONTAINER_TYPE_MAPPED, > >> + MEM_CONTAINER_TYPE_CACHED, what's MEM_CONTAINER_TYPE_CACHED, btw? it seems that nothing distinguishes it from MEM_CONTAINER_TYPE_MAPPED. YAMAMOTO Takashi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 8/9] Memory controller add switch to control what type of pages to limit (v4)
YAMAMOTO Takashi wrote: Choose if we want cached pages to be accounted or not. By default both are accounted for. A new set of tunables are added. echo -n 1 mem_control_type switches the accounting to account for only mapped pages echo -n 2 mem_control_type switches the behaviour back MEM_CONTAINER_TYPE_ALL is 3, not 2. Thanks, I'll fix the comment on the top. YAMAMOTO Takashi +enum { + MEM_CONTAINER_TYPE_UNSPEC = 0, + MEM_CONTAINER_TYPE_MAPPED, + MEM_CONTAINER_TYPE_CACHED, what's MEM_CONTAINER_TYPE_CACHED, btw? it seems that nothing distinguishes it from MEM_CONTAINER_TYPE_MAPPED. YAMAMOTO Takashi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 8/9] Memory controller add switch to control what type of pages to limit (v4)
> Choose if we want cached pages to be accounted or not. By default both > are accounted for. A new set of tunables are added. > > echo -n 1 > mem_control_type > > switches the accounting to account for only mapped pages > > echo -n 2 > mem_control_type > > switches the behaviour back MEM_CONTAINER_TYPE_ALL is 3, not 2. YAMAMOTO Takashi > +enum { > + MEM_CONTAINER_TYPE_UNSPEC = 0, > + MEM_CONTAINER_TYPE_MAPPED, > + MEM_CONTAINER_TYPE_CACHED, > + MEM_CONTAINER_TYPE_ALL, > + MEM_CONTAINER_TYPE_MAX, > +} mem_control_type; > + > +static struct mem_container init_mem_container; > + mem = rcu_dereference(mm->mem_container); > + if (mem->control_type == MEM_CONTAINER_TYPE_ALL) > + return mem_container_charge(page, mm); > + else > + return 0; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 8/9] Memory controller add switch to control what type of pages to limit (v4)
Choose if we want cached pages to be accounted or not. By default both are accounted for. A new set of tunables are added. echo -n 1 mem_control_type switches the accounting to account for only mapped pages echo -n 2 mem_control_type switches the behaviour back MEM_CONTAINER_TYPE_ALL is 3, not 2. YAMAMOTO Takashi +enum { + MEM_CONTAINER_TYPE_UNSPEC = 0, + MEM_CONTAINER_TYPE_MAPPED, + MEM_CONTAINER_TYPE_CACHED, + MEM_CONTAINER_TYPE_ALL, + MEM_CONTAINER_TYPE_MAX, +} mem_control_type; + +static struct mem_container init_mem_container; + mem = rcu_dereference(mm-mem_container); + if (mem-control_type == MEM_CONTAINER_TYPE_ALL) + return mem_container_charge(page, mm); + else + return 0; - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 6/9] Memory controller add per container LRU and reclaim (v4)
> +unsigned long mem_container_isolate_pages(unsigned long nr_to_scan, > + struct list_head *dst, > + unsigned long *scanned, int order, > + int mode, struct zone *z, > + struct mem_container *mem_cont, > + int active) > +{ > + unsigned long nr_taken = 0; > + struct page *page; > + unsigned long scan; > + LIST_HEAD(mp_list); > + struct list_head *src; > + struct meta_page *mp; > + > + if (active) > + src = _cont->active_list; > + else > + src = _cont->inactive_list; > + > + for (scan = 0; scan < nr_to_scan && !list_empty(src); scan++) { > + mp = list_entry(src->prev, struct meta_page, lru); what prevents another thread from freeing mp here? > + spin_lock(_cont->lru_lock); > + if (mp) > + page = mp->page; > + spin_unlock(_cont->lru_lock); > + if (!mp) > + continue; YAMAMOTO Takashi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 4/9] Memory controller memory accounting (v4)
> + lock_meta_page(page); > + /* > + * Check if somebody else beat us to allocating the meta_page > + */ > + race_mp = page_get_meta_page(page); > + if (race_mp) { > + kfree(mp); > + mp = race_mp; > + atomic_inc(>ref_cnt); > + res_counter_uncharge(>res, 1); > + goto done; > + } i think you need css_put here. YAMAMOTO Takashi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 4/9] Memory controller memory accounting (v4)
+ lock_meta_page(page); + /* + * Check if somebody else beat us to allocating the meta_page + */ + race_mp = page_get_meta_page(page); + if (race_mp) { + kfree(mp); + mp = race_mp; + atomic_inc(mp-ref_cnt); + res_counter_uncharge(mem-res, 1); + goto done; + } i think you need css_put here. YAMAMOTO Takashi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 6/9] Memory controller add per container LRU and reclaim (v4)
+unsigned long mem_container_isolate_pages(unsigned long nr_to_scan, + struct list_head *dst, + unsigned long *scanned, int order, + int mode, struct zone *z, + struct mem_container *mem_cont, + int active) +{ + unsigned long nr_taken = 0; + struct page *page; + unsigned long scan; + LIST_HEAD(mp_list); + struct list_head *src; + struct meta_page *mp; + + if (active) + src = mem_cont-active_list; + else + src = mem_cont-inactive_list; + + for (scan = 0; scan nr_to_scan !list_empty(src); scan++) { + mp = list_entry(src-prev, struct meta_page, lru); what prevents another thread from freeing mp here? + spin_lock(mem_cont-lru_lock); + if (mp) + page = mp-page; + spin_unlock(mem_cont-lru_lock); + if (!mp) + continue; YAMAMOTO Takashi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 01/10] Task Containers(V11): Basic task container framework
> +extern void container_init_smp(void); > +static inline void container_init_smp(void) {} stale prototypes? YAMAMOTO Takashi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 01/10] Task Containers(V11): Basic task container framework
+extern void container_init_smp(void); +static inline void container_init_smp(void) {} stale prototypes? YAMAMOTO Takashi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 01/10] Task Containers(V11): Basic task container framework
> +Other fields in the container_subsys object include: > +- hierarchy: an index indicating which hierarchy, if any, this > + subsystem is currently attached to. If this is -1, then the > + subsystem is not attached to any hierarchy, and all tasks should be > + considered to be members of the subsystem's top_container. It should > + be initialized to -1. stale info? > +struct container { > + struct containerfs_root *root; > + struct container *top_container; > +}; can cont->top_container be different from than >root.top_container? YAMAMOTO Takashi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 01/10] Task Containers(V11): Basic task container framework
+Other fields in the container_subsys object include: +- hierarchy: an index indicating which hierarchy, if any, this + subsystem is currently attached to. If this is -1, then the + subsystem is not attached to any hierarchy, and all tasks should be + considered to be members of the subsystem's top_container. It should + be initialized to -1. stale info? +struct container { + struct containerfs_root *root; + struct container *top_container; +}; can cont-top_container be different from than cont-root.top_container? YAMAMOTO Takashi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][-mm PATCH 6/8] Memory controller add per container LRU and reclaim (v3)
hi, > +unsigned long mem_container_isolate_pages(unsigned long nr_to_scan, > + struct list_head *dst, > + unsigned long *scanned, int order, > + int mode, struct zone *z, > + struct mem_container *mem_cont, > + int active) > +{ > + unsigned long nr_taken = 0; > + struct page *page; > + unsigned long scan; > + LIST_HEAD(mp_list); > + struct list_head *src; > + struct meta_page *mp; > + > + if (active) > + src = _cont->active_list; > + else > + src = _cont->inactive_list; > + > + for (scan = 0; scan < nr_to_scan && !list_empty(src); scan++) { > + mp = list_entry(src->prev, struct meta_page, lru); > + page = mp->page; > + - is it safe to pick the lists without mem_cont->lru_lock held? - what prevents mem_container_uncharge from freeing this meta_page behind us? YAMAMOTO Takashi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][-mm PATCH 6/8] Memory controller add per container LRU and reclaim (v3)
hi, +unsigned long mem_container_isolate_pages(unsigned long nr_to_scan, + struct list_head *dst, + unsigned long *scanned, int order, + int mode, struct zone *z, + struct mem_container *mem_cont, + int active) +{ + unsigned long nr_taken = 0; + struct page *page; + unsigned long scan; + LIST_HEAD(mp_list); + struct list_head *src; + struct meta_page *mp; + + if (active) + src = mem_cont-active_list; + else + src = mem_cont-inactive_list; + + for (scan = 0; scan nr_to_scan !list_empty(src); scan++) { + mp = list_entry(src-prev, struct meta_page, lru); + page = mp-page; + - is it safe to pick the lists without mem_cont-lru_lock held? - what prevents mem_container_uncharge from freeing this meta_page behind us? YAMAMOTO Takashi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 4/8] Memory controller memory accounting (v2)
> On 7/10/07, YAMAMOTO Takashi <[EMAIL PROTECTED]> wrote: > > hi, > > > > > diff -puN mm/memory.c~mem-control-accounting mm/memory.c > > > --- linux-2.6.22-rc6/mm/memory.c~mem-control-accounting 2007-07-05 > > > 13:45:18.0 -0700 > > > +++ linux-2.6.22-rc6-balbir/mm/memory.c 2007-07-05 > > > 13:45:18.0 -0700 > > > > > @@ -1731,6 +1736,9 @@ gotten: > > > cow_user_page(new_page, old_page, address, vma); > > > } > > > > > > + if (mem_container_charge(new_page, mm)) > > > + goto oom; > > > + > > > /* > > >* Re-check the pte - we dropped the lock > > >*/ > > > > it seems that the page will be leaked on error. > > You mean meta_page right? no. i meant 'new_page'. YAMAMOTO Takashi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 6/8] Memory controller add per container LRU and reclaim (v2)
> Add the meta_page to the per container LRU. The reclaim algorithm has been > modified to make the isolate_lru_pages() as a pluggable component. The > scan_control data structure now accepts the container on behalf of which > reclaims are carried out. try_to_free_pages() has been extended to become > container aware. > > Signed-off-by: Balbir Singh <[EMAIL PROTECTED]> it seems that the number of pages to scan (nr_active/nr_inactive in shrink_zone) is calculated from NR_ACTIVE and NR_INACTIVE of the zone, even in the case of per-container reclaim. is it intended? YAMAMOTO Takashi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 4/8] Memory controller memory accounting (v2)
hi, > diff -puN mm/memory.c~mem-control-accounting mm/memory.c > --- linux-2.6.22-rc6/mm/memory.c~mem-control-accounting 2007-07-05 > 13:45:18.0 -0700 > +++ linux-2.6.22-rc6-balbir/mm/memory.c 2007-07-05 13:45:18.0 > -0700 > @@ -1731,6 +1736,9 @@ gotten: > cow_user_page(new_page, old_page, address, vma); > } > > + if (mem_container_charge(new_page, mm)) > + goto oom; > + > /* >* Re-check the pte - we dropped the lock >*/ it seems that the page will be leaked on error. > @@ -2188,6 +2196,11 @@ static int do_swap_page(struct mm_struct > } > > delayacct_clear_flag(DELAYACCT_PF_SWAPIN); > + if (mem_container_charge(page, mm)) { > + ret = VM_FAULT_OOM; > + goto out; > + } > + > mark_page_accessed(page); > lock_page(page); > ditto. > @@ -2264,6 +2278,9 @@ static int do_anonymous_page(struct mm_s > if (!page) > goto oom; > > + if (mem_container_charge(page, mm)) > + goto oom; > + > entry = mk_pte(page, vma->vm_page_prot); > entry = maybe_mkwrite(pte_mkdirty(entry), vma); > ditto. can you check the rest of the patch by yourself? thanks. YAMAMOTO Takashi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 4/8] Memory controller memory accounting (v2)
hi, diff -puN mm/memory.c~mem-control-accounting mm/memory.c --- linux-2.6.22-rc6/mm/memory.c~mem-control-accounting 2007-07-05 13:45:18.0 -0700 +++ linux-2.6.22-rc6-balbir/mm/memory.c 2007-07-05 13:45:18.0 -0700 @@ -1731,6 +1736,9 @@ gotten: cow_user_page(new_page, old_page, address, vma); } + if (mem_container_charge(new_page, mm)) + goto oom; + /* * Re-check the pte - we dropped the lock */ it seems that the page will be leaked on error. @@ -2188,6 +2196,11 @@ static int do_swap_page(struct mm_struct } delayacct_clear_flag(DELAYACCT_PF_SWAPIN); + if (mem_container_charge(page, mm)) { + ret = VM_FAULT_OOM; + goto out; + } + mark_page_accessed(page); lock_page(page); ditto. @@ -2264,6 +2278,9 @@ static int do_anonymous_page(struct mm_s if (!page) goto oom; + if (mem_container_charge(page, mm)) + goto oom; + entry = mk_pte(page, vma-vm_page_prot); entry = maybe_mkwrite(pte_mkdirty(entry), vma); ditto. can you check the rest of the patch by yourself? thanks. YAMAMOTO Takashi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 6/8] Memory controller add per container LRU and reclaim (v2)
Add the meta_page to the per container LRU. The reclaim algorithm has been modified to make the isolate_lru_pages() as a pluggable component. The scan_control data structure now accepts the container on behalf of which reclaims are carried out. try_to_free_pages() has been extended to become container aware. Signed-off-by: Balbir Singh [EMAIL PROTECTED] it seems that the number of pages to scan (nr_active/nr_inactive in shrink_zone) is calculated from NR_ACTIVE and NR_INACTIVE of the zone, even in the case of per-container reclaim. is it intended? YAMAMOTO Takashi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 4/8] Memory controller memory accounting (v2)
On 7/10/07, YAMAMOTO Takashi [EMAIL PROTECTED] wrote: hi, diff -puN mm/memory.c~mem-control-accounting mm/memory.c --- linux-2.6.22-rc6/mm/memory.c~mem-control-accounting 2007-07-05 13:45:18.0 -0700 +++ linux-2.6.22-rc6-balbir/mm/memory.c 2007-07-05 13:45:18.0 -0700 @@ -1731,6 +1736,9 @@ gotten: cow_user_page(new_page, old_page, address, vma); } + if (mem_container_charge(new_page, mm)) + goto oom; + /* * Re-check the pte - we dropped the lock */ it seems that the page will be leaked on error. You mean meta_page right? no. i meant 'new_page'. YAMAMOTO Takashi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] /proc/cpumem
hi, > + while (addr < end) { > + if (valid_phys_addr_range(addr, )) { > + if (!found) { > + found = 1; > + p->addr = addr; > + p->size = size; > + } else { > + p->size += size; > + } > + addr += size; > + size = 0xf000; > + } else { > + if (found) { > + return p; > + } > + addr += PAGE_SIZE; > + } > + } doesn't this loop take very long time if you have a large hole? i'd suggest to change valid_phys_addr_range to fill even when it returns false, so that caller can skip the hole efficiently. YAMAMOTO Takashi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] /proc/cpumem
hi, + while (addr end) { + if (valid_phys_addr_range(addr, size)) { + if (!found) { + found = 1; + p-addr = addr; + p-size = size; + } else { + p-size += size; + } + addr += size; + size = 0xf000; + } else { + if (found) { + return p; + } + addr += PAGE_SIZE; + } + } doesn't this loop take very long time if you have a large hole? i'd suggest to change valid_phys_addr_range to fill size even when it returns false, so that caller can skip the hole efficiently. YAMAMOTO Takashi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/