[Devel] [PATCH -mm 0/5] swapcgroup (v3)

2008-07-04 Thread Daisuke Nishimura
Hi. This is new version of swapcgroup. Major changes from previous version - Rebased on 2.6.26-rc5-mm3. The new -mm has been released, but these patches can be applied on 2.6.26-rc8-mm1 too with only some offset warnings. I tested these patches on 2.6.26-rc5-mm3 with some fixes about

[Devel] [PATCH -mm 1/5] swapcgroup (v3): add cgroup files

2008-07-04 Thread Daisuke Nishimura
Even if limiting memory usage by memory cgroup, swap space is shared, so resource isolation is not enough. If one group uses up most of the swap space, it can affect other groups anyway. The purpose of swapcgroup is limiting swap usage per group as memory cgroup limits the RSS memory usage. It's

[Devel] [PATCH -mm 2/5] swapcgroup (v3): add a member to swap_info_struct

2008-07-04 Thread Daisuke Nishimura
This patch add a member to swap_info_struct for cgroup. This member, array of pointers to mem_cgroup, is used to remember to which cgroup each swap entries are charged. The memory for this array of pointers is allocated on swapon, and freed on swapoff. Change log v2-v3 - Rebased on

[Devel] [PATCH -mm 3/5] swapcgroup (v3): implement charge and uncharge

2008-07-04 Thread Daisuke Nishimura
This patch implements charge and uncharge of swapcgroup. - what will be charged ? charge the number of swap entries in bytes. - when to charge/uncharge ? charge at get_swap_entry(), and uncharge at swap_entry_free(). - to what group charge the swap entry ? To determine to what mem_cgroup

[Devel] [PATCH -mm 4/5] swapcgroup (v3): modify vm_swap_full()

2008-07-04 Thread Daisuke Nishimura
This patch modifies vm_swap_full() to calculate the rate of swap usage per cgroup. The purpose of this change is to free freeable swap caches (that is, swap entries) per cgroup, so that swap_cgroup_charge() fails less frequently. I tested whether this patch can reduce the swap usage or not, by

[Devel] [PATCH -mm 5/5] swapcgroup (v3): implement force_empty

2008-07-04 Thread Daisuke Nishimura
This patch implements force_empty of swapcgroup. Currently, it simply uncharges all the charges from the group. I think there can be other implementations. What I thought are: - move all the charges to its parent. - unuse(swap in) all the swap charged to the group. But in any case, I think

[Devel] Re: [PATCH 01/15] kobject: Cleanup kobject_rename and !CONFIG_SYSFS

2008-07-04 Thread Tejun Heo
Eric W. Biederman wrote: It finally dawned on me what the clean fix to sysfs_rename_dir calling kobject_set_name is. Move the work into kobject_rename where it belongs. The callers serialize us anyway so this is safe. Signed-off-by: Eric W. Biederman [EMAIL PROTECTED] Nice clean up.

[Devel] Re: [PATCH 06/15] Introduce sysfs_sd_setattr and fix sysfs_chmod

2008-07-04 Thread Tejun Heo
Eric W. Biederman wrote: Currently sysfs_chmod calls sys_setattr which in turn calls inode_change_ok which checks to see if it is ok for the current user space process to change tha attributes. Since sysfs_chmod_file has only kernel mode clients denying them permission if user space is the

[Devel] Re: [PATCH 07/15] sysfs: sysfs_chmod_file handle multiple superblocks

2008-07-04 Thread Tejun Heo
Eric W. Biederman wrote: Teach sysfs_chmod_file how to handle multiple sysfs superblocks. Since we only have one inode per sd the only thing we have to deal with is multiple dentries for sending fs notifications. This might dup the inode notifications oh well. Signed-off-by: Eric W.

[Devel] Re: [PATCH 08/15] sysfs: Make sysfs_mount static once again.

2008-07-04 Thread Tejun Heo
Eric W. Biederman wrote: Accessing the internal sysfs_mount is error prone in the context of multiple super blocks, and nothing needs it. Not even the sysfs crash debugging patch (although it did in an earlier version). Signed-off-by: Eric W. Biederman [EMAIL PROTECTED] Acked-by: Tejun Heo

[Devel] Re: [PATCH -mm 5/5] swapcgroup (v3): implement force_empty

2008-07-04 Thread YAMAMOTO Takashi
hi, +/* + * uncharge all the entries that are charged to the group. + */ +void __swap_cgroup_force_empty(struct mem_cgroup *mem) +{ + struct swap_info_struct *p; + int type; + + spin_lock(swap_lock); + for (type = swap_list.head; type = 0; type = swap_info[type].next) {

[Devel] Re: [PATCH -mm 5/5] swapcgroup (v3): implement force_empty

2008-07-04 Thread Daisuke Nishimura
Hi, Yamamoto-san. Thank you for your comment. On Fri, 4 Jul 2008 15:54:31 +0900 (JST), [EMAIL PROTECTED] (YAMAMOTO Takashi) wrote: hi, +/* + * uncharge all the entries that are charged to the group. + */ +void __swap_cgroup_force_empty(struct mem_cgroup *mem) +{ + struct

[Devel] Re: [PATCH -mm 5/5] swapcgroup (v3): implement force_empty

2008-07-04 Thread YAMAMOTO Takashi
Hi, Yamamoto-san. Thank you for your comment. On Fri, 4 Jul 2008 15:54:31 +0900 (JST), [EMAIL PROTECTED] (YAMAMOTO Takashi) wrote: hi, +/* + * uncharge all the entries that are charged to the group. + */ +void __swap_cgroup_force_empty(struct mem_cgroup *mem) +{ +

[Devel] Re: [PATCH 12/15] driver core: Implement tagged directory support for device classes.

2008-07-04 Thread Tejun Heo
Eric W. Biederman wrote: This patch enables tagging on every class directory if struct class has a tag_type. In addition device_del and device_rename were modified to uses sysfs_delete_link and sysfs_rename_link respectively to ensure when these operations happen on devices whose classes

[Devel] Re: [PATCH -mm 5/5] swapcgroup (v3): implement force_empty

2008-07-04 Thread Daisuke Nishimura
On Fri, 4 Jul 2008 16:48:28 +0900 (JST), [EMAIL PROTECTED] (YAMAMOTO Takashi) wrote: Hi, Yamamoto-san. Thank you for your comment. On Fri, 4 Jul 2008 15:54:31 +0900 (JST), [EMAIL PROTECTED] (YAMAMOTO Takashi) wrote: hi, +/* + * uncharge all the entries that are

[Devel] [PATCH 09/15] sysfs: Implement sysfs tagged directory support.

2008-07-04 Thread Eric W. Biederman
The problem. When implementing a network namespace I need to be able to have multiple network devices with the same name. Currently this is a problem for /sys/class/net/*, /sys/devices/virtual/net/*, and potentially a few other directories of the form /sys/ ... /net/*. What this patch does is

[Devel] Re: [PATCH -mm 0/5] swapcgroup (v3)

2008-07-04 Thread KAMEZAWA Hiroyuki
On Fri, 4 Jul 2008 15:15:36 +0900 Daisuke Nishimura [EMAIL PROTECTED] wrote: Hi. This is new version of swapcgroup. Major changes from previous version - Rebased on 2.6.26-rc5-mm3. The new -mm has been released, but these patches can be applied on 2.6.26-rc8-mm1 too with only some

[Devel] Re: [PATCH -mm 4/5] swapcgroup (v3): modify vm_swap_full()

2008-07-04 Thread KAMEZAWA Hiroyuki
On Fri, 4 Jul 2008 15:22:44 +0900 Daisuke Nishimura [EMAIL PROTECTED] wrote: /* Swap 50% full? Release swapcache more aggressively.. */ -#define vm_swap_full() (nr_swap_pages*2 total_swap_pages) +#define vm_swap_full(memcg) ((nr_swap_pages*2 total_swap_pages) \ +

[Devel] Re: Network namespaces without isolation

2008-07-04 Thread Eric W. Biederman
Andreas B Aaen [EMAIL PROTECTED] writes: Hi, I am looking into the network namespace implementation because I need an IP stack that is capable of talking with a number of separate IP nets with possible overlapping IP adresses. My connection to each separate IP-net is through a tunnel

[Devel] Re: [PATCH -mm 4/5] swapcgroup (v3): modify vm_swap_full()

2008-07-04 Thread Daisuke Nishimura
Hi, Kamezawa-san. On Fri, 4 Jul 2008 18:58:45 +0900, KAMEZAWA Hiroyuki [EMAIL PROTECTED] wrote: On Fri, 4 Jul 2008 15:22:44 +0900 Daisuke Nishimura [EMAIL PROTECTED] wrote: /* Swap 50% full? Release swapcache more aggressively.. */ -#define vm_swap_full() (nr_swap_pages*2

[Devel] Re: [PATCH -mm 0/5] swapcgroup (v3)

2008-07-04 Thread Daisuke Nishimura
On Fri, 4 Jul 2008 18:40:33 +0900, KAMEZAWA Hiroyuki [EMAIL PROTECTED] wrote: On Fri, 4 Jul 2008 15:15:36 +0900 Daisuke Nishimura [EMAIL PROTECTED] wrote: Hi. This is new version of swapcgroup. Major changes from previous version - Rebased on 2.6.26-rc5-mm3. The new -mm has

[Devel] Re: [RFC PATCH 0/5] Resend - Use procfs to change a syscall behavior

2008-07-04 Thread Nadia Derbey
Pavel Machek wrote: Hi! This patchset is a part of an effort to change some syscalls behavior for checkpoint restart. When restarting an object that has previously been checkpointed, its state should be unchanged compared to the checkpointed image. For example, a restarted process should

[Devel] Re: [PATCH -mm 5/5] swapcgroup (v3): implement force_empty

2008-07-04 Thread Daisuke Nishimura
On Fri, 4 Jul 2008 19:16:38 +0900, KAMEZAWA Hiroyuki [EMAIL PROTECTED] wrote: On Fri, 4 Jul 2008 15:24:23 +0900 Daisuke Nishimura [EMAIL PROTECTED] wrote: This patch implements force_empty of swapcgroup. Currently, it simply uncharges all the charges from the group. I think there

[Devel] [PATCH net-next 8/9] netns: place rt_genid into struct net

2008-07-04 Thread Denis V. Lunev
Signed-off-by: Denis V. Lunev [EMAIL PROTECTED] --- include/net/netns/ipv4.h |1 + net/ipv4/route.c | 76 ++ 2 files changed, 44 insertions(+), 33 deletions(-) diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h index

[Devel] [PATCH net-next 2/9] net: add fib_rules_ops to flush_cache method

2008-07-04 Thread Denis V. Lunev
This is required to pass namespace context into rt_cache_flush called from -flush_cache. Signed-off-by: Denis V. Lunev [EMAIL PROTECTED] --- include/net/fib_rules.h |2 +- net/core/fib_rules.c|2 +- net/decnet/dn_rules.c |2 +- net/ipv4/fib_rules.c|4 ++-- 4 files

[Devel] [PATCH net-next 3/9] ipv4: remove static flush_delay variable

2008-07-04 Thread Denis V. Lunev
flush delay is used as an external storage for net.ipv4.route.flush sysctl entry. It is write-only. The ctl_table-data for this entry is used once. Fix this case to point to the stack to remove global variable. Do this to avoid additional variable on struct net in the next patch. Possible race

[Devel] [PATCH net-next 1/9] netns: add namespace parameter to rt_cache_flush

2008-07-04 Thread Denis V. Lunev
Signed-off-by: Denis V. Lunev [EMAIL PROTECTED] --- include/net/route.h |2 +- net/ipv4/arp.c |2 +- net/ipv4/devinet.c |8 +--- net/ipv4/fib_frontend.c | 17 + net/ipv4/fib_hash.c |6 +++--- net/ipv4/fib_rules.c|2 +-

[Devel] [PATCH net-next 7/9] ipv4: pass current value of rt_genid into rt_hash

2008-07-04 Thread Denis V. Lunev
Basically, there is no difference to atomic_read internally or pass it as a parameter as rt_hash is inline. Signed-off-by: Denis V. Lunev [EMAIL PROTECTED] --- net/ipv4/route.c | 28 +--- 1 files changed, 17 insertions(+), 11 deletions(-) diff --git a/net/ipv4/route.c

[Devel] [PATCH net-next 6/9] netns: add struct net parameter to rt_cache_invalidate

2008-07-04 Thread Denis V. Lunev
Signed-off-by: Denis V. Lunev [EMAIL PROTECTED] --- net/ipv4/route.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/net/ipv4/route.c b/net/ipv4/route.c index f99d9db..9725223 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -778,7 +778,7 @@ static void

[Devel] [PATCH net-next 5/9] netns: make rt_secret_rebuild timer per namespace

2008-07-04 Thread Denis V. Lunev
Signed-off-by: Denis V. Lunev [EMAIL PROTECTED] --- include/net/netns/ipv4.h |2 ++ net/ipv4/route.c | 40 ++-- 2 files changed, 32 insertions(+), 10 deletions(-) diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h index

[Devel] [PATCH net-next 0/9] selective (per/namespace) flush of rt_cache

2008-07-04 Thread Denis V. Lunev
Hello, Dave! This series of patches implements selective rt cache flushing to make sure that in one namespace we'll not been able to affect the performance of other from the user space. Regards, Den ___ Containers mailing list [EMAIL

[Devel] [PATCH net-next 4/9] netns: register net.ipv4.route.flush in each namespace

2008-07-04 Thread Denis V. Lunev
Signed-off-by: Denis V. Lunev [EMAIL PROTECTED] --- include/net/netns/ipv4.h |1 + net/ipv4/route.c | 79 -- 2 files changed, 70 insertions(+), 10 deletions(-) diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h index

[Devel] [PATCH net-next 9/9] netns: selective flush of rt_cache

2008-07-04 Thread Denis V. Lunev
dst cache is marked as expired on the per/namespace basis by previous path. Right now we have to implement selective cache shrinking. This procedure has been ported from older OpenVz codebase. Signed-off-by: Denis V. Lunev [EMAIL PROTECTED] --- net/ipv4/route.c | 31

[Devel] Re: [PATCH 12/15] driver core: Implement tagged directory support for device classes.

2008-07-04 Thread Eric W. Biederman
Thank you for your opinion. Incremental patches to make things more beautiful are welcome. Please remember we are not building lisp. The goal is code that works today. Since we are not talking about correctness of the code. Since we are not talking about interfaces with user space. Since we

[Devel] Re: [PATCH 12/15] driver core: Implement tagged directory support for device classes.

2008-07-04 Thread Tejun Heo
Hello, Eric. Eric W. Biederman wrote: Thank you for your opinion. Incremental patches to make things more beautiful are welcome. Please remember we are not building lisp. The goal is code that works today. Since we are not talking about correctness of the code. Since we are not

[Devel] [PATCH 0/3] cgroup: block device i/o bandwidth controller (v4)

2008-07-04 Thread Andrea Righi
The objective of the i/o bandwidth controller is to improve i/o performance predictability of different cgroups sharing the same block devices. Respect to other priority/weight-based solutions the approach used by this controller is to explicitly choke applications' requests that directly (or

[Devel] [PATCH 2/3] i/o bandwidth controller infrastructure

2008-07-04 Thread Andrea Righi
This is the core io-throttle kernel infrastructure. It creates the basic interfaces to cgroups and implements the I/O measurement and throttling functions. Signed-off-by: Andrea Righi [EMAIL PROTECTED] --- block/Makefile |2 + block/blk-io-throttle.c | 529

[Devel] [PATCH 1/3] i/o bandwidth controller documentation

2008-07-04 Thread Andrea Righi
Documentation of the block device I/O bandwidth controller: description, usage, advantages and design. Signed-off-by: Andrea Righi [EMAIL PROTECTED] --- Documentation/controllers/io-throttle.txt | 265 + 1 files changed, 265 insertions(+), 0 deletions(-) create mode

[Devel] [PATCH 3/3] i/o accounting and control

2008-07-04 Thread Andrea Righi
Apply the io-throttle controller to the opportune kernel functions. Both accounting and throttling functionalities are performed by cgroup_io_throttle(). Signed-off-by: Andrea Righi [EMAIL PROTECTED] --- block/blk-core.c|2 ++ fs/buffer.c | 10 ++ fs/direct-io.c |

[Devel] Re: Network namespaces without isolation

2008-07-04 Thread Andreas B Aaen
On Friday 04 July 2008 11:52, Eric W. Biederman wrote: Andreas B Aaen [EMAIL PROTECTED] writes: Answering part of your question. As currently designed you can use multiple network namespaces in a single task, and you can place each vlan interface in different network namespace. However the

[Devel] Re: [PATCH 12/15] driver core: Implement tagged directory support for device classes.

2008-07-04 Thread Greg KH
On Fri, Jul 04, 2008 at 10:57:15PM +0900, Tejun Heo wrote: Hello, Eric. Eric W. Biederman wrote: Thank you for your opinion. Incremental patches to make things more beautiful are welcome. Please remember we are not building lisp. The goal is code that works today. Since we

[Devel] Re: [PATCH 12/15] driver core: Implement tagged directory support for device classes.

2008-07-04 Thread Eric W. Biederman
Greg KH [EMAIL PROTECTED] writes: Sorry, Greg is walking out the door in 30 minutes for a much needed week long vacation and can't look into this right now :( I'll be able to review it next weekend, sorry for the delay. Understood and no problem. Eric

[Devel] Re: [PATCH 12/15] driver core: Implement tagged directory support for device classes.

2008-07-04 Thread Eric W. Biederman
Tejun Heo [EMAIL PROTECTED] writes: Yeah, it seems we should agree to disagree here. I think using callback for static values is a really bad idea. It obfuscates the code and opens up a big hole for awful misuses. Greg, what do you think? The misuse argument is small because currently all

[Devel] Re: Network namespaces without isolation

2008-07-04 Thread Eric W. Biederman
Andreas B Aaen [EMAIL PROTECTED] writes: How do you actually use multiple name spaces in the current implementation in the same task if you refer to them indirectly through pids? So if I need 500 network namespaces then I need to fork 500 processes. Currently sockets are explicitly tagged

[Devel] Re: [PATCH 2/3] i/o bandwidth controller infrastructure

2008-07-04 Thread Li Zefan
+/** + * struct iothrottle_node - throttling rule of a single block device + * @node: list of per block device throttling rules + * @dev: block device number, used as key in the list + * @iorate: max i/o bandwidth (in bytes/s) + * @strategy: throttling strategy (0 = leaky bucket, 1 = token

[Devel] Re: [PATCH -mm 5/5] swapcgroup (v3): implement force_empty

2008-07-04 Thread KAMEZAWA Hiroyuki
On Fri, 4 Jul 2008 21:33:01 +0900 Daisuke Nishimura [EMAIL PROTECTED] wrote: On Fri, 4 Jul 2008 19:16:38 +0900, KAMEZAWA Hiroyuki [EMAIL PROTECTED] wrote: On Fri, 4 Jul 2008 15:24:23 +0900 Daisuke Nishimura [EMAIL PROTECTED] wrote: This patch implements force_empty of swapcgroup.

[Devel] Query on host swapping of guest pages.

2008-07-04 Thread Arn
Hi, I need to know the following for some experiments I'm doing as part of a school project: Does OpenVZ support host swapping of guest pages ? That is : if a host experiences memory pressure, can it swap out some of a guest's (guest vm) pages and use them ? I assume that if this is possible