Hi.
This is new version of swapcgroup.
Major changes from previous version
- Rebased on 2.6.26-rc5-mm3.
The new -mm has been released, but these patches
can be applied on 2.6.26-rc8-mm1 too with only some offset warnings.
I tested these patches on 2.6.26-rc5-mm3 with some fixes about
Even if limiting memory usage by memory cgroup,
swap space is shared, so resource isolation is not enough.
If one group uses up most of the swap space, it can affect
other groups anyway.
The purpose of swapcgroup is limiting swap usage per group
as memory cgroup limits the RSS memory usage.
It's
This patch add a member to swap_info_struct for cgroup.
This member, array of pointers to mem_cgroup, is used to
remember to which cgroup each swap entries are charged.
The memory for this array of pointers is allocated on swapon,
and freed on swapoff.
Change log
v2-v3
- Rebased on
This patch implements charge and uncharge of swapcgroup.
- what will be charged ?
charge the number of swap entries in bytes.
- when to charge/uncharge ?
charge at get_swap_entry(), and uncharge at swap_entry_free().
- to what group charge the swap entry ?
To determine to what mem_cgroup
This patch modifies vm_swap_full() to calculate
the rate of swap usage per cgroup.
The purpose of this change is to free freeable swap caches
(that is, swap entries) per cgroup, so that swap_cgroup_charge()
fails less frequently.
I tested whether this patch can reduce the swap usage or not,
by
This patch implements force_empty of swapcgroup.
Currently, it simply uncharges all the charges from the group.
I think there can be other implementations.
What I thought are:
- move all the charges to its parent.
- unuse(swap in) all the swap charged to the group.
But in any case, I think
Eric W. Biederman wrote:
It finally dawned on me what the clean fix to sysfs_rename_dir
calling kobject_set_name is. Move the work into kobject_rename
where it belongs. The callers serialize us anyway so this is
safe.
Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]
Nice clean up.
Eric W. Biederman wrote:
Currently sysfs_chmod calls sys_setattr which in turn calls
inode_change_ok which checks to see if it is ok for the current user
space process to change tha attributes. Since sysfs_chmod_file has
only kernel mode clients denying them permission if user space is the
Eric W. Biederman wrote:
Teach sysfs_chmod_file how to handle multiple sysfs superblocks.
Since we only have one inode per sd the only thing we have to deal
with is multiple dentries for sending fs notifications. This might
dup the inode notifications oh well.
Signed-off-by: Eric W.
Eric W. Biederman wrote:
Accessing the internal sysfs_mount is error prone in the context
of multiple super blocks, and nothing needs it. Not even the
sysfs crash debugging patch (although it did in an earlier version).
Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]
Acked-by: Tejun Heo
hi,
+/*
+ * uncharge all the entries that are charged to the group.
+ */
+void __swap_cgroup_force_empty(struct mem_cgroup *mem)
+{
+ struct swap_info_struct *p;
+ int type;
+
+ spin_lock(swap_lock);
+ for (type = swap_list.head; type = 0; type = swap_info[type].next) {
Hi, Yamamoto-san.
Thank you for your comment.
On Fri, 4 Jul 2008 15:54:31 +0900 (JST), [EMAIL PROTECTED] (YAMAMOTO Takashi)
wrote:
hi,
+/*
+ * uncharge all the entries that are charged to the group.
+ */
+void __swap_cgroup_force_empty(struct mem_cgroup *mem)
+{
+ struct
Hi, Yamamoto-san.
Thank you for your comment.
On Fri, 4 Jul 2008 15:54:31 +0900 (JST), [EMAIL PROTECTED] (YAMAMOTO
Takashi) wrote:
hi,
+/*
+ * uncharge all the entries that are charged to the group.
+ */
+void __swap_cgroup_force_empty(struct mem_cgroup *mem)
+{
+
Eric W. Biederman wrote:
This patch enables tagging on every class directory if struct class
has a tag_type.
In addition device_del and device_rename were modified to uses
sysfs_delete_link and sysfs_rename_link respectively to ensure
when these operations happen on devices whose classes
On Fri, 4 Jul 2008 16:48:28 +0900 (JST), [EMAIL PROTECTED] (YAMAMOTO Takashi)
wrote:
Hi, Yamamoto-san.
Thank you for your comment.
On Fri, 4 Jul 2008 15:54:31 +0900 (JST), [EMAIL PROTECTED] (YAMAMOTO
Takashi) wrote:
hi,
+/*
+ * uncharge all the entries that are
The problem. When implementing a network namespace I need to be able
to have multiple network devices with the same name. Currently this
is a problem for /sys/class/net/*, /sys/devices/virtual/net/*, and
potentially a few other directories of the form /sys/ ... /net/*.
What this patch does is
On Fri, 4 Jul 2008 15:15:36 +0900
Daisuke Nishimura [EMAIL PROTECTED] wrote:
Hi.
This is new version of swapcgroup.
Major changes from previous version
- Rebased on 2.6.26-rc5-mm3.
The new -mm has been released, but these patches
can be applied on 2.6.26-rc8-mm1 too with only some
On Fri, 4 Jul 2008 15:22:44 +0900
Daisuke Nishimura [EMAIL PROTECTED] wrote:
/* Swap 50% full? Release swapcache more aggressively.. */
-#define vm_swap_full() (nr_swap_pages*2 total_swap_pages)
+#define vm_swap_full(memcg) ((nr_swap_pages*2 total_swap_pages) \
+
Andreas B Aaen [EMAIL PROTECTED] writes:
Hi,
I am looking into the network namespace implementation because I need an IP
stack that is capable of talking with a number of separate IP nets with
possible overlapping IP adresses. My connection to each separate IP-net is
through a tunnel
Hi, Kamezawa-san.
On Fri, 4 Jul 2008 18:58:45 +0900, KAMEZAWA Hiroyuki [EMAIL PROTECTED] wrote:
On Fri, 4 Jul 2008 15:22:44 +0900
Daisuke Nishimura [EMAIL PROTECTED] wrote:
/* Swap 50% full? Release swapcache more aggressively.. */
-#define vm_swap_full() (nr_swap_pages*2
On Fri, 4 Jul 2008 18:40:33 +0900, KAMEZAWA Hiroyuki [EMAIL PROTECTED] wrote:
On Fri, 4 Jul 2008 15:15:36 +0900
Daisuke Nishimura [EMAIL PROTECTED] wrote:
Hi.
This is new version of swapcgroup.
Major changes from previous version
- Rebased on 2.6.26-rc5-mm3.
The new -mm has
Pavel Machek wrote:
Hi!
This patchset is a part of an effort to change some syscalls behavior for
checkpoint restart.
When restarting an object that has previously been checkpointed, its state
should be unchanged compared to the checkpointed image.
For example, a restarted process should
On Fri, 4 Jul 2008 19:16:38 +0900, KAMEZAWA Hiroyuki [EMAIL PROTECTED] wrote:
On Fri, 4 Jul 2008 15:24:23 +0900
Daisuke Nishimura [EMAIL PROTECTED] wrote:
This patch implements force_empty of swapcgroup.
Currently, it simply uncharges all the charges from the group.
I think there
Signed-off-by: Denis V. Lunev [EMAIL PROTECTED]
---
include/net/netns/ipv4.h |1 +
net/ipv4/route.c | 76 ++
2 files changed, 44 insertions(+), 33 deletions(-)
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index
This is required to pass namespace context into rt_cache_flush called from
-flush_cache.
Signed-off-by: Denis V. Lunev [EMAIL PROTECTED]
---
include/net/fib_rules.h |2 +-
net/core/fib_rules.c|2 +-
net/decnet/dn_rules.c |2 +-
net/ipv4/fib_rules.c|4 ++--
4 files
flush delay is used as an external storage for net.ipv4.route.flush sysctl
entry. It is write-only.
The ctl_table-data for this entry is used once. Fix this case to point
to the stack to remove global variable. Do this to avoid additional
variable on struct net in the next patch.
Possible race
Signed-off-by: Denis V. Lunev [EMAIL PROTECTED]
---
include/net/route.h |2 +-
net/ipv4/arp.c |2 +-
net/ipv4/devinet.c |8 +---
net/ipv4/fib_frontend.c | 17 +
net/ipv4/fib_hash.c |6 +++---
net/ipv4/fib_rules.c|2 +-
Basically, there is no difference to atomic_read internally or pass it as
a parameter as rt_hash is inline.
Signed-off-by: Denis V. Lunev [EMAIL PROTECTED]
---
net/ipv4/route.c | 28 +---
1 files changed, 17 insertions(+), 11 deletions(-)
diff --git a/net/ipv4/route.c
Signed-off-by: Denis V. Lunev [EMAIL PROTECTED]
---
net/ipv4/route.c |6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index f99d9db..9725223 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -778,7 +778,7 @@ static void
Signed-off-by: Denis V. Lunev [EMAIL PROTECTED]
---
include/net/netns/ipv4.h |2 ++
net/ipv4/route.c | 40 ++--
2 files changed, 32 insertions(+), 10 deletions(-)
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index
Hello, Dave!
This series of patches implements selective rt cache flushing to make
sure that in one namespace we'll not been able to affect the performance
of other from the user space.
Regards,
Den
___
Containers mailing list
[EMAIL
Signed-off-by: Denis V. Lunev [EMAIL PROTECTED]
---
include/net/netns/ipv4.h |1 +
net/ipv4/route.c | 79 --
2 files changed, 70 insertions(+), 10 deletions(-)
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index
dst cache is marked as expired on the per/namespace basis by previous
path. Right now we have to implement selective cache shrinking. This
procedure has been ported from older OpenVz codebase.
Signed-off-by: Denis V. Lunev [EMAIL PROTECTED]
---
net/ipv4/route.c | 31
Thank you for your opinion.
Incremental patches to make things more beautiful are welcome.
Please remember we are not building lisp. The goal is code that works today.
Since we are not talking about correctness of the code. Since we are not
talking about interfaces with user space. Since we
Hello, Eric.
Eric W. Biederman wrote:
Thank you for your opinion.
Incremental patches to make things more beautiful are welcome.
Please remember we are not building lisp. The goal is code that works today.
Since we are not talking about correctness of the code. Since we are not
The objective of the i/o bandwidth controller is to improve i/o performance
predictability of different cgroups sharing the same block devices.
Respect to other priority/weight-based solutions the approach used by this
controller is to explicitly choke applications' requests that directly (or
This is the core io-throttle kernel infrastructure. It creates the basic
interfaces to cgroups and implements the I/O measurement and throttling
functions.
Signed-off-by: Andrea Righi [EMAIL PROTECTED]
---
block/Makefile |2 +
block/blk-io-throttle.c | 529
Documentation of the block device I/O bandwidth controller: description, usage,
advantages and design.
Signed-off-by: Andrea Righi [EMAIL PROTECTED]
---
Documentation/controllers/io-throttle.txt | 265 +
1 files changed, 265 insertions(+), 0 deletions(-)
create mode
Apply the io-throttle controller to the opportune kernel functions. Both
accounting and throttling functionalities are performed by
cgroup_io_throttle().
Signed-off-by: Andrea Righi [EMAIL PROTECTED]
---
block/blk-core.c|2 ++
fs/buffer.c | 10 ++
fs/direct-io.c |
On Friday 04 July 2008 11:52, Eric W. Biederman wrote:
Andreas B Aaen [EMAIL PROTECTED] writes:
Answering part of your question. As currently designed you can use
multiple network namespaces in a single task, and you can place each vlan
interface in different network namespace. However the
On Fri, Jul 04, 2008 at 10:57:15PM +0900, Tejun Heo wrote:
Hello, Eric.
Eric W. Biederman wrote:
Thank you for your opinion.
Incremental patches to make things more beautiful are welcome.
Please remember we are not building lisp. The goal is code that works
today.
Since we
Greg KH [EMAIL PROTECTED] writes:
Sorry, Greg is walking out the door in 30 minutes for a much needed week
long vacation and can't look into this right now :(
I'll be able to review it next weekend, sorry for the delay.
Understood and no problem.
Eric
Tejun Heo [EMAIL PROTECTED] writes:
Yeah, it seems we should agree to disagree here. I think using callback
for static values is a really bad idea. It obfuscates the code and
opens up a big hole for awful misuses. Greg, what do you think?
The misuse argument is small because currently all
Andreas B Aaen [EMAIL PROTECTED] writes:
How do you actually use multiple name spaces in the current implementation in
the same task if you refer to them indirectly through pids?
So if I need 500 network namespaces then I need to fork 500 processes.
Currently sockets are explicitly tagged
+/**
+ * struct iothrottle_node - throttling rule of a single block device
+ * @node: list of per block device throttling rules
+ * @dev: block device number, used as key in the list
+ * @iorate: max i/o bandwidth (in bytes/s)
+ * @strategy: throttling strategy (0 = leaky bucket, 1 = token
On Fri, 4 Jul 2008 21:33:01 +0900
Daisuke Nishimura [EMAIL PROTECTED] wrote:
On Fri, 4 Jul 2008 19:16:38 +0900, KAMEZAWA Hiroyuki [EMAIL PROTECTED]
wrote:
On Fri, 4 Jul 2008 15:24:23 +0900
Daisuke Nishimura [EMAIL PROTECTED] wrote:
This patch implements force_empty of swapcgroup.
Hi,
I need to know the following for some experiments I'm doing as part of
a school project:
Does OpenVZ support host swapping of guest pages ? That is : if a host
experiences memory pressure, can it swap out some of a guest's (guest
vm) pages and use them ? I assume that if this is possible
47 matches
Mail list logo