Hi, this set is for memory controller background reclaim.
Merged YAMAMOTO-san's version onto 2.6.23-rc3-mm1 + my NUMA patch.
And splitted to several sets.
Major changes from his one is
- use kthread instead of work_queue
- adjust high/low watermark when limit changes automatically
and set
At implmenting high/low watermark in res_counter, it will be better to
adjust high/low value when limit changes. (or don't allow user to specify
high/low value)
This patch adds *internal* interface to modify resource value.
(If there are only limit/usage/failcnt, these routines are not necessary
spinlock is necessary when someone changes res_counter value.
splited out from YAMAMOTO's background page reclaim for memory cgroup set.
Signed-off-by: KAMEZAWA Hiroyuki [EMAIL PROTECTED]
From: YAMAMOTO Takashi [EMAIL PROTECTED]
kernel/res_counter.c |5 +++--
1 file changed, 3
This patch adds high/low watermark parameter to res_counter.
splitted out from YAMAMOTO's background page reclaim for memory cgroup set.
Changes:
* added param watermark_state this allows status check without lock.
Signed-off-by: KAMEZAWA Hiroyuki [EMAIL PROTECTED]
From: YAMAMOTO Takashi
High Low watermark for page reclaiming in memory cgroup(1)
High-Low value here is implemented for support background reclaim.
- If USAGE is bigger than high watermark, background reclaim starts.
- If USAGE is lower than low watermark, background reclaim stops.
Each value is represented in
Create a daemon which does background page reclaim.
This daemon
* starts when usage high_watermark
* stops when usage low_watermark.
Because kthread_run() cannot be used when init_mem_cgroup is initialized(Sigh),
thread for init_mem_cgroup is invoked later by initcall.
Changes from
The
if (statement)
WARN_ON(1);
looks much better as
WARN_ON(statement);
Signed-off-by: Pavel Emelyanov [EMAIL PROTECTED]
---
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 61ead1d..e41f4b9 100644
--- a/net/core/net-sysfs.c
+++
KAMEZAWA Hiroyuki wrote:
At implmenting high/low watermark in res_counter, it will be better to
adjust high/low value when limit changes. (or don't allow user to specify
high/low value)
This patch adds *internal* interface to modify resource value.
(If there are only limit/usage/failcnt,
KAMEZAWA Hiroyuki wrote:
spinlock is necessary when someone changes res_counter value.
splited out from YAMAMOTO's background page reclaim for memory cgroup set.
Signed-off-by: KAMEZAWA Hiroyuki [EMAIL PROTECTED]
From: YAMAMOTO Takashi [EMAIL PROTECTED]
kernel/res_counter.c |5
KAMEZAWA Hiroyuki wrote:
This patch adds high/low watermark parameter to res_counter.
splitted out from YAMAMOTO's background page reclaim for memory cgroup set.
Changes:
* added param watermark_state this allows status check without lock.
Signed-off-by: KAMEZAWA Hiroyuki [EMAIL
KAMEZAWA Hiroyuki wrote:
High Low watermark for page reclaiming in memory cgroup(1)
High-Low value here is implemented for support background reclaim.
- If USAGE is bigger than high watermark, background reclaim starts.
- If USAGE is lower than low watermark, background reclaim stops.
On Mon, 2007-11-19 at 12:51 +, Christoph Hellwig wrote:
On Fri, Nov 16, 2007 at 06:06:51PM +0300, Alexey Dobriyan wrote:
proc_kill_inodes() can clear -i_fop in the middle of vfs_readdir resulting
in
NULL dereference during file-f_op-readdir(file, buf, filler).
The solution is to
On Tue, 2007-11-20 at 15:17 +, Christoph Hellwig wrote:
On Tue, Nov 20, 2007 at 10:05:05AM -0500, Stephen Smalley wrote:
Nice, getting rid of this is a very good step formwards. Unfortunately
we have another copy of this junk in
security/selinux/selinuxfs.c:sel_remove_entries()
--- Mark Nelson [EMAIL PROTECTED] wrote:
Subject: [PATCH 2/2] hijack: update task_alloc_security
Update task_alloc_security() to take the hijacked task as a second
argument.
Could y'all bring me up to speed on what this is intended to
accomplish so that I can understand the Smack
--- Serge E. Hallyn [EMAIL PROTECTED] wrote:
Quoting Stephen Smalley ([EMAIL PROTECTED]):
On Tue, 2007-11-27 at 10:11 -0600, Serge E. Hallyn wrote:
Quoting Crispin Cowan ([EMAIL PROTECTED]):
Just the name sys_hijack makes me concerned.
This post describes a bunch of what, but
KAMEZAWA Hiroyuki wrote:
Create a daemon which does background page reclaim.
This daemon
* starts when usage high_watermark
* stops when usage low_watermark.
Because kthread_run() cannot be used when init_mem_cgroup is
initialized(Sigh),
thread for init_mem_cgroup is invoked
It will give another easy way to locate selinux security structures inside
the kernel, will not?
Again, if you have a kernel vulnerability and this feature, someone will
easily disable selinux for the process, or just change the security concerns
for it ;).
cya,
Rodrigo (BSDaemon).
--
On Tue, 2007-11-27 at 00:52 -0500, Joshua Brindle wrote:
Mark Nelson wrote:
Subject: [PATCH 2/2] hijack: update task_alloc_security
Update task_alloc_security() to take the hijacked task as a second
argument.
For the selinux version, refuse permission if hijack_src!=current,
since
Quoting Casey Schaufler ([EMAIL PROTECTED]):
--- Serge E. Hallyn [EMAIL PROTECTED] wrote:
Quoting Stephen Smalley ([EMAIL PROTECTED]):
On Tue, 2007-11-27 at 10:11 -0600, Serge E. Hallyn wrote:
Quoting Crispin Cowan ([EMAIL PROTECTED]):
Just the name sys_hijack makes me
Quoting Crispin Cowan ([EMAIL PROTECTED]):
Serge E. Hallyn wrote:
Quoting Stephen Smalley ([EMAIL PROTECTED]):
I agree with this part - we don't want people to have to choose between
using containers and using selinux, so if hijack is going to be a
requirement for effective use of
Quoting Crispin Cowan ([EMAIL PROTECTED]):
Serge E. Hallyn wrote:
Quoting Casey Schaufler ([EMAIL PROTECTED]):
Could y'all bring me up to speed on what this is intended to
accomplish so that I can understand the Smack implications?
It's basically like ptracing a process,
On Tue, 2007-11-27 at 16:38 -0600, Serge E. Hallyn wrote:
Quoting Stephen Smalley ([EMAIL PROTECTED]):
On Tue, 2007-11-27 at 10:11 -0600, Serge E. Hallyn wrote:
Quoting Crispin Cowan ([EMAIL PROTECTED]):
Just the name sys_hijack makes me concerned.
This post describes a bunch of
Quoting Stephen Smalley ([EMAIL PROTECTED]):
On Tue, 2007-11-27 at 16:38 -0600, Serge E. Hallyn wrote:
Quoting Stephen Smalley ([EMAIL PROTECTED]):
On Tue, 2007-11-27 at 10:11 -0600, Serge E. Hallyn wrote:
Quoting Crispin Cowan ([EMAIL PROTECTED]):
Just the name sys_hijack makes me
From: Cedric Le Goater [EMAIL PROTECTED]
This patch includes the mqueue namespace in the nsproxy object. It
also adds the support of unshare() and clone() with a new clone flag
CLONE_NEWMQ (1 bit left in the clone flags !)
CLONE_NEWMQ is required to be cloned or unshared along with
Hello !
Here's a small patchset introducing a new namespace for POSIX
message queues.
Nothing really complex a part from the mqueue filesystem which
needed some special care
Thanks for reviewing !
C.
___
Containers mailing list
[EMAIL PROTECTED]
From: Cedric Le Goater [EMAIL PROTECTED]
This patch adds a struct mq_namespace holding the common attributes
of the mqueue namespace.
The current code is modified to use the default mqueue namespace
object 'init_mq_ns' and to prepare the ground for futur dynamic
objects.
A new option
From: Cedric Le Goater [EMAIL PROTECTED]
Move forward and start using the mqueue namespace.
The single super block mount of the file system is modified to allow
one mount per namespace. This is achieved by storing the namespace
in the super_block s_fs_info attribute.
Signed-off-by: Cedric Le
Just a heads up: This patch is the apparent cause of a boot time
panic--null pointer deref--on my numa platform. See below.
On Tue, 2007-11-27 at 12:00 +0900, KAMEZAWA Hiroyuki wrote:
Counting active/inactive per-zone in memory controller.
This patch adds per-zone status in memory cgroup.
On Wed, 28 Nov 2007 15:20:42 +0300
Pavel Emelyanov [EMAIL PROTECTED] wrote:
+ mem = mem_cgroup_from_cont(cont);
+ spin_lock_irqsave(mem-res.lock, flags);
+ val = res_counter_get(mem-res, RES_LIMIT);
+ if (val == (unsigned long long) LLONG_MAX) {
+ low = (unsigned long
On Wed, 28 Nov 2007 16:19:59 -0500
Lee Schermerhorn [EMAIL PROTECTED] wrote:
As soon as this loop hits the first non-existent node on my platform, I
get a NULL pointer deref down in __alloc_pages. Stack trace below.
Perhaps N_POSSIBLE should be N_HIGH_MEMORY? That would require handling
On Wed, 28 Nov 2007 14:08:31 +0300
Pavel Emelyanov [EMAIL PROTECTED] wrote:
KAMEZAWA Hiroyuki wrote:
spinlock is necessary when someone changes res_counter value.
splited out from YAMAMOTO's background page reclaim for memory cgroup set.
Signed-off-by: KAMEZAWA Hiroyuki [EMAIL
On Wed, 28 Nov 2007 14:09:26 +0300
Pavel Emelyanov [EMAIL PROTECTED] wrote:
+void res_counter_set(struct res_counter *res, int member,
+ unsigned long long newval)
+{
+ unsigned long long *val;
+
+ val = res_counter_member(res, member);
+ *val = newval;
On Wed, 28 Nov 2007 14:12:53 +0300
Pavel Emelyanov [EMAIL PROTECTED] wrote:
/*
@@ -73,6 +88,8 @@
RES_USAGE,
RES_LIMIT,
RES_FAILCNT,
+ RES_HIGH_WATERMARK,
+ RES_LOW_WATERMARK,
I'd prefer some shorter names. Like RES_HWMARK and RES_LWMARK.
Hmm, ok.
On Wed, 28 Nov 2007 14:20:33 +0300
Pavel Emelyanov [EMAIL PROTECTED] wrote:
+static ssize_t mem_cgroup_write_limit(struct cgroup *cont, struct cftype
*cft,
+ struct file *file, const char __user *userbuf,
+ size_t nbytes, loff_t *ppos)
On Wed, 28 Nov 2007 14:06:22 +0300
Pavel Emelyanov [EMAIL PROTECTED] wrote:
+ struct {
+ wait_queue_head_t waitq;
+ struct task_struct *thread;
+ } daemon;
Does this HAS to be a struct?
No, but for shorter name of member.
(I'd like to add another waitq for
On Thu, 29 Nov 2007 10:37:02 +0900
KAMEZAWA Hiroyuki [EMAIL PROTECTED] wrote:
Maybe zonelists of NODE_DATA() is not initialized. you are right.
I think N_HIGH_MEMORY will be suitable here...(I'll consider node-hotplug
case later.)
Thank you for test!
Could you try this ?
Thanks,
-Kame
This patch adds high/low watermark parameter to res_counter.
splitted out from YAMAMOTO's background page reclaim for memory cgroup set.
thanks.
+ * Watermarks
+ * Should be changed automatically when the limit is changed and
+ * keep low high limit.
+ */
+
On Thu, 29 Nov 2007 11:24:06 +0900
KAMEZAWA Hiroyuki [EMAIL PROTECTED] wrote:
On Thu, 29 Nov 2007 10:37:02 +0900
KAMEZAWA Hiroyuki [EMAIL PROTECTED] wrote:
Maybe zonelists of NODE_DATA() is not initialized. you are right.
I think N_HIGH_MEMORY will be suitable here...(I'll consider
@@ -651,10 +758,11 @@
/* Avoid race with charge */
atomic_set(pc-ref_cnt, 0);
if (clear_page_cgroup(page, pc) == pc) {
+ int active;
css_put(mem-css);
+ active = pc-flags
On Thu, 29 Nov 2007 11:56:08 +0900 (JST)
[EMAIL PROTECTED] (YAMAMOTO Takashi) wrote:
This patch adds high/low watermark parameter to res_counter.
splitted out from YAMAMOTO's background page reclaim for memory cgroup set.
thanks.
+* Watermarks
+* Should be changed
On Thu, 29 Nov 2007, KAMEZAWA Hiroyuki wrote:
ok, just use N_HIGH_MEMORY here and add comment for hotplugging support is
not yet.
Christoph-san, Lee-san, could you confirm following ?
- when SLAB is used, kmalloc_node() against offline node will success.
- when SLUB is used,
On Thu, 29 Nov 2007 12:19:37 +0900 (JST)
[EMAIL PROTECTED] (YAMAMOTO Takashi) wrote:
@@ -651,10 +758,11 @@
/* Avoid race with charge */
atomic_set(pc-ref_cnt, 0);
if (clear_page_cgroup(page, pc) == pc) {
+ int active;
+static inline struct mem_cgroup_per_zone *
+mem_cgroup_zoneinfo(struct mem_cgroup *mem, int nid, int zid)
+{
+ if (!mem-info.nodeinfo[nid])
can this be true?
YAMAMOTO Takashi
+ return NULL;
+ return mem-info.nodeinfo[nid]-zoneinfo[zid];
+}
+
to me, it seems weird to prevent users from making these values back to
the default.
will fix.
LLONG_MAX-1 for high
LLONG_MAX-2 for low ...?
imo it's better to simply allow low == high == limit.
BTW, it should be low + PAGE_SIZE high + PAGE_SIZE limit ...?
it shouldn't, unless you
On Thu, 29 Nov 2007 12:33:28 +0900 (JST)
[EMAIL PROTECTED] (YAMAMOTO Takashi) wrote:
+static inline struct mem_cgroup_per_zone *
+mem_cgroup_zoneinfo(struct mem_cgroup *mem, int nid, int zid)
+{
+ if (!mem-info.nodeinfo[nid])
can this be true?
YAMAMOTO Takashi
When I set
Serge E. Hallyn wrote:
Quoting Crispin Cowan ([EMAIL PROTECTED]):
Is there to be an LSM hook, so that modules can decide on an arbitrary
decision of whether to allow a hijack? So that this do the right
SELinux thing can be generalized for all LSMs to do the right thing.
Currently:
On Mon, Nov 26, 2007 at 08:14:12PM +0300, Pavel Emelyanov wrote:
This function seems too big for inlining. Indeed, it saves
half-a-kilo when uninlined:
add/remove: 1/0 grow/shrink: 0/7 up/down: 195/-719 (-524)
function old new delta
47 matches
Mail list logo