On Fri, 19 Oct 2012, Glauber Costa wrote:
What about gfp __GFP_FS?
Do you intend to prevent or allow OOM under that flag? I personally
think that anything that accepts to be OOM-killed should have GFP_WAIT
set, so that ought to be enough.
The oom killer in the page allocator
On Thu, 18 Oct 2012, Glauber Costa wrote:
Do we actually need to test PF_KTHREAD when current-mm == NULL?
Perhaps because of aio threads whcih temporarily adopt a userspace mm?
I believe so. I remember I discussed this in the past with David
Rientjes and he advised me to test for both
all definitions to slab.h ]
Signed-off-by: Glauber Costa glom...@parallels.com
Acked-by: Christoph Lameter c...@linux.com
CC: David Rientjes rient...@google.com
CC: Pekka Enberg penb...@cs.helsinki.fi
Acked-by: David Rientjes rient...@google.com
-by: Glauber Costa glom...@parallels.com
Acked-by: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com
Acked-by: Michal Hocko mho...@suse.cz
Acked-by: Johannes Weiner han...@cmpxchg.org
CC: Tejun Heo t...@kernel.org
Acked-by: David Rientjes rient...@google.com
...@parallels.com
Acked-by: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com
Acked-by: Michal Hocko mho...@suse.cz
Acked-by: Johannes Weiner han...@cmpxchg.org
CC: Tejun Heo t...@kernel.org
Acked-by: David Rientjes rient...@google.com
___
Devel mailing list
Devel
On Tue, 16 Oct 2012, Glauber Costa wrote:
This patch adds the basic infrastructure for the accounting of kernel
memory. To control that, the following files are created:
* memory.kmem.usage_in_bytes
* memory.kmem.limit_in_bytes
* memory.kmem.failcnt
* memory.kmem.max_usage_in_bytes
c...@linux.com
CC: Pekka Enberg penb...@cs.helsinki.fi
CC: Michal Hocko mho...@suse.cz
CC: Suleiman Souhlal sulei...@google.com
CC: Tejun Heo t...@kernel.org
Acked-by: David Rientjes rient...@google.com
___
Devel mailing list
Devel@openvz.org
https
On Tue, 16 Oct 2012, Glauber Costa wrote:
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 8d9489f..303a456 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -21,6 +21,7 @@
#define _LINUX_MEMCONTROL_H
#include linux/cgroup.h
#include
mgor...@suse.de
Acked-by: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com
CC: Christoph Lameter c...@linux.com
CC: Pekka Enberg penb...@cs.helsinki.fi
CC: Johannes Weiner han...@cmpxchg.org
CC: Suleiman Souhlal sulei...@google.com
CC: Tejun Heo t...@kernel.org
Acked-by: David Rientjes rient
this value.
Signed-off-by: Glauber Costa glom...@parallels.com
Reviewed-by: Michal Hocko mho...@suse.cz
Acked-by: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com
CC: Johannes Weiner han...@cmpxchg.org
CC: Suleiman Souhlal sulei...@google.com
CC: Tejun Heo t...@kernel.org
Acked-by: David
On Tue, 16 Oct 2012, Glauber Costa wrote:
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 1182188..e24b388 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -344,6 +344,7 @@ struct mem_cgroup {
/* internal only representation about the status of kmem accounting. */
enum {
On Wed, 27 Jun 2012, Glauber Costa wrote:
fork bombs are a way bad behaved processes interfere with the rest of
the system. In here, I propose fork bomb stopping as a natural
consequence of the fact that the amount of kernel memory can be limited,
and each process uses 1 or 2 pages for the
On Wed, 27 Jun 2012, Glauber Costa wrote:
Nothing, but I also don't see how to prevent that.
You can test for current-flags PF_KTHREAD following the check for
in_interrupt() and return true, it's what you were trying to do with the
check for !current-mm.
am I right to believe
On Wed, 27 Jun 2012, Glauber Costa wrote:
@@ -2206,7 +2214,7 @@ static int mem_cgroup_do_charge(struct mem_cgroup
*memcg, gfp_t gfp_mask,
* unlikely to succeed so close to the limit, and we fall back
* to regular pages anyway in case of failure.
*/
- if (nr_pages == 1
On Tue, 26 Jun 2012, Glauber Costa wrote:
diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
index ccc1899..914ec07 100644
--- a/include/linux/thread_info.h
+++ b/include/linux/thread_info.h
@@ -61,6 +61,12 @@ extern long do_no_restart_syscall(struct
On Tue, 26 Jun 2012, Glauber Costa wrote:
+ * retries
+ */
+#define NR_PAGES_TO_RETRY 2
+
Should be 1 PAGE_ALLOC_COSTLY_ORDER? Where does this number come from?
The changelog doesn't specify.
Hocko complained about that, and I changed. Where the number comes from, is
On Tue, 26 Jun 2012, Glauber Costa wrote:
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 8e601e8..9352d40 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -387,9 +387,12 @@ enum charge_type {
};
/* for encoding cft-private value on file */
-#define _MEM
On Tue, 26 Jun 2012, Glauber Costa wrote:
This flag is used to indicate to the callees that this allocation will be
serviced to the kernel. It is not supposed to be passed by the callers
of kmem_cache_alloc, but rather by the cache core itself.
Not sure what serviced to the kernel
On Tue, 26 Jun 2012, Glauber Costa wrote:
Right, because I'm sure that __GFP_KMEMCG will be used in additional
places outside of this patchset and it will be a shame if we have to
always add #ifdef's. I see no reason why we would care if __GFP_KMEMCG
was used when
On Mon, 25 Jun 2012, Glauber Costa wrote:
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 83e7ba9..22479eb 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -21,6 +21,7 @@
#define _LINUX_MEMCONTROL_H
#include linux/cgroup.h
#include
On Tue, 26 Jun 2012, Glauber Costa wrote:
Nope, have you checked the output of /sys/kernel/slab/.../order when
running slub? On my workstation 127 out of 316 caches have order-2 or
higher by default.
Well, this is still on the side of my argument, since this is still a majority
of
On Tue, 26 Jun 2012, Andrew Morton wrote:
mm, maybe. Kernel developers tend to look at code from the point of
view does it work as designed, is it clean, is it efficient, do
I understand it, etc. We often forget to step back and really
consider whether or not it should be merged at all.
On Tue, 26 Jun 2012, Glauber Costa wrote:
@@ -416,6 +423,43 @@ static inline void sock_update_memcg(struct sock *sk)
static inline void sock_release_memcg(struct sock *sk)
{
}
+
+#define mem_cgroup_kmem_on 0
+#define __mem_cgroup_new_kmem_page(a, b, c) false
+#define
On Mon, 25 Jun 2012, Glauber Costa wrote:
From: Suleiman Souhlal ssouh...@freebsd.org
Signed-off-by: Suleiman Souhlal sulei...@google.com
Signed-off-by: Glauber Costa glom...@parallels.com
Acked-by: Kamezawa Hiroyuki kamezawa.hir...@jp.fujitsu.com
Acked-by: David Rientjes rient
On Mon, 25 Jun 2012, Glauber Costa wrote:
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 9304db2..8e601e8 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2158,8 +2158,16 @@ enum {
CHARGE_OOM_DIE, /* the current is killed because of OOM */
};
+/*
+ * We need
On Mon, 25 Jun 2012, Glauber Costa wrote:
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 8e601e8..9352d40 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -387,9 +387,12 @@ enum charge_type {
};
/* for encoding cft-private value on file */
-#define _MEM (0)
On Mon, 25 Jun 2012, Glauber Costa wrote:
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 9352d40..6f34b77 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -265,6 +265,10 @@ struct mem_cgroup {
};
/*
+ * the counter to account for kernel memory usage.
+
On Mon, 25 Jun 2012, Glauber Costa wrote:
This flag is used to indicate to the callees that this allocation will be
serviced to the kernel. It is not supposed to be passed by the callers
of kmem_cache_alloc, but rather by the cache core itself.
Not sure what serviced to the kernel means,
On Mon, 25 Jun 2012, Glauber Costa wrote:
diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
index ccc1899..914ec07 100644
--- a/include/linux/thread_info.h
+++ b/include/linux/thread_info.h
@@ -61,6 +61,12 @@ extern long do_no_restart_syscall(struct restart_block
On Mon, 25 Jun 2012, Andrew Morton wrote:
*/
bool use_hierarchy;
-bool kmem_accounted;
+/*
+ * bit0: accounted by this cgroup
+ * bit1: accounted by a parent.
+ */
+volatile unsigned long kmem_accounted;
On Mon, 25 Jun 2012, Andrew Morton wrote:
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -287,7 +287,11 @@ struct mem_cgroup {
* Should the accounting and control be hierarchical, per subtree?
*/
bool use_hierarchy;
- bool kmem_accounted;
+ /*
+* bit0:
the test matches
= SYSFS, as all other state does.
Signed-off-by: Glauber Costa glom...@parallels.com
Acked-by: David Rientjes rient...@google.com
Can be merged now, there's no dependency on the rest of this patchset.
___
Devel mailing list
Devel
On Fri, 11 May 2012, Glauber Costa wrote:
A consistent name with slub saves us an acessor function.
In both caches, this field represents the same thing. We would
like to use it from the mem_cgroup code.
Signed-off-by: Glauber Costa glom...@parallels.com
Acked-by: David Rientjes rient
On Fri, 11 May 2012, Glauber Costa wrote:
diff --git a/mm/slab.c b/mm/slab.c
index e901a36..91b9c13 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2118,6 +2118,7 @@ static void __kmem_cache_destroy(struct kmem_cache
*cachep)
kfree(l3);
}
}
+
On Fri, 27 Apr 2012, Frederic Weisbecker wrote:
No, because memory is represented by mm_struct, not task_struct, so you
must charge to p-mm-owner to allow for moving threads amongst memcgs
later for memory.move_charge_at_immigrate. You shouldn't be able to
charge two different memcgs
On Tue, 24 Apr 2012, Frederic Weisbecker wrote:
This seems horribly inconsistent with memcg charging of user memory since
it charges to p-mm-owner and you're charging to p. So a thread attached
to a memcg can charge user memory to one memcg while charging slab to
another memcg?
On Tue, 24 Apr 2012, Glauber Costa wrote:
I think memcg is not necessarily wrong. That is because threads in a process
share an address space, and you will eventually need to map a page to deliver
it to userspace. The mm struct points you to the owner of that.
But that is not necessarily
On Tue, 24 Apr 2012, Glauber Costa wrote:
Yes, for user memory, I see charging to p-mm-owner as allowing that
process to eventually move and be charged to a different memcg and there's
no way to do proper accounting if the charge is split amongst different
memcgs because of thread
On Sun, 22 Apr 2012, Glauber Costa wrote:
+/*
+ * Return the kmem_cache we're supposed to use for a slab allocation.
+ * If we are in interrupt context or otherwise have an allocation that
+ * can't fail, we return the original cache.
+ * Otherwise, we will try to use the current memcg's
On Wed, 29 Dec 2010, Li Zefan wrote:
I think it would be appropriate to use a shared nodemask with file scope
whenever you have cgroup_lock() to avoid the unnecessary kmalloc() even
with GFP_KERNEL. Cpusets are traditionally used on very large machines in
the first place, so there is
On Thu, 30 Dec 2010, Li Zefan wrote:
That's what we did for cpu masks :). See commit
2341d1b6598c7146d64a5050b53a72a5a819617f.
I made a patchset to remove on stack cpu masks.
What I meant is we don't have to allocate nodemasks in
cpuset_sprintf_memlist().
This is sufficient:
diff
On Sun, 26 Dec 2010, Ben Blum wrote:
I was going to make a macro like NODEMASK_STATIC, but it turned out that
can_attach() needed the to/from nodemasks to be shared among three
functions for the attaching, so I defined them globally without making a
macro for it.
I'm not sure what the
On Mon, 27 Dec 2010, Ben Blum wrote:
I'm not sure what the benefit of defining it as a macro would be. You're
defining these statically allocated nodemasks so they have file scope, I
hope (so they can be shared amongst the users who synchronize on
cgroup_lock() already).
In the
On Mon, 27 Dec 2010, Ben Blum wrote:
I think it would be appropriate to use a shared nodemask with file scope
whenever you have cgroup_lock() to avoid the unnecessary kmalloc() even
with GFP_KERNEL. Cpusets are traditionally used on very large machines in
the first place, so there is
On Fri, 24 Dec 2010, Ben Blum wrote:
I'll add a patch to my current series to do this. Should I leave alone
the other cases where an out-of-memory causes a silent failure?
(cpuset_change_nodemask, scan_for_empty_cpusets)
Both are protected by cgroup_lock, so I think it should be a pretty
On Thu, 23 Dec 2010, Ben Blum wrote:
On Thu, Dec 16, 2010 at 12:26:03AM -0800, Andrew Morton wrote:
Patches have gone a bit stale, sorry. Refactoring in
kernel/cgroup_freezer.c necessitates a refresh and retest please.
commit 53feb29767c29c877f9d47dcfe14211b5b0f7ebd changed a bunch of
On Fri, 24 Dec 2010, Ben Blum wrote:
Good point. How about pre-allocating the nodemasks in cpuset_can_attach,
and having a cpuset_cancel_attach function which can free them up?
They could be stored in the struct cpuset (protected by cgroup_mutex)
after being pre-allocated - but also only if
On Fri, 24 Dec 2010, Ben Blum wrote:
Oh, also, most (not all) times that NODEMASK_ALLOC is used in cpusets,
cgroup_mutex is also held. So how about just using static storage for
them? (There could be a new macro NODEMASK_ALLOC_STATIC, for use when
the caller can never race against itself.) As
system memory).
However, in dirty_bytes_handler()/dirty_ratio_handler() we actually set
the counterpart value as 0.
I think we should clarify the documentation.
Signed-off-by: Andrea Righi ari...@develer.com
Acked-by: David Rientjes rient...@google.com
Thanks for cc'ing me
On Tue, 23 Feb 2010, Vivek Goyal wrote:
Because you have modified dirtyable_memory() and made it per cgroup, I
think it automatically takes care of the cases of per cgroup dirty ratio,
I mentioned in my previous mail. So we will use system wide dirty ratio
to calculate the allowed
On Mon, 22 Feb 2010, Vivek Goyal wrote:
dirty_ratio is easy to configure. One system wide default value works for
all the newly created cgroups. For dirty_bytes, you shall have to
configure each and individual cgroup with a specific value depneding on
what is the upper limit of memory for
On Mon, 22 Feb 2010, Andrea Righi wrote:
Hmm...do we need spinlock ? You use unsigned long, then, read-write
is always atomic if not read-modify-write.
I think I simply copypaste the memcg-swappiness case. But I agree,
read-write should be atomic.
We don't need memcg-reclaim_param_lock
On Sun, 21 Feb 2010, Andrea Righi wrote:
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 1f9b119..ba3fe0d 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -25,6 +25,16 @@ struct page_cgroup;
struct page;
struct mm_struct;
+/*
On Sun, 21 Feb 2010, Andrea Righi wrote:
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 0b19943..c9ff1cd 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -137,10 +137,11 @@ static struct prop_descriptor vm_dirties;
*/
static int calc_period_shift(void)
{
-
On Thu, 20 Aug 2009, Jonathan Corbet wrote:
On Thu, 20 Aug 2009 14:14:00 -0700
Andrew Morton a...@linux-foundation.org wrote:
Hang on. Isn't this why Dave just wrote and I just rush-merged
lib/flex_array.c?
Was that code evaluated for this application and judged unsuitable? If so,
On Tue, 27 Jan 2009, Evgeniy Polyakov wrote:
/dev/mem_notify is a great idea, but please do not limit existing
oom-killer in its ability to do the job and do not rely on application's
ability to send a SIGKILL which will not kill tasks in unkillable state
contrary to oom-killer.
You're
On Tue, 27 Jan 2009, Nikanth Karthikesan wrote:
As previously stated, I think the heuristic to penalize tasks for not
having an intersection with the set of allowable nodes of the oom
triggering task could be made slightly more severe. That's irrelevant to
your patch, though.
But
On Tue, 27 Jan 2009, Nikanth Karthikesan wrote:
I don't understand what you're arguing for here. Are you suggesting that
we should not prefer tasks that intersect the set of allowable nodes?
That makes no sense if the goal is to allow for future memory freeing.
No. Actually I am just
On Tue, 27 Jan 2009, Nikanth Karthikesan wrote:
That's certainly idealistic, but cannot be done in an inexpensive way that
would scale with the large systems that clients of cpusets typically use.
If we kill only the tasks for which cpuset_mems_allowed_intersects() is true
on the first
On Tue, 27 Jan 2009, Evgeniy Polyakov wrote:
There is no additional oom killer limitation imposed here, nor can the oom
killer kill a task hung in D state any better than userspace.
Well, oom-killer can, since it drops unkillable state from the process
mask, that may be not enough
On Tue, 27 Jan 2009, KOSAKI Motohiro wrote:
Confused.
As far as I know, people want the method of flexible cache treating.
but oom seems less flexible than userland notification.
Why do you think notification is bad?
There're a couple of proposals that have been discussed recently that
On Tue, 27 Jan 2009, KOSAKI Motohiro wrote:
Yup, indeed. :)
honestly, I talked about the same thingk recently lowmemory android driver
not needed? thread.
Yeah, I proposed /dev/mem_notify being made as a client of cgroups there
in http://marc.info/?l=linux-kernelm=123200623628685
How do
On Fri, 23 Jan 2009, Nikanth Karthikesan wrote:
Of course, because the oom killer must be aware that tasks in disjoint
cpusets are more likely than not to result in no memory freeing for
current's subsequent allocations.
Yes, the problem is cpuset does not track the tasks which has
On Fri, 23 Jan 2009, Nikanth Karthikesan wrote:
In other instances, It can actually also kill some innocent tasks unless the
administrator tunes oom_adj, say something like kvm which would have a huge
memory accounted, but might be from a different node altogether. Killing a
single vm is
On Thu, 22 Jan 2009, Nikanth Karthikesan wrote:
No, this is not specific to memcg or cpuset cases alone. The same needless
kills will take place even without memcg or cpuset when an administrator
specifies a light memory consumer to be killed before a heavy memory user.
But
it is up to
On Thu, 22 Jan 2009, Nikanth Karthikesan wrote:
You can't specify different behavior for an oom cgroup depending on what
type of oom it is, which is the problem with this proposal.
No. This does not disable any such special selection criteria which is used
without this controller.
I
On Thu, 22 Jan 2009, Evgeniy Polyakov wrote:
For example, if your task triggers an oom as the result of its exclusive
cpuset placement, the oom killer should prefer to kill a task within that
cpuset to allow for future memory freeing.
This it not true for all cases. What if you do need
On Thu, 22 Jan 2009, Nikanth Karthikesan wrote:
I think cpusets preference could be improved, not to depend on badness, with
something similar to what memcg does. With or without adding overhead of
tracking processes that has memory from a node.
We actually used to do that: we excluded
On Thu, 22 Jan 2009, Evgeniy Polyakov wrote:
In an exclusive cpuset, a task's memory is restricted to a set of mems
that the administrator has designated. If it is oom, the kernel must free
memory on those nodes or the next allocation will again trigger an oom
(leading to a needlessly
On Thu, 22 Jan 2009, Evgeniy Polyakov wrote:
Of course, because the oom killer must be aware that tasks in disjoint
cpusets are more likely than not to result in no memory freeing for
current's subsequent allocations.
And if we replace cpuset with cgroup (or anything else), nothing
On Fri, 23 Jan 2009, Evgeniy Polyakov wrote:
Only the fact that cpusets have _very_ special meaning in the oom-killer
codepath, while it should be just another tunable (if it should be
special code at all at the first place, why there were no objection and
argument, that tasks could have
On Fri, 23 Jan 2009, Evgeniy Polyakov wrote:
I showed the case when it does not work at all. And then found (in this
mail), that task (part) has to be present in the memory, which means it
will be locked, which in turns will not work with the system which
already locked its range allowed by
On Wed, 21 Jan 2009, Nikanth Karthikesan wrote:
This is a container group based approach to override the oom killer selection
without losing all the benefits of the current oom killer heuristics and
oom_adj interface.
It adds a tunable oom.victim to the oom cgroup. The oom killer will
On Mon, 10 Nov 2008, Andrea Righi wrote:
IIUC, Andrea Righ posted 2 patches around dirty_ratio. (added him to CC:)
in early October.
(1) patch for adding dirty_ratio_pcm. (1/10)
(2) per-memcg dirty ratio. (maybe
this..http://lkml.org/lkml/2008/9/12/121)
(1) should be
On Thu, 6 Nov 2008, KAMEZAWA Hiroyuki wrote:
Agreed. This patchset is admittedly from a different time when cpusets
was the only relevant extension that needed to be done.
BTW, what is the problem this patch wants to fix ?
1. avoid slow-down of memory allocation by triggering
On Wed, 5 Nov 2008, Andrew Morton wrote:
See, here's my problem: we have a pile of new code which fixes some
problem. But the problem seems to be fairly small - it only affects a
small number of sophisticated users and they already have workarounds
in place.
The workarounds, while
On Mon, 15 Oct 2007, Paul Jackson wrote:
My solution may be worse than that. Because set_cpus_allowed() will
fail if asked to set a non-overlapping cpumask, my solution could never
terminate. If asked to set a cpusets cpus to something that went off
line right then, this I'd guess this code
On Tue, 16 Oct 2007, Paul Jackson wrote:
David wrote:
Why can't you just add a helper function to sched.c:
void set_hotcpus_allowed(struct task_struct *task,
cpumask_t cpumask)
{
mutex_lock(sched_hotcpu_mutex);
On Wed, 17 Oct 2007, KAMEZAWA Hiroyuki wrote:
+static ssize_t mem_force_empty_read(struct cgroup *cont,
+ struct cftype *cft,
+ struct file *file, char __user *userbuf,
+ size_t nbytes, loff_t *ppos)
+{
+
On Mon, 15 Oct 2007, Paul Jackson wrote:
--- 2.6.23-mm1.orig/kernel/cpuset.c 2007-10-14 22:24:56.268309633 -0700
+++ 2.6.23-mm1/kernel/cpuset.c2007-10-14 22:34:52.645364388 -0700
@@ -677,6 +677,64 @@ done:
}
/*
+ * update_cgroup_cpus_allowed(cont, cpus)
+ *
+ * Keep looping
On Thu, 11 Oct 2007, Paul Jackson wrote:
Hmmm ... I hadn't noticed that sched_hotcpu_mutex before.
I wonder what it is guarding? As best as I can guess, it seems, at
least in part, to be keeping the following two items consistent:
1) cpu_online_map
Yes, it protects against cpu hot-plug
On Wed, 10 Oct 2007, Paul Menage wrote:
On 10/6/07, David Rientjes [EMAIL PROTECTED] wrote:
It can race with sched_setaffinity(). It has to give up tasklist_lock as
well to call set_cpus_allowed() and can race
cpus_allowed = cpuset_cpus_allowed(p);
cpus_and(new_mask
On Sat, 6 Oct 2007, Paul Jackson wrote:
struct cgroup_iter it;
struct task_struct *p, **tasks;
int i = 0;
cgroup_iter_start(cs-css.cgroup, it);
while ((p = cgroup_iter_next(cs-css.cgroup, it))) {
get_task_struct(p);
tasks[i++] = p;
On Sat, 6 Oct 2007, Paul Menage wrote:
The getting and putting of the tasks will prevent them from exiting or
being deallocated prematurely. But this is also a critical section that
will need to be protected by some mutex so it doesn't race with other
set_cpus_allowed().
Is that
On Sat, 6 Oct 2007, Paul Jackson wrote:
This isn't working for me.
The key kernel routine for updating a tasks cpus_allowed
cannot be called while holding a spinlock.
But the above loop holds a spinlock, css_set_lock, between
the cgroup_iter_start and the cgroup_iter_end.
I end up
On Wed, 26 Sep 2007, Balbir Singh wrote:
Yes, I prefer 0 as well and had that in a series in the Lost World
of my earlier memory/RSS controller patches. I feel now that 0 is
a bit confusing, we don't use 0 to mean unlimited, unless we
treat the memory.limit_in_bytes value as boolean. 0 is
for a particular container. I think 0 would be suitable
since its use doesn't make any logical sense (you're not going to be
assigning a set of tasks to a resource void of pages).
Signed-off-by: David Rientjes [EMAIL PROTECTED]
---
Documentation/controllers/memory.txt |5 -
kernel
On Tue, 25 Sep 2007, Paul Menage wrote:
If I echo -n 8191 memory.limit_in_bytes, I'm still only going to be able
to charge one page on my x86_64. And then my program's malloc(5000) is
going to fail, which leads to the inevitable head scratching.
This is a very unrealistic argument.
On Tue, 25 Sep 2007, Paul Menage wrote:
nit pick, should be memory.limit_in_bytes
Can we reconsider this? I do think that plain limit would enable you
to have a more consistent API across all resource counters users.
Why aren't limits expressed in kilobytes? All architectures have
On Tue, 25 Sep 2007, Paul Menage wrote:
If you're fine with rounding up to the nearest page, then what's the point
of exposing it as a number of bytes?? You'll never get a granularity
finer than a kilobyte.
API != implementation.
Having the limit expressed and configurable in bytes
On Tue, 25 Sep 2007, Paul Menage wrote:
It doesn't matter. When I cat my cgroup's memory.limit (or
memory.limit_in_bytes), I should see the total number of bytes that my
applications are allowed. That's not an unrealistic expectation of a
system that is expressly designed to control my
91 matches
Mail list logo