Hi Andrea and Vivek,
From: Andrea Righi righi.and...@gmail.com
Subject: Re: [PATCH] io-controller: Add io group reference handling for request
Date: Mon, 18 May 2009 16:39:23 +0200
On Mon, May 18, 2009 at 10:01:14AM -0400, Vivek Goyal wrote:
On Sun, May 17, 2009 at 12:26:06PM +0200, Andrea
I don't see any mention in the changelog of the point brought up by Ingo:
http://lkml.org/lkml/2009/4/10/105
You also haven't responded to my comment about holding mmap
semaphore:
http://lkml.indiana.edu/hypermail/linux/kernel/0904.1/01417.html
Also, please consider combining with Ingo's point
Quoting Alexey Dobriyan (adobri...@gmail.com):
Introduction
Checkpoint/restart (C/R from now) allows to dump group of processes to disk
for various reasons like saving process state in case of box failure or
restoration of group of processes on another or same machine later.
Quoting suka...@linux.vnet.ibm.com (suka...@linux.vnet.ibm.com):
From: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
ns_exec knows the pid of the container-init in its own namespace (this
is usually the global pid since we normally run ns_exec in init-pid-ns).
If ns_exec writes out this pid
Quoting Alexey Dobriyan (adobri...@gmail.com):
Move supplementary groups implementation to kernel/groups.c .
kernel/sys.c already accumulated quite a few random stuff.
Do strictly copy/paste + add required headers to compile.
Compile-tested on many configs and archs.
Signed-off-by: Alexey
When restarting tasks, we want to be able to change xuid and
xgid in a struct cred, and do so with security checks. Break
the core functionality of set{fs,res}{u,g}id into cred_setX
which performs the access checks based on current_cred(),
but performs the requested change on a passed-in cred.
Move supplementary groups implementation to kernel/groups.c .
kernel/sys.c already accumulated quite a few random stuff.
Do strictly copy/paste + add required headers to compile.
Compile-tested on many configs and archs.
Signed-off-by: Alexey Dobriyan adobri...@gmail.com
Acked-by: Serge Hallyn
Break out the core function which checks privilege and (if
allowed) creates a new user namespace, with the passed-in
creating user_struct. Note that a user_namespace, unlike
other namespace pointers, is not stored in the nsproxy.
Rather it is purely a property of user_structs.
This will let us
Following is the next version of the credentials c/r patchset,
on top of the c/r patchset at
git://git.ncl.cs.columbia.edu/pub/git/linux-cr.git
It implements checkpoint and restart of user, user namespaces,
groups, supplementary groups, and struct cred.
There is a question as to what to do about
An application checkpoint image will store capability sets
(and the bounding set) as __u64s. Define checkpoint and
restart functions to translate between those and kernel_cap_t's.
Define a common function do_capset_tocred() which applies capability
set changes to a passed-in struct cred.
The
This patch adds the checkpointing and restart of credentials
(uids, gids, and capabilities) to Oren's c/r patchset (on top
of v14). It goes to great pains to re-use (and define when
needed) common helpers, in order to make sure that as security
code is modified, the cr code will be updated. Some
Restore a file's f_cred. This is set to the cred of the task doing
the open, so often it will be the same as that of the restarted task.
Signed-off-by: Serge E. Hallyn se...@us.ibm.com
---
checkpoint/files.c | 16 ++--
include/linux/checkpoint_hdr.h |2 +-
2 files
Create /proc/userns, which prints out all user namespaces. It
prints the address of the user_ns itself, the uid and userns address
of the user who created it, and the reference count.
Signed-off-by: Serge E. Hallyn se...@us.ibm.com
---
checkpoint/process.c |2 -
Signed-off-by: Serge E. Hallyn se...@us.ibm.com
---
kernel/groups.c |1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/kernel/groups.c b/kernel/groups.c
index 1b95b2f..14ebc6a 100644
--- a/kernel/groups.c
+++ b/kernel/groups.c
@@ -1,6 +1,7 @@
/*
* Supplementary group IDs
On a ckpt-v15-dev kernel, if I do:
git clone git://git.sr71.net/~hallyn/cr_tests.git
cd cr_tests; make; make install
mkdir /cgroup
mount -t cgroup -o freezer cgroup /cgroup
mkdir /cgroup/1
cd userns
sh run_userns.sh
cd ../fileio
On Tue, May 26, 2009 at 09:48:19AM -0500, Serge E. Hallyn wrote:
Quoting Alexey Dobriyan (adobri...@gmail.com):
Move supplementary groups implementation to kernel/groups.c .
kernel/sys.c already accumulated quite a few random stuff.
Do strictly copy/paste + add required headers to
Serge E. Hallyn se...@us.ibm.com writes:
Unable to handle kernel pointer dereference at virtual kernel address
fffd132b4000
Oops: 0038 [#4] SMP
CPU: 0 Tainted: G D2.6.30-rc3-00080-g2b1009c #261
Process ns_exec (pid: 3842, task: 1fdf9d50, ksp: 12d83d90)
Krnl PSW
On Tue, May 26, 2009 at 08:16:44AM -0500, Serge E. Hallyn wrote:
Quoting Alexey Dobriyan (adobri...@gmail.com):
Introduction
Checkpoint/restart (C/R from now) allows to dump group of processes to disk
for various reasons like saving process state in case of box failure or
[This is the fix for the bug I was trying to nail down earlier today]
If more than one restarted task are to share a checkpointed nsproxy,
then we must inc the count on the nsproxy for each new task, as
switch_task_namespaces() does not do that for us.
Signed-off-by: Serge E. Hallyn
On Fri, May 22, 2009 at 1:25 AM, KAMEZAWA Hiroyuki
kamezawa.hir...@jp.fujitsu.com wrote:
Hm, shouldn't we allow noprefix to be effective only agaisnt cpuset ?
I think it's just for backward-compatibility of cpuset.
(I don't like the option at all.)
Yes, exposing the noprefix option externally
Make cfq hierarhical.
Signed-off-by: Nauman Rafique nau...@google.com
Signed-off-by: Fabio Checconi fa...@gandalf.sssup.it
Signed-off-by: Paolo Valente paolo.vale...@unimore.it
Signed-off-by: Aristeu Rozanski a...@redhat.com
Signed-off-by: Vivek Goyal vgo...@redhat.com
---
block/Kconfig.iosched
o So far noop, deadline and AS had one common structure called *_data which
contained both the queue information where requests are queued and also
common data used for scheduling. This patch breaks down this common
structure in two parts, *_queue and *_data. This is along the lines of
cfq
o Currently a request queue has got fixed number of request descriptors for
sync and async requests. Once the request descriptors are consumed, new
processes are put to sleep and they effectively become serialized. Because
sync and async queues are separate, async requests don't impact sync
Quoting Alexey Dobriyan (adobri...@gmail.com):
On Tue, May 26, 2009 at 08:16:44AM -0500, Serge E. Hallyn wrote:
Honestly, I have great respect for your coding abilities. And if 'voices
from on high' tell us to base upon your code, I'd be fine with that, I
have no real problems with what I
Quoting Serge E. Hallyn (se...@us.ibm.com):
Signed-off-by: Serge E. Hallyn se...@us.ibm.com
---
kernel/groups.c |1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/kernel/groups.c b/kernel/groups.c
index 1b95b2f..14ebc6a 100644
--- a/kernel/groups.c
+++
Quoting Alexey Dobriyan (adobri...@gmail.com):
On Tue, May 26, 2009 at 09:48:19AM -0500, Serge E. Hallyn wrote:
Quoting Alexey Dobriyan (adobri...@gmail.com):
Move supplementary groups implementation to kernel/groups.c .
kernel/sys.c already accumulated quite a few random stuff.
Do
On Tue, 26 May 2009 04:36:18 -0700
Matt Helsley matth...@us.ibm.com wrote:
I don't see any mention in the changelog of the point brought up by Ingo:
http://lkml.org/lkml/2009/4/10/105
Nor of Eric's comments.
Alexey, pleeeze don't do this. We (read: I) heavily depend upon patch
submitters
This patch changes noop to use queue scheduling code from elevator layer.
One can go back to old noop by deselecting CONFIG_IOSCHED_NOOP_HIER.
Signed-off-by: Nauman Rafique nau...@google.com
Signed-off-by: Vivek Goyal vgo...@redhat.com
---
block/Kconfig.iosched | 11 +++
This patch changes deadline to use queue scheduling code from elevator layer.
One can go back to old deadline by selecting CONFIG_IOSCHED_DEADLINE_HIER.
Signed-off-by: Nauman Rafique nau...@google.com
Signed-off-by: Vivek Goyal vgo...@redhat.com
---
block/Kconfig.iosched| 11 +++
o This patch exports some statistics through cgroup interface. Two of the
statistics currently exported are actual disk time assigned to the cgroup
and actual number of sectors dispatched to disk on behalf of this cgroup.
Signed-off-by: Vivek Goyal vgo...@redhat.com
---
block/elevator-fq.c |
On Fri, 22 May 2009 08:55:10 +0400
Alexey Dobriyan adobri...@gmail.com wrote:
This is a mess.
heh. I'm going to treat it as Ingo's mess :)
___
Containers mailing list
contain...@lists.linux-foundation.org
Hi All,
Here is the V3 of the IO controller patches generated on top of 2.6.30-rc7.
Previous versions of the patches was posted here.
http://lkml.org/lkml/2009/3/11/486
http://lkml.org/lkml/2009/5/5/275
This patchset is still work in progress but I want to keep on getting the
snapshot of my
This patch enables per-cgroup per-device weight and ioprio_class handling.
A new cgroup interface policy is introduced. You can make use of this
file to configure weight and ioprio_class for each device in a given cgroup.
The original weight and ioprio_class files are still available. If you
don't
o There are situations where a queue gets expired very soon and it looks
as if time slice used by that queue is zero. For example, If an async
queue dispatches a bunch of requests and queue is expired before first
request completes. Another example is where a queue is expired as soon
as
o A debug patch which does wait for next IO from async queue once it
becomes empty.
o For async writes, traffic seen by IO scheduler is not in proportion to
the weight of the cgroup task/page belongs to. So if there are two processes
doing heavy writeouts in two cgroups with weights 1000
Elevator layer now has support for hierarchical fair queuing. cfq has
been migrated to make use of it and now it is time to do groundwork for
noop, deadline and AS.
noop deadline and AS don't maintain separate queues for different processes.
There is only one single queue. Effectively one can
o Littile debugging aid for hierarchical IO scheduling.
o Enabled under CONFIG_DEBUG_GROUP_IOSCHED
o Currently it outputs more debug messages in blktrace output which helps
a great deal in debugging in hierarchical setup. It also creates additional
cgroup interfaces io.disk_queue and
o When a sync queue expires, in many cases it might be empty and then
it will be deleted from the active tree. This will lead to a scenario
where out of two competing queues, only one is on the tree and when a
new queue is selected, vtime jump takes place and we don't see services
provided
o Documentation for io-controller.
Signed-off-by: Vivek Goyal vgo...@redhat.com
---
Documentation/block/00-INDEX |2 +
Documentation/block/io-controller.txt | 326 +
2 files changed, 328 insertions(+), 0 deletions(-)
create mode 100644
This patch changes anticipatory scheduler to use queue scheduling code from
elevator layer. One can go back to old as by deselecting
CONFIG_IOSCHED_AS_HIER.
TODO/Issues
===
- AS anticipation logic does not seem to be sufficient to provide BW difference
if two dd are going in two
o blkio_cgroup patches from Ryo to track async bios.
o Fernando is also working on another IO tracking mechanism. We are not
particular about any IO tracking mechanism. This patchset can make use
of any mechanism which makes it to upstream. For the time being making
use of Ryo's posting.
o So far we were assuming that a bio/rq belongs to the task who is submitting
it. It did not hold good in case of async writes. This patch makes use of
blkio_cgroup pataches to attribute the aysnc writes to right group instead
of task submitting the bio.
o For sync requests, we continue to
o In the original BFQ patch once a cgroup is being deleted, it will clean
up the associated io groups immediately and if there are any active io
queues with that group, these will be moved to root group. This movement
of queues is not good from fairness perspective as one can then create
a
This patch enables hierarchical fair queuing in common layer. It is
controlled by config option CONFIG_GROUP_IOSCHED.
Signed-off-by: Nauman Rafique nau...@google.com
Signed-off-by: Fabio Checconi fa...@gandalf.sssup.it
Signed-off-by: Paolo Valente paolo.vale...@unimore.it
Signed-off-by: Aristeu
This patch changes cfq to use fair queuing code from elevator layer.
Signed-off-by: Nauman Rafique nau...@google.com
Signed-off-by: Fabio Checconi fa...@gandalf.sssup.it
Signed-off-by: Paolo Valente paolo.vale...@unimore.it
Signed-off-by: Gui Jianfeng guijianf...@cn.fujitsu.com
Signed-off-by:
Quoting Alexey Dobriyan (adobri...@gmail.com):
And since you guys showed that just idea of in-kernel checkpointing is not
rejected outright, it doesn't mean that you can drag every single idea too.
Can you rephrase here? I have no idea what you mean by 'drag every single
idea'
Because
Paul Menage wrote:
On Fri, May 22, 2009 at 1:25 AM, KAMEZAWA Hiroyuki
kamezawa.hir...@jp.fujitsu.com wrote:
Hm, shouldn't we allow noprefix to be effective only agaisnt cpuset ?
I think it's just for backward-compatibility of cpuset.
(I don't like the option at all.)
Yes, exposing the
On Wed, 27 May 2009 09:07:31 +0800
Li Zefan l...@cn.fujitsu.com wrote:
Paul Menage wrote:
On Fri, May 22, 2009 at 1:25 AM, KAMEZAWA Hiroyuki
kamezawa.hir...@jp.fujitsu.com wrote:
Hm, shouldn't we allow noprefix to be effective only agaisnt cpuset ?
I think it's just for
Serge E. Hallyn wrote:
Following is the next version of the credentials c/r patchset,
on top of the c/r patchset at
git://git.ncl.cs.columbia.edu/pub/git/linux-cr.git
It implements checkpoint and restart of user, user namespaces,
groups, supplementary groups, and struct cred.
There is a
KAMEZAWA Hiroyuki wrote:
On Wed, 27 May 2009 09:07:31 +0800
Li Zefan l...@cn.fujitsu.com wrote:
Paul Menage wrote:
On Fri, May 22, 2009 at 1:25 AM, KAMEZAWA Hiroyuki
kamezawa.hir...@jp.fujitsu.com wrote:
Hm, shouldn't we allow noprefix to be effective only agaisnt cpuset ?
I think it's
50 matches
Mail list logo