[Devel] Re: [PATCH] namespaces: fix exit race by splitting exit

2007-01-26 Thread Oleg Nesterov
On 01/26, Daniel Hokka Zakrisson wrote: Serge E. Hallyn wrote: Ok, could you verify that the following patch at least solves the oopsing? (I can't reproduce the oops with Daniel's test prog) thanks, -serge Indeed, this patch solves the oopsing, but so did the last one. I think I

[Devel] Re: [RFC] kernel/pid.c pid allocation wierdness

2007-03-14 Thread Oleg Nesterov
On 03/14, Eric W. Biederman wrote: Pavel Emelianov [EMAIL PROTECTED] writes: Hi. I'm looking at how alloc_pid() works and can't understand one (simple/stupid) thing. It first kmem_cache_alloc()-s a strct pid, then calls alloc_pidmap() and at the end it taks a global pidmap_lock()

[Devel] Re: Getting the new RxRPC patches upstream

2007-04-20 Thread Oleg Nesterov
On 04/20, Andrew Morton wrote: On Fri, 20 Apr 2007 11:41:46 +0100 David Howells [EMAIL PROTECTED] wrote: There are only two non-net patches that AF_RXRPC depends on: (1) The key facility changes. That's all my code anyway, and shouldn't be a problem to merge unless someone

[Devel] Re: Getting the new RxRPC patches upstream

2007-04-23 Thread Oleg Nesterov
On 04/23, David Howells wrote: We only care when del_timer() returns true. In that case, if the timer function still runs (possible for single-threaded wqs), it has already passed __queue_work(). Why do you assume that? If del_timer() returns true, the timer was pending. This means it

[Devel] Re: [PATCH] kthread: Spontaneous exit support

2007-04-23 Thread Oleg Nesterov
On 04/23, Eric W. Biederman wrote: So I propose we add a kthread_orphan as a basic primitive to decrement the count on the task_struct if we want a kthread to simply exit after it has done some work. And as a helper function we can have a kthread_run_orphan. Speaking about helpers, could

[Devel] Re: Getting the new RxRPC patches upstream

2007-04-24 Thread Oleg Nesterov
On 04/24, David Howells wrote: Oleg Nesterov [EMAIL PROTECTED] wrote: The current code uses del_timer_sync(). It will also return 0. However, it will spin waiting for timer-function() to complete. So we are just wasting CPU. That's my objection to using cancel_delayed_work

[Devel] Re: Getting the new RxRPC patches upstream

2007-04-24 Thread Oleg Nesterov
On 04/24, David Howells wrote: Oleg Nesterov [EMAIL PROTECTED] wrote: Great. I'll send the s/del_timer_sync/del_timer/ patch. I didn't say I necessarily agreed that this was a good idea. I just meant that I agree that it will waste CPU. You must still audit all uses

[Devel] Re: Getting the new RxRPC patches upstream

2007-04-24 Thread Oleg Nesterov
On 04/24, David Howells wrote: Oleg Nesterov [EMAIL PROTECTED] wrote: Sure, I'll grep for cancel_delayed_work(). But unless I missed something, this change should be completely transparent for all users. Otherwise, it is buggy. I guess you will have to make sure

[Devel] Re: Getting the new RxRPC patches upstream

2007-04-25 Thread Oleg Nesterov
On 04/25, David Howells wrote: Oleg Nesterov [EMAIL PROTECTED] wrote: Yes sure. Note that this is documented: /* * Kill off a pending schedule_delayed_work(). Note that the work callback * function may still be running on return from cancel_delayed_work(). Run

[Devel] Re: - merge-sys_clone-sys_unshare-nsproxy-and-namespace.patch removed from -mm tree

2007-06-17 Thread Oleg Nesterov
On 06/16, Herbert Poetzl wrote: On Tue, May 08, 2007 at 07:45:35PM -0700, [EMAIL PROTECTED] wrote: The patch titled Merge sys_clone()/sys_unshare() nsproxy and namespace handling has been removed from the -mm tree. Its filename was

[Devel] Re: - merge-sys_clone-sys_unshare-nsproxy-and-namespace.patch removed from -mm tree

2007-06-17 Thread Oleg Nesterov
On 06/17, Oleg Nesterov wrote: Let's look at copy_namespaces(), it does the same get_xxx() in advance, but -EPERM forgets to do put_nsproxy(), so we definitely have a leak in copy_process(). Ugh, I am sorry, EPERM does put_nsproxy(). Still I can't understand why copy_namespaces() does

[Devel] Re: - merge-sys_clone-sys_unshare-nsproxy-and-namespace.patch removed from -mm tree

2007-06-17 Thread Oleg Nesterov
On 06/17, Oleg Nesterov wrote: However, nsproxy's code is full of strange unneeded get/put calls, for example: struct uts_namespace *copy_utsname(int flags, struct uts_namespace *old_ns) { struct uts_namespace *new_ns; BUG_ON(!old_ns

[Devel] Re: - merge-sys_clone-sys_unshare-nsproxy-and-namespace.patch removed from -mm tree

2007-06-17 Thread Oleg Nesterov
On 06/17, Herbert Poetzl wrote: On Sun, Jun 17, 2007 at 06:38:30PM +0400, Oleg Nesterov wrote: At first glance, sys_unshare() drops the reference to the old nsproxy, okay, the 'current' task has an nsproxy, and keeps a reference to that (let's assume it is the only task using

[Devel] Re: - merge-sys_clone-sys_unshare-nsproxy-and-namespace.patch removed from -mm tree

2007-06-18 Thread Oleg Nesterov
On 06/18, Cedric Le Goater wrote: Oleg Nesterov wrote: On 06/17, Oleg Nesterov wrote: Let's look at copy_namespaces(), it does the same get_xxx() in advance, but -EPERM forgets to do put_nsproxy(), so we definitely have a leak in copy_process(). Ugh, I am sorry, EPERM does

[Devel] [PATCH] create_new_namespaces: fix improper return of NULL

2007-06-19 Thread Oleg Nesterov
Untested. dup_mnt_ns() and clone_uts_ns() return NULL on failure. This is wrong, create_new_namespaces() uses ERR_PTR() to catch an error. This means that the subsequent create_new_namespaces() will hit BUG_ON() in copy_mnt_ns() or copy_utsname(). Signed-off-by: Oleg Nesterov [EMAIL PROTECTED

[Devel] Re: [PATCH] create_new_namespaces: fix improper return of NULL

2007-06-19 Thread Oleg Nesterov
On 06/19, Cedric Le Goater wrote: Oleg Nesterov wrote: Untested. dup_mnt_ns() and clone_uts_ns() return NULL on failure. This is wrong, create_new_namespaces() uses ERR_PTR() to catch an error. This means that the subsequent create_new_namespaces() will hit BUG_ON() in copy_mnt_ns

Re: [Devel] breakfast at ols?

2007-06-21 Thread Oleg Nesterov
Hi All, On 06/20, Cedric Le Goater wrote: Kir Kolyshkin wrote: Serge E. Hallyn wrote: Last year we all met for breakfast at OLS. Now we've all pretty much all already met so maybe it's less exciting, but do people (who will be at OLS) care to meet for breakfast on the thursday or

Re: [Devel] breakfast at ols?

2007-06-21 Thread Oleg Nesterov
On 06/21, Oleg Nesterov wrote: On 06/20, Cedric Le Goater wrote: Oleg, shall you come ? With great pleasure, thanks! Guys, I am stupid, sorry! I thought you were talking about the kernel summit in September :) (I was surpised that you are looking so far in advance) No, I won't

[Devel] Re: [PATCH 5/5] Move alloc_pid call to copy_process

2007-07-15 Thread Oleg Nesterov
Sukadev Bhattiprolu wrote: --- lx26-22-rc6-mm1.orig/kernel/pid.c 2007-07-13 18:23:55.0 -0700 +++ lx26-22-rc6-mm1/kernel/pid.c 2007-07-13 18:23:55.0 -0700 @@ -206,6 +206,10 @@ fastcall void free_pid(struct pid *pid) /* We can be called with

[Devel] Re: [PATCH 5/5] Move alloc_pid call to copy_process

2007-07-17 Thread Oleg Nesterov
On 07/16, [EMAIL PROTECTED] wrote: Oleg Nesterov [EMAIL PROTECTED] wrote: | | Could you please give more details why we need this change? Well, with multiple pid namespaces, we may need to allocate a new 'struct pid_namespace' if the CLONE_NEWPID flag is specified. And as a part

[Devel] Re: [PATCH 3/5] Use task_pid() to find leader's pid

2007-07-17 Thread Oleg Nesterov
On 07/16, [EMAIL PROTECTED] wrote: Oleg Nesterov [EMAIL PROTECTED] wrote: | | Stupid question: why do we need to put the pid namespace into the struct | pid? Isn't it better if the user of the struct pid should know its ns? | For example, if /proc does put_pid(), that pid should be from

[Devel] Re: [PATCH 1/15] Move exit_task_namespaces()

2007-07-27 Thread Oleg Nesterov
On 07/27, Pavel Emelyanov wrote: Oleg Nesterov wrote: Perhaps, we can do something like the patch below. Roland, what do you think? We can check PF_EXITING instead of -exit_state while choosing the new Heh :) I've came to the same conclusion and now I'm checking for it. But my patch

[Devel] Re: [PATCH 11/15] Signal semantics

2007-07-27 Thread Oleg Nesterov
Damn. I don't have time to read these patches today (will try tomorrow), but when I glanced at this patch yesterday I had some suspicions... On 07/26, Pavel Emelyanov wrote: +++ linux-2.6.23-rc1-mm1-7/kernel/signal.c2007-07-26 16:36:37.0 +0400 @@ -323,6 +325,9 @@ static int

[Devel] Re: [PATCH 9/15] Move alloc_pid() after the namespace is cloned

2007-07-27 Thread Oleg Nesterov
On 07/26, Pavel Emelyanov wrote: This is a fix for Sukadev's patch that moved the alloc_pid() call from do_fork() into copy_process(). ... and this patch changes almost every line from Sukadev's patch. Sorry gents, but isn't it better to ask Andrew to drop that patch (which is quite useless by

[Devel] Re: [PATCH 5/15] Introduce struct upid

2007-07-29 Thread Oleg Nesterov
On 07/26, Pavel Emelyanov wrote: --- linux-2.6.23-rc1-mm1.orig/include/linux/pid.h 2007-07-26 16:34:45.0 +0400 +++ linux-2.6.23-rc1-mm1-7/include/linux/pid.h2007-07-26 16:36:37.0 +0400 @@ -40,15 +40,21 @@ enum pid_type * processes. */ -struct pid -{

[Devel] Re: [PATCH 6/15] Make alloc_pid(), free_pid() and put_pid() work with struct upid

2007-07-29 Thread Oleg Nesterov
On 07/26, Pavel Emelyanov wrote: -struct pid *alloc_pid(void) +struct pid *alloc_pid(struct pid_namespace *ns) Why? We have the only caller, copy_process(), ns == task_active_pid_ns() always. { struct pid *pid; enum pid_type type; - int nr = -1; - struct

[Devel] Re: [PATCH 14/15] Destroy pid namespace on init's death

2007-07-29 Thread Oleg Nesterov
On 07/26, Pavel Emelyanov wrote: @@ -895,6 +915,7 @@ fastcall NORET_TYPE void do_exit(long co { struct task_struct *tsk = current; int group_dead; + struct pid_namespace *pid_ns = tsk-nsproxy-pid_ns; profile_task_exit(tsk); @@ -905,9 +926,10 @@ fastcall NORET_TYPE

[Devel] Re: [PATCH 7/15] Helpers to obtain pid numbers

2007-07-29 Thread Oleg Nesterov
On 07/26, Pavel Emelyanov wrote: --- linux-2.6.23-rc1-mm1.orig/include/linux/pid.h 2007-07-26 16:34:45.0 +0400 +++ linux-2.6.23-rc1-mm1-7/include/linux/pid.h2007-07-26 16:36:37.0 +0400 @@ -83,12 +92,34 @@ extern void FASTCALL(detach_pid(struct t extern struct

[Devel] Re: [PATCH 14/15] Destroy pid namespace on init's death

2007-07-30 Thread Oleg Nesterov
On 07/30, Pavel Emelyanov wrote: Oleg Nesterov wrote: + + nfree = 0; + for (i = 0; i PIDMAP_ENTRIES; i++) + nfree += atomic_read(pid_ns-pidmap[i].nr_free); + + /* +* If pidmap has entries for processes other than 0 and 1, retry. +*/ + if (nfree (BITS_PER_PAGE

[Devel] Re: [PATCH 14/15] Destroy pid namespace on init's death

2007-07-31 Thread Oleg Nesterov
On 07/30, [EMAIL PROTECTED] wrote: --- lx26-23-rc1-mm1.orig/kernel/exit.c2007-07-26 20:08:16.0 -0700 +++ lx26-23-rc1-mm1/kernel/exit.c 2007-07-30 23:10:30.0 -0700 @@ -915,6 +915,7 @@ fastcall NORET_TYPE void do_exit(long co { struct task_struct *tsk =

[Devel] Re: [PATCH 15/15] Hooks over the code to show correct values to user

2007-07-31 Thread Oleg Nesterov
On 07/30, Pavel Emelyanov wrote: Oleg Nesterov wrote: On 07/26, Pavel Emelyanov wrote: int kill_proc(pid_t pid, int sig, int priv) { - return kill_proc_info(sig, __si_special(priv), pid); + int ret; + + rcu_read_lock(); + ret = kill_pid_info(sig, __si_special(priv), find_pid(pid

[Devel] Re: [PATCH 14/15] Destroy pid namespace on init's death

2007-08-01 Thread Oleg Nesterov
On 07/31, [EMAIL PROTECTED] wrote: Oleg Nesterov [EMAIL PROTECTED] wrote: | | @@ -925,9 +926,10 @@ fastcall NORET_TYPE void do_exit(long co |if (unlikely(!tsk-pid)) |panic(Attempted to kill the idle task!); |if (unlikely(tsk == task_child_reaper(tsk

[Devel] Re: [PATCH 14/15] Destroy pid namespace on init's death

2007-08-02 Thread Oleg Nesterov
On 08/02, [EMAIL PROTECTED] wrote: Oleg Nesterov [EMAIL PROTECTED] wrote: | On 07/31, [EMAIL PROTECTED] wrote: | | Oleg Nesterov [EMAIL PROTECTED] wrote: | | | | @@ -925,9 +926,10 @@ fastcall NORET_TYPE void do_exit(long co | |if (unlikely(!tsk-pid

[Devel] Re: [PATCH 1/15] Move exit_task_namespaces()

2007-08-02 Thread Oleg Nesterov
On 07/26, Pavel Emelyanov wrote: The reason to release namespaces after reparenting is that when task exits it may send a signal to its parent (SIGCHLD), but if the parent has already exited its namespaces there will be no way to decide what pid to dever to him - parent can be from different

[Devel] Re: [PATCH 14/15] Destroy pid namespace on init's death

2007-08-02 Thread Oleg Nesterov
On 08/02, Oleg Nesterov wrote: On 08/02, Kirill Korotaev wrote: Oleg Nesterov wrote: As it was already discussed, the current code is buggy, and should be fixed. I'm not that sure it MUST be fixed. There are no multi-threaded init's anywhere. Oleg, does it worth changing

[Devel] Re: [PATCH 14/15] Destroy pid namespace on init's death

2007-08-02 Thread Oleg Nesterov
On 08/02, [EMAIL PROTECTED] wrote: Oleg Nesterov [EMAIL PROTECTED] wrote: | | | + if (pid_ns != init_pid_ns) { | | | + zap_pid_ns_processes(pid_ns); | | | + pid_ns-child_reaper = init_pid_ns.child_reaper; | | OOPS. I didn't notice

[Devel] Re: [PATCH 14/15] Destroy pid namespace on init's death

2007-08-02 Thread Oleg Nesterov
On 08/02, [EMAIL PROTECTED] wrote: Oleg Nesterov [EMAIL PROTECTED] wrote: | | This means that we should take care about multi-thread init exit, | otherwise the non-root user can crash the kernel. | | From reply to Kirill's message: | | Still. A non-root user does clone(CLONE_PIDNS

[Devel] [RFC, PATCH] handle the multi-threaded init's exit() properly

2007-08-02 Thread Oleg Nesterov
, and it has to be changed anyway when we really support pid namespaces, just remove it. Signed-off-by: Oleg Nesterov [EMAIL PROTECTED] --- t/kernel/exit.c~2007-08-03 00:10:28.0 +0400 +++ t/kernel/exit.c 2007-08-03 01:12:18.0 +0400 @@ -604,11 +604,6 @@ static void exit_mm(struct

[Devel] Re: [PATCH 14/15] Destroy pid namespace on init's death

2007-08-03 Thread Oleg Nesterov
On 08/02, [EMAIL PROTECTED] wrote: --- lx26-23-rc1-mm1.orig/kernel/exit.c2007-08-02 11:06:36.0 -0700 +++ lx26-23-rc1-mm1/kernel/exit.c 2007-08-02 23:06:47.0 -0700 @@ -916,7 +916,32 @@ static inline void exit_child_reaper(str if (likely(tsk-group_leader !=

[Devel] Re: [PATCH] Fix capability.c to work with threaded init

2007-08-03 Thread Oleg Nesterov
On 08/03, Dave Hansen wrote: On Thu, 2007-08-02 at 23:26 -0700, [EMAIL PROTECTED] wrote: Callers of is_container_init() should pass in task-group_leader to ensure they work with threaded-init. Can you explain this in a little more detail? That's a pretty sparse changelog. Without

[Devel] Re: [RFC, PATCH] handle the multi-threaded init's exit() properly

2007-08-03 Thread Oleg Nesterov
are playing games with -nsproxy-pid_ns. This code is bogus today, and it has to be changed anyway when we really support pid namespaces, just remove it. Signed-off-by: Oleg Nesterov [EMAIL PROTECTED] --- t/kernel/exit.c~MTINIT 2007-07-28 16:58:17.0 +0400 +++ t/kernel/exit.c

[Devel] Re: [PATCH 1/15] Move exit_task_namespaces()

2007-08-06 Thread Oleg Nesterov
On 08/06, Pavel Emelyanov wrote: Oleg Nesterov wrote: On 07/26, Pavel Emelyanov wrote: The reason to release namespaces after reparenting is that when task exits it may send a signal to its parent (SIGCHLD), but if the parent has already exited its namespaces there will be no way to decide

[Devel] Re: [PATCH 1/15] Move exit_task_namespaces()

2007-08-06 Thread Oleg Nesterov
On 08/06, Pavel Emelyanov wrote: Oleg Nesterov wrote: On 08/06, Pavel Emelyanov wrote: Oleg Nesterov wrote: On 07/26, Pavel Emelyanov wrote: The reason to release namespaces after reparenting is that when task exits it may send a signal to its parent (SIGCHLD), but if the parent has

[Devel] Re: [PATCH 1/15] Move exit_task_namespaces()

2007-08-06 Thread Oleg Nesterov
On 08/06, Pavel Emelyanov wrote: Oleg Nesterov wrote: On 08/06, Pavel Emelyanov wrote: Oleg Nesterov wrote: On 08/06, Pavel Emelyanov wrote: Oleg Nesterov wrote: On 07/26, Pavel Emelyanov wrote: The reason to release namespaces after reparenting is that when task exits it may send

[Devel] Re: [PATCH 1/15] Move exit_task_namespaces()

2007-08-06 Thread Oleg Nesterov
On 08/06, Pavel Emelyanov wrote: If task X is exiting and has already exit_task_namespaces()-ed task Y will OOPs during its exit in determining parent's namespace. I agree that in that case this is not important what namespace X belongs to, but we need to handle the race with changing the

[Devel] Re: [PATCH 1/15] Move exit_task_namespaces()

2007-08-06 Thread Oleg Nesterov
On 08/06, Pavel Emelyanov wrote: Oleg Nesterov wrote: On 08/06, Pavel Emelyanov wrote: Oleg Nesterov wrote: On 08/06, Pavel Emelyanov wrote: Oleg Nesterov wrote: On 08/06, Pavel Emelyanov wrote: Oleg Nesterov wrote: On 07/26, Pavel Emelyanov wrote: The reason to release namespaces after

[Devel] Re: [RFC, PATCH] handle the multi-threaded init's exit() properly

2007-08-06 Thread Oleg Nesterov
On 08/06, Andrew Morton wrote: On Fri, 3 Aug 2007 01:20:09 +0400 Oleg Nesterov [EMAIL PROTECTED] wrote: 2. We are playing games with -nsproxy-pid_ns. This code is bogus today, and it has to be changed anyway when we really support pid namespaces, just remove it. This patch

[Devel] Re: [RFC][PATCH] Make access to taks's nsproxy liter

2007-08-08 Thread Oleg Nesterov
On 08/08, Pavel Emelyanov wrote: When someone wants to deal with some other taks's namespaces it has to lock the task and then to get the desired namespace if the one exists. This is slow on read-only paths and may be impossible in some cases. E.g. Oleg recently noticed a race between

[Devel] Re: [RFC][PATCH] Make access to taks's nsproxy liter

2007-08-08 Thread Oleg Nesterov
On 08/08, Eric W. Biederman wrote: Oleg Nesterov [EMAIL PROTECTED] writes: On 08/08, Pavel Emelyanov wrote: +void switch_task_namespaces(struct task_struct *p, struct nsproxy *new) +{ + struct nsproxy *ns; + + might_sleep(); + + ns = p-nsproxy; + if (ns == new

[Devel] Re: [RFC][PATCH] Make access to taks's nsproxy liter

2007-08-08 Thread Oleg Nesterov
On 08/08, Paul E. McKenney wrote: On Wed, Aug 08, 2007 at 08:41:07PM +0400, Oleg Nesterov wrote: +void switch_task_namespaces(struct task_struct *p, struct nsproxy *new) +{ + struct nsproxy *ns; + + might_sleep(); + + ns = p-nsproxy; + if (ns == new) + return

[Devel] Re: [RFC][PATCH] Make access to taks's nsproxy liter

2007-08-09 Thread Oleg Nesterov
On 08/09, Pavel Emelyanov wrote: Paul E. McKenney wrote: On Wed, Aug 08, 2007 at 08:41:07PM +0400, Oleg Nesterov wrote: +void switch_task_namespaces(struct task_struct *p, struct nsproxy *new) +{ + struct nsproxy *ns; + + might_sleep(); + + ns = p-nsproxy; + if (ns == new

[Devel] Re: [RFC][PATCH] Make access to taks's nsproxy liter

2007-08-09 Thread Oleg Nesterov
On 08/09, Oleg Nesterov wrote: Note also that switch_task_namespaces() might_sleep(), but sys_unshare() calls it under task_lock(). Ah, sorry, didn't notice your patch moves task_lock() down in sys_unshare(). Oleg. ___ Containers mailing list

[Devel] Re: [RFC][PATCH] Make access to taks's nsproxy liter

2007-08-09 Thread Oleg Nesterov
On 08/09, Pavel Emelyanov wrote: Oleg Nesterov wrote: Yes. But this patch complicates the code and slows down group_exit. We don't Nope - it slows done the code only if the task exiting is the last one using the nsproxy. In other words - we slowdown the virtual server stop, not task

[Devel] Re: [PATCH] Make access to task's nsproxy liter

2007-08-10 Thread Oleg Nesterov
On 08/10, Pavel Emelyanov wrote: Oleg Nesterov wrote: On 08/10, Serge E. Hallyn wrote: Quoting Pavel Emelyanov ([EMAIL PROTECTED]): +/* + * the namespaces access rules are: + * + * 1. only current task is allowed to change tsk-nsproxy pointer or + * any pointer on the nsproxy itself

[Devel] Re: [PATCH] Make access to task's nsproxy liter

2007-08-10 Thread Oleg Nesterov
On 08/10, Serge E. Hallyn wrote: Quoting Pavel Emelyanov ([EMAIL PROTECTED]): +/* + * the namespaces access rules are: + * + * 1. only current task is allowed to change tsk-nsproxy pointer or + * any pointer on the nsproxy itself + * + * 2. when accessing (i.e. reading)

[Devel] Re: [PATCH] Make access to task's nsproxy liter

2007-08-10 Thread Oleg Nesterov
On 08/10, Oleg Nesterov wrote: On 08/10, Serge E. Hallyn wrote: Quoting Pavel Emelyanov ([EMAIL PROTECTED]): +/* + * the namespaces access rules are: + * + * 1. only current task is allowed to change tsk-nsproxy pointer or + * any pointer on the nsproxy itself

[Devel] Re: [PATCH] Allow signalling container-init

2007-08-10 Thread Oleg Nesterov
On 08/09, [EMAIL PROTECTED] wrote: Pavel Emelianov [EMAIL PROTECTED] wrote: | Oleg Nesterov wrote: | | | | I think it is better to not change the current behaviour which is not | | perfect (buggy), until we actually protect /sbin/init from unwanted | | signals. | | Can we preserve

[Devel] Re: [RFC][PATCH] Isolate some explicit usage of task-tgid

2007-08-17 Thread Oleg Nesterov
On 08/17, Pavel Emelyanov wrote: diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c index b53c8fc..06a1e7d 100644 --- a/kernel/posix-cpu-timers.c +++ b/kernel/posix-cpu-timers.c @@ -21,8 +21,8 @@ static int check_clock(const clockid_t w read_lock(tasklist_lock);

[Devel] Re: [PATCH] Isolate some explicit usage of task-tgid

2007-08-17 Thread Oleg Nesterov
On 08/17, Pavel Emelyanov wrote: Actually the p-tgid == pid has to be changed to has_group_leader_pid(), but Oleg pointed out that this is the same and thread_group_leader() is more preferable. No, no, sorry for confusion! I was not clear. I meant that thread_group_leader() is imho better

[Devel] [RFC,PATCH] fix /sbin/init signal handling

2007-08-19 Thread Oleg Nesterov
(Not for inclusion yet, against 2.6.23-rc2, untested) Currently, /sbin/init is protected from unhandled signals by the current == child_reaper(current) check in get_signal_to_deliver(). This is not enough, we have multiple problems: - this doesn't work for multi-threaded inits, and we

[Devel] Re: [RFC,PATCH] fix /sbin/init signal handling

2007-08-21 Thread Oleg Nesterov
On 08/21, [EMAIL PROTECTED] wrote: I am still reviewing this patch and will try to plug in the multiple pid ns code and play with it some more in the next couple of days. Thanks! But am curious why we need the in_interrupt() check and that too only for the container-init process. For

[Devel] Re: [RFC,PATCH] fix /sbin/init signal handling

2007-08-21 Thread Oleg Nesterov
On 08/21, Pavel Emelyanov wrote: +static int sig_init_ignore(struct task_struct *tsk) +{ +// Currently this check is a bit racy with exec(), +// we can _simplify_ de_thread and close the race. +if (likely(!is_init(tsk-group_leader))) +return 0; + +//

[Devel] Re: [RFC,PATCH] fix /sbin/init signal handling

2007-08-21 Thread Oleg Nesterov
On 08/21, Serge E. Hallyn wrote: Quoting Oleg Nesterov ([EMAIL PROTECTED]): @@ -1841,14 +1865,6 @@ relock: if (sig_kernel_ignore(signr)) /* Default is nothing. */ continue; - /* -* Init of a pid space gets no signals it doesn't

[Devel] Re: [RFC][PATCH] Cleanup the new thread's creation

2007-08-25 Thread Oleg Nesterov
On 08/24, Pavel Emelyanov wrote: The major differences of creating a new thread from creating a new process is that 1. newbie's tgid is set to leader's 2. newbie's leader is set to leader 3. newbie is added to leader's thread_list (Surely, the are many other major differences, but from the

[Devel] Re: [RFC][PATCH 2/3] Signal semantics for /sbin/init

2007-08-30 Thread Oleg Nesterov
On 08/29, [EMAIL PROTECTED] wrote: --- 2.6.23-rc3-mm1.orig/kernel/signal.c 2007-08-29 22:53:20.0 -0700 +++ 2.6.23-rc3-mm1/kernel/signal.c2007-08-29 23:10:16.0 -0700 @@ -26,6 +26,7 @@ #include linux/freezer.h #include linux/pid_namespace.h #include

[Devel] Re: [PATCH 3/3] Masquerade sender and limit system-wide signals

2007-08-30 Thread Oleg Nesterov
On 08/29, [EMAIL PROTECTED] wrote: +static void masquerade_sender(struct task_struct *t, struct sigqueue *q) +{ + /* + * If the sender does not have a pid_t in the receiver's active + * pid namespace, set si_pid to 0 and pretend signal originated + * from the kernel. +

[Devel] Re: [RFC][PATCH 1/3] Pid ns helpers for signals

2007-08-30 Thread Oleg Nesterov
On 08/29, [EMAIL PROTECTED] wrote: +static int ancestor_pid_ns(struct pid_namespace *ns1, struct pid_namespace *ns2) +{ + int i; + struct pid_namespace *tmp; + + if (ns1 == NULL || ns2 == NULL) + return 0; + + if (ns1-level = ns2-level) + return

[Devel] Re: [PATCH 1/3] Signal semantics for /sbin/init

2007-09-01 Thread Oleg Nesterov
On 08/31, [EMAIL PROTECTED] wrote: -static int sig_ignored(struct task_struct *t, int sig) + // Currently this check is a bit racy with exec(), + // we can _simplify_ de_thread and close the race. + if (likely(!is_container_init(tsk-group_leader))) + return 0; + +

[Devel] Re: [PATCH 2/3] Pid ns helpers for signals

2007-09-01 Thread Oleg Nesterov
On 08/31, [EMAIL PROTECTED] wrote: Define some helper functions that will be used to implement signal semantics with multiple pid namespaces. is_current_in_ancestor_pid_ns(task) TRUE iff active pid namespace of 'current' is an ancestor of active pid

[Devel] Re: [PATCH 3/3] Signal semantics for pid namespaces

2007-09-01 Thread Oleg Nesterov
On 08/31, [EMAIL PROTECTED] wrote: @@ -48,7 +49,7 @@ static int sig_init_ignore(struct task_s if (likely(!is_container_init(tsk-group_leader))) return 0; - if (!in_interrupt()) + if (is_current_in_ancestor_pid_ns(tsk) !in_interrupt()) return 0;

[Devel] Re: [PATCH 2/3] Pid ns helpers for signals

2007-09-01 Thread Oleg Nesterov
On 09/01, Oleg Nesterov wrote: On 08/31, [EMAIL PROTECTED] wrote: +static struct pid_namespace *get_task_pid_ns(struct task_struct *tsk) +{ + struct pid *pid; + struct pid_namespace *ns; + + pid = get_task_pid(tsk, PIDTYPE_PID); + ns = get_pid_ns(pid_active_ns(pid

[Devel] Re: [PATCH 2/3] Pid ns helpers for signals

2007-09-03 Thread Oleg Nesterov
On 09/03, [EMAIL PROTECTED] wrote: Oleg Nesterov [EMAIL PROTECTED] wrote: | On 09/01, Oleg Nesterov wrote: | | On 08/31, [EMAIL PROTECTED] wrote: | | +static struct pid_namespace *get_task_pid_ns(struct task_struct *tsk) | +{ | + struct pid *pid; | + struct

[Devel] Re: [PATCH 1/3] Signal semantics for /sbin/init

2007-09-03 Thread Oleg Nesterov
On 09/03, [EMAIL PROTECTED] wrote: Oleg Nesterov [EMAIL PROTECTED] wrote: | On 08/31, [EMAIL PROTECTED] wrote: | | -static int sig_ignored(struct task_struct *t, int sig) | + // Currently this check is a bit racy with exec(), | + // we can _simplify_ de_thread and close the race

[Devel] Re: [PATCH 1/3] Signal semantics for /sbin/init

2007-09-13 Thread Oleg Nesterov
On 09/13, Cedric Le Goater wrote: Oleg Nesterov wrote: On 09/10, [EMAIL PROTECTED] wrote: (This is Oleg's patch with my pid ns additions. Compiled and unit tested on 2.6.23-rc4-mm1 with other patches in this set. Oleg pls update this patch if necessary and sign-off) Sukadev, my

[Devel] Re: [PATCH 1/3] Signal semantics for /sbin/init

2007-09-17 Thread Oleg Nesterov
On 09/13, [EMAIL PROTECTED] wrote: Oleg Nesterov [EMAIL PROTECTED] wrote: | | Notes: | | - Blocked signals are never ignored, so init still can receive |a pending blocked signal after sigprocmask(SIG_UNBLOCK). |Easy to fix, but probably we can ignore

Re: [Devel] Re: [PATCH 1/3] Signal semantics for /sbin/init

2007-09-17 Thread Oleg Nesterov
On 09/14, Daniel Pittman wrote: Oleg Nesterov [EMAIL PROTECTED] writes: On 09/13, Cedric Le Goater wrote: Oleg Nesterov wrote: [...] To respect the current init semantic, The current init semantic is broken in many ways ;) Yup. They sure are, but they are pretty set in stone

[Devel] Re: [PATCH] pid: sys_wait... fixes

2007-12-06 Thread Oleg Nesterov
On 12/05, Eric W. Biederman wrote: This modifies do_wait and eligible_child to take a pair of enum pid_type and struct pid *pid to precisely specify what set of processes are eligible to be waited for, instead of the raw pid_t value from sys_wait4. Personally, I like this patch very much.

[Devel] Re: [PATCH] pid: Extend/Fix pid_vnr

2007-12-06 Thread Oleg Nesterov
On 12/05, Eric W. Biederman wrote: +pid_t pid_vnr(struct pid *pid) +{ + return pid_nr_ns(pid, current-nsproxy-pid_ns); +} Excellent!!! This allows us to do many cleanups. I am sending the trivial patch just as example. Oleg. ___ Containers

[Devel] [PATCH] sys_getsid: don't use -nsproxy directly

2007-12-06 Thread Oleg Nesterov
With the new semantics of find_vpid() we don't need to play with -nsproxy explicitely, _vxx() do the right things. Also s/tasklist/rcu/. Signed-off-by: Oleg Nesterov [EMAIL PROTECTED] --- PT/kernel/sys.c~2007-12-05 21:43:18.0 +0300 +++ PT/kernel/sys.c 2007-12-06 20:09

[Devel] Re: [PATCH] pid: sys_wait... fixes (v2)

2007-12-06 Thread Oleg Nesterov
On 12/06, Eric W. Biederman wrote: +static struct pid *task_pid_type(struct task_struct *task, enum pid_type type) +{ + struct pid *pid = NULL; + if (type == PIDTYPE_PID) + pid = task-pids[type].pid; + else if (type PIDTYPE_MAX) + pid =

[Devel] Re: [PATCH 9/9] signal: Ignore signals sent to the pid namespace init

2007-12-13 Thread Oleg Nesterov
On 12/12, Eric W. Biederman wrote: -static int is_sig_init(struct task_struct *tsk) +static int is_sig_init(struct task_struct *init, struct pid *sender) { - if (likely(!is_global_init(tsk-group_leader))) + if (!is_container_init(init)) + return 0; + + if

[Devel] Re: [PATCH 4/9] pid: Generalize task_active_pid_ns

2007-12-13 Thread Oleg Nesterov
Sorry for the delay, and sorry, can't read this series carefully now. A couple of question though. On 12/12, Eric W. Biederman wrote: Currently task_active_pid_ns is not safe to call after a task becomes a zombie and exit_task_namespaces is called, as nsproxy becomes NULL. By reading the pid

[Devel] Re: [PATCH 8/9] signal: Drop signals before sending them to init.

2007-12-13 Thread Oleg Nesterov
On 12/12, Eric W. Biederman wrote: By making the rule (for init dropping signals): When sending a signal to init, the presence of a signal handler that is not SIG_DFL allows the signal to be sent to init. If the signal is not sent it is silently dropped without becoming pending. But isn't

[Devel] Re: [PATCH 4/9] pid: Generalize task_active_pid_ns

2007-12-13 Thread Oleg Nesterov
On 12/13, Eric W. Biederman wrote: Oleg Nesterov [EMAIL PROTECTED] writes: On 12/12, Eric W. Biederman wrote: Currently task_active_pid_ns is not safe to call after a task becomes a zombie and exit_task_namespaces is called, as nsproxy becomes NULL. By reading the pid namespace from

[Devel] Re: [PATCH 8/9] signal: Drop signals before sending them to init.

2007-12-13 Thread Oleg Nesterov
On 12/13, Eric W. Biederman wrote: Oleg Nesterov [EMAIL PROTECTED] writes: So, do you mean we can ignore the problems with the signals which are currently blocked by /sbin/init? Yes. Further I am saying those signals will never become pending if we do not have a signal handler

[Devel] Re: [PATCH 8/9] signal: Drop signals before sending them to init.

2007-12-16 Thread Oleg Nesterov
On 12/13, Eric W. Biederman wrote: Oleg Nesterov [EMAIL PROTECTED] writes: OK, if we change the semantics for /sbin/init signals we can avoid a lot of problems, Yes. Otherwise we must track the source of the signals. Yes. Well, I am not sure about explain though. Unless I missed

[Devel] Re: [PATCH 8/9] signal: Drop signals before sending them to init.

2007-12-18 Thread Oleg Nesterov
On 12/17, Eric W. Biederman wrote: So I would have no problem with a definition said signals will be dropped when sent to init if at the time they are sent the signal is SIG_DFL and unblocked. Great! But this can happen with your patch as well. sig_init_drop() returns false if we have a

[Devel] Re: [PATCH 1/2] signals: kill(-1) should only signal processes in the same namespace

2008-07-17 Thread Oleg Nesterov
On 07/17, Pavel Emelyanov wrote: Daniel Hokka Zakrisson wrote: The way zap_pid_ns_processes does it is worse, since it signals every thread in the namespace rather than every thread group. So either we walk It's questionable whether there are more threads in a pid namespace than

[Devel] Re: [PATCH 1/2] signals: kill(-1) should only signal processes in the same namespace

2008-07-23 Thread Oleg Nesterov
On 07/17, Daniel Hokka Zakrisson wrote: +int task_in_pid_ns(struct task_struct *tsk, struct pid_namespace *ns) +{ + struct pid *pid = task_pid(tsk); + + if (!pid) + return 0; + + if (pid-level ns-level) + return 0; + + if

[Devel] Re: [PATCH] 'kill sig -1' must only apply to caller's namespace

2008-10-24 Thread Oleg Nesterov
!same_thread_group(p, current)) { + if (task_pid_vnr(p) 1 + !same_thread_group(p, current)) { Thanks Sukadev! Signed-off-by: Oleg Nesterov [EMAIL PROTECTED] Oleg. ___ Containers mailing list

[Devel] Re: Signals to cinit

2008-11-10 Thread Oleg Nesterov
On 11/10, Oleg Nesterov wrote: This way at least SIGKILL always works. Forgot to mention... We must also change sig_ignored() to drop SIGKILL/SIGSTOP early when it comes from the same ns. Otherwise, it can mask the next SIGKILL from the parent ns. But this perhaps makes sense anyway, even

[Devel] Re: Signals to cinit

2008-11-10 Thread Oleg Nesterov
On 11/01, [EMAIL PROTECTED] wrote: Other approaches to try ? I think we should try to do something simple, even if not perfect. Because most users do not care about this problem since they do not use containers at all. It would be very sad to add intrusive changes to the code. I think we

[Devel] Re: Signals to cinit

2008-11-10 Thread Oleg Nesterov
(lkml cced because containers list's archive is not useable) On 11/10, Oleg Nesterov wrote: On 11/01, [EMAIL PROTECTED] wrote: Other approaches to try ? I think we should try to do something simple, even if not perfect. Because most users do not care about this problem since they do

[Devel] Re: Signals to cinit

2008-11-12 Thread Oleg Nesterov
On 11/12, Oleg Nesterov wrote: On 11/10, [EMAIL PROTECTED] wrote: Oleg Nesterov [EMAIL PROTECTED] wrote: | Or something. yes, sys_rt_sigqueueinfo() is problematic... Yes, if user-space sets si_pid to 0. Can we change sys_rt_sigqueueinfo() to: if (!info-si_pid

[Devel] Re: [RFC][PATCH 3/3] Set si_pid to 0 for signals from ancestor namespace

2008-11-12 Thread Oleg Nesterov
On 11/11, Sukadev Bhattiprolu wrote: Subject: [PATCH 3/3] sig: Handle pid namespace crossing when sending signals. I add a struct pid sender parameter to __group_send_sig_info, as that is the only function called with si_pid != task_tgid_vnr(current). So we can correctly handle the sending

[Devel] Re: [RFC][PATCH] Define/use siginfo_from_ancestor_ns

2008-11-12 Thread Oleg Nesterov
On 11/11, Sukadev Bhattiprolu wrote: --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1117,6 +1117,7 @@ static struct task_struct *copy_process(unsigned long clone_flags, if (clone_flags CLONE_NEWPID) { retval = pid_ns_prepare_proc(p-nsproxy-pid_ns); +

[Devel] Re: Signals to cinit

2008-11-12 Thread Oleg Nesterov
On 11/10, [EMAIL PROTECTED] wrote: Also, what happens if a fatal signal is first received from a descendant and while that is still pending, the same signal is received from ancestor ns ? Won't the second one be ignored by legacy_queue() for the non-rt case ? Please see my another email:

[Devel] Re: [RFC][PATCH 3/3] Set si_pid to 0 for signals from ancestor namespace

2008-11-14 Thread Oleg Nesterov
On 11/12, Eric W. Biederman wrote: Oleg Nesterov [EMAIL PROTECTED] writes: On 11/11, Sukadev Bhattiprolu wrote: +static void set_sigqueue_pid(struct sigqueue *q, struct task_struct *t, + struct pid *sender) +{ + struct pid_namespace *ns; + + /* Set

[Devel] Re: Signals to cinit

2008-11-14 Thread Oleg Nesterov
On 11/12, Sukadev Bhattiprolu wrote: Oleg Nesterov [EMAIL PROTECTED] wrote: | On 11/10, [EMAIL PROTECTED] wrote: | | Also, what happens if a fatal signal is first received from a descendant | and while that is still pending, the same signal is received from ancestor | ns ? Won't

[Devel] Re: [RFC][PATCH][v2] Define/use siginfo_from_ancestor_ns()

2008-11-18 Thread Oleg Nesterov
On 11/15, Sukadev Bhattiprolu wrote: Subject: [PATCH] Define/use siginfo_from_ancestor_ns() Imho, the main problem with this patch is that it tries to do many different things at once, and each part is suboptimal/incomplete. This needs several patches. Not only because this is easier to

  1   2   3   >