Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-20 Thread Oleg Nesterov
On 12/19, Neil Horman wrote:
>
> On Wed, Dec 19, 2012 at 05:22:30PM +0100, Oleg Nesterov wrote:
> >
> I wouldn't worry about it.  After looking over the changes Eric just had 
> merged
> for 3.8 I'm becomming more convinced that this isn't really needed anymore. 
> With
> his changes, we can migrate a process between all available namespaces
> dynamically in user space.

Except pid_namespace. You still can't change it even with setns().

However setns(CLONE_NEWPID) + fork() creates the child in the target namespace,
probably this is enough for you.

> With that functionality we can just write a setns
> administrative utility to make this all work.

Agreed.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-20 Thread Oleg Nesterov
On 12/19, Neil Horman wrote:

 On Wed, Dec 19, 2012 at 05:22:30PM +0100, Oleg Nesterov wrote:
 
 I wouldn't worry about it.  After looking over the changes Eric just had 
 merged
 for 3.8 I'm becomming more convinced that this isn't really needed anymore. 
 With
 his changes, we can migrate a process between all available namespaces
 dynamically in user space.

Except pid_namespace. You still can't change it even with setns().

However setns(CLONE_NEWPID) + fork() creates the child in the target namespace,
probably this is enough for you.

 With that functionality we can just write a setns
 administrative utility to make this all work.

Agreed.

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-19 Thread Neil Horman
On Wed, Dec 19, 2012 at 05:22:30PM +0100, Oleg Nesterov wrote:
> On 12/18, Neil Horman wrote:
> >
> > On Tue, Dec 18, 2012 at 09:06:04PM +0100, Oleg Nesterov wrote:
> > > > Perhaps its best just to restrict this patch to adjusting the root fs 
> > > > location
> > > > for the chroot case.
> > >
> > > Probably... at least for the start.
> > >
> > > BTW. Of course this is subjective, but personally I think that "||"
> > > looks strange. Perhaps it would be better to add something like
> > > --croot argument?
> > >
> > The || is ambiguous with its simmilarity to a shell 'or' command,
> 
> Ah, I didn't mean this.
> 
> I meant, this is not flexible. We can implement --croot, then we can
> (may be) add --switch_ns. If you use "||" now for chroot, what will you
> do for switch_ns?
> 
> > but I don't
> > think the --croot argument is much better on that front, as that then 
> > becomes
> > ambiguous with arguments supplied to the pipe reader directly.
> 
> Not sure I understand... why?
> 
> > The token should
> > be leading the pipe_reader string in core_pattern to indicate a change in
> > environment independent of the executable path.
> 
> Do you mean that || at the front is more "visible" ? True, but I am
> not sure this is that important.
> 
All I really meant was that placing the token for croot at the front of the
pattern would be more readable as it disambiguates its meaning from the rest of
the command.

> But I won't insist.
> 
I wouldn't worry about it.  After looking over the changes Eric just had merged
for 3.8 I'm becomming more convinced that this isn't really needed anymore. With
his changes, we can migrate a process between all available namespaces
dynamically in user space.  With that functionality we can just write a setns
administrative utility to make this all work. I've started that work here:
http://marc.info/?l=util-linux-ng=135594402801895=2

all thats arguably left after that is to make core_pattern a per-container
(possibly per mount namespace?).  My patch doesn't do that. But its something I
can look into.

Neil
 
> Oleg.
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-19 Thread Oleg Nesterov
On 12/18, Eric W. Biederman wrote:
>
> Neil Horman  writes:
>
> > On Tue, Dec 18, 2012 at 09:06:04PM +0100, Oleg Nesterov wrote:
> >>
> >> Yes, and we probably want to change pid_ns as well. But afaics currently
> >> this is not possible, even setns can't do this.
>
> The code for setns to change the pid namespace just merged.

Yes... I see pidns_install() after git pull... And it seems that
unshare(NEWPID) should work too...

So in general task_active_pid_ns() is no longer equal to nsproxy->pid_ns.
Oh, I can't understand how this can be right ;)

> Oleg I copied you on that code when I put it up for review.  Did I use
> the wrong email address?

Probably, I didn't see any email. I'll try to read the new code later.
I simply can't understand what will happen if, say, the task does
clone(CLONE_THREAD) after unshare(NEWPID)...

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-19 Thread Oleg Nesterov
On 12/18, Neil Horman wrote:
>
> On Tue, Dec 18, 2012 at 09:06:04PM +0100, Oleg Nesterov wrote:
> > > Perhaps its best just to restrict this patch to adjusting the root fs 
> > > location
> > > for the chroot case.
> >
> > Probably... at least for the start.
> >
> > BTW. Of course this is subjective, but personally I think that "||"
> > looks strange. Perhaps it would be better to add something like
> > --croot argument?
> >
> The || is ambiguous with its simmilarity to a shell 'or' command,

Ah, I didn't mean this.

I meant, this is not flexible. We can implement --croot, then we can
(may be) add --switch_ns. If you use "||" now for chroot, what will you
do for switch_ns?

> but I don't
> think the --croot argument is much better on that front, as that then becomes
> ambiguous with arguments supplied to the pipe reader directly.

Not sure I understand... why?

> The token should
> be leading the pipe_reader string in core_pattern to indicate a change in
> environment independent of the executable path.

Do you mean that || at the front is more "visible" ? True, but I am
not sure this is that important.

But I won't insist.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-19 Thread Oleg Nesterov
On 12/18, Neil Horman wrote:

 On Tue, Dec 18, 2012 at 09:06:04PM +0100, Oleg Nesterov wrote:
   Perhaps its best just to restrict this patch to adjusting the root fs 
   location
   for the chroot case.
 
  Probably... at least for the start.
 
  BTW. Of course this is subjective, but personally I think that ||
  looks strange. Perhaps it would be better to add something like
  --croot argument?
 
 The || is ambiguous with its simmilarity to a shell 'or' command,

Ah, I didn't mean this.

I meant, this is not flexible. We can implement --croot, then we can
(may be) add --switch_ns. If you use || now for chroot, what will you
do for switch_ns?

 but I don't
 think the --croot argument is much better on that front, as that then becomes
 ambiguous with arguments supplied to the pipe reader directly.

Not sure I understand... why?

 The token should
 be leading the pipe_reader string in core_pattern to indicate a change in
 environment independent of the executable path.

Do you mean that || at the front is more visible ? True, but I am
not sure this is that important.

But I won't insist.

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-19 Thread Oleg Nesterov
On 12/18, Eric W. Biederman wrote:

 Neil Horman nhor...@tuxdriver.com writes:

  On Tue, Dec 18, 2012 at 09:06:04PM +0100, Oleg Nesterov wrote:
 
  Yes, and we probably want to change pid_ns as well. But afaics currently
  this is not possible, even setns can't do this.

 The code for setns to change the pid namespace just merged.

Yes... I see pidns_install() after git pull... And it seems that
unshare(NEWPID) should work too...

So in general task_active_pid_ns() is no longer equal to nsproxy-pid_ns.
Oh, I can't understand how this can be right ;)

 Oleg I copied you on that code when I put it up for review.  Did I use
 the wrong email address?

Probably, I didn't see any email. I'll try to read the new code later.
I simply can't understand what will happen if, say, the task does
clone(CLONE_THREAD) after unshare(NEWPID)...

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-19 Thread Neil Horman
On Wed, Dec 19, 2012 at 05:22:30PM +0100, Oleg Nesterov wrote:
 On 12/18, Neil Horman wrote:
 
  On Tue, Dec 18, 2012 at 09:06:04PM +0100, Oleg Nesterov wrote:
Perhaps its best just to restrict this patch to adjusting the root fs 
location
for the chroot case.
  
   Probably... at least for the start.
  
   BTW. Of course this is subjective, but personally I think that ||
   looks strange. Perhaps it would be better to add something like
   --croot argument?
  
  The || is ambiguous with its simmilarity to a shell 'or' command,
 
 Ah, I didn't mean this.
 
 I meant, this is not flexible. We can implement --croot, then we can
 (may be) add --switch_ns. If you use || now for chroot, what will you
 do for switch_ns?
 
  but I don't
  think the --croot argument is much better on that front, as that then 
  becomes
  ambiguous with arguments supplied to the pipe reader directly.
 
 Not sure I understand... why?
 
  The token should
  be leading the pipe_reader string in core_pattern to indicate a change in
  environment independent of the executable path.
 
 Do you mean that || at the front is more visible ? True, but I am
 not sure this is that important.
 
All I really meant was that placing the token for croot at the front of the
pattern would be more readable as it disambiguates its meaning from the rest of
the command.

 But I won't insist.
 
I wouldn't worry about it.  After looking over the changes Eric just had merged
for 3.8 I'm becomming more convinced that this isn't really needed anymore. With
his changes, we can migrate a process between all available namespaces
dynamically in user space.  With that functionality we can just write a setns
administrative utility to make this all work. I've started that work here:
http://marc.info/?l=util-linux-ngm=135594402801895w=2

all thats arguably left after that is to make core_pattern a per-container
(possibly per mount namespace?).  My patch doesn't do that. But its something I
can look into.

Neil
 
 Oleg.
 
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-18 Thread Eric W. Biederman
Neil Horman  writes:

> On Tue, Dec 18, 2012 at 12:45:18PM -0800, Eric W. Biederman wrote:

>> The code for setns to change the pid namespace just merged.
>> 
> Can you post a link to the merge commit for reference so I can take a look at
> it?

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=6a2b60b17b3e48a418695a94bd2420f6ab32e519

>> If we just want one pattern we should be able to to robustly implement
>> this in userspace with the existing functionality.  With the caveat that
>> we need to get some pid namespace and user namespace bugs in the core
>> pattern generation fixed.  But we need to fix those bugs anyway.
>> 
> Then perhaps the right thing to do here is in fact just make core_pattern a
> per-namespace sysctl.  I only took a brief look, but I was unable to find an
> example of such a per-namespace systctl.  Do we already have the 
> infrastructure
> to do such a thing?  I didn't think we did.

We do have the infrastructure for a per namespace sysctls.  Right now we
only have per network namespace sysctls.  It is on my wish list to use
the infrastructure a little more extensively and convert /proc/sys into
a symlink to /proc//sys and reduce the amount of magic in /proc for
sysctls.

We also have per namespace sysctls that do magic based upon current.
Since that pattern is has more magic I don't recommend it over the long
term.

Of course there is the question which namespace the sysctl should be
tied to, and what the other namespaces should be set to.  Shrug.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-18 Thread Neil Horman
On Tue, Dec 18, 2012 at 12:45:18PM -0800, Eric W. Biederman wrote:
> Neil Horman  writes:
> 
> > On Tue, Dec 18, 2012 at 09:06:04PM +0100, Oleg Nesterov wrote:
> >> On 12/17, Neil Horman wrote:
> >> >
> >> > On Mon, Dec 17, 2012 at 05:04:08PM +0100, Oleg Nesterov wrote:
> >> > >
> >> > > > Is there a way to switch all namespaces, except for the pid
> >> > > > namespace?
> >> > >
> >> > > Which exactly namespaces you want to change?
> >> > >
> >> > Ideally, I want the pipe reader process to execute in the same 
> >> > namespaces that
> >> > the crashing process executed in (i.e. the pipe reader should execute as 
> >> > though
> >> > the crashing process forked it).
> >> 
> >> Yes, and we probably want to change pid_ns as well. But afaics currently
> >> this is not possible, even setns can't do this.
> 
> The code for setns to change the pid namespace just merged.
> 
Can you post a link to the merge commit for reference so I can take a look at
it?

> Oleg I copied you on that code when I put it up for review.  Did I use
> the wrong email address?
> 
> >> BTW. Of course this is subjective, but personally I think that "||"
> >> looks strange. Perhaps it would be better to add something like
> >> --croot argument?
> >> 
> > The || is ambiguous with its simmilarity to a shell 'or' command, but I 
> > don't
> > think the --croot argument is much better on that front, as that then 
> > becomes
> > ambiguous with arguments supplied to the pipe reader directly.  The token 
> > should
> > be leading the pipe_reader string in core_pattern to indicate a change in
> > environment independent of the executable path.  Perhaps |^ or something
> > simmilar?
> 
> I failed to send my earlier reply but there is another problem with the
> approach of only having one global core dump pattern.  You can't set it
> per container.  Which means a special character to switch namespeces
> while a reasonable solution (and arguably unnecessary solution) is not a
> complete solution.
> 
> > Either way, Andrew, could you please drop this patch?  Olegs comments I 
> > think
> > make it pretty clear I've got some more work to do on this.
> 
> If we just want one pattern we should be able to to robustly implement
> this in userspace with the existing functionality.  With the caveat that
> we need to get some pid namespace and user namespace bugs in the core
> pattern generation fixed.  But we need to fix those bugs anyway.
> 
Then perhaps the right thing to do here is in fact just make core_pattern a
per-namespace sysctl.  I only took a brief look, but I was unable to find an
example of such a per-namespace systctl.  Do we already have the infrastructure
to do such a thing?  I didn't think we did.

Thanks!
Neil

> Eric
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-18 Thread Eric W. Biederman
Neil Horman  writes:

> On Tue, Dec 18, 2012 at 09:06:04PM +0100, Oleg Nesterov wrote:
>> On 12/17, Neil Horman wrote:
>> >
>> > On Mon, Dec 17, 2012 at 05:04:08PM +0100, Oleg Nesterov wrote:
>> > >
>> > > > Is there a way to switch all namespaces, except for the pid
>> > > > namespace?
>> > >
>> > > Which exactly namespaces you want to change?
>> > >
>> > Ideally, I want the pipe reader process to execute in the same namespaces 
>> > that
>> > the crashing process executed in (i.e. the pipe reader should execute as 
>> > though
>> > the crashing process forked it).
>> 
>> Yes, and we probably want to change pid_ns as well. But afaics currently
>> this is not possible, even setns can't do this.

The code for setns to change the pid namespace just merged.

Oleg I copied you on that code when I put it up for review.  Did I use
the wrong email address?

>> BTW. Of course this is subjective, but personally I think that "||"
>> looks strange. Perhaps it would be better to add something like
>> --croot argument?
>> 
> The || is ambiguous with its simmilarity to a shell 'or' command, but I don't
> think the --croot argument is much better on that front, as that then becomes
> ambiguous with arguments supplied to the pipe reader directly.  The token 
> should
> be leading the pipe_reader string in core_pattern to indicate a change in
> environment independent of the executable path.  Perhaps |^ or something
> simmilar?

I failed to send my earlier reply but there is another problem with the
approach of only having one global core dump pattern.  You can't set it
per container.  Which means a special character to switch namespeces
while a reasonable solution (and arguably unnecessary solution) is not a
complete solution.

> Either way, Andrew, could you please drop this patch?  Olegs comments I think
> make it pretty clear I've got some more work to do on this.

If we just want one pattern we should be able to to robustly implement
this in userspace with the existing functionality.  With the caveat that
we need to get some pid namespace and user namespace bugs in the core
pattern generation fixed.  But we need to fix those bugs anyway.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-18 Thread Neil Horman
On Tue, Dec 18, 2012 at 09:06:04PM +0100, Oleg Nesterov wrote:
> On 12/17, Neil Horman wrote:
> >
> > On Mon, Dec 17, 2012 at 05:04:08PM +0100, Oleg Nesterov wrote:
> > >
> > > > Is there a way to switch all namespaces, except for the pid
> > > > namespace?
> > >
> > > Which exactly namespaces you want to change?
> > >
> > Ideally, I want the pipe reader process to execute in the same namespaces 
> > that
> > the crashing process executed in (i.e. the pipe reader should execute as 
> > though
> > the crashing process forked it).
> 
> Yes, and we probably want to change pid_ns as well. But afaics currently
> this is not possible, even setns can't do this.
> 
> I am starting to think that in this case, perhaps, do_coredump() should
> not use call_usermode_helper() at all. Perhaps we can do clone(CLONE_VM) +
> commit_creds/restore_root/etc + kernel_execve.
> 
Yeah, I was comming to this same conclusion last night as well.  I'd rather keep
using the call_usermode_helper solution if at all possible, but perhaps we can
integrate a clone path into it.

> > > To be honest, I do not understand this patch at all. It seems that
> > > you need to do something like sys_setns(). But if we do this, then
> > > why we can't make core_pattern per-namespace?
> > >
> > That actually would make sense, although we can't really use setns 
> > directly, as
> > I don't think we want to open file descriptors to do this manipulation in 
> > the
> > kernel.
> 
> Yes, yes, sure. But this is solveable. We do not really need to open
> the files in /proc, we could use proc_ns_operations->install() directly.
> Although this is not pretty.
> 
Its not pretty, no.  If we use the above clone solution however, we can avoid
this entirely.

> > Perhaps its best just to restrict this patch to adjusting the root fs 
> > location
> > for the chroot case.
> 
> Probably... at least for the start.
> 
> BTW. Of course this is subjective, but personally I think that "||"
> looks strange. Perhaps it would be better to add something like
> --croot argument?
> 
The || is ambiguous with its simmilarity to a shell 'or' command, but I don't
think the --croot argument is much better on that front, as that then becomes
ambiguous with arguments supplied to the pipe reader directly.  The token should
be leading the pipe_reader string in core_pattern to indicate a change in
environment independent of the executable path.  Perhaps |^ or something
simmilar?

Either way, Andrew, could you please drop this patch?  Olegs comments I think
make it pretty clear I've got some more work to do on this.

Thanks!
Neil

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-18 Thread Oleg Nesterov
On 12/17, Neil Horman wrote:
>
> On Mon, Dec 17, 2012 at 05:04:08PM +0100, Oleg Nesterov wrote:
> >
> > > Is there a way to switch all namespaces, except for the pid
> > > namespace?
> >
> > Which exactly namespaces you want to change?
> >
> Ideally, I want the pipe reader process to execute in the same namespaces that
> the crashing process executed in (i.e. the pipe reader should execute as 
> though
> the crashing process forked it).

Yes, and we probably want to change pid_ns as well. But afaics currently
this is not possible, even setns can't do this.

I am starting to think that in this case, perhaps, do_coredump() should
not use call_usermode_helper() at all. Perhaps we can do clone(CLONE_VM) +
commit_creds/restore_root/etc + kernel_execve.

> > To be honest, I do not understand this patch at all. It seems that
> > you need to do something like sys_setns(). But if we do this, then
> > why we can't make core_pattern per-namespace?
> >
> That actually would make sense, although we can't really use setns directly, 
> as
> I don't think we want to open file descriptors to do this manipulation in the
> kernel.

Yes, yes, sure. But this is solveable. We do not really need to open
the files in /proc, we could use proc_ns_operations->install() directly.
Although this is not pretty.

> Perhaps its best just to restrict this patch to adjusting the root fs location
> for the chroot case.

Probably... at least for the start.

BTW. Of course this is subjective, but personally I think that "||"
looks strange. Perhaps it would be better to add something like
--croot argument?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-18 Thread Oleg Nesterov
On 12/17, Neil Horman wrote:

 On Mon, Dec 17, 2012 at 05:04:08PM +0100, Oleg Nesterov wrote:
 
   Is there a way to switch all namespaces, except for the pid
   namespace?
 
  Which exactly namespaces you want to change?
 
 Ideally, I want the pipe reader process to execute in the same namespaces that
 the crashing process executed in (i.e. the pipe reader should execute as 
 though
 the crashing process forked it).

Yes, and we probably want to change pid_ns as well. But afaics currently
this is not possible, even setns can't do this.

I am starting to think that in this case, perhaps, do_coredump() should
not use call_usermode_helper() at all. Perhaps we can do clone(CLONE_VM) +
commit_creds/restore_root/etc + kernel_execve.

  To be honest, I do not understand this patch at all. It seems that
  you need to do something like sys_setns(). But if we do this, then
  why we can't make core_pattern per-namespace?
 
 That actually would make sense, although we can't really use setns directly, 
 as
 I don't think we want to open file descriptors to do this manipulation in the
 kernel.

Yes, yes, sure. But this is solveable. We do not really need to open
the files in /proc, we could use proc_ns_operations-install() directly.
Although this is not pretty.

 Perhaps its best just to restrict this patch to adjusting the root fs location
 for the chroot case.

Probably... at least for the start.

BTW. Of course this is subjective, but personally I think that ||
looks strange. Perhaps it would be better to add something like
--croot argument?

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-18 Thread Neil Horman
On Tue, Dec 18, 2012 at 09:06:04PM +0100, Oleg Nesterov wrote:
 On 12/17, Neil Horman wrote:
 
  On Mon, Dec 17, 2012 at 05:04:08PM +0100, Oleg Nesterov wrote:
  
Is there a way to switch all namespaces, except for the pid
namespace?
  
   Which exactly namespaces you want to change?
  
  Ideally, I want the pipe reader process to execute in the same namespaces 
  that
  the crashing process executed in (i.e. the pipe reader should execute as 
  though
  the crashing process forked it).
 
 Yes, and we probably want to change pid_ns as well. But afaics currently
 this is not possible, even setns can't do this.
 
 I am starting to think that in this case, perhaps, do_coredump() should
 not use call_usermode_helper() at all. Perhaps we can do clone(CLONE_VM) +
 commit_creds/restore_root/etc + kernel_execve.
 
Yeah, I was comming to this same conclusion last night as well.  I'd rather keep
using the call_usermode_helper solution if at all possible, but perhaps we can
integrate a clone path into it.

   To be honest, I do not understand this patch at all. It seems that
   you need to do something like sys_setns(). But if we do this, then
   why we can't make core_pattern per-namespace?
  
  That actually would make sense, although we can't really use setns 
  directly, as
  I don't think we want to open file descriptors to do this manipulation in 
  the
  kernel.
 
 Yes, yes, sure. But this is solveable. We do not really need to open
 the files in /proc, we could use proc_ns_operations-install() directly.
 Although this is not pretty.
 
Its not pretty, no.  If we use the above clone solution however, we can avoid
this entirely.

  Perhaps its best just to restrict this patch to adjusting the root fs 
  location
  for the chroot case.
 
 Probably... at least for the start.
 
 BTW. Of course this is subjective, but personally I think that ||
 looks strange. Perhaps it would be better to add something like
 --croot argument?
 
The || is ambiguous with its simmilarity to a shell 'or' command, but I don't
think the --croot argument is much better on that front, as that then becomes
ambiguous with arguments supplied to the pipe reader directly.  The token should
be leading the pipe_reader string in core_pattern to indicate a change in
environment independent of the executable path.  Perhaps |^ or something
simmilar?

Either way, Andrew, could you please drop this patch?  Olegs comments I think
make it pretty clear I've got some more work to do on this.

Thanks!
Neil

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-18 Thread Eric W. Biederman
Neil Horman nhor...@tuxdriver.com writes:

 On Tue, Dec 18, 2012 at 09:06:04PM +0100, Oleg Nesterov wrote:
 On 12/17, Neil Horman wrote:
 
  On Mon, Dec 17, 2012 at 05:04:08PM +0100, Oleg Nesterov wrote:
  
Is there a way to switch all namespaces, except for the pid
namespace?
  
   Which exactly namespaces you want to change?
  
  Ideally, I want the pipe reader process to execute in the same namespaces 
  that
  the crashing process executed in (i.e. the pipe reader should execute as 
  though
  the crashing process forked it).
 
 Yes, and we probably want to change pid_ns as well. But afaics currently
 this is not possible, even setns can't do this.

The code for setns to change the pid namespace just merged.

Oleg I copied you on that code when I put it up for review.  Did I use
the wrong email address?

 BTW. Of course this is subjective, but personally I think that ||
 looks strange. Perhaps it would be better to add something like
 --croot argument?
 
 The || is ambiguous with its simmilarity to a shell 'or' command, but I don't
 think the --croot argument is much better on that front, as that then becomes
 ambiguous with arguments supplied to the pipe reader directly.  The token 
 should
 be leading the pipe_reader string in core_pattern to indicate a change in
 environment independent of the executable path.  Perhaps |^ or something
 simmilar?

I failed to send my earlier reply but there is another problem with the
approach of only having one global core dump pattern.  You can't set it
per container.  Which means a special character to switch namespeces
while a reasonable solution (and arguably unnecessary solution) is not a
complete solution.

 Either way, Andrew, could you please drop this patch?  Olegs comments I think
 make it pretty clear I've got some more work to do on this.

If we just want one pattern we should be able to to robustly implement
this in userspace with the existing functionality.  With the caveat that
we need to get some pid namespace and user namespace bugs in the core
pattern generation fixed.  But we need to fix those bugs anyway.

Eric
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-18 Thread Neil Horman
On Tue, Dec 18, 2012 at 12:45:18PM -0800, Eric W. Biederman wrote:
 Neil Horman nhor...@tuxdriver.com writes:
 
  On Tue, Dec 18, 2012 at 09:06:04PM +0100, Oleg Nesterov wrote:
  On 12/17, Neil Horman wrote:
  
   On Mon, Dec 17, 2012 at 05:04:08PM +0100, Oleg Nesterov wrote:
   
 Is there a way to switch all namespaces, except for the pid
 namespace?
   
Which exactly namespaces you want to change?
   
   Ideally, I want the pipe reader process to execute in the same 
   namespaces that
   the crashing process executed in (i.e. the pipe reader should execute as 
   though
   the crashing process forked it).
  
  Yes, and we probably want to change pid_ns as well. But afaics currently
  this is not possible, even setns can't do this.
 
 The code for setns to change the pid namespace just merged.
 
Can you post a link to the merge commit for reference so I can take a look at
it?

 Oleg I copied you on that code when I put it up for review.  Did I use
 the wrong email address?
 
  BTW. Of course this is subjective, but personally I think that ||
  looks strange. Perhaps it would be better to add something like
  --croot argument?
  
  The || is ambiguous with its simmilarity to a shell 'or' command, but I 
  don't
  think the --croot argument is much better on that front, as that then 
  becomes
  ambiguous with arguments supplied to the pipe reader directly.  The token 
  should
  be leading the pipe_reader string in core_pattern to indicate a change in
  environment independent of the executable path.  Perhaps |^ or something
  simmilar?
 
 I failed to send my earlier reply but there is another problem with the
 approach of only having one global core dump pattern.  You can't set it
 per container.  Which means a special character to switch namespeces
 while a reasonable solution (and arguably unnecessary solution) is not a
 complete solution.
 
  Either way, Andrew, could you please drop this patch?  Olegs comments I 
  think
  make it pretty clear I've got some more work to do on this.
 
 If we just want one pattern we should be able to to robustly implement
 this in userspace with the existing functionality.  With the caveat that
 we need to get some pid namespace and user namespace bugs in the core
 pattern generation fixed.  But we need to fix those bugs anyway.
 
Then perhaps the right thing to do here is in fact just make core_pattern a
per-namespace sysctl.  I only took a brief look, but I was unable to find an
example of such a per-namespace systctl.  Do we already have the infrastructure
to do such a thing?  I didn't think we did.

Thanks!
Neil

 Eric
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-18 Thread Eric W. Biederman
Neil Horman nhor...@tuxdriver.com writes:

 On Tue, Dec 18, 2012 at 12:45:18PM -0800, Eric W. Biederman wrote:

 The code for setns to change the pid namespace just merged.
 
 Can you post a link to the merge commit for reference so I can take a look at
 it?

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=6a2b60b17b3e48a418695a94bd2420f6ab32e519

 If we just want one pattern we should be able to to robustly implement
 this in userspace with the existing functionality.  With the caveat that
 we need to get some pid namespace and user namespace bugs in the core
 pattern generation fixed.  But we need to fix those bugs anyway.
 
 Then perhaps the right thing to do here is in fact just make core_pattern a
 per-namespace sysctl.  I only took a brief look, but I was unable to find an
 example of such a per-namespace systctl.  Do we already have the 
 infrastructure
 to do such a thing?  I didn't think we did.

We do have the infrastructure for a per namespace sysctls.  Right now we
only have per network namespace sysctls.  It is on my wish list to use
the infrastructure a little more extensively and convert /proc/sys into
a symlink to /proc/pid/sys and reduce the amount of magic in /proc for
sysctls.

We also have per namespace sysctls that do magic based upon current.
Since that pattern is has more magic I don't recommend it over the long
term.

Of course there is the question which namespace the sysctl should be
tied to, and what the other namespaces should be set to.  Shrug.

Eric
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-17 Thread Neil Horman
On Mon, Dec 17, 2012 at 05:04:08PM +0100, Oleg Nesterov wrote:
> On 12/17, Neil Horman wrote:
> >
> > On Mon, Dec 17, 2012 at 01:34:28PM +0100, Oleg Nesterov wrote:
> > > @@ -455,6 +468,14 @@ static int umh_pipe_setup(struct subproc
> > >   /* and disallow core files too */
> > >   current->signal->rlim[RLIMIT_CORE] = (struct rlimit){1, 1};
> > >
> > > +
> > > + if (cp->switch_ns) {
> > > + get_fs_root(cp->cprocess->fs, );
> > > + set_fs_root(current->fs, );
> > > + switch_task_namespaces(current, cp->cprocess->nsproxy);
> > >
> > > How? You can't simply change ->nsproxy this way.
> > >
> > Why not?  This is exactly how fork, exit, and setns use this call.
> 
> No. exit() does switch_task_namespaces(NULL), this is different.
> fork() doesn't do this, and unshare/setns carefully creates the new ns.
> 
> > > If nothing else this breaks sys_getpid(), no?
> > >
> > hmm, I think you're inferring here that there is a chance that a pid 
> > allocated
> > in the init namespace might conflict with another process who holds the 
> > same pid
> > in another namespace?
> 
> No, I meant that sys_getpid() should always return 0 after this
> switch_task_namespaces() if the coredumping task is not from the root
> namespace.
> 
> > Is there a way to switch all namespaces, except for the pid
> > namespace?
> 
> Which exactly namespaces you want to change?
> 
Ideally, I want the pipe reader process to execute in the same namespaces that
the crashing process executed in (i.e. the pipe reader should execute as though
the crashing process forked it).

> To be honest, I do not understand this patch at all. It seems that
> you need to do something like sys_setns(). But if we do this, then
> why we can't make core_pattern per-namespace?
> 
That actually would make sense, although we can't really use setns directly, as
I don't think we want to open file descriptors to do this manipulation in the
kernel.


The origional goal of this patch was to allow for the core_pattern pipe reader
to execute using the same root fs point that the crashing process did (the idea
being that, if a process crashed, the pipe reader should execute in the same
environment.  E.g. if a container was running a process that crashed, and the
pipe reader was going to save the core to /tmp, it should be in the chrooted
/tmp, not the global /tmp.  But it occured to me that switching all the
namespaces is really whats needed here.  Although after this conversation, that
seems far more difficult than I origionally thought.

Perhaps its best just to restrict this patch to adjusting the root fs location
for the chroot case.  Or would it be better to iterate over the setns-able
namespaces and migrate the pipe helper to each of them from umh_pipe_setup.

Thaoughts?
Neil


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-17 Thread Oleg Nesterov
On 12/17, Neil Horman wrote:
>
> On Mon, Dec 17, 2012 at 01:34:28PM +0100, Oleg Nesterov wrote:
> > @@ -455,6 +468,14 @@ static int umh_pipe_setup(struct subproc
> > /* and disallow core files too */
> > current->signal->rlim[RLIMIT_CORE] = (struct rlimit){1, 1};
> >
> > +
> > +   if (cp->switch_ns) {
> > +   get_fs_root(cp->cprocess->fs, );
> > +   set_fs_root(current->fs, );
> > +   switch_task_namespaces(current, cp->cprocess->nsproxy);
> >
> > How? You can't simply change ->nsproxy this way.
> >
> Why not?  This is exactly how fork, exit, and setns use this call.

No. exit() does switch_task_namespaces(NULL), this is different.
fork() doesn't do this, and unshare/setns carefully creates the new ns.

> > If nothing else this breaks sys_getpid(), no?
> >
> hmm, I think you're inferring here that there is a chance that a pid allocated
> in the init namespace might conflict with another process who holds the same 
> pid
> in another namespace?

No, I meant that sys_getpid() should always return 0 after this
switch_task_namespaces() if the coredumping task is not from the root
namespace.

> Is there a way to switch all namespaces, except for the pid
> namespace?

Which exactly namespaces you want to change?

To be honest, I do not understand this patch at all. It seems that
you need to do something like sys_setns(). But if we do this, then
why we can't make core_pattern per-namespace?

Anyway, please ask Pavel and Eric, they should know better ;)

> > And a lot more problems, afaics. For example, this thread can continue
> > to run after, say, this cprocess->nsproxy->pid_ns was already destroyed.
> > zap_pid_ns_processes() obviously won't see this thread.
> >
> Hmm, I don't think so.  The crashing process won't exit until the pipe reader 
> is
> done, so the reference on the namespace should never decrement to zero.
>
> Actually I take that back.  switch_task_namespaces doesn't add a ref count to
> the name space being switched to.  So if the pipe reader doesn't exit
> immediately after closing the pipe, it may live on after the namespace is
> destroyed.

Yes,

> It would seem a get_nsproxy call is needed here to hold an
> additional reference.  Or do you think more is necessecary?

This can only pin ->nsproxy itself, this is not enough iirc.

Note that the exiting sub-init assumes that nobody else can use
ns->proc_mnt after zap_pid_ns_processes().

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-17 Thread Neil Horman
On Mon, Dec 17, 2012 at 01:34:28PM +0100, Oleg Nesterov wrote:
> @@ -455,6 +468,14 @@ static int umh_pipe_setup(struct subproc
>   /* and disallow core files too */
>   current->signal->rlim[RLIMIT_CORE] = (struct rlimit){1, 1};
>  
> +
> + if (cp->switch_ns) {
> + get_fs_root(cp->cprocess->fs, );
> + set_fs_root(current->fs, );
> + switch_task_namespaces(current, cp->cprocess->nsproxy);
> 
> How? You can't simply change ->nsproxy this way.
> 
Why not?  This is exactly how fork, exit, and setns use this call.

> If nothing else this breaks sys_getpid(), no?
> 
hmm, I think you're inferring here that there is a chance that a pid allocated
in the init namespace might conflict with another process who holds the same pid
in another namespace?  Yes, hand't thought about that.  What do you propose we
do about this?  Is there a way to switch all namespaces, except for the pid
namespace?  Or do we need to modify the kthread and umh apis to allow for
namespace inheritance through the fork call?

> And a lot more problems, afaics. For example, this thread can continue
> to run after, say, this cprocess->nsproxy->pid_ns was already destroyed.
> zap_pid_ns_processes() obviously won't see this thread.
> 
Hmm, I don't think so.  The crashing process won't exit until the pipe reader is
done, so the reference on the namespace should never decrement to zero.

Actually I take that back.  switch_task_namespaces doesn't add a ref count to
the name space being switched to.  So if the pipe reader doesn't exit
immediately after closing the pipe, it may live on after the namespace is
destroyed.  It would seem a get_nsproxy call is needed here to hold an
additional reference.  Or do you think more is necessecary?

> Even ->nsproxy itself can go away. Just suppose that the coredumping
> task is the only process in this namespace (sub-init).
> 
Again, that shouldn't be a problem, should it?  As that process won't exit until
the pipe reader is done, save the condition above.

Neil

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-17 Thread Oleg Nesterov
@@ -455,6 +468,14 @@ static int umh_pipe_setup(struct subproc
/* and disallow core files too */
current->signal->rlim[RLIMIT_CORE] = (struct rlimit){1, 1};
 
+
+   if (cp->switch_ns) {
+   get_fs_root(cp->cprocess->fs, );
+   set_fs_root(current->fs, );
+   switch_task_namespaces(current, cp->cprocess->nsproxy);

How? You can't simply change ->nsproxy this way.

If nothing else this breaks sys_getpid(), no?

And a lot more problems, afaics. For example, this thread can continue
to run after, say, this cprocess->nsproxy->pid_ns was already destroyed.
zap_pid_ns_processes() obviously won't see this thread.

Even ->nsproxy itself can go away. Just suppose that the coredumping
task is the only process in this namespace (sub-init).

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-17 Thread Neil Horman
On Mon, Dec 17, 2012 at 05:04:08PM +0100, Oleg Nesterov wrote:
 On 12/17, Neil Horman wrote:
 
  On Mon, Dec 17, 2012 at 01:34:28PM +0100, Oleg Nesterov wrote:
   @@ -455,6 +468,14 @@ static int umh_pipe_setup(struct subproc
 /* and disallow core files too */
 current-signal-rlim[RLIMIT_CORE] = (struct rlimit){1, 1};
  
   +
   + if (cp-switch_ns) {
   + get_fs_root(cp-cprocess-fs, root);
   + set_fs_root(current-fs, root);
   + switch_task_namespaces(current, cp-cprocess-nsproxy);
  
   How? You can't simply change -nsproxy this way.
  
  Why not?  This is exactly how fork, exit, and setns use this call.
 
 No. exit() does switch_task_namespaces(NULL), this is different.
 fork() doesn't do this, and unshare/setns carefully creates the new ns.
 
   If nothing else this breaks sys_getpid(), no?
  
  hmm, I think you're inferring here that there is a chance that a pid 
  allocated
  in the init namespace might conflict with another process who holds the 
  same pid
  in another namespace?
 
 No, I meant that sys_getpid() should always return 0 after this
 switch_task_namespaces() if the coredumping task is not from the root
 namespace.
 
  Is there a way to switch all namespaces, except for the pid
  namespace?
 
 Which exactly namespaces you want to change?
 
Ideally, I want the pipe reader process to execute in the same namespaces that
the crashing process executed in (i.e. the pipe reader should execute as though
the crashing process forked it).

 To be honest, I do not understand this patch at all. It seems that
 you need to do something like sys_setns(). But if we do this, then
 why we can't make core_pattern per-namespace?
 
That actually would make sense, although we can't really use setns directly, as
I don't think we want to open file descriptors to do this manipulation in the
kernel.


The origional goal of this patch was to allow for the core_pattern pipe reader
to execute using the same root fs point that the crashing process did (the idea
being that, if a process crashed, the pipe reader should execute in the same
environment.  E.g. if a container was running a process that crashed, and the
pipe reader was going to save the core to /tmp, it should be in the chrooted
/tmp, not the global /tmp.  But it occured to me that switching all the
namespaces is really whats needed here.  Although after this conversation, that
seems far more difficult than I origionally thought.

Perhaps its best just to restrict this patch to adjusting the root fs location
for the chroot case.  Or would it be better to iterate over the setns-able
namespaces and migrate the pipe helper to each of them from umh_pipe_setup.

Thaoughts?
Neil


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-17 Thread Oleg Nesterov
@@ -455,6 +468,14 @@ static int umh_pipe_setup(struct subproc
/* and disallow core files too */
current-signal-rlim[RLIMIT_CORE] = (struct rlimit){1, 1};
 
+
+   if (cp-switch_ns) {
+   get_fs_root(cp-cprocess-fs, root);
+   set_fs_root(current-fs, root);
+   switch_task_namespaces(current, cp-cprocess-nsproxy);

How? You can't simply change -nsproxy this way.

If nothing else this breaks sys_getpid(), no?

And a lot more problems, afaics. For example, this thread can continue
to run after, say, this cprocess-nsproxy-pid_ns was already destroyed.
zap_pid_ns_processes() obviously won't see this thread.

Even -nsproxy itself can go away. Just suppose that the coredumping
task is the only process in this namespace (sub-init).

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-17 Thread Neil Horman
On Mon, Dec 17, 2012 at 01:34:28PM +0100, Oleg Nesterov wrote:
 @@ -455,6 +468,14 @@ static int umh_pipe_setup(struct subproc
   /* and disallow core files too */
   current-signal-rlim[RLIMIT_CORE] = (struct rlimit){1, 1};
  
 +
 + if (cp-switch_ns) {
 + get_fs_root(cp-cprocess-fs, root);
 + set_fs_root(current-fs, root);
 + switch_task_namespaces(current, cp-cprocess-nsproxy);
 
 How? You can't simply change -nsproxy this way.
 
Why not?  This is exactly how fork, exit, and setns use this call.

 If nothing else this breaks sys_getpid(), no?
 
hmm, I think you're inferring here that there is a chance that a pid allocated
in the init namespace might conflict with another process who holds the same pid
in another namespace?  Yes, hand't thought about that.  What do you propose we
do about this?  Is there a way to switch all namespaces, except for the pid
namespace?  Or do we need to modify the kthread and umh apis to allow for
namespace inheritance through the fork call?

 And a lot more problems, afaics. For example, this thread can continue
 to run after, say, this cprocess-nsproxy-pid_ns was already destroyed.
 zap_pid_ns_processes() obviously won't see this thread.
 
Hmm, I don't think so.  The crashing process won't exit until the pipe reader is
done, so the reference on the namespace should never decrement to zero.

Actually I take that back.  switch_task_namespaces doesn't add a ref count to
the name space being switched to.  So if the pipe reader doesn't exit
immediately after closing the pipe, it may live on after the namespace is
destroyed.  It would seem a get_nsproxy call is needed here to hold an
additional reference.  Or do you think more is necessecary?

 Even -nsproxy itself can go away. Just suppose that the coredumping
 task is the only process in this namespace (sub-init).
 
Again, that shouldn't be a problem, should it?  As that process won't exit until
the pipe reader is done, save the condition above.

Neil

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + core_pattern-set-core-helpers-root-and-namespace-to-crashing-process .patch added to -mm tree

2012-12-17 Thread Oleg Nesterov
On 12/17, Neil Horman wrote:

 On Mon, Dec 17, 2012 at 01:34:28PM +0100, Oleg Nesterov wrote:
  @@ -455,6 +468,14 @@ static int umh_pipe_setup(struct subproc
  /* and disallow core files too */
  current-signal-rlim[RLIMIT_CORE] = (struct rlimit){1, 1};
 
  +
  +   if (cp-switch_ns) {
  +   get_fs_root(cp-cprocess-fs, root);
  +   set_fs_root(current-fs, root);
  +   switch_task_namespaces(current, cp-cprocess-nsproxy);
 
  How? You can't simply change -nsproxy this way.
 
 Why not?  This is exactly how fork, exit, and setns use this call.

No. exit() does switch_task_namespaces(NULL), this is different.
fork() doesn't do this, and unshare/setns carefully creates the new ns.

  If nothing else this breaks sys_getpid(), no?
 
 hmm, I think you're inferring here that there is a chance that a pid allocated
 in the init namespace might conflict with another process who holds the same 
 pid
 in another namespace?

No, I meant that sys_getpid() should always return 0 after this
switch_task_namespaces() if the coredumping task is not from the root
namespace.

 Is there a way to switch all namespaces, except for the pid
 namespace?

Which exactly namespaces you want to change?

To be honest, I do not understand this patch at all. It seems that
you need to do something like sys_setns(). But if we do this, then
why we can't make core_pattern per-namespace?

Anyway, please ask Pavel and Eric, they should know better ;)

  And a lot more problems, afaics. For example, this thread can continue
  to run after, say, this cprocess-nsproxy-pid_ns was already destroyed.
  zap_pid_ns_processes() obviously won't see this thread.
 
 Hmm, I don't think so.  The crashing process won't exit until the pipe reader 
 is
 done, so the reference on the namespace should never decrement to zero.

 Actually I take that back.  switch_task_namespaces doesn't add a ref count to
 the name space being switched to.  So if the pipe reader doesn't exit
 immediately after closing the pipe, it may live on after the namespace is
 destroyed.

Yes,

 It would seem a get_nsproxy call is needed here to hold an
 additional reference.  Or do you think more is necessecary?

This can only pin -nsproxy itself, this is not enough iirc.

Note that the exiting sub-init assumes that nobody else can use
ns-proc_mnt after zap_pid_ns_processes().

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/