Re: [Vserver] Casual, naïve implementation of namespace cleanup

Herbert Poetzl Wed, 03 Nov 2004 22:16:46 -0800

On Thu, Nov 04, 2004 at 11:24:34AM +1300, Sam Vilain wrote:
> OK.  Some observations on this thread;
> 
> >>>first, I would like to split up (a) into
> >>>
> >>>  (a1) 'vserver ... enter' and
> >>>  (a2) operating from the outside in the vserver
> >>
> >>ACK; (a2) is the real problem and required by tools
> >>like vrpm or vapt-get.
> 
> I think this simply isn't always going to be possible in the general
> case.  For instance, in my case, my /usr, /bin, /sbin, /lib are bind
> mounts from the shared OS partition into the per-vserver partition.
> 
> So, because we broke the `sanity condition' that all you need to do to 
> enter a vserver is is chroot + ~chcontext() with namespaces, you can't
> expect to be able to use commands like `rpm --root /vservers/foo', which
> rely on chroot() being the correct way to become a system based at that
> root.
> 
> Herbert Poetzl wrote:
> >>>1. get a new namespace
> >>>2. create the vfsmount (for example via --bind)
> >>>3. pivot_root (or similar, maybe new cmd?) to the vfsmount
> >>>4. cleanup the namespace (remove host stuff)
> >>>5. do all required/listed mounts inside that namespace
> >>>6. create the context
> 
> Would it solve anything by considering namespaces as wholly a property
> of the security context?


I guess not, usually folks use the tools to operate 
on whatever we consider a vserver (given that the
appropriate tools are available)

> Why don't we do this already?  Perhaps it is the same situation as the
> IP chroot - it is useful to be able to enter an IP chroot without the
> context, and it is useful to be able to enter a context without the IP
> chroot.

yep, exactly, those 'features' can be used independantly
and _are_ used independantly for several applications
we do neither want nor need to lose this functionality

> However, unlike the IP chroot, these namespaces are dangerous things.
> If you have one lying around that you can't see, then you might not be
> able to unmount filesystems, which might mean that production systems
> have to be rebooted unnecessarily (or at least, all the processes
> stopped, which may as well be the same thing).

yes, that is correct, and I guess the next version 
(kernel wise) will add some namespace information
via proc for several reasons (like debugging, etc ..)

> So the order would be something like:
> 
>  1. create the context, with new VFS namespace option.  The context is
>     not restricted in any way yet (it should even be able to see
>     processes in context 0, but that might be a bitch to make the same
>     thing work for the case where the starting context is not 0)
> 
>  2. do all required/listed mounts that need outside VFS access, like
>     bind mounting in other parts of the system, to places under the new
>     location.  Call this `fstab.host'
> 
>  3. create the vfsmount target via mount --rbind
> 
>  4. do all required/listed mounts that *don't* need outside VFS access,
>     ie `fstab.local'
> 
>  5. call vserver function to change the context into the new vfsroot.
>     This performs the cleanup in kernel space.  Ideally, bind mounts
>     from locations and device nodes *outside* the chroot have their
>     /proc/mounts entry cosmetically obscured, so that the `devices' do
>     not refer to filenames that don't exist within the context.
> 
>  6. perform IP root binding
> 
>  7. do all required/listed mounts that *don't* need outside VFS access,
>     nor outside network access - `fstab'
> 
>  8. drop the context's privileges, thereby completely entering it, and
>     start the init process.

that is how we (I?) imagined that it would happen,
and actually all pieces for that procedere should
be already there ...

> Entering from the outside would be like:
> 
>  1. call vserver function to enter context.  This also moves you into
>     the correct namespace, but until you chroot(), you still have
>     outside VFS access by means of your processes' `/' and/or cwd.

this is what vcontext ... vnamespace does IIRC

> >well, with the help of the 'great kernel' we can 
> >actually do a lot of things ... we just need to
> >design a concept, then test and implement it ...
> 
> yep.  especially since we're still in `alpha' tools status, and so 
> Enrico doesn't need to hurt his head worrying about each new 0.30.19x
> release supporting every 1.9.x release :-)

I have absolutely no problem with drastic changes
in how the tools work with older devel releases
(we already have various issues between different
 tools and kernel versions, so I would not really
 care that much, especially as we are going to
 change a lot of things in the near future, like
 ngn networking and hopefully CoW links)

thanks for the input,
Herbert

> -- 
> Sam Vilain, sam /\T vilain |><>T net, PGP key ID: 0x05B52F13
> (include my PGP key ID in personal replies to avoid spam filtering)
> _______________________________________________
> Vserver mailing list
> [EMAIL PROTECTED]
> http://list.linux-vserver.org/mailman/listinfo/vserver
_______________________________________________
Vserver mailing list
[EMAIL PROTECTED]
http://list.linux-vserver.org/mailman/listinfo/vserver

Re: [Vserver] Casual, naïve implementation of namespace cleanup

Reply via email to