[lxc-devel] read-only container root

2010-02-15 Thread Michael Tokarev
lxc-start: No such file or directory - failed to mount a new instance of 
'/dev/pts'
I'm experimenting with a read-only root fs in the container.
So far it does not work.

First of all, when trying to start a container in a read-only root
lxc-start complains:
  lxc-start: Read-only file system - can't make temporary mountpoint

This is in conf.c:setup_rootfs_pivot_root() function.  That function
uses optional parameter lxc.pivotdir, or creates (and later removes)
a temporary directory for pivot_root.  Obviously there's no way to
create a directory in a read-only filesystem.

But lxc.pivotdir does not work either. In the function mentioned above
it is used with leading dot (eg. if I specify lxc.pivotdir=pivot in
the config file the pivot_root() syscall will be made to .pivot with
leading dot, not to pivot), but later on it is used without that dot,
and fails:

  lxc-start: No such file or directory - failed to open /pivot/proc/mounts
  lxc-start: No such file or directory - failed to read or parse mount list 
'/pivot/proc/mounts'
  lxc-start: failed to pivot_root to '/stage/t'

(that's with lxc.pivotdir = pivot in the config file).  After symlinking
pivot to .pivot it still fails:

  lxc-start: Device or resource busy - could not unmount old rootfs
  lxc-start: failed to pivot_root to '/stage/t'


Ok, so far so good.

Next thing is the /dev directory.  I prefer to have it in a tmpfs, because
of several reasons (one is that the root is mounted with -o nodev), but that
fails too unless the directory is pre-populated:

  lxc-start: No such file or directory - failed to mount a new instance of 
'/dev/pts'
  lxc-start: failed to setup the new pts instance

That's when specifying:

   lxc.mount.entry = /dev dev tmpfs noexec,nosuid,mode=0755

in the config file.  That creates an empty directory for container's /dev,
which is populated later in the startup script.

Similar thing happens when I pre-create dev/pts - it fails to bind-mount
tty1..tty4.

So far it works by using a wrapper around lxc-start which mounts tmpfs
over dev, fills it with a bunch of standard entries, and executes lxc-start.

But this is really getting quite ugly.  And the only solution to all this
mess is to let to perform the setup from a shell script/command which is
called after forking the (filesystem) namespace but before entering the
container for real, or _instead_ of entering the container.  As was
discussed previously.

The whole mess started when I realized that bind-mounting host's /dev
works perfectly _except_ the syslogging, -- /dev/log does not work with
multiple containers, only the container where syslogd (re)started last
works, all the rest gives ECONNREFUSED when trying to send any message
to /dev/log.

Comments?

Thanks!

/mjt

--
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
___
Lxc-devel mailing list
Lxc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-devel


Re: [lxc-devel] read-only container root

2010-02-15 Thread Daniel Lezcano
Michael Tokarev wrote:
 lxc-start: No such file or directory - failed to mount a new instance of 
 '/dev/pts'
 I'm experimenting with a read-only root fs in the container.
 So far it does not work.

 First of all, when trying to start a container in a read-only root
 lxc-start complains:
   lxc-start: Read-only file system - can't make temporary mountpoint

 This is in conf.c:setup_rootfs_pivot_root() function.  That function
 uses optional parameter lxc.pivotdir, or creates (and later removes)
 a temporary directory for pivot_root.  Obviously there's no way to
 create a directory in a read-only filesystem.
   
Why do you need to use a read-only root fs ?

 But lxc.pivotdir does not work either. In the function mentioned above
 it is used with leading dot (eg. if I specify lxc.pivotdir=pivot in
 the config file the pivot_root() syscall will be made to .pivot with
 leading dot, not to pivot), but later on it is used without that dot,
 and fails:

   lxc-start: No such file or directory - failed to open /pivot/proc/mounts
   lxc-start: No such file or directory - failed to read or parse mount list 
 '/pivot/proc/mounts'
   lxc-start: failed to pivot_root to '/stage/t'

 (that's with lxc.pivotdir = pivot in the config file).  After symlinking
 pivot to .pivot it still fails:

   lxc-start: Device or resource busy - could not unmount old rootfs
   lxc-start: failed to pivot_root to '/stage/t'
   
It's a bug introduced with the pivot_root feature. Investigation on the way.

 Ok, so far so good.

 Next thing is the /dev directory.  I prefer to have it in a tmpfs, because
 of several reasons (one is that the root is mounted with -o nodev), but that
 fails too unless the directory is pre-populated:

   lxc-start: No such file or directory - failed to mount a new instance of 
 '/dev/pts'
   lxc-start: failed to setup the new pts instance

 That's when specifying:

lxc.mount.entry = /dev dev tmpfs noexec,nosuid,mode=0755

 in the config file.  That creates an empty directory for container's /dev,
 which is populated later in the startup script.

 Similar thing happens when I pre-create dev/pts - it fails to bind-mount
 tty1..tty4.
   
Ok, so your need is to call a script between:

lxc.mount.entry = /dev dev tmpfs noexec,nosuid,mode=0755

...
lxc.tty = 4

where the script will populate /dev, right ?

mmh, not obvious.

 So far it works by using a wrapper around lxc-start which mounts tmpfs
 over dev, fills it with a bunch of standard entries, and executes lxc-start.

 But this is really getting quite ugly.  And the only solution to all this
 mess is to let to perform the setup from a shell script/command which is
 called after forking the (filesystem) namespace but before entering the
 container for real, or _instead_ of entering the container.  As was
 discussed previously.
   

What about the lxc.script configuration line which calls a script at the 
point it is in the configuration file ?

 The whole mess started when I realized that bind-mounting host's /dev
 works perfectly _except_ the syslogging, -- /dev/log does not work with
 multiple containers, only the container where syslogd (re)started last
 works, all the rest gives ECONNREFUSED when trying to send any message
 to /dev/log.
   
 /dev/log is an af_unix socket, the network is isolated, the af_unix 
belongs to the network namespace.
It's probable /dev/log is unlinked, created again and binded by syslogd. 
So as /dev/ is shared between the containers, the last one get the socket.
Any process outside of the container trying to access this socket won't 
be able.



--
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
___
Lxc-devel mailing list
Lxc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-devel