On Mon, Oct 14, 2013 at 1:01 PM, Richard Yao <[email protected]> wrote: > > 1. What are mount namespaces? How do they integrate with the kernel? > 2. What does systemd do with them? What does systemd's use of them > provide to users? > > Saying to google "per-process namespaces" does not really answer that. > Per-process namespaces provide a means to isolate processes into > containers that they have their own pid numbers and can neither nor > interact with processes outside of the container via traditional IPC > mechanisms such as signals. It is similar to the concept of FreeBSD > jails. That does not tell me what a "mount namespace is" or why systemd > has anything to do with it. >
You're describing a process namespace, which is only one type of namespace. All namespaces are "per-process," but process namespaces are just one type of per-process namespace. Confused yet? All processes within the same mount namespace see the same filesystem. If I run mount /dev/cdrom /mnt/cdrom in one process, then all processes in the same namespace will see it mounted. However, processes in another namespace will NOT see the new mount. To illustrate, if you are on linux with util-linux installed launch two root shells, and in one execute: mkdir /tmp/foo touch /tmp/foo/a unshare -m /bin/bash mount -t tmpfs none /tmp/foo touch /tmp/foo/b ls /tmp/foo Then run ls /tmp/foo in your other process. They'll see two different directories, because the tmpfs mounted in the separate namespace created by unshare is not visible to any other process. To clean up within the namespace umount /tmp/foo and exit (I have no idea if it is possible to unmount the tmpfs if you exit first, or if the kernel does it for you). The possibilities are endless. You could mount an encrypted home for a user and make it visible only to the user. Containers are an obvious way to use them. Systemd lets you configure daemons to have restricted access to the filesystem as well - either read-only, or not at all - by directory. I assume it just clones the mount namespace, and then sets up bind-mounts to implement this before dropping root and launching the process. Rich
