Hi, I am not subscribed to this list. It is not the forum where I usually discuss these issues. My input will be limited to specific technical issues where it is relatively easy to straighten them out. This mail is about firmlinks and chroots and security implications. It is not about firmlink and rm -fR.
Firmlink defeats chroot, because when the parent filesystem activates a passive translator it gives it an initial execution environment which provides authority (namely: a root directory port) that exceeds the creator of that passive translator setting. The creator, in the case of chroot+firmlink is a process that was chrooted. We must be careful to distinguish users from processes in this case. A user might have access to files outside any given chroot, while a specific process owned by that user might not. Focussing on this particular problem, any fix must implement the following invariant: The parent filesystem must not give the activated firmlink more authority than the creator of the passive translator setting possessed at creating time. The current Hurd implementation violates this invariant by making the false assumptions that all processes have access to the root of the bootstrap filesystem (or rather: whatever the root of the parent filesystem happens to be). The problem statement raises the question how the authority that a process possesses can be stored in the filesystem as serialized (textual), "persistent", data. In the Hurd, all authority is designated by Mach ports. Thus, to fulfill this invariant, we would have to somehow identify and store as text data the Mach ports that the chrooted client process has. Even limiting this to filesystem authority, this is a hard problem. Some people suggested that somehow the chroot setting should be stored in the passive translator setting. This would work, *iff* there is a single global filesystem hierarchy in which all authority (all files and directories that can be currently open) can be named by a path name. Furthermore, these path names have to be stable under reboot and other operations, ie, they must mean the same thing at creation time of the passive translator as at the time the translator is activated. Even the most superficial investigation will quickly show that this is not at all true for the Hurd, and even for Unix systems in general. I can open a file and then delete it, ending up with authority that can not be named. I can use names in /tmp, which are not stable under reboot. In the Hurd, I can create filesystems which are detached from the "global" filesystem (which is in fact not global at all then), and run programs in their own filesystems which are not even loosely connected to each other. Storing the authority (== chroot setting) as path names in a Hurd system plainly does not work. If you pursue this further (see [1]), you will find out that one way to capture the maximum authority a process has is by making all kernel objects persistent (ie, the kernel knows how to write them to disk and read them back after a reboot). With that feature, most authority can be re-established exactly (only device drivers, network handles and other external authority has to be dropped). But if you have that, passive translators are not needed anymore, because active translators just persist forever, and do not need to be restarted (you might point out that they need to be restarted on a crash: a crash monitor can do this transparently without filesystem support). In absence of a way to reconstitute the authority of the creator, the only recourse is to err on the safe side and stick to the maximum authority we can prove the creator had. This set contains exactly the underlying node on which the passive translator was installed. So, the safe fallback is to start the translator chroot'ed to its underlying node. Some translators can run happily that way (pipe server and auth server for example). firmlink and other filesystems often rely on backing store not reachable through the underlying node. These won't be functional with that restriction. We found a secure solution, but we lost a lot of functionality at the same time. Not good. You might wish that there is a middle ground. But without making very restrictive, un-hurdish, assumptions about how you structure your file systems, I don't see how to achieve that middle ground with passive translators. Transparent persistence has the nasty tendency to drag with it all objects of the system, because of the many interdependencies between components. Does this mean that passive translators are lost? Maybe, but we should remind us that passive translators are *not* a goal, but a *mechanism* to achieve a goal. We should look at the function they provide and see if we can solve the problem they try to solve differently. Passive translators are a poor man's persistence. They are a mechanism to recover state that was lost due to a system crash or reboot. There are many ways to recover such state. One is transparent orthogonal persistence, which is what is described above. Another is manual persistence. This means that the translator programs themselves know how to reconstitute their state. How can this work? Pretty much the same way that Unix daemons do this today, by startup scripts, configuration files, etc. Here is a suggestion of a mechanism which fulfills most of the functions of passive translators, and does not have the firmlink or similar problem. This solution does not use passive translators at all. Before reading on, you might train your design instinct by coming up with your own solution. To compare it against mine, I will point out below a specific design requirement, and you should see if your suggestion addresses that. Found a solution that satisfies you? Read on. The system, and each user, start a "translator server" at startup. This server attaches itself to a number of file nodes (the list is taken from a configuration file, for example). Whenever a node it covers is accessed for the first time, it takes the translator setting and other configuration (like chroot-ing, or whatever) for that node in its configuration file, and then starts an active translator with that configuration on that node. Thus, it replaces itself with the "real" active translator on that node, and all accesses (including the first) are then resolved on that new, actively translated, node. Hurd translators can (if not already, then at least in principle) be "stacked" on a node, so that the "translator server" is still running "beneath" the active translator it started. When that translator dies, for example to release all resources after a timeout, accesses go to the "translator server" again---this feature provides race-free transparent restart of translators, just as it exists with passive translators. (Does your solution also fulfill this requirement of atomic, transparent restart?). With this proposal, passive translators are not used anymore. Why does this fix the firmlink problem? When a chrooted program wants to install a firmlink translator, it can only start it as an active translator, which can not extend its authority. The equivalent action to installing a passive translator in the new system is to change the user's "translator server"'s configuration file, something that the chrooted program presumably has no authority to do. It could start its own "translator server" (and would be encouraged to do so), but this "translator server" also would not have excess authority, thereby limiting its possible actions. The only feature of passive translators that this proposal does not provide is decentralised storage of the translator settings. You might see this as a disadvantage. On the other hand, the proposal decentralizes the *policy* of how to "activate" translators, and that is much more important in my opinion. The user can configure the root directory for each translator setting, for example. Note that it might still be difficult for the user to find good and secure settings. But at least a mechanism is available which allows correct behaviour and still admits useful functionality. By the way: The above discussion settles the issue only within the other framework of the current Hurd implementation. Notably, performance and stability considerations are not included. Such analysis is better done in the framework of the Hurd-NG project, which tries to tackle these issues systematically. But even in isolation, the above proposal as it stands should work well within the current Hurd-on-Mach framework, and from what I can tell at this point, the general approach above seems "future proof", in the sense that it works within the context of Hurd-NG as well. Nevertheless, I did not think about these issues about quite a while, so don't hold it against me if I decide that details have to change, or a different strategy works better. For more information about an analysis of the firmlink problem and other related issues in the Hurd, see [1]. That paper does not include the above proposal, but it describes the problem extensively and relates it to other issues. It also explains how persistence can help to understand and fix such issues. [1] A Critique of the GNU Hurd Multi-server Operating System with Marcus Brinkmann. ACM SIGOPS Operating Systems Review special issue on Secure Small-Kernel Systems, 41(3), July 2007. http://www.walfield.org/papers/200707-walfield-critique-of-the-GNU-Hurd.pdf Thanks, Marcus
