firmlinks and chroots

Marcus Brinkmann Tue, 11 Sep 2007 23:49:09 -0700

Hi,

I am not subscribed to this list.  It is not the forum where I usually
discuss these issues.  My input will be limited to specific technical
issues where it is relatively easy to straighten them out.  This mail
is about firmlinks and chroots and security implications.  It is not
about firmlink and rm -fR.


Firmlink defeats chroot, because when the parent filesystem activates
a passive translator it gives it an initial execution environment
which provides authority (namely: a root directory port) that exceeds
the creator of that passive translator setting.  The creator, in the
case of chroot+firmlink is a process that was chrooted.  We must be
careful to distinguish users from processes in this case.  A user
might have access to files outside any given chroot, while a specific
process owned by that user might not.

Focussing on this particular problem, any fix must implement the
following invariant:

  The parent filesystem must not give the activated firmlink more
  authority than the creator of the passive translator setting
  possessed at creating time.

The current Hurd implementation violates this invariant by making the
false assumptions that all processes have access to the root of the
bootstrap filesystem (or rather: whatever the root of the parent
filesystem happens to be).

The problem statement raises the question how the authority that a
process possesses can be stored in the filesystem as serialized
(textual), "persistent", data.  In the Hurd, all authority is
designated by Mach ports.  Thus, to fulfill this invariant, we would
have to somehow identify and store as text data the Mach ports that
the chrooted client process has.  Even limiting this to filesystem
authority, this is a hard problem.

Some people suggested that somehow the chroot setting should be stored
in the passive translator setting.  This would work, *iff* there is a
single global filesystem hierarchy in which all authority (all files
and directories that can be currently open) can be named by a path
name.  Furthermore, these path names have to be stable under reboot
and other operations, ie, they must mean the same thing at creation
time of the passive translator as at the time the translator is
activated.  Even the most superficial investigation will quickly show
that this is not at all true for the Hurd, and even for Unix systems
in general.  I can open a file and then delete it, ending up with
authority that can not be named.  I can use names in /tmp, which are
not stable under reboot.  In the Hurd, I can create filesystems which
are detached from the "global" filesystem (which is in fact not global
at all then), and run programs in their own filesystems which are not
even loosely connected to each other.  Storing the authority (==
chroot setting) as path names in a Hurd system plainly does not work.

If you pursue this further (see [1]), you will find out that one way
to capture the maximum authority a process has is by making all kernel
objects persistent (ie, the kernel knows how to write them to disk and
read them back after a reboot).  With that feature, most authority can
be re-established exactly (only device drivers, network handles and
other external authority has to be dropped).  But if you have that,
passive translators are not needed anymore, because active translators
just persist forever, and do not need to be restarted (you might point
out that they need to be restarted on a crash: a crash monitor can do
this transparently without filesystem support).

In absence of a way to reconstitute the authority of the creator, the
only recourse is to err on the safe side and stick to the maximum
authority we can prove the creator had.  This set contains exactly the
underlying node on which the passive translator was installed.  So,
the safe fallback is to start the translator chroot'ed to its
underlying node.  Some translators can run happily that way (pipe
server and auth server for example).  firmlink and other filesystems
often rely on backing store not reachable through the underlying node.
These won't be functional with that restriction.  We found a secure
solution, but we lost a lot of functionality at the same time.  Not
good.

You might wish that there is a middle ground.  But without making very
restrictive, un-hurdish, assumptions about how you structure your file
systems, I don't see how to achieve that middle ground with passive
translators.  Transparent persistence has the nasty tendency to drag
with it all objects of the system, because of the many
interdependencies between components.

Does this mean that passive translators are lost?  Maybe, but we
should remind us that passive translators are *not* a goal, but a
*mechanism* to achieve a goal.  We should look at the function they
provide and see if we can solve the problem they try to solve
differently.  Passive translators are a poor man's persistence.  They
are a mechanism to recover state that was lost due to a system crash
or reboot.  There are many ways to recover such state.  One is
transparent orthogonal persistence, which is what is described above.
Another is manual persistence.  This means that the translator
programs themselves know how to reconstitute their state.  How can
this work?  Pretty much the same way that Unix daemons do this today,
by startup scripts, configuration files, etc.  Here is a suggestion of
a mechanism which fulfills most of the functions of passive
translators, and does not have the firmlink or similar problem.  This
solution does not use passive translators at all.

Before reading on, you might train your design instinct by coming up
with your own solution.  To compare it against mine, I will point out
below a specific design requirement, and you should see if your
suggestion addresses that.  Found a solution that satisfies you?  Read on.

The system, and each user, start a "translator server" at startup.
This server attaches itself to a number of file nodes (the list is
taken from a configuration file, for example).  Whenever a node it
covers is accessed for the first time, it takes the translator setting
and other configuration (like chroot-ing, or whatever) for that node
in its configuration file, and then starts an active translator with
that configuration on that node.  Thus, it replaces itself with the
"real" active translator on that node, and all accesses (including the
first) are then resolved on that new, actively translated, node.  Hurd
translators can (if not already, then at least in principle) be
"stacked" on a node, so that the "translator server" is still running
"beneath" the active translator it started.  When that translator
dies, for example to release all resources after a timeout, accesses
go to the "translator server" again---this feature provides race-free
transparent restart of translators, just as it exists with passive
translators.  (Does your solution also fulfill this requirement of
atomic, transparent restart?).  With this proposal, passive
translators are not used anymore.

Why does this fix the firmlink problem?  When a chrooted program wants
to install a firmlink translator, it can only start it as an active
translator, which can not extend its authority.  The equivalent action
to installing a passive translator in the new system is to change the
user's "translator server"'s configuration file, something that the
chrooted program presumably has no authority to do.  It could start
its own "translator server" (and would be encouraged to do so), but
this "translator server" also would not have excess authority, thereby
limiting its possible actions.

The only feature of passive translators that this proposal does not
provide is decentralised storage of the translator settings.  You
might see this as a disadvantage.  On the other hand, the proposal
decentralizes the *policy* of how to "activate" translators, and that
is much more important in my opinion.  The user can configure the root
directory for each translator setting, for example.  Note that it
might still be difficult for the user to find good and secure
settings.  But at least a mechanism is available which allows correct
behaviour and still admits useful functionality.

By the way: The above discussion settles the issue only within the
other framework of the current Hurd implementation.  Notably,
performance and stability considerations are not included.  Such
analysis is better done in the framework of the Hurd-NG project, which
tries to tackle these issues systematically.  But even in isolation,
the above proposal as it stands should work well within the current
Hurd-on-Mach framework, and from what I can tell at this point, the
general approach above seems "future proof", in the sense that it
works within the context of Hurd-NG as well.  Nevertheless, I did not
think about these issues about quite a while, so don't hold it against
me if I decide that details have to change, or a different strategy
works better.

For more information about an analysis of the firmlink problem and
other related issues in the Hurd, see [1].  That paper does not
include the above proposal, but it describes the problem extensively
and relates it to other issues.  It also explains how persistence can
help to understand and fix such issues.

[1] A Critique of the GNU Hurd Multi-server Operating System with
Marcus Brinkmann. ACM SIGOPS Operating Systems Review special issue on
Secure Small-Kernel Systems, 41(3), July 2007.  
http://www.walfield.org/papers/200707-walfield-critique-of-the-GNU-Hurd.pdf

Thanks,
Marcus

firmlinks and chroots

Reply via email to