On Fri, Mar 2, 2018 at 6:34 PM, Grant Taylor
> On 03/02/2018 05:08 AM, Rich Freeman wrote:
>> On the other hand, if netfilter were implemented in userspace such as via
>> a microkernel, then if it contained a bug the remote attacker would be able
>> to MITM all network traffic on the machine, but that would be the extent of
>> the access they have.
> I don't know that it would be the extent of the access the attacker would
> have. It might also be a beachhead that could be used as a starting point
> for future attacks.
How? You'd need a local priv escalation vulnerability to do anything
further. If the same bug existed in kernel space you'd already have
kernel privs and own the machine.
It would be the exact same code whether it is running in userspace or
kernelspace. It isn't like code is magically immune to bugs if it is
in the kernel. It would probably be maintained by the exact same
people either way.
>> The process running the netfilter code doesn't need anything other than a
>> pipe back to the kernel to receive packets and send packets back, so it can
>> run with minimal privs otherwise.
> I think that more than a simple pipe (as in unix socket) is needed.
> Currently, any program that uses IP is expecting a socket to behave like it
> currently behaves. I don't think a simple pipe can provide that.
There would be no change to regular software. They would use the same
system calls to open sockets. They would send their packets to the
kernel. The kernel would send them to the userspace netfilter
process. The userspace netfilter process would send them back to the
kernel. The kernel would then send them to the physical layer for
That is how microkernels work. The kernel is still the central point
of contact and the system calls basically work the same way as they do
today. However, the kernel offloads as much processing to userspace
With filesystems it is no different with a microkernel. You use the
same system calls to write to a file. The data to be written goes to
the kernel. However, instead of the kernel calling the filesystem
layer in kernel space it instead sends the data via IPC of some sort
to a filesystem driver running in userspace. It then sends the raw
block device instructions back to the kernel, which then passes it to
the device driver for the disk.
>> a lot of the boot-time mounting logic and devfs/etc logic has gone away in
>> favor of initramfs and udev.
> Please provide examples of this "…boot-time mounting logic and devfs/etc
> logic…" that used to be in kernel.
> I'll argue that devfs is now in kernel when it used to be files on a file
> system or dynamically created by a user space process. As far as I know,
> mounting (more than root as RO) has always been driven from user space via
> init scripts.
I'm talking about mounting root. Capabilities such as identifying
devices by UUID have not been added to the kernel, with this being
done in an initramfs instead. The trends has been in that direction
with assembling RAID arrays and such as well. They haven't removed
much code that is working, but they haven't been enhancing it either.
If you use an initramfs the kernel automatically disables most of the
I believe there was a period of time after devfs came along but before
udev came along that the complexity of hotplug/etc seemed to be
growing on the kernel side. This was quickly recognized as a losing
battle which is why we have udev today (or its alternate
implementations - one of the benefits of moving this stuff out of the
kernel is that it makes it easier to use alternate implementations).
Obviously mounting filesystems other than root have never been in the kernel.
> Sure, there's a LOT of changes going on in that space, particularly around
Well, unless you're referring to udev (which got absorbed by systemd
though it is more-or-less still separate), I don't think there is
actually a great deal that systemd does that would otherwise be done
in kernel space. Maybe some of the maintenance of CGROUPS, but that
was all done in userspace from the start, as this trend is fairly
established now and it was never done in kernel space.
>> And of course if this is done it is done correctly, and not as some kind
>> of userspace hack on top of an OS to add features that it lacks.
I said that because I think your view might be a bit tainted by
previous experiences in Windows/etc. There is a difference between
designing a kernel subsystem to provide a capability but to offload
some of the work to userspace, and trying to layer some kind of
capability into an OS that otherwise lacks it. All this stuff is
designed into linux so that it is robust.
There are pros and cons to microkernels, and of course linux will
probably never turn into a proper microkernel, and I'm not really even
saying it should. However, the fact that stuff is done in userspace
doesn't mean that it needs to be done to a lower standard, or that it
is inherently less secure. In fact, it is generally MORE secure in
the sense that problems get contained.
And Windows has gotten a lot better on these fronts as well. I'm sure
anybody who plays games on windows will have noticed that video card
drivers can be updated without a reboot, or even logging out. This is
because there is actually more isolation for these drivers in Windows,
and the OS can completely restart the video driver without really
affecting everything else that is running, despite the fact that
windows is even more GUI-centric than X11/linux/etc. Windows isn't a
microkernel either, but it does have some isolation features that are
a bit more robust than on Linux. (It would not surprise me if Linux
contained better isolation in other areas - a lot of this comes down
to xorg-server being unable to detach itself from a display - if you
could have X11 detatch from the display (while still serving clients),
have the kernel remove and reload the video module, and then have X11
re-attach, that would accomplish something similar to how Windows does
In any case, this is all academic, as there are no plans to move
netfilter to userspace.