On 2025-09-10 02:03, Konstantin Belousov wrote:
First, since you already mentioned a desire to capsicumize jfds, I
think it
is already a huge wart in the interface. The function that opens (or
creates) fd from a jail id, must not take just jail. It should be
namespace-aware already. In other words, it should take existing jfd
and create a child jail, returning jfd for it. The existing jfd gives
the namespace container to start with, which is essentially how
capsicum
is organizing the rights limiting.
For the bootstrapping, the prison0 non-capentered process can pass a
special
id for jfd to reference prison0, similar how AT_FWCWD marks '.' for
*at(2)
syscalls.
The current jaildesc code is namespace-aware, via the JAIL_AT_DESC
flag. So if you have a descriptor for jail "foo" and you create
"bar", you end up creating "foo.bar" just as you would if you were
already attached to jail "foo". Similarly, if you look up by jid,
it only works when that jail is a descendant of "foo".
Yes, getting jid 0 makes sense for bootstrapping - it already means
"the current jail" in other contexts. The resulting descriptor would
be flagged as only for JAIL_AT_DESC use, without the ability to
modify, remove, or attach to it, regardless of whether capsicum is
enabled.
- Jamie