On Sat, 12 Jul 2025 at 03:39, James Gritton <[email protected]> wrote:

> On 2025-07-11 02:15, Bjoern A. Zeeb wrote:
> > On Fri, 11 Jul 2025, Doug Rabson wrote:
> >
> >> I do have if_bridge loaded on the base system. With your examples, I
> >> can
> >> verify that creating bridges and epairs as well as adding some of
> >> those
> >> epairs to the bridge works in a simple vnet=new jail. For the scenario
> >> where network management for the host is delegated to a trusted jail,
> >> I
> >> haven't been able to create a bridge:
> >>
> >> jail -c host.hostname=foo vnet=inherit path=/ persist
> >> jexec <JID>
> >> root@foo:/ # ifconfig bridge create
> >> ifconfig: socket(family 2,SOCK_DGRAM): Protocol not supported
>
> ...
>
> >> Thanks for the feedback - it does seem that nesting Podman containers
> >> should work already - I was working on debugging the vnet=inherit use
> >> case
> >> and assumed vnet=new would be the same.
> >
> > I am a bit surprised too.  I was expecting PR_VNET to also be inherited
> > and with that the priv checks being the same.  After all the "parent"
> > says to a "child" 'you can have all I have'.
>
> vnet is an exception to the "all I have," that comes from the pre-vnet
> behavior of a jail inheriting the (only existing) network stack and not
> having these permissions by default.  With vnet=inherit, you inherit
> the vnet but only in the same way a regular jail inherits the regular
> network.  So if you want extra protocol support, you still need
> allow.socket_af.
>
> Unlike most subsystem flags, vnet doesn't have the "disable" setting.
> There's only "new" which is "your own vnet and all the freedom that
> comes with it", or "inherit" which "not your own vnet and all the
> restrictions that come with it."  This is unlike host=inherit, where
> the sub-jail is allow to change the hostname in both jails for example.
> It might have been better to call it "novnet" but I wanted to keep
> with the subsystem flags like the rest.
>
> > socreate(): does
> >
> >     943         if (prison_check_af(cred, prp->pr_domain->dom_family)
> > != 0)
> >     944                 return (EPROTONOSUPPORT);
> >
> > and
> >
> >    3458 int
> >    3459 prison_check_af(struct ucred *cred, int af)
> > ...
> >   3467 #ifdef VIMAGE
> >    3468         /* Prisons with their own network stack are not
> > limited. */
> >    3469         if (prison_owns_vnet(cred))
> >    3470                 return (0);
> >    3471 #endif
> >
> > You can probably work around this using:
> >
> >    3504         default:
> >    3505                 if (!(pr->pr_allow & PR_ALLOW_SOCKET_AF))
> >    3506                         error = EAFNOSUPPORT;
> >    3507         }
> >    3508         return (error);
> >
> > But that would likely have to come all the way up, which is no good.
>
> So from what I said, the first part is right, where prison_owns_vnet
> stops
> the rest from being checked.
>
> Yes, it looks like allow.socket_af would have to come all the way up,
> and that's where the problem lies.  Currently, we bypass that permission
> check for vnet jails, but it would make sense for them to actually have
> that and other similar inet-safety flags set in vnet jails by default.
> That doesn't change the behavior of the vnet jails themselves, since
> they bypass those bits anyway.  It also doesn't change the default
> behavior of those sub-jails, since those bits aren't inherited without
> explicit mention.  But it would make sense to allow the vnet jail to
> provide its children with those bits regardless of the vnet jail's
> parent.
>
> I think defining a set of permission bits that are always turned on for
> vnet jails wouldn't cause any unpleasant surprises.
>

I tried setting allow.socket_af for a jail which inherits the host vnet and
this still has problems creating interfaces:

$ jid=$(sudo jail -i -c host.hostname=foo vnet=inherit allow.socket_af
path=/ persist)
$ sudo jexec $jid
You have mail.
root@foo:/ # ifconfig bridge create
ifconfig: socket(family 2,SOCK_DGRAM): Protocol not supported
root@foo:/ # exit
exit
$ sudo jail -r $jid

I still think this use of jails is reasonable as long as you trust the code
which will run in the jail. I think I can work around this and arrange for
the interface create and add to happen on the host.

Doug.

Reply via email to