[clearview-discuss] Clearview IP Tunneling code-review [due August 12]

Sebastien Roy Mon, 31 Aug 2009 18:29:08 -0400

On Mon, 2009-08-31 at 14:37 -0700, Peter Memishian wrote:
> > >  > >      * 129: Why is this IPTUN_MIN_IPV6_MTU and not 
> > > IPTUN_MIN_IPV4_MTU?
>  > >  > 
>  > >  > Because only IPv6 interfaces are plumbed above 6to4 links.
>  > > 
>  > > My understanding was that these values corresponded to the MTU of the
>  > > underlying tunnel, hence the IPv4 values for an IPv4 tunnel and the IPv6
>  > > values for an IPV6 tunnel.
>  > 
>  > They are the minimum possible values for the tunnel link MTU, yes.  For
>  > a link that only supports IPv6 at the network layer (such as 6to4), it
>  > doesn't make much sense to support a link MTU that is less than the
>  > network layer minimum.  It's simply an optimization to eliminate a range
>  > of values that can't possibly work.
> 
> I'm still not sure I understand; why can't the packet be fragmented if
> necessary?


Hmm, I suppose packets would be fragmented in that case, but it wouldn't
be optimal (didn't we have a similar discussion regarding "optimal" MTUs
for Ethernet recently?)...  I'll set the minimum to the IPv4 minimum to
reflect the underlying limitations of the IPv4 media.

> In any case, I think this warrants a comment, because it looks
> like a typo.
> 
>  > >  > >     * 430: Would've expected the default to be the value that's
>  > >  > >       initialized in iptun_create() -- i.e. iti_maxmtu.
>  > >  > 
>  > >  > This is related to your earlier comment.  The current maximum is equal
>  > >  > to the current dynamically computed maximum.  If we decide that such a
>  > >  > maximum is artificial and asymmetric, an absolute maximum could be
>  > >  > provided (like iti_maxmtu).
>  > > 
>  > > My comment was about the default value, not the maximum value.
>  > 
>  > Ah yes, I had my wires crossed.  The default value is the current value
>  > in use by the link if the property isn't otherwise set.  It is therefore
>  > the currently computed MTU as returned by iptun_get_maxmtu().
> 
> That sounds more like "there is no default value".  That is, if this
> property sets a specific MTU, and by default there is no specific MTU,
> then this seems like it should be "--", not the current MTU.  This is
> similar to what we do for the `speeds' WiFi property.

Ah, yes, that's a good analogy.  I'll do the same.

> 
>  > >  > >     * 925: Is this call to dls_mgmt_get_linkinfo() safe given that
>  > >  > >       iptun_lock is held?
>  > >  > 
>  > >  > Yes.  This isn't an "upcall" since it's just a utility function in the
>  > >  > dls module.  It doesn't enter this link's perimeter.
>  > > 
>  > > But it will call up to userland with the lock held, which seems bad since
>  > > this imposes subtle and fragile restrictions on what userland can do
>  > > (i.e., it will deadlock if it tries to do anything that needs to acquire
>  > > iptun_lock).
>  > 
>  > In this case, the function is "give me this link's name".  It will not
>  > reenter into the iptun module (this door upcall simply fills in the data
>  > structure and returns).  I don't really see what userspace has to do
>  > with this, though.  If mapping a linkid to a link name required calling
>  > back into the driver, then why would it not need to call back into the
>  > driver if the function were implemented entirely in the kernel?
>  > (anyway, I digress, I'm confident that this particular function is safe)
> 
> In general, I think it's bad architecture to make an upcall with a lock
> held, as it makes the robustness of the system contingent on the behavior
> of userland code operating in a context with unstated constraints.  More
> generally, holding locks across callouts is generally problematic but
> across the userland/kernel boundary it's much, much worse.

Note that this particular function is already called in multitudes of
places with various locks held (mac_client_open(),
dls_devnet_stat_create(), dls_devnet_set(), dls_devnet_rename(), and
devnet_filldir()).  This tells me that we either have functionality that
naturally belongs in the kernel stranded in user space, or that we
really need to do away with the need to map between linkid's and link
names (perhaps by getting rid of linkid's).

Anyway, I suppose that in this case, if I must, I would have to do the
linkid to link name translation from iptun_modify() before entering the
tunnel, and pass the link name through to
iptun_setparams()->iptun_set_sec_simple().

>  > We can get here, but only under contrived scenarios.  You'd have to set
>  > IPsec policy on the tunnel without having created any IPsec keys nor
>  > started IKE nor loaded any other global IPsec policy.  When we create
>  > tunnels during boot, those other things are done first.  When IPsec
>  > tunnels are created by something like a VPN, those things are also done
>  > first.  It would only be in a case where someone is playing around and
>  > doing things out of order that this would come up.  They'd get EAGAIN,
>  > and indeed, trying again would work.
> 
> EAGAIN is just so pathetic; I really dislike this.

Me too.  Unfortunately, the IPsec loading mechanism is heavily
entrenched in conn_t STREAMS queues.  It's also extremely complicated
relative to what it does, and reimplementing this part of IPsec is
something that I eliminated as a possibility back when I contacted the
IPsec team and Erik Nordmark about this a while back.  Aside from that,
do you see a graceful way around this edge case?

-Seb

[clearview-discuss] Clearview IP Tunneling code-review [due August 12]

Reply via email to