Sebastien Roy wrote:
usr/src/uts/common/inet/iptun/iptun.c
* 746-750: This explains the implementation of conn_get_ixa(). This
comment should be the intro block-comment to conn_get_ixa() in
ip_attr.c.
Fixed.
* 759: This looks like it belongs in conn_get_ixa() as there is only
one possible value for the ixa_cred of the ixa returned by
conn_get_ixa(), and that is conn_cred.
That code isn't needed; too much copy and paste. (It only applies to UDP
where we want to track the cred for each send* so that SCM_UCRED can be
used on the receive side.)
However, re-reading the TX code I realize we need to always restart with
ixa_tsl = crgetlabel(ixa_cred) before we call tsol_check_dest(), but
that code belongs in ip_set_destination_*() where we call tsol_check_dest().
* 761-769: This also looks like something that should be in
conn_get_ixa().
Ditto. (it only applies to UDP ancillary data where the label change
for each packet due to SCM_UCRED).
* 768: Won't we need to set ixa_tsl on a per-packet basis?
Yes, we do that by calling tsol_check_label_* in the output path. But to
do that we need the initial label which is based on the zone.
* 786-800: It seems like there should be some utility function to set
conn addresses. For example, the fact that IPv4 addresses are
stored as IPv4-mapped IPv6 addresses in the conn_t seems like an
implementation detail of ip that iptun shouldn't really need to care
about.
I already responded to this one.
* 848: iptun_get_maxmtu() doesn't set the tunnel MTU, it just returns
the maximum possible link MTU for a tunnel link. The call to
iptun_update_mtu() at 886 does what you want anyway (and that calls
iptun_get_maxmtu() to do its job).
Fixed.
* 862: This can't happen because we call iptun_unbind() before
iptun_bind() if the tunnel is already bound. I suppose you could
optimize things to make the iptun_unbind() unnecessary if you want.
I'll remove 862.
* 870,872: We should just have an ipcl_conn_insert() that calls
ipcl_conn_insert_v[4|6]() depending on conn_ipversion.
tcp_conn_request() and conn_connect() both have the same little
if/else dance.
OK, but for uniformity it makes sense to do the same for ipcl_bind_insert.
* 891: The convention in this file is to use "done:" as the label, as
the code that goes through here is in most cases not related to bind
failure.
Will fix.
* 1311: The current process is some dladm incantation that will go
away momentarily. It's probably not meaningful to save that away as
conn_cpid.
I'll use NOPID instead.
* 1795: There are some places in the code where ip_get_pmtu() is
called with an argument of connp->conn_ixa directly without first
getting a safe reference using conn_get_ixa() (in icmp_icmp_input()
for example). Is that deliberate?
We only call ip_get_pmtu() in one place, but there are places where we
use conn_ixa directly. However, currently that is only in iptun_create().
Where did you see it be used?
* 1807: I know they're not used, but it would help legibility to call
the first arguments using names representing what was passed in. On
a related note, there seems to be no reason why the first argument
couldn't simply be a "conn_t *" instead of "void *".
Agreed. Will fix.
* 1824: You don't need a default case.
OK
* 1976: What is this call to ip_attr_connect() about?
If ixa_ire was NULL we got a safe copy.
But I think I'll change the ixa_notify upcall to pass up the
ip_xmit_attr_t; that way we will have the currently used ip_xmit_attr_t
and can modify it directly which avoids the conn_get_ixa in
iptun_get_maxmtu.
* 2029: typo: there is no iptun_get_pmtu().
Should say ip_get_pmtu()
* 2043: In the case of IPTUN_FIXED_MTU, don't we only need update
ixa_pmtu when we initially set iptun_mtu?
I already responded to this one.
I'll also add some comment to make this more clear.
* 2093: The ICMP error doesn't need to have the same cred as the
original packet?
I read RFC 791 and there aren't credentials for IP packets ;-)
What is the purpose of the cred on the packet?
We use them for getpeerucred() and SCM_UCRED. They have also been
overloaded to carry a label, but with refactoring we have the explicit
ixa_tsl for the label.
* 2150,2200: The cred associated with the conn_t won't necessarily be the
same as the cred associated with the data. The label we want in the
packet is the one associated with the data, right? Labels used to
be computed using dblk creds...
I was hoping to already have merged with labeled IPsec, to be able to
answer authoritatively.
Given that iptun in onnv-gate has the implicit side effect of preserving
db_credp hence the label from the original ICMP error, it makes sense to
do the same by explicitly passing the ts_label_t to the iptun_icmp
functions.
* 2677: The first argument is always a conn_t *, so there appears to
be no reason to pass in void * that is then set to a conn_t *.
conn_recv was introduced by FireEngine - conn_recvicmp just follows what
conn_recv started - it doesn't make sense to have them be different.
And I've exceeded the quota for how much of FireEngine I'm allowed to
clean up ;-)
* 2677: The third argument is always NULL, so why have a third argument?
It is non-NULL for tcp, just like conn_recv has a non-NULL third
argument for TCP.
* 2769: The first argument is always a conn_t *, so let's not use void *.
See above.
* 2769: The third argument is always NULL, let's remove it.
See above.
* 2781: Can we use ira_ip_hdr_len to make this faster and simple?
It will be a bit faster at the expense of duplicated code. The
duplication would be to have
outer_hlen = ira->ira_ip_hdr_length;
if (ira->ira_flags & IRAF_IS_IPV4) {
outer4 = (ipha_t *)data_mp->b_rptr;
outer6 = NULL;
} else {
outer4 = NULL;
outer6 = (ip6_t *)data_mp->b_rptr;
}
if (MBLKL(data_mp) == outer_hlen) {
inner_mp = data_mp->b_cont;
ipha = (ipha_t *)inner_mp->b_rptr;
} else {
inner_mp = data_mp;
ipha = (ipha_t *)(data_mp->b_rptr + outer_hlen);
}
switch (IPH_HDR_VERSION(ipha)) {
case IPV4_VERSION:
if (inner_mp->b_wptr - (uint8_t *)ipha < sizeof (ipha_t))
goto drop;
inner4 = ipha;
inner6 = NULL;
break;
case IPV6_VERSION:
if (inner_mp->b_wptr - (uint8_t *)ipha < sizeof (ip6_t))
goto drop;
*inner4 = NULL;
*inner6 = (ip6_t *)ipha;
break;
default:
goto drop;
}
thus a subset of what iptun_find_headers does.
Is that what you want?
* 2804: I believe the answer is yes, but danmcd should confirm. In
the old world, I believe an ipsec_in_t M_CTL would be pre-pended
even on a clear-text packet if there was global policy present.
This would cause ipsec_tun_inbound() to be called on such packets.
I'll check with Dan.
One would thus assume that IRAF_IPSEC_SECURE should be set if there
was global policy present in the ip input path(?).
That doesn't seem like the natural semantics of a flag that says SECURE
- it shouldn't be set if the packet was received in the clear.
Perhaps an IRAF_IPSEC_HAS_GLOBAL_POLICY would make sense.
* 2989: Don't we lose the db_credp here?
Yes, intentionally. db_credp in the new world is only needed for certain
messages going to sockfs for getpeerucred(), and in the case of TX for
certain messages going to rpcmod.
Using db_credp less means less overloaded semantics.
* 3183,3276: We call iptun_find_headers() twice in a row for 6to4
output processing.
The alternative would be to pass 5 more arguments (outer{4,6},
inner{4,6} and outer_hlen) to iptun_output_common(). Is that better?
* 3246-3259: Can this be done before calling iptun_output_common() so
that iptun_output_common() becomes a tail-call?
ixa_refrele() can't be done until we are done with the ixa.
If you think the extra stack frame makes a big difference I can apply a
pragma inline.
* 3297: I'm thinking that it might make sense to unconditionally
include the labeling overhead in the link MTU if
is_system_labeled(). This would simplify things a bit I think and
would reduce the round trips necessary to make pmtu converge on
labeled systems. This isn't new with your changes, but just a
fleeting thought.
The labeling overhead is not constant; different 6to4 destinations can
have CIPSO and not have CIPSO.
I don't know if it would make sense to always subtract the maximum size
of a CIPSO option on a labeled system. Was that what you had in mind?
Erik
_______________________________________________
networking-discuss mailing list
[email protected]