Daniel Borkmann <dan...@iogearbox.net> [Thu, 2018-05-24 18:00 -0700]: > On 05/23/2018 01:40 AM, Andrey Ignatov wrote: > [...] > > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c > > index ff4d4ba..a1f9ba2 100644 > > --- a/net/ipv4/udp.c > > +++ b/net/ipv4/udp.c > > @@ -900,6 +900,7 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, > > size_t len) > > { > > struct inet_sock *inet = inet_sk(sk); > > struct udp_sock *up = udp_sk(sk); > > + DECLARE_SOCKADDR(struct sockaddr_in *, usin, msg->msg_name); > > struct flowi4 fl4_stack; > > struct flowi4 *fl4; > > int ulen = len; > > @@ -954,8 +955,7 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, > > size_t len) > > /* > > * Get and verify the address. > > */ > > - if (msg->msg_name) { > > - DECLARE_SOCKADDR(struct sockaddr_in *, usin, msg->msg_name); > > + if (usin) { > > if (msg->msg_namelen < sizeof(*usin)) > > return -EINVAL; > > if (usin->sin_family != AF_INET) { > > @@ -1009,6 +1009,22 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, > > size_t len) > > rcu_read_unlock(); > > } > > > > + if (!connected) { > > + err = BPF_CGROUP_RUN_PROG_UDP4_SENDMSG_LOCK(sk, > > + (struct sockaddr *)usin, &ipc.addr); > > + if (err) > > + goto out_free; > > + if (usin) { > > + if (usin->sin_port == 0) { > > + /* BPF program set invalid port. Reject it. */ > > + err = -EINVAL; > > + goto out_free; > > + } > > + daddr = usin->sin_addr.s_addr; > > + dport = usin->sin_port; > > + } > > + } > > + > > saddr = ipc.addr; > > ipc.addr = faddr = daddr; > > > > diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c > > index 2839c1b..67c44b5 100644 > > --- a/net/ipv6/udp.c > > +++ b/net/ipv6/udp.c > > @@ -1315,6 +1315,29 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr > > *msg, size_t len) > > fl6.saddr = np->saddr; > > fl6.fl6_sport = inet->inet_sport; > > > > + if (!connected) { > > + err = BPF_CGROUP_RUN_PROG_UDP6_SENDMSG_LOCK(sk, > > + (struct sockaddr *)sin6, &fl6.saddr); > > + if (err) > > + goto out_no_dst; > > + if (sin6) { > > + if (ipv6_addr_v4mapped(&sin6->sin6_addr)) { > > + /* BPF program rewrote IPv6-only by IPv4-mapped > > + * IPv6. It's currently unsupported. > > + */ > > + err = -ENOTSUPP; > > + goto out_no_dst; > > + } > > + if (sin6->sin6_port == 0) { > > + /* BPF program set invalid port. Reject it. */ > > + err = -EINVAL; > > + goto out_no_dst; > > + } > > + fl6.fl6_dport = sin6->sin6_port; > > + fl6.daddr = sin6->sin6_addr; > > + } > > Hmm, this extra work here and in v4 case should probably all be done under > the static key? Otherwise we'll do the extra work for checking sin6 and > setting up fl6 twice?
Hm .. true, we can put the whole this block under static key (the main one, since there are no others, but we can follow-up separately): if (cgroup_bpf_enabled && !connected) { I'll send v3 with this change for both ipv6 and ipv4. Thanks. As for the logic inside the `if`, I'll describe it just in case, since some things may not be obvious. There are two cases earlier in this function that can lead to `connected = false`, either user specifies destination address (the 1st `if (sin6)`) or/and user specifies ancillary data (`if (msg->msg_controllen)`). Ancillary data can contain option to set source IP. So to simplify: if user specifies source or destination we're in unconnected mode. Now imagine that we have connected socket and then user calls sendmsg without setting destination (sin6 = NULL), but sets the source IP in ancillary data at the same time. It will cause `connected = false` and BPF prog will be run (it can e.g. override that source IP set by user), but we have no sin6, that's why this `if (sin6)` is second time here. On the other hand if sin6 is passed by user, it'll cause unconnected mode as well and BPF prog has a chance to override IP and port in sin6 and in this case we have to update fl6 after BPF prog finishes. That's why `fl6.daddr = sin6->sin6_addr;` the second time. But I agree that work should be avoided when cgroup-bpf is disabled. > Also, when not enabled, couldn't we run into the case > of ipv6_addr_v4mapped() as well? If I'm spotting this right, then we would > bail out though we shouldn't normally? IPv4-mapped IPv6 case is handled earlier in this function and if user passed IPv4-mapped IPv6, we don't get this far and call IPv4 udp_sendmsg() much earlier. Same is true for port. That's why this code wouldn't affect the logic for IPv4-mapped IPv6, but again, you're right that we shouldn't do this extra work when cgroup-bpf is disabled and I'll fix it. > > > + } > > + > > final_p = fl6_update_dst(&fl6, opt, &final); > > if (final_p) > > connected = false; > > @@ -1394,6 +1417,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr > > *msg, size_t len) > > > > out: > > dst_release(dst); > > +out_no_dst: > > fl6_sock_release(flowlabel); > > txopt_put(opt_to_free); > > if (!err) > > > -- Andrey Ignatov