On Tue, Mar 27, 2018 at 04:27:30PM +0300, Alexey Kodanev wrote: > On 26.03.2018 20:02, Martin KaFai Lau wrote: > > On Mon, Mar 26, 2018 at 05:48:47PM +0300, Alexey Kodanev wrote: > >> After commit 33c162a980fe ("ipv6: datagram: Update dst cache of a > >> connected datagram sk during pmtu update"), when the error occurs on > >> sending datagram in udpv6_sendmsg() due to ICMPV6_PKT_TOOBIG type, > >> error handler can trigger the following path and call ip6_dst_store(): > >> > >> udpv6_err() > >> ip6_sk_update_pmtu() > >> ip6_datagram_dst_update() > >> ip6_dst_lookup_flow(), can create a RTF_CACHE clone > > Instead of ip6_dst_lookup_flow(), > > you meant the RTF_CACHE route created in ip6_update_pmtu() > > > > Right, or even earlier... I was using vti tunnel and it invokes > skb_dst_update_pmtu() on this error, then sends ICMPv6_PKT_TOOBIG. > > >> ... > >> ip6_dst_store() > >> > >> It can happen before a connected UDP socket invokes ip6_dst_store() > >> in the end of udpv6_sendmsg(), on destination release, as a result, > >> the last one changes dst to the old one, preventing getting updated > >> dst cache on the next udpv6_sendmsg() call. > >> > >> This patch moves ip6_dst_store() in udpv6_sendmsg(), so that it is > >> invoked after ip6_sk_dst_lookup_flow() and before udp_v6_send_skb(). > > After this patch, the above udpv6_err() path could not happen after > > ip6_sk_dst_lookup_flow() and before the ip6_dst_store() in udpv6_sendmsg()? > > > > May be we could minimize this if save it in ip6_sk_dst_lookup_flow() > for a connected UDP sockets only if we're not getting it from a cache > for some reason? I am just not clear how moving it earlier can completely stop the described issue. Beside, the next ICMPV6_PKT_TOOBIG will be received again and eventually rectify sk->sk_dst_cache?
> > diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c > index a8a9195..0204f52 100644 > --- a/net/ipv6/ip6_output.c > +++ b/net/ipv6/ip6_output.c > @@ -1115,13 +1115,30 @@ struct dst_entry *ip6_dst_lookup_flow(const struct > sock *sk, struct flowi6 *fl6, > * error code. > */ > struct dst_entry *ip6_sk_dst_lookup_flow(struct sock *sk, struct flowi6 *fl6, > - const struct in6_addr *final_dst) > + const struct in6_addr *final_dst, > + bool connected) > { > struct dst_entry *dst = sk_dst_check(sk, inet6_sk(sk)->dst_cookie); > > dst = ip6_sk_dst_check(sk, dst, fl6); > - if (!dst) > - dst = ip6_dst_lookup_flow(sk, fl6, final_dst); > + if (dst) > + return dst; > + > + dst = ip6_dst_lookup_flow(sk, fl6, final_dst); > + > + if (connected && !IS_ERR(dst)) > + ip6_dst_store(sk, dst_clone(dst), ...); Right, I think doing dst_store() only if ip6_sk_dst_check()/ sk_dst_check() returns NULL is a more sensible thing to do here. > > Thanks, > Alexey > > >> > >> Also, increase refcnt for dst, when passing it to ip6_dst_store() > >> because after that the dst cache can be released by other calls > >> to ip6_dst_store() with the same socket. > >> > >> Fixes: 33c162a980fe ("ipv6: datagram: Update dst cache of a connected > >> datagram sk during pmtu update") > >> Signed-off-by: Alexey Kodanev <alexey.koda...@oracle.com> > >> --- > >> > >> v2: * remove 'release_dst:' label > >> > >> * move ip6_dst_store() below MSG_CONFIRM check as > >> suggested by Eric and add dst_clone() > >> > >> * add 'Fixes' commit. > >> > >> > >> net/ipv6/udp.c | 29 +++++++++++------------------ > >> 1 file changed, 11 insertions(+), 18 deletions(-) > >> > >> diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c > >> index 52e3ea0..4508e5a 100644 > >> --- a/net/ipv6/udp.c > >> +++ b/net/ipv6/udp.c > >> @@ -1303,6 +1303,16 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr > >> *msg, size_t len) > >> goto do_confirm; > >> back_from_confirm: > >> > >> + if (connected)>> + ip6_dst_store(sk, dst_clone(dst), > >> + ipv6_addr_equal(&fl6.daddr, &sk->sk_v6_daddr) ? > >> + &sk->sk_v6_daddr : NULL, > >> +#ifdef CONFIG_IPV6_SUBTREES > >> + ipv6_addr_equal(&fl6.saddr, &np->saddr) ? > >> + &np->saddr : > >> +#endif > >> + NULL); > >> + > >> /* Lockless fast path for the non-corking case */ > >> if (!corkreq) { > >> struct sk_buff *skb; > >> @@ -1314,7 +1324,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr > >> *msg, size_t len) > >> err = PTR_ERR(skb); > >> if (!IS_ERR_OR_NULL(skb)) > >> err = udp_v6_send_skb(skb, &fl6); > >> - goto release_dst; > >> + goto out; > >> } > >> > >> lock_sock(sk); > >> @@ -1348,23 +1358,6 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr > >> *msg, size_t len) > >> err = np->recverr ? net_xmit_errno(err) : 0; > >> release_sock(sk); > >> > >> -release_dst: > >> - if (dst) { > >> - if (connected) { > >> - ip6_dst_store(sk, dst, > >> - ipv6_addr_equal(&fl6.daddr, > >> &sk->sk_v6_daddr) ? > >> - &sk->sk_v6_daddr : NULL, > >> -#ifdef CONFIG_IPV6_SUBTREES > >> - ipv6_addr_equal(&fl6.saddr, &np->saddr) ? > >> - &np->saddr : > >> -#endif > >> - NULL); > >> - } else { > >> - dst_release(dst); > >> - } > >> - dst = NULL; > >> - } > >> - > >> out: > >> dst_release(dst); > >> fl6_sock_release(flowlabel); > >> -- > >> 1.8.3.1 > >> >