On Fri, Aug 18, 2017 at 07:50:03AM -0700, Tom Herbert wrote:
> > We had been using the auto_flowlabels=1 (i.e. essentially enable flowlabel)
> > mainly because we want to take the benefit of dst_negative_advice() when
> > tcp_write_timeout() happens.
> > During our test, our system handles quite well with changing flowlabel.
> > The only exception we have hit is the TCP_RST sent from an
> > inet_timewait_sock.
> That is interesting data. Have you determined why the middlebox has a
> problem with flow label change in TW state but not other states?
The problem of this middle box is specific to TCP_RST with a different
flowlabel from its previous packets. Sending TCP_RST from TW state hits
this pain point.
It seems like that middle box specifically drops TCP_RST if it
does not know anything about this flow. Since the flowlabel of the TCP_RST
(sent in TW state) is always different, it always lands to a different middle
box. All of these TCP_RST cannot be delivered.
We are resilience to a small number of TCP_RST drop. However, this guarantee
flowlabel change on TCP_RST and then dropped is too much.
This flowlabel change does not look like intentional either when transitioning
from full sk to tw sk (tw->tw_flowlabel is inheriting the np->flow_label
in tcp_time_wait()). Currently, the tw_flowlabel is used in
but not in tcp_v6_send_reset(). Hence, shaohua is looking for a solution to
> > If we keep the flowlabel consistent (or persistent sk_txhash), there
> > is no practical usage for us to turn on flowlabel and the problem also goes
> > away. We have it off for now.
> >> > There seems to have other bug in this side. From my understanding, commit
> >> > 265f94ff54d6(net: Recompute sk_txhash on negative routing advice) tries
> >> > to
> >> > select a different route. But the multipath selection code
> >> > (rt6_multipath_select) doesn't use sk_txhash or skb->hash, it does use
> >> > fl6.flowlabel, but that is the flowlabel user sets. So looks like the
> >> > commit
> >> > doesn't change anything.
> >> >
> >> The routing functions typically don't use sock of skbuff, but use flow
> >> structs instead. It may be reasonable to add a hash to those.
> > The localhost's mutlipath selection is another existing issue. AFAICT,
> > it does not take the sk_txhash (or skb->hash) into account and the following
> > dst_negative_advice() will also have no effect in the route selction.
> > It is another issue to be fixed and to be figured out how to pass the
> > sk_txhash down. (1)
> > Shaohua is proposing to record the 20 bits of the sk_txhash in the
> > tw_flowlabel of the 'struct inet_timewait_sock'. The tw_flowlabel could
> > potentially be used to do the multipath selection once we figured out
> > how to tackle (1).
> > Thanks,
> > Martin
> >> > What's the 'src port for UDP encap'? I can't find the code setting
> >> > skb->hash
> >> > to sk_txhash in UDP side.
> >> >
> >> udp_flow_src_port is function call by UDP encaps to set source port.
> >> This is call skb_get_hash. sk_set_txhash is function to set txhash
> >> right now to random value. skb_set_hash_from_sk set skb->hash when
> >> skbuff is owned by socket (skb_set_owner_w).
> >> Thanks,
> >> Tom
> >> > Thanks,
> >> > Shaohua