On 16-08-17 03:34 PM, Eric Dumazet wrote: > On Wed, 2016-08-17 at 12:33 -0700, John Fastabend wrote: > > >> diff --git a/net/core/dev.c b/net/core/dev.c >> index 4ce07dc..5db395d 100644 >> --- a/net/core/dev.c >> +++ b/net/core/dev.c >> @@ -3076,6 +3076,26 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, >> struct Qdisc *q, >> int rc; >> >> qdisc_calculate_pkt_len(skb, q); >> + >> + if (q->flags & TCQ_F_NOLOCK) { >> + if (unlikely(test_bit(__QDISC_STATE_DEACTIVATED, &q->state))) { >> + __qdisc_drop(skb, &to_free); >> + rc = NET_XMIT_DROP; >> + } else if ((q->flags & TCQ_F_CAN_BYPASS) && !qdisc_qlen(q)) { > > For a lockless qdisc, do you believe TCQ_F_CAN_BYPASS is still a gain ? >
For the benchmarks from pktgen it appears to be a win or mute to just drop the TCQ_F_CAN_BYPASS (just taking a look at one sample below) nolock & nobypass locked (current master) ---------------------------------------------- 1: 1435796 1471479 2: 1880642 1746231 4: 1922935 1119626 8: 1585055 1001471 12: 1479273 989269 The only thing would be to test a bunch of netperf RR sessions to be sure. > Also !qdisc_qlen(q) looks racy anyway ? Yep its racy unless you make it an atomic but this hurts performance metrics. There is a patch further in the stack here that adds the atomic variants but I tend to think we can just drop the bypass logic in the lockless case assuming the netperf tests look good. > >> + qdisc_bstats_cpu_update(q, skb); >> + if (sch_direct_xmit(skb, q, dev, txq, root_lock, true)) >> + __qdisc_run(q); >> + rc = NET_XMIT_SUCCESS; >> + } else { >> + rc = q->enqueue(skb, q, &to_free) & NET_XMIT_MASK; >> + __qdisc_run(q); >> + } >> + >> + if (unlikely(to_free)) >> + kfree_skb_list(to_free); >> + return rc; >> + } >> + > >