During stress-testing our "ucan" USB/CAN adapter SocketCAN driver on
Linux v4.16-rc4-383-ged58d66f60b3 we observed that a small fraction of
packets are delivered out-of-order.
We have tracked the problem down to the driver interface level, and it
seems that the driver's net_device_ops.ndo_start_xmit() function gets
the packets handed over in the wrong order.
This behavior was not observed on Linux v4.15 and I have bisected the
problem down to this patch:
Author: John Fastabend <john.fastab...@gmail.com>
Date: Thu Dec 7 09:58:19 2017 -0800
net: sched: pfifo_fast use skb_array
This converts the pfifo_fast qdisc to use the skb_array data structure
and set the lockless qdisc bit. pfifo_fast is the first qdisc to support
the lockless bit that can be a child of a qdisc requiring locking. So
we add logic to clear the lock bit on initialization in these cases when
the qdisc graft operation occurs.
This also removes the logic used to pick the next band to dequeue from
and instead just checks a per priority array for packets from top priority
to lowest. This might need to be a bit more clever but seems to work
Signed-off-by: John Fastabend <john.fastab...@gmail.com>
Signed-off-by: David S. Miller <da...@davemloft.net>
The patch does not revert cleanly, but moving to one commit earlier
makes the problem go away.
Selecting the "fq" scheduler instead of "pfifo_fast" makes the problem
go away as well.
Is this an unintended side-effect of the patch or is there something the
driver has to do to request in-order delivery?