On 04/18/2018 03:22 AM, Paolo Abeni wrote:
> This changeset extends the idea behind commit c8c8b127091b ("udp:
> under rx pressure, try to condense skbs"), trading more BH cpu
> time and memory bandwidth to decrease the load on the user space
> receiver.
>
> At boot time we allocate a limited amount of skbs with small
> data buffer, storing them in per cpu arrays. Such skbs are never
> freed.
>
> At run time, under rx pressure, the BH tries to copy the current
> skb contents into the cache - if the current cache skb is available,
> and the ingress skb is small enough and without any head states.
>
> When using the cache skb, the ingress skb is dropped by the BH
> - while still hot on cache - and the cache skb is inserted into
> the rx queue, after increasing its usage count. Also, the cache
> array index is moved to the next entry.
>
> The receive side is unmodified: in udp_rcvmsg() the usage skb
> usage count is decreased and the skb is _not_ freed - since the
> cache keeps usage > 0. Since skb->usage is hot in the cache of the
> receiver at consume time - the receiver has just read skb->data,
> which lies in the same cacheline - the whole skb_consume_udp() becomes
> really cheap.
>
> UDP receive performances under flood improve as follow:
>
> NR RX queues Kpps Kpps Delta (%)
> Before After
>
> 1 2252 2305 2
> 2 2151 2569 19
> 4 2033 2396 17
> 8 1969 2329 18
>
> Overall performances of knotd DNS server under real traffic flood
> improves as follow:
>
> Kpps Kpps Delta (%)
> Before After
>
> 3777 3981 5
It might be time for knotd DNS server to finally use SO_REUSEPORT instead of
adding this bloat to the kernel ?
Sorry, 5% improvement while you easily can get 300% improvement with no kernel
change
is not appealing to me :/