From: Paolo Abeni
Date: Wed, 4 Apr 2018 14:30:01 +0200
> After commit 694aba690de0 ("ipv4: factorize sk_wmem_alloc updates
> done by __ip_append_data()") and commit 1f4c6eb24029 ("ipv6:
> factorize sk_wmem_alloc updates done by __ip6_append_data()"),
> when transmitting sub MTU datagram, an addtional, unneeded atomic
> operation is performed in ip*_append_data() to update wmem_alloc:
> in the above condition the delta is 0.
>
> The above cause small but measurable performance regression in UDP
> xmit tput test with packet size below MTU.
>
> This change avoids such overhead updating wmem_alloc only if
> wmem_alloc_delta is non zero.
>
> The error path is left intentionally unmodified: it's a slow path
> and simplicity is preferred to performances.
>
> Fixes: 694aba690de0 ("ipv4: factorize sk_wmem_alloc updates done by
> __ip_append_data()")
> Fixes: 1f4c6eb24029 ("ipv6: factorize sk_wmem_alloc updates done by
> __ip6_append_data()")
> Signed-off-by: Paolo Abeni
...
> - refcount_add(wmem_alloc_delta, >sk_wmem_alloc);
> + if (wmem_alloc_delta)
> + refcount_add(wmem_alloc_delta, >sk_wmem_alloc);
...
> - refcount_add(wmem_alloc_delta, >sk_wmem_alloc);
> + if (wmem_alloc_delta)
> + refcount_add(wmem_alloc_delta, >sk_wmem_alloc);
This is simple enough, so applied.
But I wonder if atomic_{add,sub} and refcount_{add,sub}() should just check
for zero inline, just like the {set,clear}_bit() implementations avoid the
atomic operation if the bit already has the desired value.