Re: [PATCH net-next 09/10] net/mlx4_en: Replace TXBB_SIZE multiplications with shift operations

2017-06-20 Thread Tariq Toukan



On 20/06/2017 11:45 AM, David Laight wrote:

From: Tariq Toukan

Sent: 15 June 2017 12:36
Define LOG_TXBB_SIZE, log of TXBB_SIZE, and use it with a shift
operation instead of a multiplication with TXBB_SIZE.
Operations are equivalent as TXBB_SIZE is a power of two.

Performance tests:
Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz

Gain is too small to be measurable, no degradation sensed.
Results are similar for IPv4 and IPv6.


I can't imagine there is any difference at all.
The compiler will use a shift for a 'multiply by a constant power of 2'.
Yeah i guess my compiler does that, because it's a constant known at 
compile-time.




...
If you want to save a few cycles I think the loop:

-   for (i = 0; i < tx_info->nr_txbb * TXBB_SIZE;
+   for (i = 0; i < tx_info->nr_txbb << LOG_TXBB_SIZE;

requires the compiler generate code to read nr_txbb every
iteration.
Caching the values might help (unless it causes a different
register spill).


That sounds good!
I'll prepare and send this after testing.
I'll also look for similar cases in driver.



David



Thank you David!


RE: [PATCH net-next 09/10] net/mlx4_en: Replace TXBB_SIZE multiplications with shift operations

2017-06-20 Thread David Laight
From: Tariq Toukan
> Sent: 15 June 2017 12:36
> Define LOG_TXBB_SIZE, log of TXBB_SIZE, and use it with a shift
> operation instead of a multiplication with TXBB_SIZE.
> Operations are equivalent as TXBB_SIZE is a power of two.
> 
> Performance tests:
> Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
> 
> Gain is too small to be measurable, no degradation sensed.
> Results are similar for IPv4 and IPv6.

I can't imagine there is any difference at all.
The compiler will use a shift for a 'multiply by a constant power of 2'.

...
If you want to save a few cycles I think the loop:
> - for (i = 0; i < tx_info->nr_txbb * TXBB_SIZE;
> + for (i = 0; i < tx_info->nr_txbb << LOG_TXBB_SIZE;
requires the compiler generate code to read nr_txbb every
iteration.
Caching the values might help (unless it causes a different
register spill).

David