On 5/28/2024 6:48 AM, Alexander Lobakin wrote:
> Now that the queue and queue vector structures are separated and laid
> out optimally, group the fields as read-mostly, read-write, and cold
> cachelines and add size assertions to make sure new features won't push
> something out of its place and provoke perf regression.
> Despite looking innocent, this gives up to 2% of perf bump on Rx.
>
Could you explain this a bit more for my education? This patch does
clearly change the layout from what it was before this patch, but the
commit message here claims it was already laid out optimally? I guess
that wasn't 100% true? Or do these group field macros also provide
further hints to the compiler about read_mostly or cold, etc?
> Reviewed-by: Przemek Kitszel <[email protected]>
> Signed-off-by: Alexander Lobakin <[email protected]>
> ---
Having the compiler assert some of this so that we can more easily spot
regressions in the layout is a big benefit.
Thanks!
Although I'm not an expert on cache-line layout, I didn't see anything
wrong here:
Reviewed-by: Jacob Keller <[email protected]>
> drivers/net/ethernet/intel/idpf/idpf_txrx.h | 443 +++++++++++---------
> 1 file changed, 250 insertions(+), 193 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.h
> b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
> index 9e4ba0aaf3ab..731e2a73def5 100644
> --- a/drivers/net/ethernet/intel/idpf/idpf_txrx.h
> +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
> @@ -6,6 +6,7 @@
>
> #include <linux/dim.h>
>
> +#include <net/libeth/cache.h>
> #include <net/page_pool/helpers.h>
> #include <net/tcp.h>
> #include <net/netdev_queues.h>
> @@ -504,59 +505,70 @@ struct idpf_intr_reg {
>
> /**
> * struct idpf_q_vector
> + * @read_mostly: CL group with rarely written hot fields
I wonder if there is a good way to format the doc here since we almost
want read_mostly to be some sort of header making it clear which fields
belong to it? I don't know how we'd achieve that with current kdoc though.