> -----Original Message----- > From: David Marchand <david.march...@redhat.com> > Sent: Wednesday, July 29, 2020 2:28 PM > To: Dumitrescu, Cristian <cristian.dumitre...@intel.com> > Cc: Xu, Ting <ting...@intel.com>; dev <dev@dpdk.org>; dpdk stable > <sta...@dpdk.org>; Kevin Traynor <ktray...@redhat.com>; Luca Boccassi > <bl...@debian.org> > Subject: Re: [dpdk-stable] [dpdk-dev] [PATCH v4] lib/table: fix cache > alignment issue > > On Wed, Jul 29, 2020 at 3:14 PM Dumitrescu, Cristian > <cristian.dumitre...@intel.com> wrote: > > > Please correct me if I am wrong, but it simply means this part of the > > > table library never worked for 32-bit. > > > It seems more adding 32-bit support rather than a fix and then I > > > wonder if it has its place in rc3. > > > > > > > Functionally. the code works, but performance is affected. > > > > The only thing that prevents the code from working is the check in the > table create function that checks the size of the above structure is 64 bytes, > which caught this issue. > > Yes, and that's my point. > It was not working. > It was not tested. > >
Not sure when this code was last tested on 32-bit systems, I'll let the validation folks comment on this, but I cannot rule out a change in compiler behavior either. This is a low complexity and low impact change, hence low risk IMO. > This patch asks for backport in stable branches, I will let Kevin and > Luca comment. > > > > > > > > > > > > > Now, looking at the details: > > > > > > For 64-bit on my x86, we have: > > > > > > struct rte_bucket_4_8 { > > > uint64_t signature; /* 0 8 */ > > > uint64_t lru_list; /* 8 8 */ > > > struct rte_bucket_4_8 * next; /* 16 8 */ > > > uint64_t next_valid; /* 24 8 */ > > > uint64_t key[4]; /* 32 32 */ > > > /* --- cacheline 1 boundary (64 bytes) --- */ > > > uint8_t data[]; /* 64 0 */ > > > > > > /* size: 64, cachelines: 1, members: 6 */ > > > }; > > > > > > > > > For 32-bit, we have: > > > > > > struct rte_bucket_4_8 { > > > uint64_t signature; /* 0 8 */ > > > uint64_t lru_list; /* 8 8 */ > > > struct rte_bucket_4_8 * next; /* 16 4 */ > > > uint64_t next_valid; /* 20 8 */ > > > uint64_t key[4]; /* 28 32 */ > > > uint8_t data[]; /* 60 0 */ > > > > > > /* size: 60, cachelines: 1, members: 6 */ > > > /* last cacheline: 60 bytes */ > > > } __attribute__((__packed__)); > > > > > > ^^ it is interesting that a packed attribute ends up here. > > > I saw no such attribute in the library code. > > > Compiler black magic at work I guess... > > > > > > > Where do you see the packet attribute? I don't see it in the code. > > That's pahole reporting this. > Maybe the tool extrapolates this attribute based on the next_valid > field placement... I don't know. > > > A packet attribute would explain this issue, i.e. why did the compiler > > decide > not to insert an expected padfing of 4 bytes right after the "next" field, > that > would allow the field "next_valid" to be aligned to its natural boundary of 8 > bytes. > > Or a 64-bit field on 32-bit has a special alignment that I am not aware of. > > > > > > > > > > > > > > > Fixes: 8aa327214c ("table: hash") > > > > Cc: sta...@dpdk.org > > > > > > > > Signed-off-by: Ting Xu <ting...@intel.com> > > > > > > > > --- > > > > v3->v4: Change design based on comment > > > > v2->v3: Rebase > > > > v1->v2: Correct patch time > > > > --- > > > > lib/librte_table/rte_table_hash_key16.c | 17 +++++++++++++++++ > > > > lib/librte_table/rte_table_hash_key32.c | 17 +++++++++++++++++ > > > > lib/librte_table/rte_table_hash_key8.c | 16 ++++++++++++++++ > > > > 3 files changed, 50 insertions(+) > > > > > > > > diff --git a/lib/librte_table/rte_table_hash_key16.c > > > b/lib/librte_table/rte_table_hash_key16.c > > > > index 2cca1c924..c4384b114 100644 > > > > --- a/lib/librte_table/rte_table_hash_key16.c > > > > +++ b/lib/librte_table/rte_table_hash_key16.c > > > > @@ -33,6 +33,7 @@ > > > > > > > > #endif > > > > > > > > +#ifdef RTE_ARCH_64 > > > > struct rte_bucket_4_16 { > > > > /* Cache line 0 */ > > > > uint64_t signature[4 + 1]; > > > > @@ -46,6 +47,22 @@ struct rte_bucket_4_16 { > > > > /* Cache line 2 */ > > > > uint8_t data[0]; > > > > }; > > > > +#else > > > > +struct rte_bucket_4_16 { > > > > + /* Cache line 0 */ > > > > + uint64_t signature[4 + 1]; > > > > + uint64_t lru_list; > > > > + struct rte_bucket_4_16 *next; > > > > + uint32_t pad; > > > > + uint64_t next_valid; > > > > + > > > > + /* Cache line 1 */ > > > > + uint64_t key[4][2]; > > > > + > > > > + /* Cache line 2 */ > > > > + uint8_t data[0]; > > > > +}; > > > > +#endif > > > > > > The change could simply be: > > > > > > @@ -38,6 +38,9 @@ struct rte_bucket_4_16 { > > > uint64_t signature[4 + 1]; > > > uint64_t lru_list; > > > struct rte_bucket_4_16 *next; > > > +#ifndef RTE_ARCH_64 > > > + uint32_t pad; > > > +#endif > > > uint64_t next_valid; > > > > > > /* Cache line 1 */ > > > > > > It avoids duplicating the whole structure definition (we could miss > > > updating one side of the #ifdef later). > > > Idem for the other "8" and "32" structures. > > > What about this comment? > > > -- > David Marchand