> -----Original Message----- > From: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> > Sent: Thursday, June 4, 2020 12:34 PM > To: Wang, Yipeng1 <yipeng1.w...@intel.com>; Stephen Hemminger > <step...@networkplumber.org> > Cc: Gobriel, Sameh <sameh.gobr...@intel.com>; Richardson, Bruce > <bruce.richard...@intel.com>; dev@dpdk.org; De Lara Guarch, Pablo > <pablo.de.lara.gua...@intel.com>; nd <n...@arm.com>; Honnappa > Nagarahalli <honnappa.nagaraha...@arm.com>; nd <n...@arm.com> > Subject: RE: [PATCH] hash: document breakage with multi-writer thread > > <snip> > > > > > > > > > > > > Subject: [PATCH] hash: document breakage with multi-writer > > > > > > thread > > > > > > > > > > > > The code in rte_cuckoo_hash multi-writer support is broken if > > > > > > write operations are called from a non-EAL thread. > > > > > > > > > > > > rte_lcore_id() wil return LCORE_ID_ANY (UINT32_MAX) for non > > > > > > EAL thread and that leads to using wrong local cache. > > > > > > > > > > > > Add error checks and document the restriction. > > > > > Having multiple non-EAL writer threads is a valid use case. > > > > > Should we fix the > > > > issue instead? > > > > > > > > Discovered this the hard way... > > > > > > > > Fixing is non-trivial. Basically, the local cache has to be take > > > > out and that leads to having to do real locking or atomic operations. > > > Looking at rte_hash_create function: > > > > > > if (params->extra_flag & > > > RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD) { > > > use_local_cache = 1; > > > writer_takes_lock = 1; > > > } > > > > > > The writer locks are in place already. The code to handle the case > > > when local cache is taken out is also there. > > > What we need is another input flag that says 'multi writer + non-eal > > threads' > > > which would set 'use_local_cache = 0' and 'writer_takes_lock = 1'. > > > Not sure, it would be valuable addition. But looks like this is what > > > you were expecting when you had enabled > > > 'RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD'. Many other APIs in > DPDK > > do > > > not provide this kind of MT safety. > > > > [Wang, Yipeng] > > If possible, we can try to not add new flags, because there are > > already a lot of flag options. > > How about in the code, we check if the writer is a non-eal or not by > > checking the rte_lcore_id, and operate on the global queue? > > Could this work? > > If(h->use_local_cache) { > > lcore_id = rte_lcore_id(); > > if(lcore_id == LCORE_ID_ANY) { // this is non-eal threads > > <call rte_ring_mp/mc_* to directly operate on global queue> > > } > > Else { > > <original path> > > } > > } > The other thing I wanted to do was saving on the memory allocated for the > local cache when the writers are non-eal threads. Without knowing the kind > of threads upfront, we might have to create the local cache when a writer > adds the entry first time.
I got what you mean. If people only use non-eal threads, we could save the space of local cache completely. Creating local cache during the first write is one solution. But the current rte_hash always allocate things during table creation time. This provides guarantee that the program won't fail in the middle due to memory allocation issue. Meanwhile I would rather be wasting some space than adding another option flag related to multi-threading. In my opinion, all those flags are already confusing enough. It would also be harder to maintain in future.