Re: [dpdk-dev] [PATCH] hash: document breakage with multi-writer thread

Wang, Yipeng1 Thu, 04 Jun 2020 13:22:53 -0700

> -----Original Message-----
> From: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> Sent: Thursday, June 4, 2020 12:34 PM
> To: Wang, Yipeng1 <yipeng1.w...@intel.com>; Stephen Hemminger
> <step...@networkplumber.org>
> Cc: Gobriel, Sameh <sameh.gobr...@intel.com>; Richardson, Bruce
> <bruce.richard...@intel.com>; dev@dpdk.org; De Lara Guarch, Pablo
> <pablo.de.lara.gua...@intel.com>; nd <n...@arm.com>; Honnappa
> Nagarahalli <honnappa.nagaraha...@arm.com>; nd <n...@arm.com>
> Subject: RE: [PATCH] hash: document breakage with multi-writer thread
> 
> <snip>
> 
> > > > >
> > > > > > Subject: [PATCH] hash: document breakage with multi-writer
> > > > > > thread
> > > > > >
> > > > > > The code in rte_cuckoo_hash multi-writer support is broken if
> > > > > > write operations are called from a non-EAL thread.
> > > > > >
> > > > > > rte_lcore_id() wil return LCORE_ID_ANY (UINT32_MAX) for non
> > > > > > EAL thread and that leads to using wrong local cache.
> > > > > >
> > > > > > Add error checks and document the restriction.
> > > > > Having multiple non-EAL writer threads is a valid use case.
> > > > > Should we fix the
> > > > issue instead?
> > > >
> > > > Discovered this the hard way...
> > > >
> > > > Fixing is non-trivial. Basically, the local cache has to be take
> > > > out and that leads to having to do real locking or atomic operations.
> > > Looking at rte_hash_create function:
> > >
> > >         if (params->extra_flag &
> > > RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD) {
> > >                 use_local_cache = 1;
> > >                 writer_takes_lock = 1;
> > >         }
> > >
> > > The writer locks are in place already. The code to handle the case
> > > when local cache is taken out is also there.
> > > What we need is another input flag that says 'multi writer + non-eal
> > threads'
> > > which would set 'use_local_cache = 0' and 'writer_takes_lock = 1'.
> > > Not sure, it would be valuable addition. But looks like this is what
> > > you were expecting when you had enabled
> > > 'RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD'. Many other APIs in
> DPDK
> > do
> > > not provide this kind of MT safety.
> >
> > [Wang, Yipeng]
> > If possible, we can try to not add new flags, because there are
> > already a lot of flag options.
> > How about in the code, we check if the writer is a non-eal or not by
> > checking the rte_lcore_id, and operate on the global queue?
> > Could this work?
> > If(h->use_local_cache) {
> >     lcore_id = rte_lcore_id();
> >     if(lcore_id == LCORE_ID_ANY) {   // this is non-eal threads
> >             <call rte_ring_mp/mc_* to directly operate on global queue>
> >     }
> >     Else {
> >             <original path>
> >     }
> > }
> The other thing I wanted to do was saving on the memory allocated for the
> local cache when the writers are non-eal threads. Without knowing the kind
> of threads upfront, we might have to create the local cache when a writer
> adds the entry first time.


I got what you mean.  If people only use non-eal threads, we could save the 
space of local cache completely. 
Creating local cache during the first write is one solution. But the current 
rte_hash always allocate things during
table creation time. This provides guarantee that the program won't fail in the 
middle due to memory allocation issue.

Meanwhile I would rather be wasting some space than adding another option flag 
related to multi-threading.
In my opinion, all those flags are already confusing enough. It would also be 
harder to maintain in future.

Re: [dpdk-dev] [PATCH] hash: document breakage with multi-writer thread

Reply via email to