> -----Original Message----- > From: Ilya Maximets [mailto:[email protected]] > Sent: Wednesday, July 17, 2019 5:26 PM > To: Van Haaren, Harry <[email protected]>; [email protected] > Cc: [email protected]; [email protected]; Stokes, Ian > <[email protected]> > Subject: Re: [PATCH v12 0/5] dpcls func ptrs & optimizations
<snip> > >> I performed a few tests with v11 of this patch-set on my usual setup to > >> check > >> performance of the new generic (non-optimized) implementation. The result > is > >> that > >> new generic implementation is ~5% faster for me than current master (it > was > >> 12% > >> for optimized lookup functions) which is good. > >> The code looks OK for me in general. I still don't really like the fact > that > >> dpcls depends on the internal structure of a flowmap, but we, probably, > >> could > >> deal with that while we have a build time assertion. Hope, we'll have > some > >> better > >> implementation with the same level of performance in the future. > > > > > > Thanks for the feedback on performance Ilya - good to see that we're going > > in the right direction performance wise anyway. > > > > I have one item I'd still like to improve in this patchset, and its > regarding > > where the blocks scratch array is being stored. I'll rework and find a > better > > solution than the current, and post that as the lucky patchset v13 :) > > As an idea, I could suggest DEFINE_STATIC_PER_THREAD_DATA inside the dpif- > netdev-lookup-generic.c. Thanks for the suggestion - that would make the blocks scratch pointer available on a per-thread basis, and be initialized to NULL correct? As a result, we would must a (predictable) branch to check if the pointer is NULL, to allocate the required space on first usage of the subtable_lookup? Or is there a better way to initialize it for all threads that call dpcls_lookup()? My concern here is what if one thread creates a subtable (and allocs blocks_scratch for itself), but then the PMD is polled by another thread - which uses its own blocks_scratch which is NULL. Hence, I think the runtime check + alloc is probably the best/safest way, but I'd appreciate your input on the above threading logic Ilya :) _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
