> On 18.07.2019 16:03, Harry van Haaren wrote:
> > This commit adds a number of specialized functions, that handle
> > common miniflow fingerprints. This enables compiler optimization,
> > resulting in higher performance. Below a quick description of
> > how this optimization actually works;
> >
> > "Specialized functions" are "instances" of the generic implementation,
> > but the compiler is given extra context when compiling. In the case of
> > iterating miniflow datastructures, the most interesting value to enable
> > compile time optimizations is the loop trip count per unit.
> >
> > In order to create a specialized function, there is a generic
> implementation,
> > which uses a for() loop without the compiler knowing the loop trip count
> at
> > compile time. The loop trip count is passed in as an argument to the
> function:
> >
> > uint32_t miniflow_impl_generic(struct miniflow *mf, uint32_t loop_count)
> > {
> > for(uint32_t i = 0; i < loop_count; i++)
> > // do work
> > }
> >
> > In order to "specialize" the function, we call the generic
> implementation
> > with hard-coded numbers - these are compile time constants!
> >
> > uint32_t miniflow_impl_loop5(struct miniflow *mf, uint32_t loop_count)
> > {
> > // use hard coded constant for compile-time constant-propogation
> > return miniflow_impl_generic(mf, 5);
> > }
> >
> > Given the compiler is aware of the loop trip count at compile time,
> > it can perform an optimization known as "constant propogation". Combined
> > with inlining of the miniflow_impl_generic() function, the compiler is
> > now enabled to *compile time* unroll the loop 5x, and produce "flat"
> code.
> >
> > The last step to using the specialized functions is to utilize a
> > function-pointer to choose the specialized (or generic) implementation.
> > The selection of the function pointer is performed at subtable creation
> > time, when miniflow fingerprint of the subtable is known. This technique
> > is known as "multiple dispatch" in some literature, as it uses multiple
> > items of information (miniflow bit counts) to select the dispatch
> function.
> >
> > By pointing the function pointer at the optimized implementation, OvS
> > benefits from the compile time optimizations at runtime.
> >
> > Signed-off-by: Harry van Haaren <[email protected]>
> > Tested-by: Malvika Gupta <[email protected]>
> >
> > ---
> >
> > v13:
> > - Update macros to new lookup function prototype without blocks scratch
> >
> > v12:
> > - Fix typo (Ian)
> > - Fix missing . after comments (Ian)
> > - Improve return value comments for probe function (Ian)
> > - Spaces after number in optimized lookup declarations (Ilya)
> > - Make VLOG level debug instead of info (Ilya)
> >
> > v11:
> > - Use MACROs to declare and check optimized functions (Ilya)
> > - Use captial letter for commit message (Ilya)
> > - Rebase onto latest patchset changes
> > - Added NEWS entry for data-path subtable specialization (Ian/Harry)
> > - Checkpatch notes an "incorrect bracketing" in the MACROs, however I
> > didn't find a solution that it does like.
> >
> > v10:
> > - Rebase changes from previous patches.
> > - Remove "restrict" keyword as windows CI failed, see here for details:
> > https://ci.appveyor.com/project/istokes/ovs-q8fvv/builds/24398228
> >
> > v8:
> > - Rework to use blocks_cache from the dpcls instance, to avoid variable
> > lenght arrays in the data-path.
> > ---
> > NEWS | 4 +++
> > lib/dpif-netdev-lookup-generic.c | 49 ++++++++++++++++++++++++++++++++
> > lib/dpif-netdev-private.h | 8 ++++++
> > lib/dpif-netdev.c | 9 ++++--
> > 4 files changed, 68 insertions(+), 2 deletions(-)
> >
> > diff --git a/NEWS b/NEWS
> > index 81130e667..4cfffb1bc 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -34,6 +34,10 @@ Post-v2.11.0
> > * 'ovs-appctl exit' now implies cleanup of non-internal ports in
> userspace
> > datapath regardless of '--cleanup' option. Use '--cleanup' to
> remove
> > internal ports too.
> > + * Datapath classifer code refactored to enable function pointers
> to select
> > + the lookup implementation at runtime. This enables
> specialization of
> > + specific subtables based on the miniflow attributes, enhancing
> the
> > + performance of the subtable search.
> > - OVSDB:
> > * OVSDB clients can now resynchronize with clustered servers much
> more
> > quickly after a brief disconnection, saving bandwidth and CPU
> time.
> > diff --git a/lib/dpif-netdev-lookup-generic.c b/lib/dpif-netdev-lookup-
> generic.c
> > index 4dcf4640e..10ac41d8e 100644
> > --- a/lib/dpif-netdev-lookup-generic.c
> > +++ b/lib/dpif-netdev-lookup-generic.c
> > @@ -260,7 +260,56 @@ dpcls_subtable_lookup_generic(struct dpcls_subtable
> *subtable,
> > const struct netdev_flow_key *keys[],
> > struct dpcls_rule **rules)
> > {
> > + /* Here the runtime subtable->mf_bits counts are used, which forces
> the
> > + * compiler to iterate normal for() loops. Due to this limitation
> in the
> > + * compilers available optimizations, this function has lower
> performance
> > + * than the below specialized functions.
> > + */
> > return lookup_generic_impl(subtable, keys_map, keys, rules,
> > subtable->mf_bits_set_unit0,
> > subtable->mf_bits_set_unit1);
> > }
> > +
> > +/* Expand out specialized functions with U0 and U1 bit attributes. */
> > +#define DECLARE_OPTIMIZED_LOOKUP_FUNCTION(U0, U1)
> \
> > + static uint32_t
> \
> > + dpcls_subtable_lookup_mf_u0w##U0##_u1w##U1(
> \
> > + struct dpcls_subtable
> *subtable, \
> > + uint32_t keys_map,
> \
> > + const struct netdev_flow_key
> *keys[],\
> > + struct dpcls_rule **rules)
> \
> > + {
> \
> > + return lookup_generic_impl(subtable, keys_map, keys, rules, U0,
> U1); \
> > + }
> \
> > +
> > +DECLARE_OPTIMIZED_LOOKUP_FUNCTION(5, 1)
> > +DECLARE_OPTIMIZED_LOOKUP_FUNCTION(4, 1)
> > +DECLARE_OPTIMIZED_LOOKUP_FUNCTION(4, 0)
> > +
> > +/* Check if a specialized function is valid for the required subtable.
> */
> > +#define CHECK_LOOKUP_FUNCTION(U0,U1)
> \
>
> Ian, could you, please, add a space after a comma here before applying the
> patch?
Sure, no problem.
Ian
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev