On 2/2/22 15:51, Cian Ferriter wrote:
> The subtable search function can be used at any time by a PMD thread.
> Setting the subtable search function should be done atomically to
> prevent garbage data from being read.
>
> A dpcls_subtable_lookup_reprobe() call can happen at the same time that
> DPCLS subtables are being sorted. This could lead to modifications done
> by the reprobe() to be lost. Prevent this from happening by locking on
> pmd->flow_mutex. After this change both the reprobe function and a
> subtable sort will share the flow_mutex preventing modifications by
> either one from being lost.
>
> Fixes: 3d018c3ea79d ("dpif-netdev: add subtable lookup prio set command.")
> Reported-by: Ilya Maximets <[email protected]>
> Reported-at:
> https://mail.openvswitch.org/pipermail/ovs-dev/2022-January/390757.html
> Signed-off-by: Cian Ferriter <[email protected]>
Hi, Cian. Thanks for working on this!
I didn't test that, but see some comments inline.
Best regards, Ilya Maximets.
> ---
> lib/dpif-netdev-private-dpcls.h | 2 +-
> lib/dpif-netdev.c | 23 +++++++++++++++++++----
> 2 files changed, 20 insertions(+), 5 deletions(-)
>
> diff --git a/lib/dpif-netdev-private-dpcls.h b/lib/dpif-netdev-private-dpcls.h
> index 7c4a840cb..1b4d69605 100644
> --- a/lib/dpif-netdev-private-dpcls.h
> +++ b/lib/dpif-netdev-private-dpcls.h
> @@ -84,7 +84,7 @@ struct dpcls_subtable {
> * property of the subtable (eg: only 3 bits of miniflow metadata is
> * used for the lookup) then this can point at an optimized version of
> * the lookup function for this particular subtable. */
> - dpcls_subtable_lookup_func lookup_func;
> + ATOMIC(dpcls_subtable_lookup_func) lookup_func;
Maybe a comment update saying why it's atomic?
>
> /* Caches the masks to match a packet to, reducing runtime calculations.
> */
> uint64_t *mf_masks;
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index e28e0b554..9ca6d9842 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -1074,7 +1074,9 @@ dpif_netdev_subtable_lookup_set(struct unixctl_conn
> *conn, int argc OVS_UNUSED,
> if (!cls) {
> continue;
> }
> + ovs_mutex_lock(&pmd->flow_mutex);
> uint32_t subtbl_changes = dpcls_subtable_lookup_reprobe(cls);
> + ovs_mutex_unlock(&pmd->flow_mutex);
> if (subtbl_changes) {
> lookup_dpcls_changed++;
> lookup_subtable_changed += subtbl_changes;
> @@ -9736,9 +9738,14 @@ dpcls_create_subtable(struct dpcls *cls, const struct
> netdev_flow_key *mask)
>
> /* Get the preferred subtable search function for this (u0,u1) subtable.
> * The function is guaranteed to always return a valid implementation,
> and
> - * possibly an ISA optimized, and/or specialized implementation.
> + * possibly an ISA optimized, and/or specialized implementation.
> Initialize
> + * the subtable search function atomically.
Again, the comment 'why?' would be helpful. OTOH, the comment in the structure
itself might be enough.
> */
> - subtable->lookup_func = dpcls_subtable_get_best_impl(unit0, unit1);
> + dpcls_subtable_lookup_func best_func =
> dpcls_subtable_get_best_impl(unit0,
> +
> unit1);
> + atomic_uintptr_t *subtable_func = (void *) &subtable->lookup_func;
> + atomic_init(subtable_func, (uintptr_t) best_func);
This pattern looks a bit weird, but for some reason it's used across
avx512-related
parts of the code. Something like this should work just fine:
atomic_init(&subtable->lookup_func,
dpcls_subtable_get_best_impl(unit0, unit1));
> +
Extra empty line is not needed.
>
> cmap_insert(&cls->subtables_map, &subtable->cmap_node, mask->hash);
> /* Add the new subtable at the end of the pvector (with no hits yet) */
> @@ -9779,10 +9786,18 @@ dpcls_subtable_lookup_reprobe(struct dpcls *cls)
> uint32_t u0_bits = subtable->mf_bits_set_unit0;
> uint32_t u1_bits = subtable->mf_bits_set_unit1;
> void *old_func = subtable->lookup_func;
> - subtable->lookup_func = dpcls_subtable_get_best_impl(u0_bits,
> u1_bits);
> +
> + /* Set the best impl to the subtable lookup function atomically. */
This comment doesn't seem very useful.
> + dpcls_subtable_lookup_func best_func =
> + dpcls_subtable_get_best_impl(u0_bits, u1_bits);
> + atomic_uintptr_t *subtable_func = (void *) &subtable->lookup_func;>
> + atomic_store_relaxed(subtable_func, (uintptr_t) best_func);
Same here.
atomic_store_relaxed(&subtable->lookup_func,
dpcls_subtable_get_best_impl(u0_bits, u1_bits));
> +
Please, add an empty line between declarations of variables and the rest of
the code instead.
> subtables_changed += (old_func != subtable->lookup_func);
> }
> - pvector_publish(pvec);
> + if (subtables_changed) {
> + pvector_publish(pvec);
This still does nothing useful. It's harmless, because we're under the
flow_mutex now, but it does nothing, because we're still not modifying
the vector, we're modifying the data to which pointers in the vector are
pointing to.
In the ideal world the vector and data it points to should be immutable.
So, we'll need to clone the implementation of the vector, remove and destroy
all the elements we would like to change, create modified versions and
add them to the vector. After that we can publish it. In this scenario
everything will be protected by rcu and no atomics required. But that
sounds like a waste of time for this particular use case, so we're
creating a workaround by using atomics and modifying the data while other
threads are using it. But we're not changing the vector itself.
Would be nice to have a comment though, why this function requires holding
a flow_mutex.
> + }
>
> return subtables_changed;
> }
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev