On Wed, Apr 9, 2025 at 5:08 PM Ilya Maximets <i.maxim...@ovn.org> wrote:
>
> On 4/9/25 5:43 PM, Numan Siddique wrote:
> > Hi OVS maintainers and others,
> >
> > In our deployments where we run OVS 3.2.2, we see a crash in
> > ovs-vswitchd when the ovn-controller
> > gets upgraded from 23.09 to 24.09.  The crash is seen exactly when
> > ovn-controller restarts
> > and connects to ovs-vswitchd.  We have configured
> > ovn-ofctrl-wait-before-clear="300000".
> > So ovn-controller doesn't delete the existing openflow rules as soon
> > as it connects to ovs-vswitchd.
> >
> > In 24.09, ovn-controller now configures the flow_table prefixes for
> > each table [1] which was not
> > the case in 23.09.
> >
> > We couldn't reproduce the issue again and it was seen only once on a
> > compute node.
> > Other compute nodes in the deployment didn't see this crash when the
> > upgrade happened.
> >
> > I suspect if we are hitting the same issue which this commit tried to fix 
> > [2].
>
> Yeah, the whole thing how trie updates are managed doesn't seem right.
> The memory fence in the classifier_lookup__() doesn't actually do much,
> because cls->n_tries is read multiple times after the fence and
> subtable->trie_plen is accessed after the fence.  So, I'm not sure
> ordering of what this fence is trying to ensure.
>
> The following sequence of events seems very possible:
> 0. n_tries is 2 from the beginning.
> 1. Main thread calls classifier_set_prefix_fields() and counts 3 tries.
> 2. Main thread clears subtable->trie_plen for 2 current tries.
> 3. Main thread waits for RCU to synchronize and continues.
> 4. Handler thread starts classification for some packet and calls
>    classifier_lookup__().
> 5. Handler thread copies pointers for 2 current tries into trie_ctx[].
> 6. Main thread initializes 3 tries, sets subtable->trie_plen for all
>    three of them and sets n_tries to 3.
> 7. Handler thread calls find_match_wc(trie_ctx[], cls->n_tries == 3).
> 8. Handler checks subtable->trie_plen[2] is set and attempts to use
>    trie_ctx[2] that is not initialized, as only two pointers were copied.
>
> One obvious thing here is that n_tries must be read only once in the
> classifier_lookup__(), so we do not access elements of the trie_ctx[]
> that were not initialized.
>
> A larger change with the re-evaluation of the whole thread fence thing
> might look something like this:
>
> 1. Convert cls->n_tries into atomic.
> 2. In classifier_set_prefix_fields():
> 2.1. Before making any changes, set cls->n_tries to zero with the
>      memory_order_release (this one may be relaxed?).
> 2.2. ovsrcu_synchronize() to wait for all threads to stop using tries.
> 2.3. Update tries while nobody is using them.
> 2.4. Set cls->n_tries to a new value with memory_order_release.
> 3. In the classifier_lookup__(), read cls->n_tries with the
>    memory_order_acquire and use that once read value throughout.
>
> Current code is trying to be smart and not disable trie lookups for
> tries that are not changing.  But it may be better to just turn them
> all off, as the schema above will do, and run without tries for a
> short period of time until they are reconfigured.  With that schema
> we can probably revert [1] and avoid clearing trie_plen fields and
> so cleanup a bit of relevant code.  But I didn't look that deep yet.
>

Thanks Ilya for the explanation.
I'm not familiar with the classifier code.  I'll see if I can take a
look at the fix.
If you or others want to address it, feel free to look at it.

Thanks
Numan

> Best regards, Ilya Maximets.
>
> >
> > From the coredump analysis,
> > `check_tries` function at `../lib/classifier.c:1585` is where the
> > crash seems to occur.
> > It's trying to access `ctx->trie->field` and it is NULL.
> >
> > Any pointers on what could be going on here ?
> >
> > [1] - 8d64f2b7dc ("controller: Fix IPv6 dp flow explosion by setting
> > flow table prefixes.")
> >       
> > https://github.com/ovn-org/ovn/commit/8d64f2b7dcef24d175151ba5e0732281cdeb6d54
> >
> > [2] - a6117059904 "(classifier: Prevent tries vs n_tries race leading
> > to NULL dereference.")
> >       
> > https://github.com/openvswitch/ovs/commit/a6117059904bb692039c926221964dd6d49b3bfd
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to