On 4/12/25 5:52 PM, Numan Siddique wrote: > On Wed, Apr 9, 2025 at 5:08 PM Ilya Maximets <i.maxim...@ovn.org> wrote: >> >> On 4/9/25 5:43 PM, Numan Siddique wrote: >>> Hi OVS maintainers and others, >>> >>> In our deployments where we run OVS 3.2.2, we see a crash in >>> ovs-vswitchd when the ovn-controller >>> gets upgraded from 23.09 to 24.09. The crash is seen exactly when >>> ovn-controller restarts >>> and connects to ovs-vswitchd. We have configured >>> ovn-ofctrl-wait-before-clear="300000". >>> So ovn-controller doesn't delete the existing openflow rules as soon >>> as it connects to ovs-vswitchd. >>> >>> In 24.09, ovn-controller now configures the flow_table prefixes for >>> each table [1] which was not >>> the case in 23.09. >>> >>> We couldn't reproduce the issue again and it was seen only once on a >>> compute node. >>> Other compute nodes in the deployment didn't see this crash when the >>> upgrade happened. >>> >>> I suspect if we are hitting the same issue which this commit tried to fix >>> [2]. >> >> Yeah, the whole thing how trie updates are managed doesn't seem right. >> The memory fence in the classifier_lookup__() doesn't actually do much, >> because cls->n_tries is read multiple times after the fence and >> subtable->trie_plen is accessed after the fence. So, I'm not sure >> ordering of what this fence is trying to ensure. >> >> The following sequence of events seems very possible: >> 0. n_tries is 2 from the beginning. >> 1. Main thread calls classifier_set_prefix_fields() and counts 3 tries. >> 2. Main thread clears subtable->trie_plen for 2 current tries. >> 3. Main thread waits for RCU to synchronize and continues. >> 4. Handler thread starts classification for some packet and calls >> classifier_lookup__(). >> 5. Handler thread copies pointers for 2 current tries into trie_ctx[]. >> 6. Main thread initializes 3 tries, sets subtable->trie_plen for all >> three of them and sets n_tries to 3. >> 7. Handler thread calls find_match_wc(trie_ctx[], cls->n_tries == 3). >> 8. Handler checks subtable->trie_plen[2] is set and attempts to use >> trie_ctx[2] that is not initialized, as only two pointers were copied. >> >> One obvious thing here is that n_tries must be read only once in the >> classifier_lookup__(), so we do not access elements of the trie_ctx[] >> that were not initialized. >> >> A larger change with the re-evaluation of the whole thread fence thing >> might look something like this: >> >> 1. Convert cls->n_tries into atomic. >> 2. In classifier_set_prefix_fields(): >> 2.1. Before making any changes, set cls->n_tries to zero with the >> memory_order_release (this one may be relaxed?). >> 2.2. ovsrcu_synchronize() to wait for all threads to stop using tries. >> 2.3. Update tries while nobody is using them. >> 2.4. Set cls->n_tries to a new value with memory_order_release. >> 3. In the classifier_lookup__(), read cls->n_tries with the >> memory_order_acquire and use that once read value throughout. >> >> Current code is trying to be smart and not disable trie lookups for >> tries that are not changing. But it may be better to just turn them >> all off, as the schema above will do, and run without tries for a >> short period of time until they are reconfigured. With that schema >> we can probably revert [1] and avoid clearing trie_plen fields and >> so cleanup a bit of relevant code. But I didn't look that deep yet. >> > > Thanks Ilya for the explanation.
Np. > I'm not familiar with the classifier code. I'll see if I can take a > look at the fix. We don't really have many people who knows the classifier code well. So, having new eyes on it is good for a long run. :) > If you or others want to address it, feel free to look at it. I'll be off for a week, but if you'll find you don't have much time for this, let me know, I can take over once I'm back. Thanks! Best regards, Ilya Maximets. > > Thanks > Numan > >> Best regards, Ilya Maximets. >> >>> >>> From the coredump analysis, >>> `check_tries` function at `../lib/classifier.c:1585` is where the >>> crash seems to occur. >>> It's trying to access `ctx->trie->field` and it is NULL. >>> >>> Any pointers on what could be going on here ? >>> >>> [1] - 8d64f2b7dc ("controller: Fix IPv6 dp flow explosion by setting >>> flow table prefixes.") >>> >>> https://github.com/ovn-org/ovn/commit/8d64f2b7dcef24d175151ba5e0732281cdeb6d54 >>> >>> [2] - a6117059904 "(classifier: Prevent tries vs n_tries race leading >>> to NULL dereference.") >>> >>> https://github.com/openvswitch/ovs/commit/a6117059904bb692039c926221964dd6d49b3bfd _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev