On 5/26/25 3:45 PM, Rukomoinikova Aleksandra wrote:
> Ilya, hi! Thank you for your help and sorry for the long reply.
> 
> Generally, I figured out the classifier code and understood what you 
> meant, but I noticed that when testing with a kernel module, there's a 
> difference in the way the classifier works, or something similar. In the 
> test I was having trouble with, the expected conntrack state is: 
> +new-est-rpl+trk. After my changes, it turns out to be: +new-rpl+trk. I 
> corrected the test to expect +new-rpl+trk, but in tests with the OVS 
> kernel datapath(check kernel), the old сonntrack state is expected: 
> +new-est-rpl+trk. I couldn't figure out the reason for this behavior, 
> can you suggest where it would be better to look and what it could be 
> connected with? Thanks!

Sorry, lost track of this discussion for some time.  If you run the same
test with 'make check-kernel' and the 'make check-system-userspace', do
you see different conntrack states?  There might be some slight difference
in what kind of OpenFlow rules ovn-controller generates depending on the
OVS datapath type (features supported).

> 
> On 19.05.2025 23:14, Ilya Maximets wrote:
>> On 5/16/25 11:16 AM, Rukomoinikova Aleksandra wrote:
>>> Hi!
>>>
>>> We've encountered a strange issue while backporting patches to the
>>> version 24.03 branch (ovs v3.3.4) and running tests. Let me describe the
>>> situation:
>>> I took the upstream branch 24.03, added a stage at the beginning of the
>>> switch pipeline, and added a 'match all' flow with 'next;' action.
>>> Commit example:
>>> https://github.com/Sashhkaa/ovn/commit/f20295315c327addfeb6fe455c3b3c655d6b3666.
>>> After this change, OVN 79-82 userspace tests (ECMP symmetric reply)
>>> started failing.
>>> According to the test logs, I see the following:
>>> The test expects to see the conntrack state ct_state(+new-est-rpl+trk)
>>> in the datapath flow, but gets ct_state(+new-rpl+trk) - that is, -est
>>> disappears. I will also attach more detailed dumps below.
>> In general, the extra -est match is harmless and doesn't affect correctness,
>> because +new traffic is always -est.  And +est traffic is always -new.
>> So, I think, you may just update the test in your internal backport and
>> call it a day.
>>
>> For the actual reason why this is happening, the answer is: OpenFlow table
>> sharing between the switch and the router pipelines.
>>
>> Both the router and the switch pipelines have their OpenFlow rules in the
>> exact same OpenFlow tables starting from table 8.  This means that on 24.03
>> the ls_in_acl_action and the lr_in_ecmp_stateful stages are using the same
>> OpenFlow table 17.  When you add one stage to the switch pipeline, you shift
>> all switch tables by one while keeping router pipelines in place.  So, now
>> lr_in_ecmp_stateful shares the table with ls_in_acl_eval instead.
>>
>> All the rules have a match on metadata fields that distinguishes switches
>> from routers and so there are no issues with correctness caused by sharing.
>> However, the classifier may add extra matches due to internal implementation
>> details.  Classifier will traverse all the rules in the OpenFlow table
>> starting with the highest priority.  If there are no rules that match the
>> packet in the current priority, classifier adds a minimal match to the
>> datapath flow that will distinguish this packet from any OpenFlow rule in
>> this table at this priority.  So, if one of the rules with the higher
>> priority had +est in the match, classifier will add -est to the datapath
>> flow for the packet that didn't match that flow.
>>
>> So, by adding an extra stage to the router pipeline, you're just restoring
>> the mapping of switch and router pipelines to OpenFlow tables like it was
>> before the backport.  By playing with ACL priorities, you're making the
>> classifier go to the next table before evaluating a lower priority rule that
>> has an extra +est match.
>>
>> We worked on one similar issue recently:
>>    
>> https://patchwork.ozlabs.org/project/ovn/patch/20250414085122.348614-4-dce...@redhat.com/
>> Here we had a -dnat match leak from the router pipeline to the switch
>> pipeline for the packet that does not even go through the router.  And that
>> breaks hardware offload because neither kernel nor hardware NICs support
>> offloading of NAT flags.
>>
>> Leaking of match criteria between switch and router pipelines is an
>> interesting side effect of OVN design, but should not generally cause issues,
>> except for hardware offloading in some cases.
>>
>> Best regards, Ilya Maximets.
>>
>>> The expected state should be set by matching this OpenFlow rule in table
>>> 17 (in OVN it is router pipeline table 9 - ECMP stateful):
>>>
>>>    cookie=0xdda3b0a7, duration=2.635s, table=17, n_packets=6,
>>> n_bytes=636, idle_age=1,
>>> priority=100,ct_state=+new-rpl+trk,ipv6,reg14=0x2,metadata=0x1,ipv6_dst=fd01::/126
>>> actions=ct(commit,zone=NXM_NX_REG11[0..15],nat(src),exec(move:NXM_OF_ETH_SRC[]->NXM_NX_CT_LABEL[32..79],load:0x2->NXM_NX_CT_MARK[16..31])),resubmit(,18)
>>>
>>>    cookie=0xdda3b0a7, duration=2.635s, table=17, n_packets=14,
>>> n_bytes=1396, idle_age=0,
>>> priority=100,ct_state=+est-rpl+trk,ipv6,reg14=0x2,metadata=0x1,ipv6_dst=fd01::/126
>>> actions=ct(commit,zone=NXM_NX_REG11[0..15],nat(src),exec(move:NXM_OF_ETH_SRC[]->NXM_NX_CT_LABEL[32..79],load:0x2->NXM_NX_CT_MARK[16..31])),resubmit(,18)
>>>
>>>
>>> I found two logical flow changes, that work, though it's not clear why:
>>> 1) Adding a router table before ECMP processing:
>>> By inserting just one table at the very beginning of the router
>>> pipeline, before the ECMP stateful handling (for example,
>>> https://github.com/odivlad/ovn/commit/eb6d0d7409ff78f1fc0908a28225d0a2a47daa29
>>> one table is enough), the test starts passing. The mechanism isn't clear
>>> - packets now match the default flow in table 17 and only hit the proper
>>> ECMP rule in table 18, yet this somehow resolves the issue.
>>> 2) Modifying ACL evaluation rules:
>>> The second solution is even more strange. Since this test case doesn't
>>> use ACLs or load balancers, northd adds match all' flow with 'next;'
>>> action and priority 65535 to the acl_eval table (logical table 9 in
>>> switch, OpenFlow table 17). When we lower the priority of these rules
>>> below 100(less priority for the ecmp rules), the test begins working.
>>> This suggests some hidden interaction between router and switch pipeline
>>> rules, despite their different metadata matching criteria.
>>>
>>> When examining the OVS traces for both cases - the initial failed test
>>> with just a stage addition versus the working version where we also
>>> modified the ACL eval table priority to 0 - the packet's path through
>>> the tables shows no differences except for two key aspects: first, the
>>> rule matching in ACL eval (OpenFlow table 17), and second, the resulting
>>> datapath action where the -est state unexpectedly disappears. The trace
>>> comparison reveals that only the rule priorities in table 17 actually
>>> changed, yet this somehow impacts the connection tracking state. You can
>>> see the complete trace comparison showing both scenarios - with just the
>>> stage addition and with the priority modification - along with the
>>> contents of table 17 and the diff between traces at this link:
>>> https://gist.github.com/Sashhkaa/58b2c616e7d46fc2dafb898ed832960f.
>>> I've verified this behavior persists in newer versions of Open vSwitch
>>> as well.
>>> Does anyone understand what could be causing this issue? I'd appreciate
>>> any insights or suggestions for a proper fix. Thank you!
>>>

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to