On 5/24/22 23:35, Aaron Conole wrote:
> During flow processing, the flow wildcards are checked as a series of
> stages, and these stages are intended to carry dependencies in a single
> direction.  But when the neighbor discovery processing, for example, was
> executed there is an incorrect dependency chain - we need fields from
> stage 4 to determine whether we need fields from stage 3.
> 
> We can build a set of flow rules to demonstrate this:
>   table=0,priority=100,ipv6,ipv6_src=1000::/10 actions=resubmit(,1)
>   table=0,priority=0 actions=NORMAL
>   table=1,priority=110,ipv6,ipv6_dst=1000::3 actions=resubmit(,2)
>   table=1,priority=100,ipv6,ipv6_dst=1000::4 actions=resubmit(,2)
>   table=1,priority=0 actions=NORMAL
>   
> table=2,priority=120,icmp6,nw_ttl=255,icmp_type=135,icmp_code=0,nd_sll=10:de:ad:be:ef:10
>  actions=NORMAL
>   table=2,priority=100,tcp actions=NORMAL
>   table=2,priority=100,icmp6 actions=NORMAL
>   table=2,priority=0 actions=NORMAL
> 
> With this set of flows, any IPv6 packet that executes through this pipeline
> will have the corresponding nd_sll field flagged as required match for
> classification even if that field doesn't make sense in such a context
> (for example, TCP packets).  When the corresponding flow is installed into
> the kernel datapath, this field is not reflected when the revalidator
> executes the dump stage (see net/openvswitch/flow_netlink.c for more details).
> 
> During the sweep stage, revalidator will compare the dumped WC with a
> generated WC - these will mismatch because the generated WC will match on
> the Neighbor Discovery fields, while the datapath WC will not match on
> these fields.  We will then invalidate the flow and as a side effect
> force an upcall.
> 
> By redefining the boundary, we shift these fields to the l4 subtable, and
> cause masks to be generated matching just the requisite fields.  The list
> of fields being shifted:
> 
>     struct in6_addr nd_target;
>     struct eth_addr arp_sha;
>     struct eth_addr arp_tha;
>     ovs_be16 tcp_flags;
>     ovs_be16 pad2;
>     struct ovs_key_nsh nsh;
> 
> A standout field would be tcp_flags moving from l3 subtable matches to
> the l4 subtable matches.  This reverts a partial performance optimization
> in the case of stateless firewalling.  The tcp_flags field might have
> been a good candidate to retain in the l3 segment, but it got overloaded
> with ICMPv6 ND matching, and therefore we can't preserve this kind of
> optimization.
> 
> Two other approaches were considered - moving the nd_target field alone
> and collapsing the l3/l4 segments into a single subtable for matching.
> Moving any field individually introduces ABI mismatch, and doesn't
> completely address the problems with other neighbor discovery related
> fields (such as nd_sll/nd_tll).  Collapsing the two subtables creates
> an issue with datapath flow explosion, since the l3 and l4 fields will
> be unwildcarded together (this can be seen with some of the existing
> classifier tests).
> 
> A simple test is added to showcase the behavior.
> 
> Fixes: 476f36e83bc5 ("Classifier: Staged subtable matching.")
> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2081773
> Reported-by: Numan Siddique <nusid...@redhat.com>
> Suggested-by: Ilya Maximets <i.maxim...@ovn.org>
> Signed-off-by: Aaron Conole <acon...@redhat.com>
> Acked-by: Eelco Chaudron <echau...@redhat.com>
> Acked-by: Cian Ferriter <cian.ferri...@intel.com>
> Tested-by: Numan Siddique <num...@ovn.org>
> ---
> v1->v2: Added missing OVS_VSWITCHD_STOP and adjust whitespace on test case

This fixes an important bug that is really hard to track down on real
deployments.  Testing so far didn't show any significant issues with
this approach, so I applied the patch and backported down to 2.13.

Thanks!

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to