On 16 May 2022, at 22:04, Aaron Conole wrote:

> During flow processing, the flow wildcards are checked as a series of
> stages, and these stages are intended to carry dependencies in a single
> direction.  But when the neighbor discovery processing, for example, was
> executed there is an incorrect dependency chain - we need fields from
> stage 4 to determine whether we need fields from stage 3.
>
> We can build a set of flow rules to demonstrate this:
>   table=0,priority=100,ipv6,ipv6_src=1000::/10 actions=resubmit(,1)
>   table=0,priority=0 actions=NORMAL
>   table=1,priority=110,ipv6,ipv6_dst=1000::3 actions=resubmit(,2)
>   table=1,priority=100,ipv6,ipv6_dst=1000::4 actions=resubmit(,2)
>   table=1,priority=0 actions=NORMAL
>   
> table=2,priority=120,icmp6,nw_ttl=255,icmp_type=135,icmp_code=0,nd_sll=10:de:ad:be:ef:10
>  actions=NORMAL
>   table=2,priority=100,tcp actions=NORMAL
>   table=2,priority=100,icmp6 actions=NORMAL
>   table=2,priority=0 actions=NORMAL
>
> With this set of flows, any IPv6 packet that executes through this pipeline
> will have the corresponding nd_sll field flagged as required match for
> classification even if that field doesn't make sense in such a context
> (for example, TCP packets).  When the corresponding flow is installed into
> the kernel datapath, this field is not reflected when the revalidator
> executes the dump stage (see net/openvswitch/flow_netlink.c for more details).
>
> During the sweep stage, revalidator will compare the dumped WC with a
> generated WC - these will mismatch because the generated WC will match on
> the Neighbor Discovery fields, while the datapath WC will not match on
> these fields.  We will then invalidate the flow and as a side effect
> force an upcall.
>
> By redefining the boundary, we shift these fields to the l4 subtable, and
> cause masks to be generated matching just the requisite fields.  The list
> of fields being shifted:
>
>     struct in6_addr nd_target;
>     struct eth_addr arp_sha;
>     struct eth_addr arp_tha;
>     ovs_be16 tcp_flags;
>     ovs_be16 pad2;
>     struct ovs_key_nsh nsh;
>
> A standout field would be tcp_flags moving from l3 subtable matches to
> the l4 subtable matches.  This reverts a partial performance optimization
> in the case of stateless firewalling.  The tcp_flags field might have
> been a good candidate to retain in the l3 segment, but it got overloaded
> with ICMPv6 ND matching, and therefore we can't preserve this kind of
> optimization.
>
> Two other approaches were considered - moving the nd_target field alone
> and collapsing the l3/l4 segments into a single subtable for matching.
> Moving any field individually introduces ABI mismatch, and doesn't
> completely address the problems with other neighbor discovery related
> fields (such as nd_sll/nd_tll).  Collapsing the two subtables creates
> an issue with datapath flow explosion, since the l3 and l4 fields will
> be unwildcarded together (this can be seen with some of the existing
> classifier tests).
>
> A simple test is added to showcase the behavior.
>
> Fixes: 476f36e83bc5 ("Classifier: Staged subtable matching.")
> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2081773
> Reported-by: Numan Siddique <[email protected]>
> Suggested-by: Ilya Maximets <[email protected]>
> Signed-off-by: Aaron Conole <[email protected]>
> ---

Bit later response (was planning on replying to the RFC patch), but I needed a 
bit more time (and experimenting) with this area of the code :)

The change looks good, and thanks for adding so many details in the commit 
message!

Acked-by: Eelco Chaudron <[email protected]>


_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to