On Thu, Aug 15, 2024 at 6:38 AM Dumitru Ceara <[email protected]> wrote:
>
> On 8/15/24 05:02, [email protected] wrote:
> > From: Numan Siddique <[email protected]>
> >
> > IPv6 ND Solicitation (NS) responder logical flows match on ip6.dst
> > field.  These flows when translated to datapath flows also match on
> > ip6.dst, which means a separate datapath flow per destination IP
> > address.  This may cause significant performance issues in some
> > setups (particularly ovs-dpdk telco deployments).
> >
> > This patch addresses this issue by matching on eth.mcast6 so that
> > datapath flows for normal IPv6 traffic doesn't have to match on
> > ip6.dst.  IPv6 NS packets are generally multicast.  A new logical
> > match "nd_ns_mcast" is added for this purpose.
> >
> > After this patch, We no longer respond to IPv6 NS unicast packets.
> > Let the target reply to it, so that the sender has the ability to
> > monitor the targe liveness via the unicast ND solicitations.
> > This behavior now matches the IPv4 ARP responder flows.  Note that
> > after the commit [1] which was recently added we now only respond
> > to IPv4 ARP broadcast packets.
> >
> > A recent patch [2] from Ilya partially addressed the same datapath
> > flow explosion issue by matching on eth.mcast6 for MLD packets.
> > With this patch, we now completely address the datapath flow
> > explosion issue for IPv6 traffic provided all the logical ports
> > of a logical switch are not configured with port security.
> >
> > [1] - c48ed1736a58 ("Do not reply on unicast arps for IPv4 targets.")
> > [2] - 43c34f2e6676 ("logical-fields: Add missing multicast matches for MLD 
> > and IGMP.")
> >
> > Note: Documentation for 'eth.mcastv6' and 'ip6.mcast' predicates were
> > missing from ovn-sb.xml and this patch adds it.
> >
> > Reported-at: https://issues.redhat.com/browse/FDP-728
> > Reported-by: Mike Pattrick <[email protected]>
> > Signed-off-by: Numan Siddique <[email protected]>
> > ---
> >
> > v1 -> v2
> > -------
> >   - Addressed review comments from Ilya.
> >
>
> Hi Numan,
>
> For proxy-arp we still match on unicast ND as far as I can tell.  I'm
> not sure if we can remove that part though.  Maybe it's something to
> consider as a follow up?

Hi Dumitru,
Thanks for the review.  If a logical switch has  a router port with
proxy_arp configured
with IPv6 addresses then the dp flow would match on the ipv6 src/dst fields.

I've updated the commit message with this limitation.  I'm not sure if
we can avoid matching
on unicast ND solicitation for proxy_arp.  I need to think about it.
I guess it can be a follow up
if it's possible to address that.

I've submitted v3.  Please take a look -
https://patchwork.ozlabs.org/project/ovn/patch/[email protected]/.
I removed the  system test and added the test in ovn.at instead so
that the test is reliable.

Thanks
Numan

>
> build_lswitch_arp_nd_responder_known_ips()
> [...]
>         /* Add IPv6 NDP responses.
>          * For ND solicitations, we need to listen for both the
>          * unicast IPv6 address and its all-nodes multicast address,
>          * but always respond with the unicast IPv6 address. */
>         if (op->proxy_arp_addrs.n_ipv6_addrs) {
>             struct ds ip6_dst_match = DS_EMPTY_INITIALIZER;
>             struct ds nd_target_match = DS_EMPTY_INITIALIZER;
>             for (size_t j = 0; j < op->proxy_arp_addrs.n_ipv6_addrs; j++) {
>                 ds_put_format(&ip6_dst_match, "%s/%u, %s/%u, ",
>                         op->proxy_arp_addrs.ipv6_addrs[j].addr_s,
>                         op->proxy_arp_addrs.ipv6_addrs[j].plen,
>                         op->proxy_arp_addrs.ipv6_addrs[j].sn_addr_s,
>                         op->proxy_arp_addrs.ipv6_addrs[j].plen);
>                 ds_put_format(&nd_target_match, "%s/%u, ",
>                         op->proxy_arp_addrs.ipv6_addrs[j].addr_s,
>                         op->proxy_arp_addrs.ipv6_addrs[j].plen);
>             }
>
> Aside from that I have a few small comments on the test.  In general
> the change makes sense to me.
>
> Thanks,
> Dumitru
>
> >  lib/logical-fields.c |   3 ++
> >  northd/northd.c      |  21 +++++---
> >  ovn-sb.xml           |   3 ++
> >  tests/ovn-northd.at  |   4 +-
> >  tests/ovn.at         |   4 +-
> >  tests/system-ovn.at  | 123 +++++++++++++++++++++++++++++++++++++++++++
> >  6 files changed, 146 insertions(+), 12 deletions(-)
> >
> > diff --git a/lib/logical-fields.c b/lib/logical-fields.c
> > index 2c9d3c61bf..5a8b53f2b6 100644
> > --- a/lib/logical-fields.c
> > +++ b/lib/logical-fields.c
> > @@ -293,6 +293,9 @@ ovn_init_symtab(struct shash *symtab)
> >                "icmp6.type == {135, 136} && icmp6.code == 0 && ip.ttl == 
> > 255");
> >      expr_symtab_add_predicate(symtab, "nd_ns",
> >                "icmp6.type == 135 && icmp6.code == 0 && ip.ttl == 255");
> > +    expr_symtab_add_predicate(symtab, "nd_ns_mcast",
> > +              "ip6.mcast && icmp6.type == 135 && icmp6.code == 0 && "
> > +              "ip.ttl == 255");
> >      expr_symtab_add_predicate(symtab, "nd_na",
> >                "icmp6.type == 136 && icmp6.code == 0 && ip.ttl == 255");
> >      expr_symtab_add_predicate(symtab, "nd_rs",
> > diff --git a/northd/northd.c b/northd/northd.c
> > index c4824aae3b..c9ae1e958f 100644
> > --- a/northd/northd.c
> > +++ b/northd/northd.c
> > @@ -9633,16 +9633,21 @@ build_lswitch_arp_nd_responder_known_ips(struct 
> > ovn_port *op,
> >                                                    op->lflow_ref);
> >              }
> >
> > -            /* For ND solicitations, we need to listen for both the
> > -             * unicast IPv6 address and its all-nodes multicast address,
> > -             * but always respond with the unicast IPv6 address. */
> > +            /* For ND solicitations:
> > +             *   - Reply only for the all-nodes multicast address(es) of 
> > the
> > +             *     logical port IPv6 address(es).
> > +             *
> > +             *   - Do not reply for unicast ND solicitations.  Let the 
> > target
> > +             *     reply it, so that the sender has the ability to monitor
> > +             *     the targe liveness via the unicast ND solicitations.
> > +             */
> >              for (size_t j = 0; j < op->lsp_addrs[i].n_ipv6_addrs; j++) {
> >                  ds_clear(match);
> > -                ds_put_format(match,
> > -                        "nd_ns && ip6.dst == {%s, %s} && nd.target == %s",
> > -                        op->lsp_addrs[i].ipv6_addrs[j].addr_s,
> > -                        op->lsp_addrs[i].ipv6_addrs[j].sn_addr_s,
> > -                        op->lsp_addrs[i].ipv6_addrs[j].addr_s);
> > +                ds_put_format(
> > +                    match,
> > +                    "nd_ns_mcast && ip6.dst == %s && nd.target == %s",
> > +                    op->lsp_addrs[i].ipv6_addrs[j].sn_addr_s,
> > +                    op->lsp_addrs[i].ipv6_addrs[j].addr_s);
> >
> >                  ds_clear(actions);
> >                  ds_put_format(actions,
> > diff --git a/ovn-sb.xml b/ovn-sb.xml
> > index de0bd636fd..c11296d7c4 100644
> > --- a/ovn-sb.xml
> > +++ b/ovn-sb.xml
> > @@ -1146,6 +1146,7 @@
> >        <ul>
> >          <li><code>eth.bcast</code> expands to <code>eth.dst == 
> > ff:ff:ff:ff:ff:ff</code></li>
> >          <li><code>eth.mcast</code> expands to <code>eth.dst[40]</code></li>
> > +        <li><code>eth.mcastv6</code> expands to <code>eth.dst[32..47] == 
> > 0x3333</code></li>
> >          <li><code>vlan.present</code> expands to 
> > <code>vlan.tci[12]</code></li>
> >          <li><code>ip4</code> expands to <code>eth.type == 0x800</code></li>
> >          <li><code>ip4.src_mcast</code> expands to
> > @@ -1161,8 +1162,10 @@
> >          <li><code>ip.first_frag</code> expands to <code>ip.is_frag 
> > &amp;&amp; !ip.later_frag</code></li>
> >          <li><code>arp</code> expands to <code>eth.type == 0x806</code></li>
> >          <li><code>rarp</code> expands to <code>eth.type == 
> > 0x8035</code></li>
> > +        <li><code>ip6.mcast</code> expands to <code>eth.mcastv6 &amp;&amp; 
> > ip6.dst[120..127] == 0xff</code></li>
> >          <li><code>nd</code> expands to <code>icmp6.type == {135, 136} 
> > &amp;&amp; icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
> >          <li><code>nd_ns</code> expands to <code>icmp6.type == 135 
> > &amp;&amp; icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
> > +        <li><code>nd_ns_mcast</code> expands to <code>ip6.mcast  
> > &amp;&amp; icmp6.type == 135 &amp;&amp; icmp6.code == 0 &amp;&amp; ip.ttl 
> > == 255</code></li>
> >          <li><code>nd_na</code> expands to <code>icmp6.type == 136 
> > &amp;&amp; icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
> >          <li><code>nd_rs</code> expands to <code>icmp6.type == 133 
> > &amp;&amp;
> >          icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
> > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> > index 3c49c0d43b..93ccbce6b0 100644
> > --- a/tests/ovn-northd.at
> > +++ b/tests/ovn-northd.at
> > @@ -9503,9 +9503,9 @@ AT_CAPTURE_FILE([S1flows])
> >  AT_CHECK([grep -e "ls_in_arp_rsp" S1flows | ovn_strip_lflows], [0], [dnl
> >    table=??(ls_in_arp_rsp      ), priority=0    , match=(1), action=(next;)
> >    table=??(ls_in_arp_rsp      ), priority=100  , match=(arp.tpa == 
> > 192.168.0.10 && arp.op == 1 && eth.dst == ff:ff:ff:ff:ff:ff && inport == 
> > "S1-vm1"), action=(next;)
> > -  table=??(ls_in_arp_rsp      ), priority=100  , match=(nd_ns && ip6.dst 
> > == {fd00::10, ff02::1:ff00:10} && nd.target == fd00::10 && inport == 
> > "S1-vm1"), action=(next;)
> > +  table=??(ls_in_arp_rsp      ), priority=100  , match=(nd_ns_mcast && 
> > ip6.dst == ff02::1:ff00:10 && nd.target == fd00::10 && inport == "S1-vm1"), 
> > action=(next;)
> >    table=??(ls_in_arp_rsp      ), priority=50   , match=(arp.tpa == 
> > 192.168.0.10 && arp.op == 1 && eth.dst == ff:ff:ff:ff:ff:ff), 
> > action=(eth.dst = eth.src; eth.src = 50:54:00:00:00:10; arp.op = 2; /* ARP 
> > reply */ arp.tha = arp.sha; arp.sha = 50:54:00:00:00:10; arp.tpa = arp.spa; 
> > arp.spa = 192.168.0.10; outport = inport; flags.loopback = 1; output;)
> > -  table=??(ls_in_arp_rsp      ), priority=50   , match=(nd_ns && ip6.dst 
> > == {fd00::10, ff02::1:ff00:10} && nd.target == fd00::10), action=(nd_na { 
> > eth.src = 50:54:00:00:00:10; ip6.src = fd00::10; nd.target = fd00::10; 
> > nd.tll = 50:54:00:00:00:10; outport = inport; flags.loopback = 1; output; 
> > };)
> > +  table=??(ls_in_arp_rsp      ), priority=50   , match=(nd_ns_mcast && 
> > ip6.dst == ff02::1:ff00:10 && nd.target == fd00::10), action=(nd_na { 
> > eth.src = 50:54:00:00:00:10; ip6.src = fd00::10; nd.target = fd00::10; 
> > nd.tll = 50:54:00:00:00:10; outport = inport; flags.loopback = 1; output; 
> > };)
> >  ])
> >
> >  #Set the disable_arp_nd_rsp option and verify the flow
> > diff --git a/tests/ovn.at b/tests/ovn.at
> > index a1d689e849..7583da7e07 100644
> > --- a/tests/ovn.at
> > +++ b/tests/ovn.at
> > @@ -8617,7 +8617,7 @@ done
> >  # Complete Neighbor Solicitation packet and Neighbor Advertisement packet
> >  # vif1 -> NS -> vif2.  vif1 <- NA <- ovn-controller.
> >  # vif2 will not receive NS packet, since ovn-controller will reply for it.
> > -ns_packet=3333ffa1f9aefa163e94059886dd6000000000203afffd81ce49a9480000f8163efffe940598fd81ce49a9480000f8163efffea1f9ae8700e01160000000fd81ce49a9480000f8163efffea1f9ae0101fa163e940598
> > +ns_packet=3333ffa1f9aefa163e94059886dd6000000000203afffd81ce49a9480000f8163efffe940598ff0200000000000000000001ffa1f9ae8700e01160000000fd81ce49a9480000f8163efffea1f9ae0101fa163e940598
> >  
> > na_packet=fa163e940598fa163ea1f9ae86dd6000000000203afffd81ce49a9480000f8163efffea1f9aefd81ce49a9480000f8163efffe9405988800e9ed60000000fd81ce49a9480000f8163efffea1f9ae0201fa163ea1f9ae
> >
> >  as hv1 ovs-appctl netdev-dummy/receive vif1 $ns_packet
> > @@ -14417,7 +14417,7 @@ test_ns_na() {
> >      local inport=$1 src_mac=$2 dst_mac=$3 src_ip=$4 dst_ip=$5
> >
> >      packet=$(fmt_pkt "
> > -        Ether(dst='ff:ff:ff:ff:ff:ff', src='${src_mac}') /
> > +        Ether(dst='33:33:ff:ff:ff:ff', src='${src_mac}') /
> >          IPv6(src='${src_ip}', dst='ff02::1:ff00:2') /
> >          ICMPv6ND_NS(tgt='${dst_ip}')
> >      ")
> > diff --git a/tests/system-ovn.at b/tests/system-ovn.at
> > index 6e4ec42473..e43244ca60 100644
> > --- a/tests/system-ovn.at
> > +++ b/tests/system-ovn.at
> > @@ -13899,3 +13899,126 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
> >  /.*terminating with signal 15.*/d"])
> >  AT_CLEANUP
> >  ])
> > +
> > +dnl This test sends IPv6 TCP packets between two lports of the same
> > +dnl logical switch and checks that the datapath flows don't match on
> > +dnl IPv6 source and destination addresses.
> > +OVN_FOR_EACH_NORTHD([
> > +AT_SETUP([ovn -- IPv6 switching - datapath flows])
> > +AT_KEYWORDS([IPv6])
> > +
> > +ovn_start
> > +OVS_TRAFFIC_VSWITCHD_START()
> > +ADD_BR([br-int])
> > +
> > +dnl Set external-ids in br-int needed for ovn-controller
> > +check ovs-vsctl \
> > +        -- set Open_vSwitch . external-ids:system-id=hv1 \
> > +        -- set Open_vSwitch . 
> > external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
> > +        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
> > +        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
> > +        -- set bridge br-int fail-mode=secure 
> > other-config:disable-in-band=true
> > +
> > +dnl Start ovn-controller
> > +start_daemon ovn-controller
> > +
> > +check ovn-nbctl ls-add sw0
> > +
> > +check ovn-nbctl lsp-add sw0 vm0
> > +check ovn-nbctl lsp-set-addresses vm0 "f0:00:0f:01:02:03 10.0.0.3 1000::3"
> > +
> > +check ovn-nbctl lsp-add sw0 vm1
> > +check ovn-nbctl lsp-set-addresses vm1 "f0:00:0f:01:02:04 10.0.0.4 1000::4"
> > +
> > +check ovn-nbctl --wait=hv sync
> > +
> > +ADD_NAMESPACES(vm0)
> > +ADD_VETH(vm0, vm0, br-int, "1000::3/64", "f0:00:0f:01:02:03", "1000::1", 
> > "nodad")
> > +
> > +ADD_NAMESPACES(vm1)
> > +ADD_VETH(vm1, vm1, br-int, "1000::4/64", "f0:00:0f:01:02:04", "1000::1", 
> > "nodad")
> > +
>
> It's probably best to ensure OVN bound the ports here, e.g.:
>
> dnl Wait for ovn-controller to catch up.
> wait_for_ports_up
> check ovn-nbctl --wait=hv sync
>
> The sync before ADD_NAMESPACES can probably be removed then.
>
> > +dnl Checks the datapath flows and makes sure that there are no matches to
> > +dnl IPv6 src or dst fields.
> > +check_dp_flows() {
> > +  ipv6_src_dst_match=$1
> > +  dnl Dump the flows. Will be helpful while debugging.
> > +  ovs-appctl dpctl/dump-flows
>
> It might be better to change this to something like:
>
> ovs-appctl dpctl/dump-flows > dp-flows
> AT_CAPTURE_FILE([dp-flows])
>
> > +
> > +  eth_match1="eth(src=f0:00:0f:01:02:03,dst=f0:00:0f:01:02:04)"
> > +  eth_match2="eth(src=f0:00:0f:01:02:04,dst=f0:00:0f:01:02:03)"
> > +  ipv6_src_match="ipv6(src="
> > +  ipv6_dst_match="ipv6(dst="
> > +
> > +  AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match1" -c], [0], [dnl
> > +2
> > +])
> > +
> > +  AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match2" -c], [0], [dnl
> > +2
> > +])
> > +
> > +  if [[ "$ipv6_src_dst_match" == "yes" ]]; then
> > +    dnl Check that the dp flow for 
> > eth(src=f0:00:0f:01:02:03,dst=f0:00:0f:01:02:04)
> > +    dnl matches on IPv6 src or dst.
> > +    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match1" | grep -e 
> > "$ipv6_src_match" -e "$ipv6_dst_match" -c], [0], [dnl
> > +2
> > +])
> > +
> > +    dnl Check that the dp flow for 
> > eth(src=f0:00:0f:01:02:04,dst=f0:00:0f:01:02:03)
> > +    dnl matches on IPv6 src or dst.
> > +    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match2" | grep -e 
> > "$ipv6_src_match" -e "$ipv6_dst_match" -c], [0], [dnl
> > +2
> > +])
> > +  else
> > +    dnl Check that the dp flow for 
> > eth(src=f0:00:0f:01:02:03,dst=f0:00:0f:01:02:04)
> > +    dnl doesn't match on IPv6 src or dst.
> > +    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match1" | grep 
> > "$ipv6_src_match"], [1], [])
> > +    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match1" | grep 
> > "$ipv6_dst_match"], [1], [])
> > +
> > +    dnl Check that the dp flow for 
> > eth(src=f0:00:0f:01:02:04,dst=f0:00:0f:01:02:03)
> > +    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match2" | grep 
> > "$ipv6_src_match"], [1], [])
> > +    AT_CHECK([ovs-appctl dpctl/dump-flows | grep "$eth_match2" | grep 
> > "$ipv6_dst_match"], [1], [])
> > +  fi
> > +}
> > +
> > +NS_CHECK_EXEC([vm0], [ping6 -q -c 3 -i 0.3 -w 2 1000::4 | FORMAT_PING], \
> > +[0], [dnl
> > +3 packets transmitted, 3 received, 0% packet loss, time 0ms
> > +])
> > +
> > +# Start IPv6 TCP server on vm0.
> > +NETNS_DAEMONIZE([vm0], [nc -6 -l -k 1000::3 4242], [nc-vm1-1.pid])
> > +NS_CHECK_EXEC([vm1], [nc -6 1000::3 4242 -z], [0], [ignore], [ignore])
> > +
> > +check_dp_flows no
> > +
> > +# Now configure port security on vm0.  dp flows should now match on IPv6 
> > fields.
> > +check ovn-nbctl lsp-set-port-security vm0 "f0:00:0f:01:02:03 10.0.0.3 
> > 1000::3"
> > +check ovn-nbctl --wait=hv sync
> > +
> > +NS_CHECK_EXEC([vm0], [ping6 -q -c 3 -i 0.3 -w 2 1000::4 | FORMAT_PING], \
> > +[0], [dnl
> > +3 packets transmitted, 3 received, 0% packet loss, time 0ms
> > +])
> > +
> > +NS_CHECK_EXEC([vm1], [nc -6 1000::3 4242 -z], [0], [ignore], [ignore])
> > +check_dp_flows yes
> > +
>
> The test failed here with the userspace datapath in the ovsrobot run:
> https://github.com/ovsrobot/ovn/actions/runs/10398228623/job/28795300825
>
> Is it a matter of waiting for output instead of checking for datapath flows
> immediately?  Alternatively do we need to wait for the revalidator to run
> (ovs-appctl revalidator/wait)?
>
> > +OVS_APP_EXIT_AND_WAIT([ovn-controller])
> > +
> > +as ovn-sb
> > +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> > +
> > +as ovn-nb
> > +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> > +
> > +as northd
> > +OVS_APP_EXIT_AND_WAIT([ovn-northd])
> > +
> > +as
> > +OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
> > +/failed to query port patch-.*/d
> > +/.*terminating with signal 15.*/d"])
> > +AT_CLEANUP
> > +])
>
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to