Thanks for feedback,
I'm not an ipv6 expert, I did some research. There is no ARP in ipv6, it's
replaced with Neighbor Discovery
<https://www.ietf.org/proceedings/65/slides/16ng-3/sld5.htm>. The ARP
resolution is done via Neighbor Solicitation
<https://datatracker.ietf.org/doc/html/rfc4861#page-22> and Neighbour
Advertisement <https://datatracker.ietf.org/doc/html/rfc4861#section-4.4>
messages.
Neighbor Solicitation is an analog of ARP request and is sent to either
- solicited-node multicast address 33:33:ff:XX:XX:XX where XX:XX:XX is
the last 3 bytes of the target ipv6 address
- or target node address
So in our case we can update the flows to match only the solicited-node
multicast address 33:33:ff:XX:XX:XX as follows. From
table=??(ls_in_arp_rsp ), priority=100 , match=(nd_ns && ip6.dst ==
{fd00::10, ff02::1:ff00:10} && nd.target == fd00::10 && inport ==
"S1-vm1"), action=(next;)
table=??(ls_in_arp_rsp ), priority=50 , match=(nd_ns && ip6.dst ==
{fd00::10, ff02::1:ff00:10} && nd.target == fd00::10), action=(nd_na {
eth.src = 50:54:00:00:00:10; ip6.src = fd00::10; nd.target = fd00::10;
nd.tll = 50:54:00:00:00:10; outport = inport; flags.loopback = 1; output;
};)
to
table=??(ls_in_arp_rsp ), priority=100 , match=(nd_ns && *eth.dst ==
{33:33:ff:00:00:10, 33:33:ff:ff:00:10} *&& ip6.dst == {fd00::10,
ff02::1:ff00:10} && nd.target == fd00::10 && inport == "S1-vm1"),
action=(next;)
table=??(ls_in_arp_rsp ), priority=50 , match=(nd_ns && *eth.dst ==
{33:33:ff:00:00:10, 33:33:ff:ff:00:10}* && ip6.dst == {fd00::10,
ff02::1:ff00:10} && nd.target == fd00::10), action=(nd_na { eth.src =
50:54:00:00:00:10; ip6.src = fd00::10; nd.target = fd00::10; nd.tll =
50:54:00:00:00:10; outport = inport; flags.loopback = 1; output; };)
Will this be okay? Maybe we can do this in a separate patch?
On Wed, May 22, 2024 at 12:16 AM Han Zhou <[email protected]> wrote:
>
>
> On Tue, May 21, 2024 at 9:56 PM Numan Siddique <[email protected]> wrote:
> >
> > On Mon, May 20, 2024 at 1:47 AM Vasyl Saienko <[email protected]>
> wrote:
> > >
> > > Thanks for feedback Numan.
> > >
> > > On Wed, May 15, 2024 at 12:04 AM Numan Siddique <[email protected]>
> wrote:
> > >
> > > > On Mon, May 13, 2024 at 9:00 AM Dumitru Ceara <[email protected]>
> wrote:
> > > > >
> > > > > On 5/4/24 11:38, Vasyl Saienko wrote:
> > > > > > Hello Numan
> > > > > >
> > > > > > Thanks for review and feedback.
> > > > > >
> > > > > > On Fri, May 3, 2024 at 7:14 PM Numan Siddique <[email protected]>
> wrote:
> > > > > >
> > > > > >> On Mon, Apr 22, 2024 at 2:54 AM Vasyl Saienko <
> [email protected]>
> > > > > >> wrote:
> > > > > >>>
> > > > > >>> Reply only if target ethernet address is broadcast, if
> > > > > >>> address is specified explicitly do noting to let target
> > > > > >>> reply by itself. This technique allows to monitor target
> > > > > >>> aliveness with arping.
> > > > > >>
> > > > > >> Thanks for the patch.
> > > > > >>
> > > > > >> This patch does change the behavior of OVN. Having ARP
> responder
> > > > > >> flows avoids tunnelling the request packet if the destination
> is on a
> > > > > >> different compute node.
> > > > > >> But I don't think its a big deal.
> > > > > >>
> > > > > >> You are totally correct that the ARP responder allows limiting
> ARP
> > > > > > broadcast traffic between nodes. Normally ARP requests are sent
> to
> > > > > > broadcast ff:ff:ff:ff:ff:ff ethernet, as the protocol is
> designed to
> > > > > > identify ethernet addresses for known IP addresses. In this case
> since
> > > > > > traffic is broadcast it will be flooded among all nodes if ARP
> > > > responder is
> > > > > > not applied. At the same time the client may specify the target
> > > > > > ethernet address as unicast (in this case a broadcast storm will
> not
> > > > > > happen, as the destination ethernet address is unicast).
> > > > > >
> > > > > > In OpenStack we use unicast ARP requests as a way to monitor VM
> > > > aliveness
> > > > > > from remote locations to make sure there are no issues with
> > > > > > flows/underlying networking infra between nodes. We use ARPs due
> to
> > > > several
> > > > > > reasons:
> > > > > > 1. This protocol is mandatory and can't be disabled on the target
> > > > machine
> > > > > > (which guarantees that the target VM will always reply to ARPs
> if it is
> > > > > > alive)
> > > > > > 2. This protocol is not filtered by firewalls/security groups as
> it is
> > > > > > mandatory for network workability
> > > > > > 3. We can't use upper layers protocols such as ICMP as they will
> > > > require
> > > > > > specific firewall rules inside VMs, and we do not control VMs in
> cloud
> > > > > > cases, but still need to monitor infrastructure aliveness.
> > > > > >
> > > > > > If ARP responder replies to requests with broadcast and unicast
> target
> > > > > > ethernet address, we can't use this technique for monitoring, as
> even
> > > > if
> > > > > > target VM is down (not necessary that ovn port is down, the VM
> may
> > > > stuck or
> > > > > > be in kernel panic for example, or datapath between monitoring
> tool
> > > > and VM
> > > > > > is broken) the ARP responder will do reply to our request so we
> can't
> > > > > > identify if VM is really accessible or not as we will always got
> > > > replies
> > > > > > from local ARP responder. At the same time there is no need to
> set up
> > > > ARP
> > > > > > responder for requests with a unicast ethernet address, as they
> will
> > > > not
> > > > > > generate broadcast storms by design (they are unicasts).
> > > > > >
> > > > >
> > > > > Right, this sounds acceptable to me too.
> > > > >
> > > > > >
> > > > > >> 1. Accept your patch
> > > > > >> 2. Use the NB Global option - "ignore_lsp_down" and set it to
> false,
> > > > > >> so that ovn-northd will install the ARP responder flows only if
> the
> > > > > >> logical port is up.
> > > > > >>
> > > > > > This option will not allow catching a situation when there is an
> issue
> > > > in
> > > > > > datapath between monitoring software and VM located on another
> node,
> > > > for
> > > > > > example issues with flows or some underlying networking
> problems.
> > > > > >
> > > > > > Have you considered this approach for your use case ?
> > > > > >> 3. Add another global option - "disable_arp_responder" which
> when set
> > > > > >> to true, ovn-northd will not install ARP responder flows.
> > > > > >>
> > > > > >> Disabling ARP responder for ports completely will allow
> broadcast
> > > > storms
> > > > > > for ARP requests. Which we would like to avoid, but enable the
> > > > possibility
> > > > > > to monitor infrastructure aliveness.
> > > > > >
> > > > > >
> > > > > >> Given that OpenStack neutron ml2ovs uses ARP responder flows
> only for
> > > > > >> broad cast my preference would be (1).
> > > > > >>
> > > > > >> I'd like to know other's opinion on this.
> > > > > >>
> > > > >
> > > > > I'd accept the patch too so I'd go for (1).
> > > > >
> > > > > Regards,
> > > > > Dumitru
> > > > >
> > > > > >> Thanks
> > > > > >> Numan
> > > > > >>
> > > > > >>>
> > > > > >>> Closes #239
> > > > > >>>
> > > > > >>> Signed-off-by: Vasyl Saienko <[email protected]>
> > > > > >>> ---
> > > > > >>> northd/northd.c | 11 +++++++++--
> > > > > >>> tests/ovn-northd.at | 4 ++--
> > > > > >>> 2 files changed, 11 insertions(+), 4 deletions(-)
> > > > > >>>
> > > > > >>> diff --git a/northd/northd.c b/northd/northd.c
> > > > > >>> index 37f443e70..e80e1885d 100644
> > > > > >>> --- a/northd/northd.c
> > > > > >>> +++ b/northd/northd.c
> > > > > >>> @@ -8832,8 +8832,15 @@
> > > > build_lswitch_arp_nd_responder_known_ips(struct
> > > > > >> ovn_port *op,
> > > > > >>> for (size_t i = 0; i < op->n_lsp_addrs; i++) {
> > > > > >>> for (size_t j = 0; j <
> op->lsp_addrs[i].n_ipv4_addrs;
> > > > j++) {
> > > > > >>> ds_clear(match);
> > > > > >>> - ds_put_format(match, "arp.tpa == %s && arp.op
> == 1",
> > > > > >>> -
> op->lsp_addrs[i].ipv4_addrs[j].addr_s);
> > > > > >>> + /* NOTE(vsaienko): Do not reply on unicast
> ARPs,
> > > > forward
> > > > > >>> + * them to the target to have ability to
> monitor
> > > > target
> > > > > >>> + * aliveness via ARPs.
> > > > > >>> + */
> > > > > >>> + ds_put_format(match,
> > > > > >>> + "arp.tpa == %s && "
> > > > > >>> + "arp.op == 1 && "
> > > > > >>> + "eth.dst == ff:ff:ff:ff:ff:ff",
> > > > > >>> + op->lsp_addrs[i].ipv4_addrs[j].addr_s);
> > > > > >>> ds_clear(actions);
> > > > > >>> ds_put_format(actions,
> > > > > >>> "eth.dst = eth.src; "
> > > > > >>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> > > > > >>> index be006fb32..4162196f4 100644
> > > > > >>> --- a/tests/ovn-northd.at
> > > > > >>> +++ b/tests/ovn-northd.at
> > > > > >>> @@ -9283,9 +9283,9 @@ AT_CAPTURE_FILE([S1flows])
> > > > > >>>
> > > > > >>> AT_CHECK([grep -e "ls_in_arp_rsp" S1flows |
> ovn_strip_lflows], [0],
> > > > [dnl
> > > > > >>> table=??(ls_in_arp_rsp ), priority=0 , match=(1),
> > > > > >> action=(next;)
> > > > > >>> - table=??(ls_in_arp_rsp ), priority=100 ,
> match=(arp.tpa ==
> > > > > >> 192.168.0.10 && arp.op == 1 && inport == "S1-vm1"),
> action=(next;)
> > > > > >>> + table=??(ls_in_arp_rsp ), priority=100 ,
> match=(arp.tpa ==
> > > > > >> 192.168.0.10 && arp.op == 1 && eth.dst == ff:ff:ff:ff:ff:ff &&
> inport
> > > > ==
> > > > > >> "S1-vm1"), action=(next;)
> > > > > >>> table=??(ls_in_arp_rsp ), priority=100 , match=(nd_ns
> &&
> > > > > >> ip6.dst == {fd00::10, ff02::1:ff00:10} && nd.target == fd00::10
> &&
> > > > inport
> > > > > >> == "S1-vm1"), action=(next;)
> > > > > >>> - table=??(ls_in_arp_rsp ), priority=50 ,
> match=(arp.tpa ==
> > > > > >> 192.168.0.10 && arp.op == 1), action=(eth.dst = eth.src;
> eth.src =
> > > > > >> 50:54:00:00:00:10; arp.op = 2; /* ARP reply */ arp.tha =
> arp.sha;
> > > > arp.sha =
> > > > > >> 50:54:00:00:00:10; arp.tpa = arp.spa; arp.spa = 192.168.0.10;
> outport
> > > > =
> > > > > >> inport; flags.loopback = 1; output;)
> > > > > >>> + table=??(ls_in_arp_rsp ), priority=50 ,
> match=(arp.tpa ==
> > > > > >> 192.168.0.10 && arp.op == 1 && eth.dst == ff:ff:ff:ff:ff:ff),
> > > > > >> action=(eth.dst = eth.src; eth.src = 50:54:00:00:00:10; arp.op
> = 2;
> > > > /* ARP
> > > > > >> reply */ arp.tha = arp.sha; arp.sha = 50:54:00:00:00:10;
> arp.tpa =
> > > > arp.spa;
> > > > > >> arp.spa = 192.168.0.10; outport = inport; flags.loopback = 1;
> output;)
> > > > > >>> table=??(ls_in_arp_rsp ), priority=50 , match=(nd_ns
> &&
> > > > > >> ip6.dst == {fd00::10, ff02::1:ff00:10} && nd.target ==
> fd00::10),
> > > > > >> action=(nd_na { eth.src = 50:54:00:00:00:10; ip6.src = fd00::10;
> > > > nd.target
> > > > > >> = fd00::10; nd.tll = 50:54:00:00:00:10; outport = inport;
> > > > flags.loopback =
> > > > > >> 1; output; };)
> > > >
> > > > I took a look at this patch. There's one problem with this patch.
> If
> > > > you see the test case changes, this patch now
> > > > replaces the below logical flow
> > > >
> > > > table=??(ls_in_arp_rsp ), priority=100 , match=(arp.tpa ==
> > > > 192.168.0.10 && arp.op == 1 && inport == "S1-vm1"), action=(next;)
> > > >
> > > > with
> > > >
> > > > table=??(ls_in_arp_rsp ), priority=100 , match=(arp.tpa ==
> > > > 192.168.0.10 && arp.op == 1 && eth.dst == ff:ff:ff:ff:ff:ff && inport
> > > > == "S1-vm1"), action=(next;)
> > > >
> > > > This is wrong. If there is an ARP request packet entering from a
> > > > router port to the switch, we were advancing this packet to the next
> > > > stage.
> > > > But now, we only advance if it's a broadcast pkt. Please correct
> this.
> > > >
> > > > This does not look wrong, the router is yet another client that uses
> ARP
> > > to map IP address to MAC address.
> > > If the MAC address is known, the router will not send ARP to the
> target at
> > > all, in other cases it will send it to broadcast where
> > > we will reply from the ARP responder.
> >
> > Yes. You're right.
>
> The patch looks correct to me, but shall we add the same for IPv6 (just a
> few lines below this in the same function)?
>
> Thanks,
> Han
>
> >
> > Thanks
> > Numan
> >
> >
> >
> > >
> > >
> > > > Also please update the logical flow documentation in ovn-northd.8.xml
> > > > for the "ls_in_arp_rsp" stage. I think its also worth mentioning
> this
> > > > change in the
> > > > NEWS file.
> > > >
> > > > Will update in next revision.
> > >
> > >
> > > > Thanks
> > > > Numan
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > > >>> ])
> > > > > >>>
> > > > > >>> --
> > > > > >>> 2.40.1
> > > > > >>>
> > > > > >>> _______________________________________________
> > > > > >>> dev mailing list
> > > > > >>> [email protected]
> > > > > >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > > > > >>>
> > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > > > _______________________________________________
> > > > > dev mailing list
> > > > > [email protected]
> > > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > > >
> > >
> > >
> > > --
> > > <https://www.mirantis.com/>
> > > Vasyl Saienko
> > >
> > > Principal DevOps Engineer
> > > [email protected] <[email protected]>
> > > +(380) 66 072 07 17 <++1+(650)+564+7038>
> > > _______________________________________________
> > > dev mailing list
> > > [email protected]
> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > _______________________________________________
> > dev mailing list
> > [email protected]
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
--
<https://www.mirantis.com/>
Vasyl Saienko
Principal DevOps Engineer
[email protected] <[email protected]>
+(380) 66 072 07 17 <++1+(650)+564+7038>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev