Re: [Dnsmasq-discuss] DHCPv6 doesn't work on Linux interfaces enslaved to a VRF

Simon Kelley Sun, 08 Oct 2023 15:31:43 -0700


On 07/10/2023 14:02, Luci Stanescu via Dnsmasq-discuss wrote:

Hi,
I've discovered that DHCPv6 doesn't work on Linux interfaces enslaved toa VRF. Now, I believe this to be a bug in the kernel and I've reportedit, but in case you'd like to implement a workaround in dnsmasq, this isquite trivial, as I'll explain in a bit.
The issue is that when a datagram is received from an interface enslavedto a VRF device, the sin6_scope_id of the msg_name field returned fromrecvmsg() points to the interface index of the VRF device, instead ofthe enslaved device. Unfortunately, this is completely useless when thesource address is a link-local address, as a subsequent sendmsg() whichspecifies that scope will fail with ENETUNREACH, as expected,considering the interface index of the enslaved device would have to bespecified as the scope (there can of course be multiple interfacesenslaved to a single VRF device).
With DHCPv6, a DHCPSOLICIT is received from a link-local address andDHCPADVERTISE is sent to the source of that address, with a scopespecified according to the scope from the msg_name field returned byrecvmsg(). I've debugged this using strace, as dnsmasq doesn't print anyerrors when the send fails. Here is the recvmsg() call:
recvmsg(6, {msg_name={sa_family=AF_INET6, sin6_port=htons(546),sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "fe80::216:3eff:fed0:4e7d",&sin6_addr), sin6_scope_id=if_nametoindex("myvrf")}, msg_namelen=28,msg_iov=[{iov_base="\1\203\273\n\0\1\0\16\0\1\0\1,\262\320k\0\26>\320N}\0\6\0\10\0\27\0\30\0'"..., iov_len=548}], msg_iovlen=1, msg_control=[{cmsg_len=36, cmsg_level=SOL_IPV6, cmsg_type=0x32}], msg_controllen=40, msg_flags=0}, MSG_PEEK|MSG_TRUNC) = 56
and the sending of the response later on:
sendto(6,"\2\203\273\n\0\1\0\16\0\1\0\1,\262\320k\0\26>\320N}\0\2\0\16\0\1\0\1,\262"..., 114, 0, {sa_family=AF_INET6, sin6_port=htons(546), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "fe80::216:3eff:fed0:4e7d", &sin6_addr), sin6_scope_id=if_nametoindex("myvrf")}, 28) = -1 ENETUNREACH (Network is unreachable)
Please notice that the scope is the index of the VRF master device, sothe sendto() call is certain to fail.
When reporting the issue as a kernel bug, I reproduced the issue usinglocal communication with unicast and a couple of simple Python scripts.Here's reproduction using local communication, but with multicast, tomake it closer to home:
First, set up a VRF device and a veth pair, with one end enslaved to theVRF master (on which we'll be receiving datagrams) and the other endused to send datagrams.
ip link add myvrf type vrf table 42
ip link set myvrf up
ip link add veth1 type veth peer name veth2
ip link set veth1 master myvrf up
ip link set veth2 up

# ip link sh dev myvrf
110: myvrf: <NOARP,MASTER,UP,LOWER_UP> mtu 65575 qdisc noqueue state UPmode DEFAULT group default qlen 1000
     link/ether da:ca:c9:2b:6e:02 brd ff:ff:ff:ff:ff:ff
# ip addr sh dev veth1
112: veth1@veth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdiscnoqueue master myvrf state UP group default qlen 1000
     link/ether 32:63:cf:f5:08:35 brd ff:ff:ff:ff:ff:ff
     inet6 fe80::3063:cfff:fef5:835/64 scope link
        valid_lft forever preferred_lft forever
# ip addr sh dev veth2
111: veth2@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdiscnoqueue state UP group default qlen 1000
     link/ether 1a:8f:5a:85:3c:c0 brd ff:ff:ff:ff:ff:ff
     inet6 fe80::188f:5aff:fe85:3cc0/64 scope link
        valid_lft forever preferred_lft forever

The receiver:
import socket
import struct

s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM, socket.IPPROTO_UDP)
s.setsockopt(socket.IPPROTO_IPV6, socket.IPV6_RECVPKTINFO, 1)
s.setsockopt(socket.SOL_SOCKET, socket.SO_BINDTODEVICE, b'veth1')
s.bind(('', 2000, 0, 0))
mreq = struct.pack('@16sI', socket.inet_pton(socket.AF_INET6,'ff02::1:2'), socket.if_nametoindex('veth1'))
s.setsockopt(socket.IPPROTO_IPV6, socket.IPV6_JOIN_GROUP, mreq)

while True:
     data, cmsg_list, flags, source = s.recvmsg(4096, 4096)
     for level, type, cmsg_data in cmsg_list:
         if level == socket.IPPROTO_IPV6 and type == socket.IPV6_PKTINFO:
             dest_address, dest_scope = struct.unpack('@16sI', cmsg_data)
             dest_address = socket.inet_ntop(socket.AF_INET6, dest_address)
             dest_scope = socket.if_indextoname(dest_scope)
print("PKTINFO destination {} {}".format(dest_address,dest_scope))
     source_address, source_port, source_flow, source_scope = source
     source_scope = socket.if_indextoname(source_scope)
     print("name source {} {}".format(source_address, source_scope))

And the sender:
import socket

s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM, socket.IPPROTO_UDP)
dest = ('ff02::1:2', 2000, 0, socket.if_nametoindex('veth2'))
s.sendto(b'foo', dest)

The receiver will print:
PKTINFO destination ff02::1:2 veth1
name source fe80::188f:5aff:fe85:3cc0 myvrf
Please notice that the receiver gets the right address, the oneassociated to veth2, but the scope identifies the VRF master. However,I've noticed that the scope in PKTINFO actually identifies the index ofthe actual interface on which the datagram was received, the VRF slaveveth1.
As I mentioned, I believe this is a bug in the kernel and I've opened abug report for that. But, considering that dnsmasq seems to alreadyrequest IPV6_PKTINFO (0x32) in recvmsg() as shown in the strace above(msg_control=[{cmsg_len=36, cmsg_level=SOL_IPV6, cmsg_type=0x32}]), Ibelieve a workaround of using the scope from there would work just fineand would be trivial to implement.
--
Luci Stanescu



Thanks for a very clear and thorough explanation.

Three things come to mind.

1) Even if this is a kernel bug, kernel bugs fixes take a long time tospread, so working around them in dnsmasq is a good thing to do, as longas it doesn't leave us with long-term technical debt. This wouldn't bethe first time a kernel bug has been worked around.


2) https://docs.kernel.org/networking/vrf.html says:

Applications that are to work within a VRF need to bind their socket tothe VRF device:

setsockopt(sd, SOL_SOCKET, SO_BINDTODEVICE, dev, strlen(dev)+1);
or to specify the output device using cmsg and IP_PKTINFO.

Which kind of implies that this might not be a kernel bug, rather we'rejust not doing what's required to work with VRF.

Setting the device to send to using IP_PKTINFO rather than relying onthe flowinfo field in the destination address would be quite possible,and the above implies that it will work. This brings us on to

3) IPv4. Does DHCPv4 work with VRF devices? It would be nice to test,and fix any similar problems in the same patch. Interestingly, theDHCPv4 code already sets the outgoing device via IP_PKTINFO (there beingno flowinfo field in an IPv4 sockaddr) so it stands a chance of justworking.

Copying the inferface index into the flowinfo of the destination orsetting IP_PKTINFO are both easy patches to make and try. The difficultbit is being sure that they won't break existing installations.



Cheers,

Simon.

_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss

Re: [Dnsmasq-discuss] DHCPv6 doesn't work on Linux interfaces enslaved to a VRF

Reply via email to