(Re-sending, as the first one had a .png attached which exceeded what sourceforge is willing to forward)
Hi, as promised, here's test results and code review. Test results: - running openvpn over TCP gives me a kernel panic - this is not so nice... (see attached .png from the vmware console) - userland seems to assume "kernel can do TCP", kernel panics on "if !udp, panic()" (so intentional panic, not corruption panic). This is on freebsd git FreeBSD 14.0-CURRENT #1 main-n257130-c0665d5c824 - running openvpn over UDP has issues with fragmentation - almost all t_client tests that *do* use DCO fail the "big ping" test run IPv4 ping tests (want_ok), 64 byte packets... run IPv4 ping tests (want_ok), 1440 byte packets... run IPv4 ping tests (want_ok), 3000 byte packets... FAIL: IPv4 ping test (3000 bytes) failed, but should succeed. run IPv6 ping tests (want_ok), 64 byte packets... run IPv6 ping tests (want_ok), 1440 byte packets... run IPv6 ping tests (want_ok), 3000 byte packets... FAIL: IPv6 ping test (3000 bytes) failed, but should succeed. this is "with no special mtu related options to OpenVPN", so the tun interface is MTU 1500, and the pings are created by fping -b 3000 -C 20 -p 250 -q -C 5 -p 400 10.194.5.1 10.194.0.1 fping6 -b 3000 -C 20 -p 250 -q -C 5 -p 400 fd00:abcd:194:5::1 fd00:abcd:194:0::1 (testing the host "on the p2p link" and "something behind a --route pushed by the server") I have not investigated exactly what happens at which point yet, but with Linux DCO, we had a similar effect where a fragmented *inside* packet ("ICMP") carried over a flag in the sk_buff that resulted in the kernel refusing to fragment the resulting *outside* packet. (Inside MTU 1500 -> fping -s 3000 -> 3 "inside" fragments, 1500+1500+68, adding openvpn overhead -> packet > 1500 -> external fragmentation needed for the first two) Inside tcpdump (tun0): 17:28:37.342050 IP (tos 0x0, ttl 64, id 50758, offset 0, flags [+], proto ICMP (1), length 1500) 10.194.5.222 > 10.194.5.1: ICMP echo request, id 2433, seq 0, length 1480 17:28:37.342142 IP (tos 0x0, ttl 64, id 50758, offset 1480, flags [+], proto ICMP (1), length 1500) 10.194.5.222 > 10.194.5.1: ip-proto-1 17:28:37.342270 IP (tos 0x0, ttl 64, id 50758, offset 2960, flags [none], proto ICMP (1), length 68) 10.194.5.222 > 10.194.5.1: ip-proto-1 Outside tcpdump (em0): 17:28:37.342103 IP (tos 0x0, ttl 64, id 50759, offset 0, flags [+], proto UDP (17), length 1500) 194.97.140.5.15738 > 199.102.77.82.51197: UDP, bad length 1528 > 1472 17:28:37.342123 IP (tos 0x0, ttl 64, id 50759, offset 1480, flags [none], proto UDP (17), length 76) 194.97.140.5 > 199.102.77.82: ip-proto-17 17:28:37.342185 IP (tos 0x0, ttl 64, id 50760, offset 0, flags [+], proto UDP (17), length 1500) 194.97.140.5.15738 > 199.102.77.82.51197: UDP, bad length 1528 > 1472 17:28:37.342228 IP (tos 0x0, ttl 64, id 50760, offset 1480, flags [none], proto UDP (17), length 76) 194.97.140.5 > 199.102.77.82: ip-proto-17 17:28:37.342300 IP (tos 0x0, ttl 64, id 50761, offset 0, flags [none], proto UDP (17), length 132) 194.97.140.5.15738 > 199.102.77.82.51197: UDP, length 104 ... this *does* look correct, but there is never a reply from the other end, so something is not right. Either inside or outside. This happens with IPv4 or IPv6 UDP outside, and IPv4 or IPv6 inside. - tcpdump'ing on the DCO interface gave me complains from the kernel about locking on ctrl-c'ing Aug 10 17:28:41 fbsd14 kernel: lock order bpf global lock -> iflib ctx lock attempted at: Aug 10 17:28:41 fbsd14 kernel: #0 0xffffffff80c5c3dd at witness_checkorder+0xbfd Aug 10 17:28:41 fbsd14 kernel: #1 0xffffffff80bf5303 at _sx_xlock+0x63 Aug 10 17:28:41 fbsd14 kernel: #2 0xffffffff80d3874f at iflib_if_ioctl+0x2df Aug 10 17:28:41 fbsd14 kernel: #3 0xffffffff80d19b5e at if_setflag+0xde Aug 10 17:28:41 fbsd14 kernel: #4 0xffffffff80d19a2a at ifpromisc+0x2a Aug 10 17:28:41 fbsd14 kernel: #5 0xffffffff80d0e72b at bpf_detachd_locked+0x27b Aug 10 17:28:41 fbsd14 kernel: #6 0xffffffff80d111f7 at bpf_dtor+0x87 Aug 10 17:28:41 fbsd14 kernel: #7 0xffffffff80a7818b at devfs_destroy_cdevpriv+0xab Aug 10 17:28:41 fbsd14 kernel: #8 0xffffffff80a7bda4 at devfs_close_f+0x64 Aug 10 17:28:41 fbsd14 kernel: #9 0xffffffff80b876eb at _fdrop+0x1b Aug 10 17:28:41 fbsd14 kernel: #10 0xffffffff80b8af3b at closef+0x1db Aug 10 17:28:41 fbsd14 kernel: #11 0xffffffff80b8ec97 at closefp_impl+0x77 Aug 10 17:28:41 fbsd14 kernel: #12 0xffffffff810c733e at amd64_syscall+0x12e Aug 10 17:28:41 fbsd14 kernel: #13 0xffffffff8109ae0b at fast_syscall_common+0xf8 ... so while this is outside "openvpn source code patches", it's still something that smells like it needs to be addressed. Now, coding style ;-) - as promised, I went through the code for things that need to be done in a certain way in OpenVPN land, due to agreed convention... inline (things I do not comment could go in "as is"). On Mon, Aug 08, 2022 at 04:34:23PM +0200, Kristof Provost via Openvpn-devel wrote: > diff --git a/src/openvpn/dco_freebsd.c b/src/openvpn/dco_freebsd.c > new file mode 100644 > index 00000000..a8a53fe3 > --- /dev/null > +++ b/src/openvpn/dco_freebsd.c > @@ -0,0 +1,636 @@ [..] > +static nvlist_t * > +sockaddr_to_nvlist(const struct sockaddr *sa) > +{ [..] > + default: > + abort(); please use "ASSERT(0);" here. It will do the same thing, but if ever hit, it will print a __FILE__, __LINE__ message to the log, so it's easier for us to see *where* it triggered a "this must never happen" condition. > +int > +dco_new_peer(dco_context_t *dco, unsigned int peerid, int sd, > + struct sockaddr *localaddr, struct sockaddr *remoteaddr, > + struct in_addr *remote_in4, struct in6_addr *remote_in6) > +{ > + struct ifdrv drv; > + nvlist_t *nvl; > + int ret; > + > + nvl = nvlist_create(0); We use C99 these days, so this could be written as > + nvlist_t *nvl = nvlist_create(0); but this is a matter of personal preference. Both are fine, just wanted to point out that the option exists. > + > + nvlist_add_number(nvl, "fd", sd); > + nvlist_add_number(nvl, "peerid", peerid); > + > + bzero(&drv, sizeof(drv)); > + snprintf(drv.ifd_name, IFNAMSIZ, "%s", dco->ifname); > + drv.ifd_cmd = OVPN_NEW_PEER; > + drv.ifd_data = nvlist_pack(nvl, &drv.ifd_len); > + > + ret = ioctl(dco->fd, SIOCSDRVSPEC, &drv); > + free(drv.ifd_data); What happens should nvlist_pack() fail here? Manpage says it returns NULL, will the ioctl() handle this gracefully, or will something crash? (If the ioctl() returns something like EINVAL in this case, perfectly fine) > + nvlist_destroy(nvl); > + if (ret) > + { > + msg(D_DCO, "Failed to create new peer %d", errno); > + return ret; > + } > + > + return 0; > +} Most functions in OpenVPN that do use a "ret" variable would look more like this: > + if (ret) > + { > + msg(D_DCO, "Failed to create new peer %d", errno); > + } > + > + return ret; > +} we do have all sorts of variants (*sigh*), but since you also used both versions, let's go for this one. > + msg(D_DCO, "Failed to create new peer %d", errno); this can be written as > + msg(D_DCO|M_ERRNO, "Failed to create new peer"); M_ERRNO will automatically append the strerror(errno) stuff. D_DCO is "--verb 3", so that message won't be visible in normal operation - if we consider this an error, then it should be > + msg(M_WARN|M_ERRNO, "Failed to create new peer"); instead (or if it's "fatal error, give up" -> M_ERR). Now, I'm not sure if accessing errno after the free() call is guaranteed to be save - so maybe move the msg() call up? > +bool > +ovpn_dco_init(int mode, dco_context_t *dco) > +{ > + if (open_fd(dco) < 0) > + { > + msg(D_DCO, "Failed to open socket"); Same here - if you want this to be heard, D_DCO should be M_WARN|M_ERRNO. If this is just informational, because the caller will make sufficient noise, D_DCO|M_ERRNO is good (but you might want to point out that this is the "DCO socket" that couldn't be opened). > +static int > +create_interface(struct tuntap *tt, const char *dev) > +{ > + int ret; > + struct ifreq ifr; > + > + bzero(&ifr, sizeof(ifr)); There is a convenience macro here, CLEAR(ifr) (which does exactly this - #define CLEAR(x) memset(&(x), 0, sizeof(x)). > + /* Create ovpnx first, then rename it. */ > + snprintf(ifr.ifr_name, IFNAMSIZ, "ovpn"); > + ret = ioctl(tt->dco.fd, SIOCIFCREATE2, &ifr); > + if (ret) > + { > + msg(D_DCO, "Failed to create interface %s: %d", ifr.ifr_name, errno); > + return ret; > + } -> M_ERRNO > + /* Rename */ > + if (!strcmp(dev, "tun")) > + { > + ifr.ifr_data = "ovpn"; > + } > + else > + { > + ifr.ifr_data = (char *)dev; > + } I'm not sure I understand this code part. When would dev be "tun" here, triggering a rename to "ovpn"? The call chain leading to this is (now) - tun.c:open_tun_dco_generic() - open_tun_dco() - create_interface(tt,dev) and tun_dco_generic() should never pass in a bare "tun" name (because in that case, it would iterate "tun0, tun1, tun2"). Is this a safeguard against "bad things will happen in kernel land if we use the unqualified name of another driver"? > + ret = ioctl(tt->dco.fd, SIOCSIFNAME, &ifr); > + if (ret) > + { > + /* Delete the created interface again. */ > + (void)ioctl(tt->dco.fd, SIOCIFDESTROY, &ifr); > + msg(D_DCO, "Failed to create interface %s: %d", ifr.ifr_data, errno); -> M_ERRNO (there's more, but no need to point them out individually :-) ) > +static int > +remove_interface(struct tuntap *tt) > +{ > + int ret; > + struct ifreq ifr; > + > + bzero(&ifr, sizeof(ifr)); > + snprintf(ifr.ifr_name, IFNAMSIZ, "%s", tt->dco.ifname); > + > + ret = ioctl(tt->dco.fd, SIOCIFDESTROY, &ifr); > + if (ret) > + { > + msg(D_DCO, "Failed to remove interface %s: %d", ifr.ifr_name, errno); > + return ret; > + } > + > + tt->dco.ifname[0] = 0; > + > + return 0; > +} Maybe always clear the ifname, and do "return ret;" here too? > +static int > +start_tun(dco_context_t *dco) > +{ > + struct ifdrv drv; > + int ret; > + > + bzero(&drv, sizeof(drv)); CLEAR(drv); > +int > +dco_do_read(dco_context_t *dco) > +{ > + struct ifdrv drv; > + uint8_t buf[4096]; > + nvlist_t *nvl; > + const uint8_t *pkt; > + size_t pktlen; > + int ret; > + > + /* Flush any pending data from the pipe. */ > + (void)read(dco->pipefd[1], buf, sizeof(buf)); > + > + bzero(&drv, sizeof(drv)); > + snprintf(drv.ifd_name, IFNAMSIZ, "%s", dco->ifname); > + drv.ifd_cmd = OVPN_GET_PKT; > + drv.ifd_data = buf; > + drv.ifd_len = sizeof(buf); > + > + ret = ioctl(dco->fd, SIOCGDRVSPEC, &drv); > + if (ret) > + { > + msg(D_DCO, "Failed to read control packet %d", errno); > + return errno; Documentation (dco.h) says "0 on success or a negative error code otherwise", so this needs to be "return -errno;" to behave the same as dco_linux > + } > + > + nvl = nvlist_unpack(buf, drv.ifd_len, 0); > + if (!nvl) > + { > + msg(D_DCO, "Failed to unpack nvlist"); > + return EINVAL; -EINVAL > +int > +dco_do_write(dco_context_t *dco, int peer_id, struct buffer *buf) > +{ > + struct ifdrv drv; > + nvlist_t *nvl; > + int ret; > + > + nvl = nvlist_create(0); > + > + nvlist_add_binary(nvl, "packet", BSTR(buf), BLEN(buf)); > + nvlist_add_number(nvl, "peerid", peer_id); > + > + bzero(&drv, sizeof(drv)); > + snprintf(drv.ifd_name, IFNAMSIZ, "%s", dco->ifname); > + drv.ifd_cmd = OVPN_SEND_PKT; > + drv.ifd_data = nvlist_pack(nvl, &drv.ifd_len); > + > + ret = ioctl(dco->fd, SIOCSDRVSPEC, &drv); > + free(drv.ifd_data); > + nvlist_destroy(nvl); > + if (ret) > + { > + msg(D_DCO, "Failed to send control packet %d", errno); > + return ret; > + } > + > + return BLEN(buf); > +} The return code of dco_do_write() is not clearly defined in dco.h (Antonio, looking at you...!) but the linux version seems to do "negative is error, positive is buffer length written", so this should be fine. Might need a change to "return -errno". > +bool > +dco_available(int msglevel) > +{ > + struct if_clonereq ifcr; > + char *buf = NULL; > + int fd; > + int ret; > + bool available = false; > + > + /* Attempt to load the module. Ignore errors, because it might already be > + * loaded, or built into the kernel. */ > + (void)kldload("if_ovpn"); > + > + fd = socket(AF_INET, SOCK_DGRAM | SOCK_CLOEXEC, 0); > + if (fd < 0) > + { > + return false; > + } > + > + bzero(&ifcr, sizeof(ifcr)); CLEAR(ifcr) > + ret = ioctl(fd, SIOCIFGCLONERS, &ifcr); > + if (ret != 0) > + { > + goto out; > + } > + > + buf = malloc(ifcr.ifcr_total * IFNAMSIZ); > + > + ifcr.ifcr_count = ifcr.ifcr_total; > + ifcr.ifcr_buffer = buf; > + ret = ioctl(fd, SIOCIFGCLONERS, &ifcr); > + if (ret != 0) > + { > + goto out; > + } > + > + for (int i = 0; i < ifcr.ifcr_total; i++) > + { > + if (strcmp(buf + (i * IFNAMSIZ), "openvpn") == 0) This looks fairly magic to me - am I reading this right that the first call returns a number of "somethings", then we malloc(something * IFNAMSIZ), and the second call returns a list of strings that describe "something", and if there is one of them named "openvpn", we know that DCO is available? Why is it called "openvpn", not "ovpn" (if_ovpn, man ovpn)? (A few lines of comment on what SIOCIFGCLONERS does - or which manpage to look at - would be appreciated here :-) ) > +const char * > +dco_get_supported_ciphers() > +{ > + return "none:AES-256-GCM:AES-128-GCM:CHACHA20-POLY1305"; > +} Is "none" indeed supported? I find that useful to test, but Arne and Antonio refuse(d) to support it in linux DCO :-) > diff --git a/src/openvpn/forward.c b/src/openvpn/forward.c > index 55c939c4..14ad24fa 100644 > --- a/src/openvpn/forward.c > +++ b/src/openvpn/forward.c ... these are all very straightforward "LINUX" -> "LINUX || FREEBSD" :-) > diff --git a/src/openvpn/tun.c b/src/openvpn/tun.c > index 54f7d72c..e84d5a27 100644 > --- a/src/openvpn/tun.c > +++ b/src/openvpn/tun.c [..] > @@ -2917,20 +2917,24 @@ void > open_tun(const char *dev, const char *dev_type, const char *dev_node, struct > tuntap *tt, > openvpn_net_ctx_t *ctx) > { > - open_tun_generic(dev, dev_type, dev_node, tt); > - > - if (tt->fd >= 0 && tt->type == DEV_TYPE_TUN) > - { > - int i = IFF_POINTOPOINT | IFF_MULTICAST; > + if (tun_dco_enabled(tt)) { > + open_tun_dco_generic(dev, dev_type, tt, ctx); > + } else { > + open_tun_generic(dev, dev_type, dev_node, tt); This one will upset the whitespace dragon and it will eat all of the patch :-) OpenVPN bracket convention requires this to be: + if (tun_dco_enabled(tt)) + { + open_tun_dco_generic(dev, dev_type, tt, ctx); + } + else + { + open_tun_generic(dev, dev_type, dev_node, tt); (I run uncrustify on merge, so this would get fixed - but it would be good to have it properly formatted right away) > > - if (ioctl(tt->fd, TUNSIFMODE, &i) < 0) > + if (tt->fd >= 0 && tt->type == DEV_TYPE_TUN) > { > - msg(M_WARN | M_ERRNO, "ioctl(TUNSIFMODE)"); > - } > - i = 1; > - if (ioctl(tt->fd, TUNSIFHEAD, &i) < 0) > - { ... this looks like a massive diff, but actually it's just an indenting change that git diff does not present nicely (checked with "git show -w"). That's it for today :-) gert -- "If was one thing all people took for granted, was conviction that if you feed honest figures into a computer, honest figures come out. Never doubted it myself till I met a computer with a sense of humor." Robert A. Heinlein, The Moon is a Harsh Mistress Gert Doering - Munich, Germany g...@greenie.muc.de
signature.asc
Description: PGP signature
_______________________________________________ Openvpn-devel mailing list Openvpn-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openvpn-devel