uvm_fault when setting ddb breakpoint on armv7 -current
Hello, I am investigating a usb issue on my imx6-based novena, and I tried to set a breakpoint to inspect the backtrace when the issue occurs. The problem is, when resuming execution out of ddb, I get a uvm_fault and then the only way forward is to reboot the system. Am I missing a step ? or is it a bug ? -> Kernel config $ diff -Naur /usr/src/sys/arch/armv7/conf/{GENERIC,USBDEBUG} --- /usr/src/sys/arch/armv7/conf/GENERICMon Dec 14 09:19:10 2020 +++ /usr/src/sys/arch/armv7/conf/USBDEBUG Sun Dec 13 17:26:09 2020 @@ -26,6 +26,11 @@ option USBVERBOSE option USER_PCICONF# user-space PCI configuration +option USB_DEBUG +option UHUB_DEBUG +option UMASS_DEBUG +option EHCI_DEBUG + config bsd swap generic # The main bus device -> steps to reproduce over serial console $ doas sysctl ddb.trigger=1 Stopped at db_enter: ldrbr15, [r15, r15, ror r15]! ddb> break umass_bbb_reset ddb> c uvm_fault(0xc08d1260, c0659000, 2, 0) -> e Fatal kernel mode data abort: 'Permission fault (L1)' trapframe: 0xd0ccfcf8 DFSR=080d, DFAR=c06595b8, spsr=2013 r0 =00ff, r1 =c06595b8, r2 =, r3 =0002 r4 =c08e5164, r5 =c06595b8, r6 =d0ccfd91, r7 =0003 r8 =c083ee30, r9 =0004, r10=c06595b8, r11=d0ccfd88 r12=000f, ssp=d0ccfd48, slr=1060, pc =c04d69c0 Stopped at db_write_bytes+0x3ac: strbr0, [r5], #0x001 ddb> trace db_write_bytes+0x3ac rlv=0xc03973fc rfp=0xd0ccfda0 db_put_value+0x50 rlv=0xc0669cc0 rfp=0xd0ccfdb0 db_set_breakpoints+0x54 rlv=0xc072e670 rfp=0xd0ccfdd8 db_restart_at_pc+0x178 rlv=0xc06731c4 rfp=0xd0ccfe00 db_trap+0x14c rlv=0xc04d6b18 rfp=0xd0ccfe20 db_trapper+0x88 rlv=0xc06f4734 rfp=0xd0ccfe50 undefinedinstruction+0x114 rlv=0xc05b5a68 rfp=0xd0ccfed8 $a.13 rlv=0xc04b1a18 rfp=0xd0ccff40 sys_sysctl+0x17c rlv=0xc0427620 rfp=0xd0ccffa8 swi_handler+0x2e0 rlv=0xc05b5898 rfp=0xbffe1460 -> dmesg Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. Copyrght (c) 1995-2020 OpenBSD. All rights reserved. https://www.OpenBS.org OpenBSD 6.8-current (USBDEBUG) #2: Mon Dec 14 10:18:02 CET 2020 dermi...@derrida.kilob.yt:/usr/src/sys/arch/armv/compile/USBDEBUG real mem = 3933511680 (3751MB) avail mem = 3847270400 (3669MB) random: good seed from bootblocks mainbus0 at root: Kosagi Novena Dual/Quad cpu0 at mainbus0 mpidr 0: ARM Cortex-A9 r2p10 cpu0: 32KB 32b/line 4-way L1 VIPT I-cache, 32KB 32b/line 4-way L1 D-cache cortex0 at mainbus0 amptimer0 at cortex0: tick rate 396000 KHz armliicc0 at cortex0: rtl 7 waymask: 0x000f ampintc0 at mainbus0 nirq 160, ncpu 4: "interrupt-controller" simplebus0 at mainbus0: "soc" "dma-apbh" at simplebus0 not configured "gpu" at simplebus0 not configured "gpu" at simplebus0 not configured "hdmi" at simplebus0 not configured "timer" at simplebus0 not configured "l2-cache" at simplebus0 not configured "pcie" at simplebus0 not configured "pmu" at simplebus0 not configured simplebus1 at simplebus0: "aips-bus" imxccm0 at simplebus1 imxanatop0 at simplebus1 syscon0 at simplebus1: "snvs" imxrtc0 at syscon0 imxsrc0 at simplebus1 syscon1 at simplebus1: "iomuxc-gpr" imxiomuxc0 at simplebus1 simplebus2 at simplebus1: "spba-bus" "ssi" at simplebus2 not configured "asrc" at simplebus2 not configured "vpu" at simplebus1 not configured "pwm" at simplebus1 not configured "gpt" at simplebus1 not configured imxgpio0 at simplebus1 imxgpio1 at simplebus1 imxgpio2 at simplebus1 imxgpio3 at simplebus1 imxgpio4 at simplebus1 imxgpio5 at simplebus1 imxgpio6 at simplebus1 "kpp" at simplebus1 not configured imxdog0 at simplebus1 imxtemp0 at simplebus1 "usbphy" at simplebus1 not configured "usbphy" at simplebus1 not configured imxgpc0 at simplebus1 "ldb" at simplebus1 not configured "sdma" at simplebus1 not configured simplebus3 at simplebus0: "aips-bus" syscon2 at simplebus3: "ocotp" "caam" at simplebus3 not configured imxehci0 at simplebus3 usb0 at imxehci0: USB revision 2.0 uhub0 at usb0 configuration 1 interface 0 "i.MX EHCI root hub" rev 2.00/1.00 addr 1 uhub0: 1 port with 1 removable, self powered imxehci1 at simplebus3 usb1 at imxehci1: USB revision 2.0 uhub1 at usb1 configuration 1 interface 0 "i.MX EHCI root hub" rev 2.00/1.00 addr 1 uhub1: 1 port with 1 removable, self powered "usbmisc" at simplebus3 not configured fec0 at simplebus3 fec0: address 00:1f:11:02:17:de ukphy0 at fec0 phy 7: Generic IEEE 802.3u media interface, rev. 1: OUI 0x000885, model 0x0021 imxesdhc0 at simplebus3 imxesdhc0: 198 MHz base clock sdmmc0 at imxesdhc0: 4-bit, sd high-speed, mmc high-speed, dma imxesdhc1 at simplebus3 imxesdhc1: 198 MHz base clock sdmmc1 at imxesdhc1: 4-bit, sd high-speed, mmc high-speed, dma imxiic0 at simplebus3 iic0 at imxiic0 "sbs,sbs-battery" at iic0 addr 0xb not configured "kosagi,senoko" at iic0 addr 0x20 not configured "st,stmp
Re: IP_SENDSRCADDR cmsg_len and dnsmasq
On Thu, 12 Jul 2018 19:54:26 +0200 Alexander Bluhm wrote: > > If it is a temporary problem, that will go away when the content > of the socket buffer is sent away, we should block or return > EWOULDBLOCK. For a permanent problem return EMSGSIZE. Non atomic > operations can be split in smaller chunks, so there are no permanent > problems. Control messages are considerd atomic. AF_UNIX needs > special treatment as file descriptor passing is dificult. On top > of that consider integer overflow. > > revision 1.100 > date: 2012/04/24 16:35:08; author: deraadt; state: Exp; lines: > +13 -3; In sosend() for AF_UNIX control message sending, correctly > calculate the size (internalized ones can be larger on some > architectures) for fitting into the socket. Avoid getting confused > by sb_hiwat as well. This fixes a variety of issues where sendmsg() > would fail to deliver a fd set or fail to wait; even leading to file > leakage. Worked on this with claudio for about a week... > > revision 1.145 > date: 2016/01/06 10:06:50; author: stefan; state: Exp; lines: > +27 -30; Prevent integer overflows in sosend() and soreceive() by > converting min()+uiomovei() to ulmin()+uiomove() and re-arranging > space computations in sosend(). The soreceive() part was also > reported by Martin Natano. ok bluhm@ and also discussed with tedu@ > > So first of all we should split the AF_UNIX cases to keep it readable. > And I don't want to change the AF_UNIX code as the commit message > indicates that it was hard to develop the current solution. > > From the bug reports it seems that we should check that the UDP > packets and the IP_SENDSRCADDR fit into the socket buffer. If not > it is a permanent EMSGSIZE error. So make sure that resid + clen > <= so->so_snd.sb_hiwat. Then write it the other way around to > prevent signed integer overflow. > > The result of this considerations is the diff below. I have not > tested it. Does the orignal bug go away with it? Some hackathons > ago jeremy@ mentioned that the ruby test suite found a bug in this > area. So maybe we should try it. > > Does this make sense? > > bluhm > > Index: kern/uipc_socket.c > === > RCS file: /data/mirror/openbsd/cvs/src/sys/kern/uipc_socket.c,v > retrieving revision 1.225 > diff -u -p -r1.225 uipc_socket.c > --- kern/uipc_socket.c5 Jul 2018 14:45:07 - 1.225 > +++ kern/uipc_socket.c12 Jul 2018 17:24:28 - > @@ -462,10 +462,14 @@ restart: > space = sbspace(so, &so->so_snd); > if (flags & MSG_OOB) > space += 1024; > - if ((atomic && resid > so->so_snd.sb_hiwat) || > - (so->so_proto->pr_domain->dom_family != AF_UNIX > && > - clen > so->so_snd.sb_hiwat)) > - snderr(EMSGSIZE); > + if (so->so_proto->pr_domain->dom_family == AF_UNIX) { > + if (atomic && resid > so->so_snd.sb_hiwat) > + snderr(EMSGSIZE); > + } else { > + if (clen > so->so_snd.sb_hiwat || > + (atomic && resid > so->so_snd.sb_hiwat - > clen)) > + snderr(EMSGSIZE); > + } > if (space < clen || > (space - clen < resid && > (atomic || space < so->so_snd.sb_lowat))) { It is indeed much easier to parse. Kudos on spotting the potential overflow. Ok vgross@ I have a regression test for this based on Alexander Markert code + rework by mpi@, do you want me to commit it right now ?
IP_SENDSRCADDR cmsg_len and dnsmasq
So a while back Alexander Markert sent a bug report regarding sendmsg() behaviour with IP_SENDSRCADDR : https://marc.info/?l=openbsd-tech&m=149276833923905&w=2 This impacts our dnsmasq port : https://marc.info/?l=openbsd-tech&m=149234052220818&w=2 Alexander Markert shows in the first thread the problematic code and conditions. To save you the trip back in time : sendfrom() returns EWOULDBLOCK (or blocks if using blocking IO) when len(cmsg) + len(data) > len(socket.buffer). The better behaviour would be to never block and return EMSGSIZE. The first diff fixes the kernel code ; The second diff reverts https://marc.info/?l=openbsd-ports-cvs&m=149233921320572&w=2 and fixes a bad cmsg setup. 1) Can you confirm this fixes dnsmasq ? or whatever you used to trigger the bug ? 2) Ok ? (apologies for the delay by the way :S ) -- 8< 8< 8< -- Index: sys/kern/uipc_socket.c === RCS file: /cvs/src/sys/kern/uipc_socket.c,v retrieving revision 1.224 diff -u -p -r1.224 uipc_socket.c --- sys/kern/uipc_socket.c 14 Jun 2018 08:46:09 - 1.224 +++ sys/kern/uipc_socket.c26 Jun 2018 18:51:49 - @@ -459,9 +459,10 @@ restart: space = sbspace(so, &so->so_snd); if (flags & MSG_OOB) space += 1024; - if ((atomic && resid > so->so_snd.sb_hiwat) || + if ((so->so_proto->pr_domain->dom_family == AF_UNIX && + atomic && resid > so->so_snd.sb_hiwat) || (so->so_proto->pr_domain->dom_family != AF_UNIX && - clen > so->so_snd.sb_hiwat)) + clen + (atomic ? resid : 0) > so->so_snd.sb_hiwat)) snderr(EMSGSIZE); if (space < clen || (space - clen < resid && -- 8< 8< 8< -- Index: net/dnsmasq/patches/patch-src_dnsmasq_c === RCS file: net/dnsmasq/patches/patch-src_dnsmasq_c diff -N net/dnsmasq/patches/patch-src_dnsmasq_c --- net/dnsmasq/patches/patch-src_dnsmasq_c 29 Mar 2018 19:42:51 - 1.5 +++ /dev/null 1 Jan 1970 00:00:00 - @@ -1,16 +0,0 @@ -$OpenBSD: patch-src_dnsmasq_c,v 1.5 2018/03/29 19:42:51 ajacoutot Exp $ - -Fails. Currently disabled pending investigation. - -Index: src/dnsmasq.c src/dnsmasq.c.orig -+++ src/dnsmasq.c -@@ -149,7 +149,7 @@ int main (int argc, char **argv) - open("/dev/null", O_RDWR); - - #ifndef HAVE_LINUX_NETWORK --# if !(defined(IP_RECVDSTADDR) && defined(IP_RECVIF) && defined(IP_SENDSRCADDR)) -+# if defined(__OpenBSD__) || !(defined(IP_RECVDSTADDR) && defined(IP_RECVIF) && defined(IP_SENDSRCADDR)) - if (!option_bool(OPT_NOWILD)) - { - bind_fallback = 1; Index: net/dnsmasq/patches/patch-src_forward_c === RCS file: /cvs/ports/net/dnsmasq/patches/patch-src_forward_c,v retrieving revision 1.1 diff -u -p -r1.1 patch-src_forward_c --- net/dnsmasq/patches/patch-src_forward_c 16 Apr 2017 10:40:07 - 1.1 +++ net/dnsmasq/patches/patch-src_forward_c 26 Jun 2018 18:48:17 - @@ -1,24 +1,17 @@ -$OpenBSD: patch-src_forward_c,v 1.1 2017/04/16 10:40:07 sthen Exp $ +$OpenBSD$ -Fails. Currently disabled pending investigation. +CMSG_SPACE() != CMSG_LEN() src/forward.c.orig Sat Apr 15 22:36:04 2017 -+++ src/forward.c Sat Apr 15 22:46:09 2017 -@@ -35,7 +35,7 @@ int send_from(int fd, int nowild, char *packet, size_t - struct cmsghdr align; /* this ensures alignment */ - #if defined(HAVE_LINUX_NETWORK) - char control[CMSG_SPACE(sizeof(struct in_pktinfo))]; --#elif defined(IP_SENDSRCADDR) -+#elif !defined(__OpenBSD__) && defined(IP_SENDSRCADDR) - char control[CMSG_SPACE(sizeof(struct in_addr))]; - #endif - #ifdef HAVE_IPV6 -@@ -71,7 +71,7 @@ int send_from(int fd, int nowild, char *packet, size_t - msg.msg_controllen = cmptr->cmsg_len = CMSG_LEN(sizeof(struct in_pktinfo)); - cmptr->cmsg_level = IPPROTO_IP; +Index: src/forward.c +--- src/forward.c.orig src/forward.c +@@ -73,7 +73,8 @@ int send_from(int fd, int nowild, char *packet, size_t cmptr->cmsg_type = IP_PKTINFO; --#elif defined(IP_SENDSRCADDR) -+#elif !defined(__OpenBSD__) && defined(IP_SENDSRCADDR) + #elif defined(IP_SENDSRCADDR) memcpy(CMSG_DATA(cmptr), &(source->addr.addr4), sizeof(source->addr.addr4)); - msg.msg_controllen = cmptr->cmsg_len = CMSG_LEN(sizeof(struct in_addr)); +-msg.msg_controllen = cmptr->cmsg_len = CMSG_LEN(sizeof(struct in_addr)); ++msg.msg_controllen = sizeof(control_u.control); ++cmptr->cmsg_len = CMSG_LEN(sizeof(struct in_addr)); cmptr->cmsg_level = IPPROTO_IP; + cmptr->cmsg_type = IP_SENDSRCADDR; + #endif
Re: 11n support for athn(4)
On Tue, 10 Jan 2017 00:27:47 +0100 Stefan Sperling wrote: > On Mon, Jan 09, 2017 at 01:54:55PM +0100, Stefan Sperling wrote: > > This diff adds 11n support to the athn(4) driver. > > Requires -current net80211 code from today. > > A better diff which fixes several bugs. > > Most notably this should fix a crash in hostap mode triggered by > clients joining and leaving in a loop. This is fixed by making sure > timeout handlers managed by mira aren't overwritten when a client > rejoins, and by cancelling these timeouts properly. I'd like to > rename some mira API functions for better clarity but that's left for > later. > > This also restores USB device firmware rate scaling in client mode > which was disabled by commits I made in 2015. I found a missing > 'usc->nnodes--;' in the code from before those commits, and I hope > adding that is a proper fix for the problems we were hunting back > then. > > Known issues (not blocking issues IMO): > > - The athn(4) driver selects low transmit rates relative to what > iwn(4) and iwm(4) clients select. > > - USB client in 11n mode only sends with legacy rates (up to > 54Mbit/s). Technically this is legal behaviour, and receiving MCS > sent by the AP works. Rate selection is done in firmware so this > isn't straightforward to debug. > I just rebuilt a bsd.mp with your diff on my home router, so far so good, works with my -current laptop iwn(4) card and the BCM4339 from my android phone. I REALLY wonder where can I find lots of heavy content on the internet to stress this diff ... or not. [snip]
sys/net/if.c, leftovers from r1.442
up is never set in ifioctl(). Ok ? Index: net/if.c === RCS file: /cvs/src/sys/net/if.c,v retrieving revision 1.463 diff -u -p -r1.463 if.c --- net/if.c28 Nov 2016 11:18:02 - 1.463 +++ net/if.c1 Dec 2016 20:31:27 - @@ -1688,7 +1688,6 @@ ifioctl(struct socket *so, u_long cmd, c size_t bytesdone; short oif_flags; const char *label; - short up = 0; switch (cmd) { @@ -2046,12 +2045,6 @@ ifioctl(struct socket *so, u_long cmd, c if (((oif_flags ^ ifp->if_flags) & IFF_UP) != 0) microtime(&ifp->if_lastchange); - /* If we took down the IF, bring it back */ - if (up) { - s = splnet(); - if_up(ifp); - splx(s); - } return (error); }
vxlan bug wrt IN6_ANY as source Was: Re: tweak in6_selectsrc()
On Tue, 29 Nov 2016 17:03:44 +0100 Martin Pieuchot wrote: > Diff below removes the 'struct route_in6' argument from > in6_selectsrc(). > > It is only used by in6_pcbselsrc() so move the code there. This > reduces differences with IPv4 and help me to get rid of 'struct > route*'. > > ok? Reads ok, not tested yet. Your diff is interesting in that is helped me to find a bug in vxlan(4). Build a tunnel like this: $ doas ifconfig pair11 rdomain 11 $ doas ifconfig pair12 rdomain 12 patch pair11 $ doas ifconfig pair11 inet6 fd03::11/64 up $ doas ifconfig pair12 inet6 fd03::12/64 up $ doas ifconfig vxlan11 rdomain 11 tunneldomain 11 vnetid 10 $ doas ifconfig vxlan12 rdomain 12 tunneldomain 12 vnetid 10 $ doas ifconfig vxlan11 inet6 fd06::11/64 tunnel :: fd03::12 up $ doas ifconfig vxlan12 inet6 fd06::12/64 tunnel :: fd03::11 up Watch ping6 fail: $ ping6 -V 11 fd06::12 Tweak the vxlans and see pings flow $ doas ifconfig vxlan11 tunnel fd03::11 fd03::12 $ doas ifconfig vxlan12 tunnel fd03::12 fd03::11 $ ping6 -V 11 fd06::11 I think we should not allow at all empty source addresses, as it can make things confusing when troubleshooting. goda@ yasuoka@ reyk@ : what is your take on this ? > > Index: net/if_vxlan.c > === > RCS file: /cvs/src/sys/net/if_vxlan.c,v > retrieving revision 1.52 > diff -u -p -r1.52 if_vxlan.c > --- net/if_vxlan.c29 Nov 2016 10:09:57 - 1.52 > +++ net/if_vxlan.c29 Nov 2016 15:52:41 - > @@ -768,7 +768,7 @@ vxlan_encap6(struct ifnet *ifp, struct m > ip6->ip6_hlim = ip6_defhlim; > > if (IN6_IS_ADDR_UNSPECIFIED(&satosin6(src)->sin6_addr)) { > - error = in6_selectsrc(&in6a, satosin6(dst), NULL, > NULL, > + error = in6_selectsrc(&in6a, satosin6(dst), NULL, > sc->sc_rdomain); > if (error != 0) { > m_freem(m); > Index: netinet6/in6_src.c > === > RCS file: /cvs/src/sys/netinet6/in6_src.c,v > retrieving revision 1.80 > diff -u -p -r1.80 in6_src.c > --- netinet6/in6_src.c2 Sep 2016 13:53:44 - 1.80 > +++ netinet6/in6_src.c29 Nov 2016 15:56:56 - > @@ -99,7 +99,6 @@ in6_pcbselsrc(struct in6_addr **in6src, > struct route_in6 *ro = &inp->inp_route6; > struct in6_addr *laddr = &inp->inp_laddr6; > u_int rtableid = inp->inp_rtableid; > - > struct ifnet *ifp = NULL; > struct in6_addr *dst; > struct in6_ifaddr *ia6 = NULL; > @@ -172,7 +171,55 @@ in6_pcbselsrc(struct in6_addr **in6src, > return (0); > } > > - return in6_selectsrc(in6src, dstsock, mopts, ro, rtableid); > + error = in6_selectsrc(in6src, dstsock, mopts, rtableid); > + if (error != EADDRNOTAVAIL) > + return (error); > + > + /* > + * If route is known or can be allocated now, > + * our src addr is taken from the i/f, else punt. > + */ > + if (!rtisvalid(ro->ro_rt) || (ro->ro_tableid != rtableid) || > + !IN6_ARE_ADDR_EQUAL(&ro->ro_dst.sin6_addr, dst)) { > + rtfree(ro->ro_rt); > + ro->ro_rt = NULL; > + } > + if (ro->ro_rt == NULL) { > + struct sockaddr_in6 *sa6; > + > + /* No route yet, so try to acquire one */ > + bzero(&ro->ro_dst, sizeof(struct sockaddr_in6)); > + ro->ro_tableid = rtableid; > + sa6 = &ro->ro_dst; > + sa6->sin6_family = AF_INET6; > + sa6->sin6_len = sizeof(struct sockaddr_in6); > + sa6->sin6_addr = *dst; > + sa6->sin6_scope_id = dstsock->sin6_scope_id; > + ro->ro_rt = rtalloc(sin6tosa(&ro->ro_dst), > + RT_RESOLVE, ro->ro_tableid); > + } > + > + /* > + * in_pcbconnect() checks out IFF_LOOPBACK to skip using > + * the address. But we don't know why it does so. > + * It is necessary to ensure the scope even for lo0 > + * so doesn't check out IFF_LOOPBACK. > + */ > + > + if (ro->ro_rt) { > + ifp = if_get(ro->ro_rt->rt_ifidx); > + if (ifp != NULL) { > + ia6 = in6_ifawithscope(ifp, dst, rtableid); > + if_put(ifp); > + } > + if (ia6 == NULL) /* xxx scope error ?*/ > + ia6 = ifatoia6(ro->ro_rt->rt_ifa); > + } > + if (ia6 == NULL) > + return (EHOSTUNREACH); /* no route */ > + > + *in6src = &ia6->ia_addr.sin6_addr; > + return (0); > } > > /* > @@ -183,7 +230,7 @@ in6_pcbselsrc(struct in6_addr **in6src, > */ > int > in6_selectsrc(struct in6_addr **in6src, struct sockaddr_in6 *dstsock, > -struct ip6_moptions *mopts, struct route_in6 *ro, u_int rtableid) > +struct ip6_moptions *mopts, unsigned int rtableid) > { > struct ifnet *ifp = NULL; > struct in6_addr *dst; > @@ -239,54 +286,6 @@ i
Re: add in6 multicast support to vxlan(4), take 4
On Tue, 29 Nov 2016 15:13:16 +0100 Alexander Bluhm wrote: > On Sat, Nov 05, 2016 at 12:41:39PM +0100, Vincent Gross wrote: > > Updated diff, I reworked the logic to handle the if_get/if_put > > dance in vxlan_multicast_join(), and fixed an uninitialized > > variable. > > > > Ok ? > > Some nits inline. [snip] About sleeping on malloc : better to err on the safe side with M_NOWAIT. About resolving the route : you are right, the cloning route is enough to get the interface index. New diff with nits fixed : Index: sys/net/if_vxlan.c === RCS file: /cvs/src/sys/net/if_vxlan.c,v retrieving revision 1.52 diff -u -p -r1.52 if_vxlan.c --- sys/net/if_vxlan.c 29 Nov 2016 10:09:57 - 1.52 +++ sys/net/if_vxlan.c 30 Nov 2016 22:23:01 - @@ -47,6 +47,10 @@ #include #include +#ifdef INET6 +#include +#endif /* INET6 */ + #if NPF > 0 #include #endif @@ -61,7 +65,14 @@ struct vxlan_softc { struct arpcomsc_ac; struct ifmedia sc_media; - struct ip_moptions sc_imo; + union { + struct ip_moptions u_imo; +#ifdef INET6 + struct ip6_moptions u_im6o; +#endif /* INET6 */ + } sc_imu; +#define sc_imo sc_imu.u_imo +#define sc_im6osc_imu.u_im6o void*sc_ahcookie; void*sc_lhcookie; void*sc_dhcookie; @@ -129,10 +140,6 @@ vxlan_clone_create(struct if_clone *ifc, M_DEVBUF, M_NOWAIT|M_ZERO)) == NULL) return (ENOMEM); - sc->sc_imo.imo_membership = malloc( - (sizeof(struct in_multi *) * IP_MIN_MEMBERSHIPS), M_IPMOPTS, - M_WAITOK|M_ZERO); - sc->sc_imo.imo_max_memberships = IP_MIN_MEMBERSHIPS; sc->sc_dstport = htons(VXLAN_PORT); sc->sc_vnetid = VXLAN_VNI_UNSET; @@ -190,7 +197,6 @@ vxlan_clone_destroy(struct ifnet *ifp) ifmedia_delete_instance(&sc->sc_media, IFM_INST_ANY); ether_ifdetach(ifp); if_detach(ifp); - free(sc->sc_imo.imo_membership, M_IPMOPTS, 0); free(sc, M_DEVBUF, sizeof(*sc)); return (0); @@ -199,11 +205,35 @@ vxlan_clone_destroy(struct ifnet *ifp) void vxlan_multicast_cleanup(struct ifnet *ifp) { - struct vxlan_softc *sc = (struct vxlan_softc *)ifp->if_softc; - struct ip_moptions *imo = &sc->sc_imo; - struct ifnet*mifp; + struct vxlan_softc *sc = (struct vxlan_softc *)ifp->if_softc; + struct ip_moptions *imo; + struct in_multi **imm; + struct ip6_moptions *im6o; + struct in6_multi_mship *im6m, *im6m_next; + struct ifnet *mifp = NULL; + + switch (sc->sc_dst.ss_family) { + case AF_INET: + imo = &sc->sc_imo; + mifp = if_get(imo->imo_ifidx); + imm = imo->imo_membership; + while (imo->imo_num_memberships > 0) + in_delmulti(imm[--imo->imo_num_memberships]); + free(imm, M_IPMOPTS, + sizeof(struct in_multi *) * IP_MIN_MEMBERSHIPS); + break; +#ifdef INET6 + case AF_INET6: + im6o = &sc->sc_im6o; + mifp = if_get(im6o->im6o_ifidx); + LIST_FOREACH_SAFE(im6m, &im6o->im6o_memberships, i6mm_chain, + im6m_next) + in6_leavegroup(im6m); + break; +#endif /* INET6 */ + } + bzero(&sc->sc_imu, sizeof(sc->sc_imu)); - mifp = if_get(imo->imo_ifidx); if (mifp != NULL) { if (sc->sc_ahcookie != NULL) { hook_disestablish(mifp->if_addrhooks, sc->sc_ahcookie); @@ -219,14 +249,9 @@ vxlan_multicast_cleanup(struct ifnet *if sc->sc_dhcookie); sc->sc_dhcookie = NULL; } - - if_put(mifp); } - if (imo->imo_num_memberships > 0) { - in_delmulti(imo->imo_membership[--imo->imo_num_memberships]); - imo->imo_ifidx = 0; - } + if_put(mifp); } int @@ -234,55 +259,141 @@ vxlan_multicast_join(struct ifnet *ifp, struct sockaddr *dst) { struct vxlan_softc *sc = ifp->if_softc; - struct ip_moptions *imo = &sc->sc_imo; + struct ip_moptions *imo; + struct ip6_moptions *im6o; + struct in6_multi_mship *im6m; struct sockaddr_in *src4, *dst4; #ifdef INET6 - struct sockaddr_in6 *dst6; + struct sockaddr_in6 *src6, *dst6; #endif /* INET6 */ struct ifaddr *ifa; - struct ifnet*mifp; + struct
Re: add in6 multicast support to vxlan(4), take 4
On Thu, 10 Nov 2016 22:16:55 +0100 Vincent Gross wrote: > On Sat, 5 Nov 2016 12:41:39 +0100 > Vincent Gross wrote: > > > Updated diff, I reworked the logic to handle the if_get/if_put dance > > in vxlan_multicast_join(), and fixed an uninitialized variable. > > > > Ok ? > > Anyone to comment or ok ? this blocks the submission of > other changes on the network stack. Come on ! Don't be shy ! http://quigon.bsws.de/papers/2015/asiabsdcon/mgp00042.html http://quigon.bsws.de/papers/2015/asiabsdcon/mgp00043.html > > > > > Index: net/if_vxlan.c > > === > > RCS file: /cvs/src/sys/net/if_vxlan.c,v > > retrieving revision 1.51 > > diff -u -p -r1.51 if_vxlan.c > > --- net/if_vxlan.c 25 Oct 2016 16:31:08 - 1.51 > > +++ net/if_vxlan.c 5 Nov 2016 11:36:02 - > > @@ -47,6 +47,8 @@ > > #include > > #include > > > > +#include > > + > > #if NPF > 0 > > #include > > #endif > > @@ -61,7 +63,12 @@ struct vxlan_softc { > > struct arpcomsc_ac; > > struct ifmedia sc_media; > > > > - struct ip_moptions sc_imo; > > + union { > > + struct ip_moptions u_imo; > > + struct ip6_moptions u_imo6; > > + } sc_imu; > > +#define sc_imo sc_imu.u_imo > > +#define sc_im6osc_imu.u_imo6 > > void*sc_ahcookie; > > void*sc_lhcookie; > > void*sc_dhcookie; > > @@ -129,10 +136,6 @@ vxlan_clone_create(struct if_clone *ifc, > > M_DEVBUF, M_NOWAIT|M_ZERO)) == NULL) > > return (ENOMEM); > > > > - sc->sc_imo.imo_membership = malloc( > > - (sizeof(struct in_multi *) * IP_MIN_MEMBERSHIPS), > > M_IPMOPTS, > > - M_WAITOK|M_ZERO); > > - sc->sc_imo.imo_max_memberships = IP_MIN_MEMBERSHIPS; > > sc->sc_dstport = htons(VXLAN_PORT); > > sc->sc_vnetid = VXLAN_VNI_UNSET; > > > > @@ -190,7 +193,6 @@ vxlan_clone_destroy(struct ifnet *ifp) > > ifmedia_delete_instance(&sc->sc_media, IFM_INST_ANY); > > ether_ifdetach(ifp); > > if_detach(ifp); > > - free(sc->sc_imo.imo_membership, M_IPMOPTS, 0); > > free(sc, M_DEVBUF, sizeof(*sc)); > > > > return (0); > > @@ -199,11 +201,33 @@ vxlan_clone_destroy(struct ifnet *ifp) > > void > > vxlan_multicast_cleanup(struct ifnet *ifp) > > { > > - struct vxlan_softc *sc = (struct vxlan_softc > > *)ifp->if_softc; > > - struct ip_moptions *imo = &sc->sc_imo; > > - struct ifnet*mifp; > > + struct vxlan_softc *sc = (struct vxlan_softc > > *)ifp->if_softc; > > + struct ip_moptions *imo; > > + struct in_multi **imm; > > + struct ip6_moptions *im6o; > > + struct in6_multi_mship *im6m, *im6m_next; > > + struct ifnet *mifp = NULL; > > + > > + switch (sc->sc_dst.ss_family) { > > + case AF_INET: > > + imo = &sc->sc_imo; > > + mifp = if_get(imo->imo_ifidx); > > + imm = imo->imo_membership; > > + while (imo->imo_num_memberships > 0) > > + > > in_delmulti(imm[--imo->imo_num_memberships]); > > + free(imm, M_IPMOPTS, > > + sizeof(struct in_multi *) * > > imo->imo_num_memberships); > > + break; > > + case AF_INET6: > > + im6o = &sc->sc_im6o; > > + mifp = if_get(im6o->im6o_ifidx); > > + LIST_FOREACH_SAFE(im6m, &im6o->im6o_memberships, > > i6mm_chain, > > + im6m_next) > > + in6_leavegroup(im6m); > > + break; > > + } > > + bzero(&sc->sc_imu, sizeof(sc->sc_imu)); > > > > - mifp = if_get(imo->imo_ifidx); > > if (mifp != NULL) { > > if (sc->sc_ahcookie != NULL) { > > hook_disestablish(mifp->if_addrhooks, > > sc->sc_ahcookie); @@ -219,14 +243,9 @@ > > vxlan_multicast_cleanup(struct ifnet *if sc->sc_dhcookie); > > sc->sc_dhcookie = NULL; > > } > > - > > - if_put(mifp); > > } > > > > - if (imo->imo_num_memberships > 0) { > > - > > in_delmulti(imo->imo_membership[--imo->imo_num_memberships]); > > - imo-&
Re: add in6 multicast support to vxlan(4), take 4
On Sat, 5 Nov 2016 12:41:39 +0100 Vincent Gross wrote: > Updated diff, I reworked the logic to handle the if_get/if_put dance > in vxlan_multicast_join(), and fixed an uninitialized variable. > > Ok ? Anyone to comment or ok ? this blocks the submission of other changes on the network stack. > > Index: net/if_vxlan.c > === > RCS file: /cvs/src/sys/net/if_vxlan.c,v > retrieving revision 1.51 > diff -u -p -r1.51 if_vxlan.c > --- net/if_vxlan.c25 Oct 2016 16:31:08 - 1.51 > +++ net/if_vxlan.c5 Nov 2016 11:36:02 - > @@ -47,6 +47,8 @@ > #include > #include > > +#include > + > #if NPF > 0 > #include > #endif > @@ -61,7 +63,12 @@ struct vxlan_softc { > struct arpcomsc_ac; > struct ifmedia sc_media; > > - struct ip_moptions sc_imo; > + union { > + struct ip_moptions u_imo; > + struct ip6_moptions u_imo6; > + } sc_imu; > +#define sc_imo sc_imu.u_imo > +#define sc_im6o sc_imu.u_imo6 > void*sc_ahcookie; > void*sc_lhcookie; > void*sc_dhcookie; > @@ -129,10 +136,6 @@ vxlan_clone_create(struct if_clone *ifc, > M_DEVBUF, M_NOWAIT|M_ZERO)) == NULL) > return (ENOMEM); > > - sc->sc_imo.imo_membership = malloc( > - (sizeof(struct in_multi *) * IP_MIN_MEMBERSHIPS), > M_IPMOPTS, > - M_WAITOK|M_ZERO); > - sc->sc_imo.imo_max_memberships = IP_MIN_MEMBERSHIPS; > sc->sc_dstport = htons(VXLAN_PORT); > sc->sc_vnetid = VXLAN_VNI_UNSET; > > @@ -190,7 +193,6 @@ vxlan_clone_destroy(struct ifnet *ifp) > ifmedia_delete_instance(&sc->sc_media, IFM_INST_ANY); > ether_ifdetach(ifp); > if_detach(ifp); > - free(sc->sc_imo.imo_membership, M_IPMOPTS, 0); > free(sc, M_DEVBUF, sizeof(*sc)); > > return (0); > @@ -199,11 +201,33 @@ vxlan_clone_destroy(struct ifnet *ifp) > void > vxlan_multicast_cleanup(struct ifnet *ifp) > { > - struct vxlan_softc *sc = (struct vxlan_softc > *)ifp->if_softc; > - struct ip_moptions *imo = &sc->sc_imo; > - struct ifnet*mifp; > + struct vxlan_softc *sc = (struct vxlan_softc > *)ifp->if_softc; > + struct ip_moptions *imo; > + struct in_multi **imm; > + struct ip6_moptions *im6o; > + struct in6_multi_mship *im6m, *im6m_next; > + struct ifnet *mifp = NULL; > + > + switch (sc->sc_dst.ss_family) { > + case AF_INET: > + imo = &sc->sc_imo; > + mifp = if_get(imo->imo_ifidx); > + imm = imo->imo_membership; > + while (imo->imo_num_memberships > 0) > + in_delmulti(imm[--imo->imo_num_memberships]); > + free(imm, M_IPMOPTS, > + sizeof(struct in_multi *) * > imo->imo_num_memberships); > + break; > + case AF_INET6: > + im6o = &sc->sc_im6o; > + mifp = if_get(im6o->im6o_ifidx); > + LIST_FOREACH_SAFE(im6m, &im6o->im6o_memberships, > i6mm_chain, > + im6m_next) > + in6_leavegroup(im6m); > + break; > + } > + bzero(&sc->sc_imu, sizeof(sc->sc_imu)); > > - mifp = if_get(imo->imo_ifidx); > if (mifp != NULL) { > if (sc->sc_ahcookie != NULL) { > hook_disestablish(mifp->if_addrhooks, > sc->sc_ahcookie); @@ -219,14 +243,9 @@ vxlan_multicast_cleanup(struct > ifnet *if sc->sc_dhcookie); > sc->sc_dhcookie = NULL; > } > - > - if_put(mifp); > } > > - if (imo->imo_num_memberships > 0) { > - > in_delmulti(imo->imo_membership[--imo->imo_num_memberships]); > - imo->imo_ifidx = 0; > - } > + if_put(mifp); > } > > int > @@ -234,55 +253,141 @@ vxlan_multicast_join(struct ifnet *ifp, > struct sockaddr *dst) > { > struct vxlan_softc *sc = ifp->if_softc; > - struct ip_moptions *imo = &sc->sc_imo; > + struct ip_moptions *imo; > + struct ip6_moptions *im6o; > + struct in6_multi_mship *im6m; > struct sockaddr_in *src4, *dst4; > #ifdef INET6 > - struct sockaddr_in6 *dst6; > + struct sockaddr_in6 *src6, *dst6; > #endif /* INET6 */ > struct ifad
Re: [PATCH] iked: Bugfixes for IKE rekeying
On Wed, 9 Nov 2016 13:16:46 + Thomas Klute wrote: > Hi tech@, > > this patch contains fixes for two bugs that break IKE rekeying > initiated by iked. Please review, and apply or let me know what has to > be changed! Both bugs are fixed by initializing the respective > structures of the new IKE SA (struct iked_sa *nsa in the > ikev2_ike_sa_rekey function): Thanks, we are looking into it. > > For [1]: Copying the address information is required to send any > request messages over the new IKE SA after rekeying, otherwise errors > like the following happen because the IP addresses and ports remain > initialized to zero: > > ikev2_msg_send: INFORMATIONAL request from any to any msgid 1, 80 > bytes ikev2_msg_send: sendtofrom: Invalid argument > > For [2]: Setting the DH group based on the currently used one is > necessary because iked proposes only the currently used transforms > during IKE rekeying, so trying to use any other group for the DH > exchange will fail even if it is preferred by local policy (see > comment in the patch for details). > > This patch includes and supersedes the one for only the first bug I > sent yesterday. > > Best regards, > Thomas > > [1] https://marc.info/?l=openbsd-bugs&m=147739504516767&w=2 > [2] https://marc.info/?l=openbsd-bugs&m=147747405806461&w=2 > > Index: src/sbin/iked/ikev2.c > === > RCS file: /cvs/src/sbin/iked/ikev2.c,v > retrieving revision 1.131 > diff -u -p -u -r1.131 ikev2.c > --- src/sbin/iked/ikev2.c 2 Jun 2016 07:14:26 - > 1.131 +++ src/sbin/iked/ikev2.c 9 Nov 2016 13:12:32 - > @@ -2658,6 +2658,18 @@ ikev2_ike_sa_rekey(struct iked *env, voi > goto done; > } > > + /* Select the DH group ID based on the currently used > + * one. Otherwise the call to ikev2_sa_initiator below would > + * set it to the first DH transform in the policy, while the > + * SA payload contains only one proposal matching the > + * currently used transforms. If a different DH transform has > + * been negotiated this means KE payload and negotiated DH > + * transform cannot match, causing rekeying to fail. */ > + if ((nsa->sa_dhgroup = group_get(sa->sa_dhgroup->id)) == > NULL) { > + log_debug("%s: failed to initialize DH group", > __func__); > + goto done; > + } > + > if (ikev2_sa_initiator(env, nsa, sa, NULL)) { > log_debug("%s: failed to setup DH", __func__); > goto done; > @@ -2665,6 +2677,13 @@ ikev2_ike_sa_rekey(struct iked *env, voi > sa_state(env, nsa, IKEV2_STATE_AUTH_SUCCESS); > nonce = nsa->sa_inonce; > > + /* Copy local and peer address from the old SA */ > + if (sa_address(nsa, &nsa->sa_peer, &sa->sa_peer.addr) == -1 > || > + sa_address(nsa, &nsa->sa_local, &sa->sa_local.addr) == > -1) { > + log_debug("%s: failed copy address data", __func__); > + goto done; > + } > + > if ((e = ibuf_static()) == NULL) > goto done; > >
Re: Kill ifa_ifwithnet()
On Mon, 7 Nov 2016 08:59:53 +0100 Martin Pieuchot wrote: > On 04/11/16(Fri) 21:33, Vincent Gross wrote: > > [...] > > Why are you killing Strict Source Route Record ? Just as you did > > with rtredirect(), you can check whether RTF_GATEWAY is set and > > send back an ICMP_UNREACH if so. Or did I miss something ? > > Like that? > > Index: netinet/ip_input.c > === > RCS file: /cvs/src/sys/netinet/ip_input.c,v > retrieving revision 1.282 > diff -u -p -r1.282 ip_input.c > --- netinet/ip_input.c22 Sep 2016 10:12:25 - 1.282 > +++ netinet/ip_input.c7 Nov 2016 07:59:02 - > @@ -1117,37 +1117,20 @@ ip_dooptions(struct mbuf *m, struct ifne > ipaddr.sin_len = sizeof(ipaddr); > memcpy(&ipaddr.sin_addr, cp + off, > sizeof(ipaddr.sin_addr)); > - if (opt == IPOPT_SSRR) { > - if ((ia = ifatoia(ifa_ifwithdstaddr( > - sintosa(&ipaddr), > - m->m_pkthdr.ph_rtableid))) == > NULL) > - ia = ifatoia(ifa_ifwithnet( > - sintosa(&ipaddr), > - > m->m_pkthdr.ph_rtableid)); > - if (ia == NULL) { > - type = ICMP_UNREACH; > - code = ICMP_UNREACH_SRCFAIL; > - goto bad; > - } > - memcpy(cp + off, > &ia->ia_addr.sin_addr, > - sizeof(struct in_addr)); > - cp[IPOPT_OFFSET] += sizeof(struct > in_addr); > - } else { > - /* keep packet in the virtual > instance */ > - rt = rtalloc(sintosa(&ipaddr), > RT_RESOLVE, > - rtableid); > - if (!rtisvalid(rt)) { > - type = ICMP_UNREACH; > - code = ICMP_UNREACH_SRCFAIL; > - rtfree(rt); > - goto bad; > - } > - ia = ifatoia(rt->rt_ifa); > - memcpy(cp + off, > &ia->ia_addr.sin_addr, > - sizeof(struct in_addr)); > + /* keep packet in the virtual instance */ > + rt = rtalloc(sintosa(&ipaddr), RT_RESOLVE, > rtableid); > + if (!rtisvalid(rt) || ((opt == IPOPT_SSRR) && > + ISSET(rt->rt_flags, RTF_GATEWAY))) { > + type = ICMP_UNREACH; > + code = ICMP_UNREACH_SRCFAIL; > rtfree(rt); > - cp[IPOPT_OFFSET] += sizeof(struct > in_addr); > + goto bad; > } > + ia = ifatoia(rt->rt_ifa); > + memcpy(cp + off, &ia->ia_addr.sin_addr, > + sizeof(struct in_addr)); > + rtfree(rt); > + cp[IPOPT_OFFSET] += sizeof(struct in_addr); > ip->ip_dst = ipaddr.sin_addr; > /* >* Let ip_intr's mcast routing check handle > mcast pkts Ok vgross@
add in6 multicast support to vxlan(4), take 4
Updated diff, I reworked the logic to handle the if_get/if_put dance in vxlan_multicast_join(), and fixed an uninitialized variable. Ok ? Index: net/if_vxlan.c === RCS file: /cvs/src/sys/net/if_vxlan.c,v retrieving revision 1.51 diff -u -p -r1.51 if_vxlan.c --- net/if_vxlan.c 25 Oct 2016 16:31:08 - 1.51 +++ net/if_vxlan.c 5 Nov 2016 11:36:02 - @@ -47,6 +47,8 @@ #include #include +#include + #if NPF > 0 #include #endif @@ -61,7 +63,12 @@ struct vxlan_softc { struct arpcomsc_ac; struct ifmedia sc_media; - struct ip_moptions sc_imo; + union { + struct ip_moptions u_imo; + struct ip6_moptions u_imo6; + } sc_imu; +#define sc_imo sc_imu.u_imo +#define sc_im6osc_imu.u_imo6 void*sc_ahcookie; void*sc_lhcookie; void*sc_dhcookie; @@ -129,10 +136,6 @@ vxlan_clone_create(struct if_clone *ifc, M_DEVBUF, M_NOWAIT|M_ZERO)) == NULL) return (ENOMEM); - sc->sc_imo.imo_membership = malloc( - (sizeof(struct in_multi *) * IP_MIN_MEMBERSHIPS), M_IPMOPTS, - M_WAITOK|M_ZERO); - sc->sc_imo.imo_max_memberships = IP_MIN_MEMBERSHIPS; sc->sc_dstport = htons(VXLAN_PORT); sc->sc_vnetid = VXLAN_VNI_UNSET; @@ -190,7 +193,6 @@ vxlan_clone_destroy(struct ifnet *ifp) ifmedia_delete_instance(&sc->sc_media, IFM_INST_ANY); ether_ifdetach(ifp); if_detach(ifp); - free(sc->sc_imo.imo_membership, M_IPMOPTS, 0); free(sc, M_DEVBUF, sizeof(*sc)); return (0); @@ -199,11 +201,33 @@ vxlan_clone_destroy(struct ifnet *ifp) void vxlan_multicast_cleanup(struct ifnet *ifp) { - struct vxlan_softc *sc = (struct vxlan_softc *)ifp->if_softc; - struct ip_moptions *imo = &sc->sc_imo; - struct ifnet*mifp; + struct vxlan_softc *sc = (struct vxlan_softc *)ifp->if_softc; + struct ip_moptions *imo; + struct in_multi **imm; + struct ip6_moptions *im6o; + struct in6_multi_mship *im6m, *im6m_next; + struct ifnet *mifp = NULL; + + switch (sc->sc_dst.ss_family) { + case AF_INET: + imo = &sc->sc_imo; + mifp = if_get(imo->imo_ifidx); + imm = imo->imo_membership; + while (imo->imo_num_memberships > 0) + in_delmulti(imm[--imo->imo_num_memberships]); + free(imm, M_IPMOPTS, + sizeof(struct in_multi *) * imo->imo_num_memberships); + break; + case AF_INET6: + im6o = &sc->sc_im6o; + mifp = if_get(im6o->im6o_ifidx); + LIST_FOREACH_SAFE(im6m, &im6o->im6o_memberships, i6mm_chain, + im6m_next) + in6_leavegroup(im6m); + break; + } + bzero(&sc->sc_imu, sizeof(sc->sc_imu)); - mifp = if_get(imo->imo_ifidx); if (mifp != NULL) { if (sc->sc_ahcookie != NULL) { hook_disestablish(mifp->if_addrhooks, sc->sc_ahcookie); @@ -219,14 +243,9 @@ vxlan_multicast_cleanup(struct ifnet *if sc->sc_dhcookie); sc->sc_dhcookie = NULL; } - - if_put(mifp); } - if (imo->imo_num_memberships > 0) { - in_delmulti(imo->imo_membership[--imo->imo_num_memberships]); - imo->imo_ifidx = 0; - } + if_put(mifp); } int @@ -234,55 +253,141 @@ vxlan_multicast_join(struct ifnet *ifp, struct sockaddr *dst) { struct vxlan_softc *sc = ifp->if_softc; - struct ip_moptions *imo = &sc->sc_imo; + struct ip_moptions *imo; + struct ip6_moptions *im6o; + struct in6_multi_mship *im6m; struct sockaddr_in *src4, *dst4; #ifdef INET6 - struct sockaddr_in6 *dst6; + struct sockaddr_in6 *src6, *dst6; #endif /* INET6 */ struct ifaddr *ifa; - struct ifnet*mifp; + struct ifnet*mifp = NULL, *m6ifp = NULL; + struct rtentry *rt; + int error; switch (dst->sa_family) { case AF_INET: dst4 = satosin(dst); + src4 = satosin(src); if (!IN_MULTICAST(dst4->sin_addr.s_addr)) return (0); + if (src4->sin_addr.s_addr == INADDR_ANY || + IN_MULTICAST(src4->sin_addr.s_addr)) + return (EINVAL); + if ((ifa = ifa_ifwithaddr(src, sc->sc_rdomain)) == NULL || + (mifp = ifa->ifa_ifp) == NULL || + (mifp->if_flags & IFF_
Re: Kill ifa_ifwithnet()
On Fri, 4 Nov 2016 12:01:58 +0100 Martin Pieuchot wrote: > Rather than trying to keep this old routing table like function alive > by reimplementing rn_refines(), let's get rid of it. > > ok? > > Index: net/route.c > === > RCS file: /cvs/src/sys/net/route.c,v > retrieving revision 1.333 > diff -u -p -r1.333 route.c > --- net/route.c 6 Oct 2016 19:09:08 - 1.333 > +++ net/route.c 4 Nov 2016 10:51:55 - > @@ -550,11 +550,16 @@ rtredirect(struct sockaddr *dst, struct > splsoftassert(IPL_SOFTNET); > > /* verify the gateway is directly reachable */ > - if ((ifa = ifa_ifwithnet(gateway, rdomain)) == NULL) { > + rt = rtalloc(gateway, 0, rdomain); > + if (!rtisvalid(rt) || ISSET(rt->rt_flags, RTF_GATEWAY)) { > + rtfree(rt); > error = ENETUNREACH; > goto out; > } > - ifidx = ifa->ifa_ifp->if_index; > + ifidx = rt->rt_ifidx; > + rtfree(rt); > + rt = NULL; > + > rt = rtable_lookup(rdomain, dst, NULL, NULL, RTP_ANY); > /* >* If the redirect isn't from our current router for this > dst, Index: net/if.c > === > RCS file: /cvs/src/sys/net/if.c,v > retrieving revision 1.456 > diff -u -p -r1.456 if.c > --- net/if.c 19 Oct 2016 02:05:49 - 1.456 > +++ net/if.c 4 Nov 2016 10:55:03 - > @@ -1282,47 +1282,6 @@ ifa_ifwithdstaddr(struct sockaddr *addr, > } > > /* > - * Find an interface on a specific network. If many, choice > - * is most specific found. > - */ > -struct ifaddr * > -ifa_ifwithnet(struct sockaddr *sa, u_int rtableid) > -{ > - struct ifnet *ifp; > - struct ifaddr *ifa, *ifa_maybe = NULL; > - char *cplim, *addr_data = sa->sa_data; > - u_int rdomain; > - > - KERNEL_ASSERT_LOCKED(); > - rdomain = rtable_l2(rtableid); > - TAILQ_FOREACH(ifp, &ifnet, if_list) { > - if (ifp->if_rdomain != rdomain) > - continue; > - TAILQ_FOREACH(ifa, &ifp->if_addrlist, ifa_list) { > - char *cp, *cp2, *cp3; > - > - if (ifa->ifa_addr->sa_family != > sa->sa_family || > - ifa->ifa_netmask == 0) > - next: continue; > - cp = addr_data; > - cp2 = ifa->ifa_addr->sa_data; > - cp3 = ifa->ifa_netmask->sa_data; > - cplim = (char *)ifa->ifa_netmask + > - ifa->ifa_netmask->sa_len; > - while (cp3 < cplim) > - if ((*cp++ ^ *cp2++) & *cp3++) > - /* want to continue for() loop */ > - goto next; > - if (ifa_maybe == 0 || > - rn_refines((caddr_t)ifa->ifa_netmask, > - (caddr_t)ifa_maybe->ifa_netmask)) > - ifa_maybe = ifa; > - } > - } > - return (ifa_maybe); > -} > - > -/* > * Find an interface address specific to an interface best matching > * a given address. > */ > Index: net/if_var.h > === > RCS file: /cvs/src/sys/net/if_var.h,v > retrieving revision 1.75 > diff -u -p -r1.75 if_var.h > --- net/if_var.h 4 Sep 2016 15:46:39 - 1.75 > +++ net/if_var.h 4 Nov 2016 10:54:55 - > @@ -304,7 +304,6 @@ void p2p_rtrequest(struct ifnet *, int, > > struct ifaddr *ifa_ifwithaddr(struct sockaddr *, u_int); > struct ifaddr *ifa_ifwithdstaddr(struct sockaddr *, u_int); > -struct ifaddr *ifa_ifwithnet(struct sockaddr *, u_int); > struct ifaddr *ifaof_ifpforaddr(struct sockaddr *, struct > ifnet *); voidifafree(struct ifaddr *); > Everything above is ok vgross@ > Index: netinet/ip_input.c > === > RCS file: /cvs/src/sys/netinet/ip_input.c,v > retrieving revision 1.282 > diff -u -p -r1.282 ip_input.c > --- netinet/ip_input.c22 Sep 2016 10:12:25 - 1.282 > +++ netinet/ip_input.c4 Nov 2016 10:54:49 - > @@ -1117,37 +1117,19 @@ ip_dooptions(struct mbuf *m, struct ifne > ipaddr.sin_len = sizeof(ipaddr); > memcpy(&ipaddr.sin_addr, cp + off, > sizeof(ipaddr.sin_addr)); > - if (opt == IPOPT_SSRR) { > - if ((ia = ifatoia(ifa_ifwithdstaddr( > - sintosa(&ipaddr), > - m->m_pkthdr.ph_rtableid))) == > NULL) > - ia = ifatoia(ifa_ifwithnet( > - sintosa(&ipaddr), > - > m->m_pkthdr.ph_rtableid)); > - if (ia == NULL
Re: add in6 multicast support to vxlan(4) ; question on mbufs
On Tue, 1 Nov 2016 18:51:08 +0100 Mike Belopuhov wrote: > On 1 November 2016 at 18:23, Vincent Gross > wrote: > > On Tue, 4 Oct 2016 01:07:51 +0200 > > Vincent Gross wrote: > > > >> On Sat, 24 Sep 2016 10:58:10 +0200 > >> Vincent Gross wrote: > >> > >> > Hi, > >> > > >> [snip] > >> > > >> > Aside from the mbuf issue, is this Ok ? > >> > >> I will go back on the mbuff stuff later. > >> > >> Diff rebased, ok anyone ? > >> > > > > New rebase, tested on amd64 and macppc, Ok ? > > > > Why have you kept the m_adj with ETHER_ALIGN? > Derp. Lost track while reading the mbuf thread. New diff w/o the mbuf dance, again tested on amd64 and macppc. Index: sys/net/if_vxlan.c === RCS file: /cvs/src/sys/net/if_vxlan.c,v retrieving revision 1.51 diff -u -p -r1.51 if_vxlan.c --- sys/net/if_vxlan.c 25 Oct 2016 16:31:08 - 1.51 +++ sys/net/if_vxlan.c 1 Nov 2016 21:58:24 - @@ -47,6 +47,8 @@ #include #include +#include + #if NPF > 0 #include #endif @@ -61,7 +63,12 @@ struct vxlan_softc { struct arpcomsc_ac; struct ifmedia sc_media; - struct ip_moptions sc_imo; + union { + struct ip_moptions u_imo; + struct ip6_moptions u_imo6; + } sc_imu; +#define sc_imo sc_imu.u_imo +#define sc_im6osc_imu.u_imo6 void*sc_ahcookie; void*sc_lhcookie; void*sc_dhcookie; @@ -129,10 +136,6 @@ vxlan_clone_create(struct if_clone *ifc, M_DEVBUF, M_NOWAIT|M_ZERO)) == NULL) return (ENOMEM); - sc->sc_imo.imo_membership = malloc( - (sizeof(struct in_multi *) * IP_MIN_MEMBERSHIPS), M_IPMOPTS, - M_WAITOK|M_ZERO); - sc->sc_imo.imo_max_memberships = IP_MIN_MEMBERSHIPS; sc->sc_dstport = htons(VXLAN_PORT); sc->sc_vnetid = VXLAN_VNI_UNSET; @@ -190,7 +193,6 @@ vxlan_clone_destroy(struct ifnet *ifp) ifmedia_delete_instance(&sc->sc_media, IFM_INST_ANY); ether_ifdetach(ifp); if_detach(ifp); - free(sc->sc_imo.imo_membership, M_IPMOPTS, 0); free(sc, M_DEVBUF, sizeof(*sc)); return (0); @@ -199,11 +201,33 @@ vxlan_clone_destroy(struct ifnet *ifp) void vxlan_multicast_cleanup(struct ifnet *ifp) { - struct vxlan_softc *sc = (struct vxlan_softc *)ifp->if_softc; - struct ip_moptions *imo = &sc->sc_imo; - struct ifnet*mifp; + struct vxlan_softc *sc = (struct vxlan_softc *)ifp->if_softc; + struct ip_moptions *imo; + struct in_multi **imm; + struct ip6_moptions *im6o; + struct in6_multi_mship *im6m, *im6m_next; + struct ifnet *mifp = NULL; + + switch (sc->sc_dst.ss_family) { + case AF_INET: + imo = &sc->sc_imo; + mifp = if_get(imo->imo_ifidx); + imm = imo->imo_membership; + while (imo->imo_num_memberships > 0) + in_delmulti(imm[--imo->imo_num_memberships]); + free(imm, M_IPMOPTS, + sizeof(struct in_multi *) * imo->imo_num_memberships); + break; + case AF_INET6: + im6o = &sc->sc_im6o; + mifp = if_get(im6o->im6o_ifidx); + LIST_FOREACH_SAFE(im6m, &im6o->im6o_memberships, i6mm_chain, + im6m_next) + in6_leavegroup(im6m); + break; + } + bzero(&sc->sc_imu, sizeof(sc->sc_imu)); - mifp = if_get(imo->imo_ifidx); if (mifp != NULL) { if (sc->sc_ahcookie != NULL) { hook_disestablish(mifp->if_addrhooks, sc->sc_ahcookie); @@ -219,14 +243,9 @@ vxlan_multicast_cleanup(struct ifnet *if sc->sc_dhcookie); sc->sc_dhcookie = NULL; } - - if_put(mifp); } - if (imo->imo_num_memberships > 0) { - in_delmulti(imo->imo_membership[--imo->imo_num_memberships]); - imo->imo_ifidx = 0; - } + if_put(mifp); } int @@ -234,55 +253,140 @@ vxlan_multicast_join(struct ifnet *ifp, struct sockaddr *dst) { struct vxlan_softc *sc = ifp->if_softc; - struct ip_moptions *imo = &sc->sc_imo; + struct ip_moptions *imo; + struct ip6_moptions *im6o; + struct in6_multi_mship *imm; struct sockaddr_in *src4, *dst4; #ifdef INET6 - struct sockaddr_in6 *dst6; +
Re: add in6 multicast support to vxlan(4) ; question on mbufs
On Tue, 4 Oct 2016 01:07:51 +0200 Vincent Gross wrote: > On Sat, 24 Sep 2016 10:58:10 +0200 > Vincent Gross wrote: > > > Hi, > > > [snip] > > > > Aside from the mbuf issue, is this Ok ? > > I will go back on the mbuff stuff later. > > Diff rebased, ok anyone ? > New rebase, tested on amd64 and macppc, Ok ? Index: sys/net/if_vxlan.c === RCS file: /cvs/src/sys/net/if_vxlan.c,v retrieving revision 1.51 diff -u -p -r1.51 if_vxlan.c --- sys/net/if_vxlan.c 25 Oct 2016 16:31:08 - 1.51 +++ sys/net/if_vxlan.c 1 Nov 2016 17:18:15 - @@ -47,6 +47,8 @@ #include #include +#include + #if NPF > 0 #include #endif @@ -61,7 +63,12 @@ struct vxlan_softc { struct arpcomsc_ac; struct ifmedia sc_media; - struct ip_moptions sc_imo; + union { + struct ip_moptions u_imo; + struct ip6_moptions u_imo6; + } sc_imu; +#define sc_imo sc_imu.u_imo +#define sc_im6osc_imu.u_imo6 void*sc_ahcookie; void*sc_lhcookie; void*sc_dhcookie; @@ -129,10 +136,6 @@ vxlan_clone_create(struct if_clone *ifc, M_DEVBUF, M_NOWAIT|M_ZERO)) == NULL) return (ENOMEM); - sc->sc_imo.imo_membership = malloc( - (sizeof(struct in_multi *) * IP_MIN_MEMBERSHIPS), M_IPMOPTS, - M_WAITOK|M_ZERO); - sc->sc_imo.imo_max_memberships = IP_MIN_MEMBERSHIPS; sc->sc_dstport = htons(VXLAN_PORT); sc->sc_vnetid = VXLAN_VNI_UNSET; @@ -190,7 +193,6 @@ vxlan_clone_destroy(struct ifnet *ifp) ifmedia_delete_instance(&sc->sc_media, IFM_INST_ANY); ether_ifdetach(ifp); if_detach(ifp); - free(sc->sc_imo.imo_membership, M_IPMOPTS, 0); free(sc, M_DEVBUF, sizeof(*sc)); return (0); @@ -199,11 +201,33 @@ vxlan_clone_destroy(struct ifnet *ifp) void vxlan_multicast_cleanup(struct ifnet *ifp) { - struct vxlan_softc *sc = (struct vxlan_softc *)ifp->if_softc; - struct ip_moptions *imo = &sc->sc_imo; - struct ifnet*mifp; + struct vxlan_softc *sc = (struct vxlan_softc *)ifp->if_softc; + struct ip_moptions *imo; + struct in_multi **imm; + struct ip6_moptions *im6o; + struct in6_multi_mship *im6m, *im6m_next; + struct ifnet *mifp = NULL; + + switch (sc->sc_dst.ss_family) { + case AF_INET: + imo = &sc->sc_imo; + mifp = if_get(imo->imo_ifidx); + imm = imo->imo_membership; + while (imo->imo_num_memberships > 0) + in_delmulti(imm[--imo->imo_num_memberships]); + free(imm, M_IPMOPTS, + sizeof(struct in_multi *) * imo->imo_num_memberships); + break; + case AF_INET6: + im6o = &sc->sc_im6o; + mifp = if_get(im6o->im6o_ifidx); + LIST_FOREACH_SAFE(im6m, &im6o->im6o_memberships, i6mm_chain, + im6m_next) + in6_leavegroup(im6m); + break; + } + bzero(&sc->sc_imu, sizeof(sc->sc_imu)); - mifp = if_get(imo->imo_ifidx); if (mifp != NULL) { if (sc->sc_ahcookie != NULL) { hook_disestablish(mifp->if_addrhooks, sc->sc_ahcookie); @@ -219,14 +243,9 @@ vxlan_multicast_cleanup(struct ifnet *if sc->sc_dhcookie); sc->sc_dhcookie = NULL; } - - if_put(mifp); } - if (imo->imo_num_memberships > 0) { - in_delmulti(imo->imo_membership[--imo->imo_num_memberships]); - imo->imo_ifidx = 0; - } + if_put(mifp); } int @@ -234,55 +253,140 @@ vxlan_multicast_join(struct ifnet *ifp, struct sockaddr *dst) { struct vxlan_softc *sc = ifp->if_softc; - struct ip_moptions *imo = &sc->sc_imo; + struct ip_moptions *imo; + struct ip6_moptions *im6o; + struct in6_multi_mship *imm; struct sockaddr_in *src4, *dst4; #ifdef INET6 - struct sockaddr_in6 *dst6; + struct sockaddr_in6 *src6, *dst6; #endif /* INET6 */ struct ifaddr *ifa; - struct ifnet*mifp; + struct ifnet*mifp = NULL; + struct rtentry *rt; + int error; switch (dst->sa_family) { case AF_INET: dst4 = satosin(dst); + src4 = satosin(src); if (!IN_MULTICAST(dst4->sin_addr.s_addr
Re: add in6 multicast support to vxlan(4) ; question on mbufs
On Sat, 24 Sep 2016 10:58:10 +0200 Vincent Gross wrote: > Hi, > [snip] > > Aside from the mbuf issue, is this Ok ? I will go back on the mbuff stuff later. Diff rebased, ok anyone ? Index: net/if_vxlan.c === RCS file: /cvs/src/sys/net/if_vxlan.c,v retrieving revision 1.48 diff -u -p -r1.48 if_vxlan.c --- net/if_vxlan.c 30 Sep 2016 10:22:05 - 1.48 +++ net/if_vxlan.c 3 Oct 2016 23:12:42 - @@ -47,6 +47,8 @@ #include #include +#include + #if NPF > 0 #include #endif @@ -61,7 +63,12 @@ struct vxlan_softc { struct arpcomsc_ac; struct ifmedia sc_media; - struct ip_moptions sc_imo; + union { + struct ip_moptions u_imo; + struct ip6_moptions u_imo6; + } sc_imu; +#define sc_imo sc_imu.u_imo +#define sc_im6osc_imu.u_imo6 void*sc_ahcookie; void*sc_lhcookie; void*sc_dhcookie; @@ -129,10 +136,6 @@ vxlan_clone_create(struct if_clone *ifc, M_DEVBUF, M_NOWAIT|M_ZERO)) == NULL) return (ENOMEM); - sc->sc_imo.imo_membership = malloc( - (sizeof(struct in_multi *) * IP_MIN_MEMBERSHIPS), M_IPMOPTS, - M_WAITOK|M_ZERO); - sc->sc_imo.imo_max_memberships = IP_MIN_MEMBERSHIPS; sc->sc_dstport = htons(VXLAN_PORT); sc->sc_vnetid = VXLAN_VNI_UNSET; @@ -190,7 +193,6 @@ vxlan_clone_destroy(struct ifnet *ifp) ifmedia_delete_instance(&sc->sc_media, IFM_INST_ANY); ether_ifdetach(ifp); if_detach(ifp); - free(sc->sc_imo.imo_membership, M_IPMOPTS, 0); free(sc, M_DEVBUF, sizeof(*sc)); return (0); @@ -199,11 +201,33 @@ vxlan_clone_destroy(struct ifnet *ifp) void vxlan_multicast_cleanup(struct ifnet *ifp) { - struct vxlan_softc *sc = (struct vxlan_softc *)ifp->if_softc; - struct ip_moptions *imo = &sc->sc_imo; - struct ifnet*mifp; + struct vxlan_softc *sc = (struct vxlan_softc *)ifp->if_softc; + struct ip_moptions *imo; + struct in_multi **imm; + struct ip6_moptions *im6o; + struct in6_multi_mship *im6m, *im6m_next; + struct ifnet *mifp = NULL; + + switch (sc->sc_dst.ss_family) { + case AF_INET: + imo = &sc->sc_imo; + mifp = if_get(imo->imo_ifidx); + imm = imo->imo_membership; + while (imo->imo_num_memberships > 0) + in_delmulti(imm[--imo->imo_num_memberships]); + free(imm, M_IPMOPTS, + sizeof(struct in_multi *) * imo->imo_num_memberships); + break; + case AF_INET6: + im6o = &sc->sc_im6o; + mifp = if_get(im6o->im6o_ifidx); + LIST_FOREACH_SAFE(im6m, &im6o->im6o_memberships, i6mm_chain, + im6m_next) + in6_leavegroup(im6m); + break; + } + bzero(&sc->sc_imu, sizeof(sc->sc_imu)); - mifp = if_get(imo->imo_ifidx); if (mifp != NULL) { if (sc->sc_ahcookie != NULL) { hook_disestablish(mifp->if_addrhooks, sc->sc_ahcookie); @@ -219,14 +243,9 @@ vxlan_multicast_cleanup(struct ifnet *if sc->sc_dhcookie); sc->sc_dhcookie = NULL; } - - if_put(mifp); } - if (imo->imo_num_memberships > 0) { - in_delmulti(imo->imo_membership[--imo->imo_num_memberships]); - imo->imo_ifidx = 0; - } + if_put(mifp); } int @@ -234,47 +253,136 @@ vxlan_multicast_join(struct ifnet *ifp, struct sockaddr *dst) { struct vxlan_softc *sc = ifp->if_softc; - struct ip_moptions *imo = &sc->sc_imo; + struct ip_moptions *imo; + struct ip6_moptions *im6o; + struct in6_multi_mship *imm; struct sockaddr_in *src4, *dst4; - struct sockaddr_in6 *dst6; + struct sockaddr_in6 *src6, *dst6; struct ifaddr *ifa; - struct ifnet*mifp; + struct ifnet*mifp = NULL; + struct rtentry *rt; + int error; - if (dst->sa_family == AF_INET) { + switch (dst->sa_family) { + case AF_INET: dst4 = satosin(dst); + src4 = satosin(src); if (!IN_MULTICAST(dst4->sin_addr.s_addr)) return (0); - } else if (dst->sa_family == AF_INET6) { + if (src4->sin_addr.s_addr == INADDR_ANY || + IN_MULTICAST(src4->sin_addr
Re: iked recvfromto flags
On Mon, 26 Sep 2016 18:33:43 +0200 j...@wxcvbn.org (Jeremie Courreges-Anglas) wrote: > Don't ignore the "flags" argument passed to recvfromto. Doesn't > matter for now in iked (0 is passed), but this kind of code tends to > be copied. > > ok? > ok vgross@ > > Index: util.c > === > RCS file: /cvs/src/sbin/iked/util.c,v > retrieving revision 1.31 > diff -u -p -p -u -r1.31 util.c > --- util.c4 Sep 2016 10:26:02 - 1.31 > +++ util.c26 Sep 2016 16:32:56 - > @@ -366,7 +366,7 @@ recvfromto(int s, void *buf, size_t len, > msg.msg_control = &cmsgbuf.buf; > msg.msg_controllen = sizeof(cmsgbuf.buf); > > - if ((ret = recvmsg(s, &msg, 0)) == -1) > + if ((ret = recvmsg(s, &msg, flags)) == -1) > return (-1); > > *fromlen = from->sa_len; >
add in6 multicast support to vxlan(4) ; question on mbufs
Hi, As said in Subject:. I would like to get comments on the m_adj/m_pullup dance at the end of vxlan_lookup() ; I do this because ether_input() access the ethernet header with mtod(), and under some conditions the mbuf handled would have its first data chunk empty (mh_len == 0). What is the rule of thumb regarding m_pullup/mtod use versus m_copydata ? Aside from the mbuf issue, is this Ok ? Index: net/if_vxlan.c === RCS file: /cvs/src/sys/net/if_vxlan.c,v retrieving revision 1.44 diff -u -p -r1.44 if_vxlan.c --- net/if_vxlan.c 4 Sep 2016 11:14:44 - 1.44 +++ net/if_vxlan.c 24 Sep 2016 08:37:22 - @@ -47,6 +47,8 @@ #include #include +#include + #if NPF > 0 #include #endif @@ -61,7 +63,12 @@ struct vxlan_softc { struct arpcomsc_ac; struct ifmedia sc_media; - struct ip_moptions sc_imo; + union { + struct ip_moptions u_imo; + struct ip6_moptions u_imo6; + } sc_imu; +#define sc_imo sc_imu.u_imo +#define sc_im6osc_imu.u_imo6 void*sc_ahcookie; void*sc_lhcookie; void*sc_dhcookie; @@ -129,10 +136,6 @@ vxlan_clone_create(struct if_clone *ifc, M_DEVBUF, M_NOWAIT|M_ZERO)) == NULL) return (ENOMEM); - sc->sc_imo.imo_membership = malloc( - (sizeof(struct in_multi *) * IP_MIN_MEMBERSHIPS), M_IPMOPTS, - M_WAITOK|M_ZERO); - sc->sc_imo.imo_max_memberships = IP_MIN_MEMBERSHIPS; sc->sc_dstport = htons(VXLAN_PORT); sc->sc_vnetid = VXLAN_VNI_UNSET; @@ -190,7 +193,6 @@ vxlan_clone_destroy(struct ifnet *ifp) ifmedia_delete_instance(&sc->sc_media, IFM_INST_ANY); ether_ifdetach(ifp); if_detach(ifp); - free(sc->sc_imo.imo_membership, M_IPMOPTS, 0); free(sc, M_DEVBUF, sizeof(*sc)); return (0); @@ -199,11 +201,33 @@ vxlan_clone_destroy(struct ifnet *ifp) void vxlan_multicast_cleanup(struct ifnet *ifp) { - struct vxlan_softc *sc = (struct vxlan_softc *)ifp->if_softc; - struct ip_moptions *imo = &sc->sc_imo; - struct ifnet*mifp; + struct vxlan_softc *sc = (struct vxlan_softc *)ifp->if_softc; + struct ip_moptions *imo; + struct in_multi **imm; + struct ip6_moptions *im6o; + struct in6_multi_mship *im6m, *im6m_next; + struct ifnet *mifp = NULL; + + switch (sc->sc_dst.ss_family) { + case AF_INET: + imo = &sc->sc_imo; + mifp = if_get(imo->imo_ifidx); + imm = imo->imo_membership; + while (imo->imo_num_memberships > 0) + in_delmulti(imm[--imo->imo_num_memberships]); + free(imm, M_IPMOPTS, + sizeof(struct in_multi *) * imo->imo_num_memberships); + break; + case AF_INET6: + im6o = &sc->sc_im6o; + mifp = if_get(im6o->im6o_ifidx); + LIST_FOREACH_SAFE(im6m, &im6o->im6o_memberships, i6mm_chain, + im6m_next) + in6_leavegroup(im6m); + break; + } + bzero(&sc->sc_imu, sizeof(sc->sc_imu)); - mifp = if_get(imo->imo_ifidx); if (mifp != NULL) { if (sc->sc_ahcookie != NULL) { hook_disestablish(mifp->if_addrhooks, sc->sc_ahcookie); @@ -219,14 +243,9 @@ vxlan_multicast_cleanup(struct ifnet *if sc->sc_dhcookie); sc->sc_dhcookie = NULL; } - - if_put(mifp); } - if (imo->imo_num_memberships > 0) { - in_delmulti(imo->imo_membership[--imo->imo_num_memberships]); - imo->imo_ifidx = 0; - } + if_put(mifp); } int @@ -234,47 +253,136 @@ vxlan_multicast_join(struct ifnet *ifp, struct sockaddr *dst) { struct vxlan_softc *sc = ifp->if_softc; - struct ip_moptions *imo = &sc->sc_imo; + struct ip_moptions *imo; + struct ip6_moptions *im6o; + struct in6_multi_mship *imm; struct sockaddr_in *src4, *dst4; - struct sockaddr_in6 *dst6; + struct sockaddr_in6 *src6, *dst6; struct ifaddr *ifa; - struct ifnet*mifp; + struct ifnet*mifp = NULL; + struct rtentry *rt; + int error; - if (dst->sa_family == AF_INET) { + switch (dst->sa_family) { + case AF_INET: dst4 = satosin(dst); + src4 = satosin(src); if (!IN_MULTICAST(dst4->sin_addr.s_addr)) return (0); - } else if (dst->sa_family == AF_INET6) { + if (src4->s
Re: netinet/tcp_input.c syn_cache_get diff
On Sun, 18 Sep 2016 13:11:58 -0400 David Hill wrote: > Hello - > > Make sure we keep TF_NOPUSH set if TCP_NOPUSH was set. > > FreeBSD has the same: > https://github.com/freebsd/freebsd/blob/c9af4f2541fd437e0805365fbeec46d69e033310/sys/netinet/tcp_syncache.c#L860 > Ok vgross@
Re: timeout_set_proc(9)
On Thu, 15 Sep 2016 16:29:45 +0200 Martin Pieuchot wrote: > After discussing with a few people about a new "timed task" API I came > to the conclusion that mixing timeouts and tasks will result in: > > - always including a 'struct timeout' in a 'struct task', or the > other the way around > or > > - introducing a new data structure, hence API. > > Since I'd like to keep the change as small as possible when converting > existing timeout_set(9), neither option seem a good fit. So I decided > to add a new kernel thread, curiously named "softclock", that will > offer his stack to the poor timeout handlers that need one. > > With this approach, converting a timeout is just a matter of doing: > > s/timeout_set/timeout_set_proc/ > > > Diff below includes the conversions I need for the "netlock". I'm > waiting for feedbacks and a better name to document the new function. > > Comments? Reads OK; I like the simple renaming. The "softclock" thread name will be confusing, the timeouts are indeed driven by the softclock interrupt, but the tasks have nothing to do with softclock. Maybe "timeothread" ? Will this new thread stay, or is it only to ease the transition to MP networking ? > > Index: net/if_pflow.c > === > RCS file: /cvs/src/sys/net/if_pflow.c,v > retrieving revision 1.61 > diff -u -p -r1.61 if_pflow.c > --- net/if_pflow.c29 Apr 2016 08:55:03 - 1.61 > +++ net/if_pflow.c15 Sep 2016 14:19:10 - > @@ -548,15 +548,16 @@ pflow_init_timeouts(struct pflow_softc * > if (timeout_initialized(&sc->sc_tmo_tmpl)) > timeout_del(&sc->sc_tmo_tmpl); > if (!timeout_initialized(&sc->sc_tmo)) > - timeout_set(&sc->sc_tmo, pflow_timeout, sc); > + timeout_set_proc(&sc->sc_tmo, pflow_timeout, > sc); break; > case PFLOW_PROTO_10: > if (!timeout_initialized(&sc->sc_tmo_tmpl)) > - timeout_set(&sc->sc_tmo_tmpl, > pflow_timeout_tmpl, sc); > + timeout_set_proc(&sc->sc_tmo_tmpl, > pflow_timeout_tmpl, > + sc); > if (!timeout_initialized(&sc->sc_tmo)) > - timeout_set(&sc->sc_tmo, pflow_timeout, sc); > + timeout_set_proc(&sc->sc_tmo, pflow_timeout, > sc); if (!timeout_initialized(&sc->sc_tmo6)) > - timeout_set(&sc->sc_tmo6, pflow_timeout6, > sc); > + timeout_set_proc(&sc->sc_tmo6, > pflow_timeout6, sc); > timeout_add_sec(&sc->sc_tmo_tmpl, > PFLOW_TMPL_TIMEOUT); break; > Index: net/if_pfsync.c > === > RCS file: /cvs/src/sys/net/if_pfsync.c,v > retrieving revision 1.231 > diff -u -p -r1.231 if_pfsync.c > --- net/if_pfsync.c 15 Sep 2016 02:00:18 - 1.231 > +++ net/if_pfsync.c 15 Sep 2016 14:19:10 - > @@ -328,9 +328,9 @@ pfsync_clone_create(struct if_clone *ifc > IFQ_SET_MAXLEN(&ifp->if_snd, IFQ_MAXLEN); > ifp->if_hdrlen = sizeof(struct pfsync_header); > ifp->if_mtu = ETHERMTU; > - timeout_set(&sc->sc_tmo, pfsync_timeout, sc); > - timeout_set(&sc->sc_bulk_tmo, pfsync_bulk_update, sc); > - timeout_set(&sc->sc_bulkfail_tmo, pfsync_bulk_fail, sc); > + timeout_set_proc(&sc->sc_tmo, pfsync_timeout, sc); > + timeout_set_proc(&sc->sc_bulk_tmo, pfsync_bulk_update, sc); > + timeout_set_proc(&sc->sc_bulkfail_tmo, pfsync_bulk_fail, sc); > > if_attach(ifp); > if_alloc_sadl(ifp); > @@ -1723,7 +1723,7 @@ pfsync_defer(struct pf_state *st, struct > sc->sc_deferred++; > TAILQ_INSERT_TAIL(&sc->sc_deferrals, pd, pd_entry); > > - timeout_set(&pd->pd_tmo, pfsync_defer_tmo, pd); > + timeout_set_proc(&pd->pd_tmo, pfsync_defer_tmo, pd); > timeout_add_msec(&pd->pd_tmo, 20); > > schednetisr(NETISR_PFSYNC); > Index: netinet/ip_carp.c > === > RCS file: /cvs/src/sys/netinet/ip_carp.c,v > retrieving revision 1.293 > diff -u -p -r1.293 ip_carp.c > --- netinet/ip_carp.c 25 Jul 2016 16:44:04 - 1.293 > +++ netinet/ip_carp.c 15 Sep 2016 14:19:11 - > @@ -831,9 +831,9 @@ carp_new_vhost(struct carp_softc *sc, in > vhe->vhid = vhid; > vhe->advskew = advskew; > vhe->state = INIT; > - timeout_set(&vhe->ad_tmo, carp_send_ad, vhe); > - timeout_set(&vhe->md_tmo, carp_master_down, vhe); > - timeout_set(&vhe->md6_tmo, carp_master_down, vhe); > + timeout_set_proc(&vhe->ad_tmo, carp_send_ad, vhe); > + timeout_set_proc(&vhe->md_tmo, carp_master_down, vhe); > + timeout_set_proc(&vhe->md6_tmo, carp_master_down, vhe); > > KERNEL_ASSERT_LOCKED(); /* touching carp_vhosts */ > > Index: netinet/tcp_timer.h > === > RCS file: /cvs/src/sys/netinet/tcp_timer.
Re: ip6_setpktopt: dead code & param
On Tue, 13 Sep 2016 14:19:24 +0200 j...@wxcvbn.org (Jeremie Courreges-Anglas) wrote: > Since it has been introduced, ip6_setpktopt has only been called with > (sticky=1, cmsg=0) or (sticky=0, cmsg=1). Let's simplify this code. Ok vgross@ > > > Index: ip6_output.c > === > RCS file: /cvs/src/sys/netinet6/ip6_output.c,v > retrieving revision 1.213 > diff -u -p -p -u -r1.213 ip6_output.c > --- ip6_output.c 25 Aug 2016 12:30:16 - 1.213 > +++ ip6_output.c 13 Sep 2016 11:56:19 - > @@ -119,8 +119,7 @@ struct ip6_exthdrs { > int ip6_pcbopt(int, u_char *, int, struct ip6_pktopts **, int, int); > int ip6_pcbopts(struct ip6_pktopts **, struct mbuf *, struct socket > *); int ip6_getpcbopt(struct ip6_pktopts *, int, struct mbuf **); > -int ip6_setpktopt(int, u_char *, int, struct ip6_pktopts *, int, int, > - int, int); > +int ip6_setpktopt(int, u_char *, int, struct ip6_pktopts *, int, > int, int); int ip6_setmoptions(int, struct ip6_moptions **, struct > mbuf *); int ip6_getmoptions(int, struct ip6_moptions *, struct mbuf > **); int ip6_copyexthdr(struct mbuf **, caddr_t, int); > @@ -1770,7 +1769,7 @@ ip6_pcbopt(int optname, u_char *buf, int > } > opt = *pktopt; > > - return (ip6_setpktopt(optname, buf, len, opt, priv, 1, 0, > uproto)); > + return (ip6_setpktopt(optname, buf, len, opt, priv, 1, > uproto)); } > > int > @@ -2352,7 +2351,7 @@ ip6_setpktopts(struct mbuf *control, str > return (EINVAL); > if (cm->cmsg_level == IPPROTO_IPV6) { > error = ip6_setpktopt(cm->cmsg_type, > CMSG_DATA(cm), > - cm->cmsg_len - CMSG_LEN(0), opt, priv, > 0, 1, uproto); > + cm->cmsg_len - CMSG_LEN(0), opt, priv, > 0, uproto); if (error) > return (error); > } > @@ -2367,39 +2366,12 @@ ip6_setpktopts(struct mbuf *control, str > /* > * Set a particular packet option, as a sticky option or an > ancillary data > * item. "len" can be 0 only when it's a sticky option. > - * We have 4 cases of combination of "sticky" and "cmsg": > - * "sticky=0, cmsg=0": impossible > - * "sticky=0, cmsg=1": RFC2292 or RFC3542 ancillary data > - * "sticky=1, cmsg=0": RFC3542 socket option > - * "sticky=1, cmsg=1": RFC2292 socket option > */ > int > ip6_setpktopt(int optname, u_char *buf, int len, struct ip6_pktopts > *opt, > -int priv, int sticky, int cmsg, int uproto) > +int priv, int sticky, int uproto) > { > int minmtupolicy; > - > - if (!sticky && !cmsg) { > -#ifdef DIAGNOSTIC > - printf("ip6_setpktopt: impossible case\n"); > -#endif > - return (EINVAL); > - } > - > - if (sticky && cmsg) { > - switch (optname) { > - case IPV6_PKTINFO: > - case IPV6_HOPLIMIT: > - case IPV6_HOPOPTS: > - case IPV6_DSTOPTS: > - case IPV6_RTHDRDSTOPTS: > - case IPV6_RTHDR: > - case IPV6_USE_MIN_MTU: > - case IPV6_DONTFRAG: > - case IPV6_TCLASS: > - return (ENOPROTOOPT); > - } > - } > > switch (optname) { > case IPV6_PKTINFO: >
Re: rwsleep(9)
On Tue, 13 Sep 2016 10:08:13 +0200 Martin Pieuchot wrote: > On 12/09/16(Mon) 12:12, Vincent Gross wrote: > > On Mon, 12 Sep 2016 10:49:03 +0200 > > Martin Pieuchot wrote: > > > > > I'd like to use a write lock to serialize accesses to ip_output(). > > > This will be used to guarantee that atomic code sections in the > > > socket layer stay atomic when the input/forwarding path won't run > > > under KERNEL_LOCK(). > > > > > > For such purpose I'll have to convert some tsleep(9) to an > > > msleep(9)-like function operating on a write lock. That's why I'd > > > like to introduce rwsleep(9). I did not bother exporting a read > > > variant of this function since I don't need it for the moment. > > > > > > ok? > > > > MP noob here : > > > > tsleep() and msleep() check if they are called during > > autoconfiguration or after a panic to let interrupts run. There is > > no such check here. I get that rwsleep() during autoconf makes > > little sense, but to err on the safe side maybe add some kind of > > assert (if it is not too much of a pain) ? and what about panic, > > shouldn't this be handled ? > > This is not a MP problem but an old BSD heritage. I don't mind adding > it. But that's not a real solution to panic being broken with sleep > or locks. > "old BSD heritage" -> 'nuff said. No need to spread the rot then. ok vgross@
Re: rwsleep(9)
On Mon, 12 Sep 2016 10:49:03 +0200 Martin Pieuchot wrote: > I'd like to use a write lock to serialize accesses to ip_output(). > This will be used to guarantee that atomic code sections in the > socket layer stay atomic when the input/forwarding path won't run > under KERNEL_LOCK(). > > For such purpose I'll have to convert some tsleep(9) to an > msleep(9)-like function operating on a write lock. That's why I'd > like to introduce rwsleep(9). I did not bother exporting a read > variant of this function since I don't need it for the moment. > > ok? MP noob here : tsleep() and msleep() check if they are called during autoconfiguration or after a panic to let interrupts run. There is no such check here. I get that rwsleep() during autoconf makes little sense, but to err on the safe side maybe add some kind of assert (if it is not too much of a pain) ? and what about panic, shouldn't this be handled ? > > Index: sys/kern/kern_synch.c > === > RCS file: /cvs/src/sys/kern/kern_synch.c,v > retrieving revision 1.134 > diff -u -p -r1.134 kern_synch.c > --- sys/kern/kern_synch.c 3 Sep 2016 15:06:06 - > 1.134 +++ sys/kern/kern_synch.c 12 Sep 2016 08:41:23 - > @@ -226,6 +226,40 @@ msleep(const volatile void *ident, struc > return (error); > } > > +/* > + * Same as tsleep, but if we have a rwlock provided, then once we've > + * entered the sleep queue we drop the it. After sleeping we re-lock. > + */ > +int > +rwsleep(const volatile void *ident, struct rwlock *wl, int priority, > +const char *wmesg, int timo) > +{ > + struct sleep_state sls; > + int error, error1; > + > + KASSERT((priority & ~(PRIMASK | PCATCH | PNORELOCK)) == 0); > + rw_assert_wrlock(wl); > + > + sleep_setup(&sls, ident, priority, wmesg); > + sleep_setup_timeout(&sls, timo); > + sleep_setup_signal(&sls, priority); > + > + rw_exit_write(wl); > + > + sleep_finish(&sls, 1); > + error1 = sleep_finish_timeout(&sls); > + error = sleep_finish_signal(&sls); > + > + if ((priority & PNORELOCK) == 0) > + rw_enter_write(wl); > + > + /* Signal errors are higher priority than timeouts. */ > + if (error == 0 && error1 != 0) > + error = error1; > + > + return (error); > +} > + > void > sleep_setup(struct sleep_state *sls, const volatile void *ident, int > prio, const char *wmesg) > Index: sys/sys/systm.h > === > RCS file: /cvs/src/sys/sys/systm.h,v > retrieving revision 1.116 > diff -u -p -r1.116 systm.h > --- sys/sys/systm.h 4 Sep 2016 09:22:29 - 1.116 > +++ sys/sys/systm.h 12 Sep 2016 08:35:26 - > @@ -246,11 +246,13 @@ int sleep_finish_signal(struct sleep_sta > void sleep_queue_init(void); > > struct mutex; > +struct rwlock; > voidwakeup_n(const volatile void *, int); > voidwakeup(const volatile void *); > #define wakeup_one(c) wakeup_n((c), 1) > int tsleep(const volatile void *, int, const char *, int); > int msleep(const volatile void *, struct mutex *, int, const > char*, int); +int rwsleep(const volatile void *, struct rwlock > *, int, const char *, int); void yield(void); > > void wdog_register(int (*)(void *, int), void *); > Index: share/man/man9/tsleep.9 > === > RCS file: /cvs/src/share/man/man9/tsleep.9,v > retrieving revision 1.10 > diff -u -p -r1.10 tsleep.9 > --- share/man/man9/tsleep.9 14 Sep 2015 15:14:55 - > 1.10 +++ share/man/man9/tsleep.9 12 Sep 2016 08:42:55 - > @@ -34,6 +34,7 @@ > .Sh NAME > .Nm tsleep , > .Nm msleep , > +.Nm rwsleep , > .Nm wakeup , > .Nm wakeup_n , > .Nm wakeup_one > @@ -45,6 +46,8 @@ > .Fn tsleep "void *ident" "int priority" "const char *wmesg" "int > timo" .Ft int > .Fn msleep "void *ident" "struct mutex *mtx" "int priority" "const > char *wmesg" "int timo" +.Ft int > +.Fn rwsleep "void *ident" "struct rwlock *wl" "int priority" "const > char *wmesg" "int timo" .Ft void > .Fn wakeup "void *ident" > .Ft void > @@ -53,9 +56,10 @@ > .Fn wakeup_one "void *ident" > .Sh DESCRIPTION > These functions implement voluntary context switching. > -.Fn tsleep > -and > +.Fn tsleep , > .Fn msleep > +and > +.Fn rwsleep > are used throughout the kernel whenever processing in the current > context cannot continue for any of the following reasons: > .Bl -bullet -offset indent > @@ -146,6 +150,22 @@ argument. > .El > .Pp > The > +.Fn rwsleep > +function behaves just like > +.Fn tsleep , > +but takes an additional argument: > +.Bl -tag -width priority > +.It Fa wl > +A write lock that will be unlocked when the process is safely > +on the sleep queue. > +The write lock will be relocked at the end of rwsleep unless the > +.Dv PNORELOCK > +flag is set in the > +.Fa priority > +argument. > +.El > +.Pp > +The > .Fn wakeup > function will mark all proce
in6_selectroute should never get AF_INET filled struct route *
in6_selectroute() checks whether the struct route it received contains a valid route whose AF is not AF_INET6, "in case the cache is shared". Well, is this cache shared or not ? There's only two ways to get to in6_selectroute() 1) in6_pcbselsrc() -> in6_selectif() -> in6_selectroute() It is trivial to check that only inet6 is handled here, and that any other AF is obviously an error. 2) ip6_output() -> in6_selectroute() a. If the struct route * arg of ip6_output() is NULL, then ip6_output() zeroes a struct route from the stack, it will never be valid thus there is no need to check its AF. b. If the struct route * arg is not NULL, it is passed to ip6_output(). ip6_output() is called with a non-NULL struct route * in 5 places only: netinet/tcp_output.c:1124: error = ip6_output(m, tp->t_inpcb->inp_outputopts6, netinet/tcp_output.c-1125-&tp->t_inpcb->inp_route6, netinet/tcp_output.c-1126-0, NULL, tp->t_inpcb); netinet/tcp_subr.c:399: ip6_output(m, tp ? tp->t_inpcb->inp_outputopts6 : NULL, netinet/tcp_subr.c-400- tp ? &tp->t_inpcb->inp_route6 : NULL, netinet/tcp_subr.c-401- 0, NULL, netinet/tcp_subr.c-402- tp ? tp->t_inpcb : NULL); netinet/tcp_input.c:4386: error = ip6_output(m, NULL /*XXX*/, &sc->sc_route6, 0, netinet/tcp_input.c-4387- NULL, NULL); netinet6/ip6_divert.c:167: error = ip6_output(m, NULL, &inp->inp_route6, netinet6/ip6_divert.c-168- IP_ALLOWBROADCAST | IP_RAWOUTPUT, NULL, NULL); netinet6/raw_ip6.c:457: error = ip6_output(m, optp, &in6p->inp_route6, flags, netinet6/raw_ip6.c-458- in6p->inp_moptions6, in6p); Each time, the struct route is only used in an inet6 context. I think it is safe to add this KASSERT() to in6_selectroute(). A few other things can be tightened, they will be addressed later. Ok ? Index: netinet6/in6_src.c === RCS file: /cvs/src/sys/netinet6/in6_src.c,v retrieving revision 1.79 diff -u -p -r1.79 in6_src.c --- netinet6/in6_src.c 4 Aug 2016 20:46:24 - 1.79 +++ netinet6/in6_src.c 2 Sep 2016 09:17:10 - @@ -302,13 +302,13 @@ in6_selectroute(struct sockaddr_in6 *dst /* * Use a cached route if it exists and is valid, else try to allocate -* a new one. Note that we should check the address family of the -* cached destination, in case of sharing the cache with IPv4. +* a new one. */ if (ro) { + if (rtisvalid(ro->ro_rt)) + KASSERT(sin6tosa(&ro->ro_dst)->sa_family == AF_INET6); if (!rtisvalid(ro->ro_rt) || -sin6tosa(&ro->ro_dst)->sa_family != AF_INET6 || -!IN6_ARE_ADDR_EQUAL(&ro->ro_dst.sin6_addr, dst)) { + !IN6_ARE_ADDR_EQUAL(&ro->ro_dst.sin6_addr, dst)) { rtfree(ro->ro_rt); ro->ro_rt = NULL; }
Re: Let iked specify its source address when sending
Objections anyone ? On Wed, 31 Aug 2016 15:57:45 +0200 Vincent Gross wrote: > On Wed, 31 Aug 2016 15:26:53 +0200 > Vincent Gross wrote: > > > On Thu, 11 Aug 2016 16:57:27 +0100 > > Stuart Henderson wrote: > > > > > On 2016/06/27 13:00, JĆ©rĆ©mie CourrĆØges-Anglas wrote: > > [...] > > > > > > > > I also gave my ok to vgross by IM. > > > > > > > > I know that some concerns have been exposed privately, I was not > > > > Cc'd, thus I have no idea what is the current status of that > > > > discussion. To the people concerned, please keep me / us > > > > updated about that discussion and Cc us. > > > > > > How are things looking with IN_SENDSRCADDR now, are there any > > > remaining concerns that need fixing before it could be committed? > > > (Also if anyone has a share-able diff to use this with iked it > > > would be quite handy..) > > > > > > > Tested locally with two iked on two distinct rdomains plus a bit of > > LD_PRELOAD goop. Unfortunately I couldn't ping from one rdom to the > > other, but I also have this problem without my patch, so I am > > confident this ping problem is unrelated. > > > > I would be very grateful if someone could test this. > > > > Take two, unmangled version : > > Index: sbin/iked/iked.h > === > RCS file: /cvs/src/sbin/iked/iked.h,v > retrieving revision 1.96 > diff -u -p -r1.96 iked.h > --- sbin/iked/iked.h 1 Jun 2016 11:16:41 - 1.96 > +++ sbin/iked/iked.h 31 Aug 2016 13:19:10 - > @@ -898,6 +898,8 @@ intsocket_setport(struct sockaddr *, i > int socket_getaddr(int, struct sockaddr_storage *); > int socket_bypass(int, struct sockaddr *); > int udp_bind(struct sockaddr *, in_port_t); > +ssize_t sendtofrom(int, void *, size_t, int, struct sockaddr > *, > + socklen_t, struct sockaddr *, socklen_t); > ssize_t recvfromto(int, void *, size_t, int, struct sockaddr > *, socklen_t *, struct sockaddr *, socklen_t *); > const char * > Index: sbin/iked/ikev2_msg.c > === > RCS file: /cvs/src/sbin/iked/ikev2_msg.c,v > retrieving revision 1.45 > diff -u -p -r1.45 ikev2_msg.c > --- sbin/iked/ikev2_msg.c 19 Oct 2015 11:25:35 - > 1.45 +++ sbin/iked/ikev2_msg.c31 Aug 2016 13:19:10 - > @@ -319,9 +319,11 @@ ikev2_msg_send(struct iked *env, struct > msg->msg_offset += sizeof(natt); > } > > - if ((sendto(msg->msg_fd, ibuf_data(buf), ibuf_size(buf), 0, > - (struct sockaddr *)&msg->msg_peer, msg->msg_peerlen)) == > -1) { > - log_warn("%s: sendto", __func__); > + if (sendtofrom(msg->msg_fd, ibuf_data(buf), ibuf_size(buf), > 0, > + (struct sockaddr *)&msg->msg_peer, msg->msg_peerlen, > + (struct sockaddr *)&msg->msg_local, msg->msg_locallen) < > + ibuf_size(buf)) { > + log_warn("%s: sendtofrom", __func__); > return (-1); > } > > @@ -969,10 +971,12 @@ int > ikev2_msg_retransmit_response(struct iked *env, struct iked_sa *sa, > struct iked_message *msg) > { > - if ((sendto(msg->msg_fd, ibuf_data(msg->msg_data), > - ibuf_size(msg->msg_data), 0, (struct sockaddr > *)&msg->msg_peer, > - msg->msg_peerlen)) == -1) { > - log_warn("%s: sendto", __func__); > + if (sendtofrom(msg->msg_fd, ibuf_data(msg->msg_data), > + ibuf_size(msg->msg_data), 0, > + (struct sockaddr *)&msg->msg_peer, msg->msg_peerlen, > + (struct sockaddr *)&msg->msg_local, msg->msg_locallen) < > + ibuf_size(msg->msg_data)) { > + log_warn("%s: sendtofrom", __func__); > return (-1); > } > > @@ -996,11 +1000,12 @@ ikev2_msg_retransmit_timeout(struct iked > struct iked_sa *sa = msg->msg_sa; > > if (msg->msg_tries < IKED_RETRANSMIT_TRIES) { > - if ((sendto(msg->msg_fd, ibuf_data(msg->msg_data), > + if (sendtofrom(msg->msg_fd, ibuf_data(msg->msg_data), > ibuf_size(msg->msg_data), 0, > - (struct sockaddr *)&msg->msg_peer, > - msg->msg_peerlen)) == -1) { > - log_warn("%s: sendto", __func__); > + (struct sockaddr *)&msg->msg_peer, > msg->
Re: Drop IPSec traffic that should be encapsulated but is not
On Thu, 1 Sep 2016 18:02:14 +0200 Claer wrote: > Hello, > > In some production systems, I'm still using an old patch to isakmpd > for Nat-t. > When negociating SAs with ASA peers and OpenBSD is nated, you have > issues during negociation. The following discutions explain the issue > > http://openbsd.7691.n7.nabble.com/isakmpd-NAT-T-interoperability-td173004.html > http://marc.info/?l=openbsd-tech&m=139140140105433&w=2 > > I think the patch is related to the parts of the code you are working > on. > Actually it is not. The issue you are referencing is in isakmpd, whereas the diff below is in the OpenBSD kernel. Totally different stuff. I do not plan to look at isakmpd at the moment, as it only supports IKEv1, and its code is nearly twice the size of iked. I do not have Cisco gear available to test, is this issue present when opening NAT-T tunnels with iked ? Cheers > Would you mind looking at this issue also? :) > > Thanks! > > Claer > > On Thu, Sep 01 2016 at 31:10, Vincent Gross wrote: > > > Our IPSec stack rejects UDP-encapsulated traffic using a non > > encapsulating SA, but not the other way around. This diff adds > > the missing check and the corresponding stat counter. > > > > Ok ? > > > > Index: sys/netinet/ip_esp.h > > === > > RCS file: /cvs/src/sys/netinet/ip_esp.h,v > > retrieving revision 1.42 > > diff -u -p -r1.42 ip_esp.h > > --- sys/netinet/ip_esp.h10 Jan 2010 12:43:07 - > > 1.42 +++ sys/netinet/ip_esp.h 1 Sep 2016 08:24:15 - > > @@ -62,6 +62,7 @@ struct espstat > > u_int32_t esps_udpencin; /* Input ESP-in-UDP packets */ > > u_int32_t esps_udpencout; /* Output ESP-in-UDP packets > > */ u_int32_tesps_udpinval; /* Invalid input ESP-in-UDP > > packets */ > > +u_int32_t esps_udpneeded; /* Trying to use a ESP-in-UDP > > TDB */ }; > > > > /* > > Index: sys/netinet/ipsec_input.c > > === > > RCS file: /cvs/src/sys/netinet/ipsec_input.c,v > > retrieving revision 1.135 > > diff -u -p -r1.135 ipsec_input.c > > --- sys/netinet/ipsec_input.c 10 Sep 2015 17:52:05 > > - 1.135 +++ sys/netinet/ipsec_input.c 1 Sep 2016 > > 08:24:16 - @@ -262,6 +262,16 @@ ipsec_common_input(struct mbuf > > *m, int s return EINVAL; > > } > > > > + if (!udpencap && (tdbp->tdb_flags & TDBF_UDPENCAP)) { > > + splx(s); > > + DPRINTF(("ipsec_common_input(): attempted to use > > udpencap " > > + "SA %s/%08x/%u\n", ipsp_address(&dst_address, > > buf, > > + sizeof(buf)), ntohl(spi), tdbp->tdb_sproto)); > > + m_freem(m); > > + espstat.esps_udpneeded++; > > + return EINVAL; > > + } > > + > > if (tdbp->tdb_xform == NULL) { > > splx(s); > > DPRINTF(("ipsec_common_input(): attempted to use > > uninitialized " Index: usr.bin/netstat/inet.c > > === > > RCS file: /cvs/src/usr.bin/netstat/inet.c,v > > retrieving revision 1.150 > > diff -u -p -r1.150 inet.c > > --- usr.bin/netstat/inet.c 27 Aug 2016 04:13:43 - > > 1.150 +++ usr.bin/netstat/inet.c1 Sep 2016 08:24:16 - > > @@ -1073,6 +1073,7 @@ esp_stats(char *name) > > p(esps_udpencin, "\t%u input UDP encapsulated ESP > > packet%s\n"); p(esps_udpencout, "\t%u output UDP encapsulated ESP > > packet%s\n"); p(esps_udpinval, "\t%u UDP packet%s for > > non-encapsulating TDB received\n"); > > + p(esps_udpneeded, "\t%u raw ESP packet%s for encapsulating > > TDB received\n"); p(esps_ibytes, "\t%llu input byte%s\n"); > > p(esps_obytes, "\t%llu output byte%s\n"); > > > >
NAT-on-enc on iked(8)
This diff adds the missing bits to support NAT-on-enc in iked(8). See OUTGOING NETWORK ADDRESS TRANSLATION in iked.conf(5), and also http://undeadly.org/cgi?action=article&sid=20090127205841. Ok ? diff --git sbin/iked/iked.h sbin/iked/iked.h index aa40d70..dfa04ad 100644 --- sbin/iked/iked.h +++ sbin/iked/iked.h @@ -140,6 +140,7 @@ struct iked_flow { struct iked_addr flow_src; struct iked_addr flow_dst; unsigned int flow_dir; /* in/out */ + struct iked_addr flow_prenat; unsigned int flow_loaded; /* pfkey done */ diff --git sbin/iked/parse.y sbin/iked/parse.y index c93a978..e3e7c29 100644 --- sbin/iked/parse.y +++ sbin/iked/parse.y @@ -2418,7 +2418,7 @@ create_ike(char *name, int af, uint8_t ipproto, struct ipsec_hosts *hosts, { char idstr[IKED_ID_SIZE]; unsigned int idtype = IKEV2_ID_NONE; - struct ipsec_addr_wrap *ipa, *ipb; + struct ipsec_addr_wrap *ipa, *ipb, *ippn; struct iked_policy pol; struct iked_proposal prop[2]; unsigned int j; @@ -2640,6 +2640,17 @@ create_ike(char *name, int af, uint8_t ipproto, struct ipsec_hosts *hosts, flows[j].flow_dst.addr_net = ipb->netaddress; flows[j].flow_dst.addr_port = hosts->dport; + ippn = ipa->srcnat; + if (ippn) { + memcpy(&flows[j].flow_prenat.addr, &ippn->address, + sizeof(ippn->address)); + flows[j].flow_prenat.addr_af = ippn->af; + flows[j].flow_prenat.addr_mask = ippn->mask; + flows[j].flow_prenat.addr_net = ippn->netaddress; + } else { + flows[j].flow_prenat.addr_af = 0; + } + flows[j].flow_ipproto = ipproto; pol.pol_nflows++; diff --git sbin/iked/pfkey.c sbin/iked/pfkey.c index 72c2d31..20ca4aa 100644 --- sbin/iked/pfkey.c +++ sbin/iked/pfkey.c @@ -173,6 +173,7 @@ int pfkey_flow(int sd, uint8_t satype, uint8_t action, struct iked_flow *flow) { struct sadb_msg smsg; + struct iked_addr*flow_src, *flow_dst; struct sadb_address sa_src, sa_dst, sa_local, sa_peer, sa_smask, sa_dmask; struct sadb_protocol sa_flowtype, sa_protocol; @@ -183,56 +184,75 @@ pfkey_flow(int sd, uint8_t satype, uint8_t action, struct iked_flow *flow) sa_srcid = sa_dstid = NULL; + flow_src = &flow->flow_src; + flow_dst = &flow->flow_dst; + + if (flow->flow_prenat.addr_af == flow_src->addr_af) { + switch (flow->flow_type) { + case SADB_X_FLOW_TYPE_USE: + flow_dst = &flow->flow_prenat; + break; + case SADB_X_FLOW_TYPE_REQUIRE: + flow_src = &flow->flow_prenat; + break; + case 0: + if (flow->flow_dir == IPSP_DIRECTION_IN) + flow_dst = &flow->flow_prenat; + else + flow_src = &flow->flow_prenat; + } + } + bzero(&ssrc, sizeof(ssrc)); bzero(&smask, sizeof(smask)); - memcpy(&ssrc, &flow->flow_src.addr, sizeof(ssrc)); - memcpy(&smask, &flow->flow_src.addr, sizeof(smask)); - socket_af((struct sockaddr *)&ssrc, flow->flow_src.addr_port); - socket_af((struct sockaddr *)&smask, flow->flow_src.addr_port ? + memcpy(&ssrc, &flow_src->addr, sizeof(ssrc)); + memcpy(&smask, &flow_src->addr, sizeof(smask)); + socket_af((struct sockaddr *)&ssrc, flow_src->addr_port); + socket_af((struct sockaddr *)&smask, flow_src->addr_port ? 0x : 0); - switch (flow->flow_src.addr_af) { + switch (flow_src->addr_af) { case AF_INET: ((struct sockaddr_in *)&smask)->sin_addr.s_addr = - prefixlen2mask(flow->flow_src.addr_net ? - flow->flow_src.addr_mask : 32); + prefixlen2mask(flow_src->addr_net ? + flow_src->addr_mask : 32); break; case AF_INET6: - prefixlen2mask6(flow->flow_src.addr_net ? - flow->flow_src.addr_mask : 128, + prefixlen2mask6(flow_src->addr_net ? + flow_src->addr_mask : 128, (uint32_t *)((struct sockaddr_in6 *) &smask)->sin6_addr.s6_addr); break; default: log_warnx("%s: unsupported address family %d", - __func__, flow->flow_src.addr_af); + __func__, flow_src->addr_af); return (-1); } smask.ss_len = ssrc.ss_len;
Drop IPSec traffic that should be encapsulated but is not
Our IPSec stack rejects UDP-encapsulated traffic using a non encapsulating SA, but not the other way around. This diff adds the missing check and the corresponding stat counter. Ok ? Index: sys/netinet/ip_esp.h === RCS file: /cvs/src/sys/netinet/ip_esp.h,v retrieving revision 1.42 diff -u -p -r1.42 ip_esp.h --- sys/netinet/ip_esp.h10 Jan 2010 12:43:07 - 1.42 +++ sys/netinet/ip_esp.h1 Sep 2016 08:24:15 - @@ -62,6 +62,7 @@ struct espstat u_int32_t esps_udpencin; /* Input ESP-in-UDP packets */ u_int32_t esps_udpencout; /* Output ESP-in-UDP packets */ u_int32_t esps_udpinval; /* Invalid input ESP-in-UDP packets */ +u_int32_t esps_udpneeded; /* Trying to use a ESP-in-UDP TDB */ }; /* Index: sys/netinet/ipsec_input.c === RCS file: /cvs/src/sys/netinet/ipsec_input.c,v retrieving revision 1.135 diff -u -p -r1.135 ipsec_input.c --- sys/netinet/ipsec_input.c 10 Sep 2015 17:52:05 - 1.135 +++ sys/netinet/ipsec_input.c 1 Sep 2016 08:24:16 - @@ -262,6 +262,16 @@ ipsec_common_input(struct mbuf *m, int s return EINVAL; } + if (!udpencap && (tdbp->tdb_flags & TDBF_UDPENCAP)) { + splx(s); + DPRINTF(("ipsec_common_input(): attempted to use udpencap " + "SA %s/%08x/%u\n", ipsp_address(&dst_address, buf, + sizeof(buf)), ntohl(spi), tdbp->tdb_sproto)); + m_freem(m); + espstat.esps_udpneeded++; + return EINVAL; + } + if (tdbp->tdb_xform == NULL) { splx(s); DPRINTF(("ipsec_common_input(): attempted to use uninitialized " Index: usr.bin/netstat/inet.c === RCS file: /cvs/src/usr.bin/netstat/inet.c,v retrieving revision 1.150 diff -u -p -r1.150 inet.c --- usr.bin/netstat/inet.c 27 Aug 2016 04:13:43 - 1.150 +++ usr.bin/netstat/inet.c 1 Sep 2016 08:24:16 - @@ -1073,6 +1073,7 @@ esp_stats(char *name) p(esps_udpencin, "\t%u input UDP encapsulated ESP packet%s\n"); p(esps_udpencout, "\t%u output UDP encapsulated ESP packet%s\n"); p(esps_udpinval, "\t%u UDP packet%s for non-encapsulating TDB received\n"); + p(esps_udpneeded, "\t%u raw ESP packet%s for encapsulating TDB received\n"); p(esps_ibytes, "\t%llu input byte%s\n"); p(esps_obytes, "\t%llu output byte%s\n");
Re: Let iked specify its source address when sending
On Wed, 31 Aug 2016 16:09:30 +0200 Reyk Floeter wrote: > On Wed, Aug 31, 2016 at 03:26:53PM +0200, Vincent Gross wrote: > > On Thu, 11 Aug 2016 16:57:27 +0100 > > Stuart Henderson wrote: > > > > > On 2016/06/27 13:00, J?r?mie Courr?ges-Anglas wrote: > > [...] > > > > > > > > I also gave my ok to vgross by IM. > > > > > > > > I know that some concerns have been exposed privately, I was not > > > > Cc'd, thus I have no idea what is the current status of that > > > > discussion. To the people concerned, please keep me / us > > > > updated about that discussion and Cc us. > > > > > > How are things looking with IN_SENDSRCADDR now, are there any > > > remaining concerns that need fixing before it could be committed? > > > (Also if anyone has a share-able diff to use this with iked it > > > would be quite handy..) > > > > > > > Tested locally with two iked on two distinct rdomains plus a bit of > > LD_PRELOAD goop. Unfortunately I couldn't ping from one rdom to the > > other, but I also have this problem without my patch, so I am > > confident this ping problem is unrelated. > > > > I would be very grateful if someone could test this. > > > > I don't know why you need LD_PRELOAD. > I wanted to use different key sets for each iked instance. > When testing iked with two different rdomains, you have to create an > enc(4) device per rdomain, or no ipsec traffic will flow. enc0 is for > rdomain 0 only. > > # ifconfig enc1 rdomain up > # ifconfig enc2 rdomain up > # route -T 1 exec iked -ddvvf /etc/iked.conf.1 > # route -T 2 exec iked -ddvvf /etc/iked.conf.2 Well, look at what you did, I understood why my pings wouldn't go through ! Hm, turns out I can send esp'ed data, when using the not-default address, but I can't receive yet. This is my test bench : -- A side: $ doas ifconfig pair10 rdomain 10 10.124.0.10/24 up $ doas ifconfig enc10 rdomain 10 up $ doas ifconfig vether10 rdomain 10 10.123.10.1/24 up $ doas route -T 10 add 10.123.11.0/24 10.124.0.11 $ cat iked.a.conf ikev2 active esp from 10.123.10.0/24 to 10.124.0.11 \ local 10.123.10.1 peer 10.124.0.11 \ srcid a.test dstid b.test "route -T 10 exec ipsecctl -s all" output : FLOWS: flow esp in from 10.124.0.11 to 10.123.10.0/24 peer 10.124.0.11 srcid FQDN/a.test dstid FQDN/b.test type use flow esp out from 10.123.10.0/24 to 10.124.0.11 peer 10.124.0.11 srcid FQDN/a.test dstid FQDN/b.test type require flow esp out from ::/0 to ::/0 type deny SAD: esp tunnel from 10.124.0.11 to 10.124.0.10 spi 0x2ebe4b1b auth hmac-sha2-256 enc aes-256 esp tunnel from 10.124.0.10 to 10.124.0.11 spi 0x3c4b29c3 auth hmac-sha2-256 enc aes-256 -- B side: $ doas ifconfig pair11 rdomain 11 10.124.0.11/24 up $ doas ifconfig enc11 rdomain 11 up $ doas ifconfig vether11 rdomain 11 10.123.11.1/24 up $ doas route -T 11 add 10.123.10.0/24 10.124.0.10 $ cat iked.b.conf ikev2 active esp from 10.124.0.11 to 10.123.10.0/24 \ local 10.124.0.11 peer 10.123.10.1 \ srcid b.test dstid a.test "route -T 11 exec ipsecctl -s all" output : FLOWS: flow esp in from 10.123.10.0/24 to 10.124.0.11 peer 10.124.0.10 srcid FQDN/b.test dstid FQDN/a.test type use flow esp out from 10.124.0.11 to 10.123.10.0/24 peer 10.124.0.10 srcid FQDN/b.test dstid FQDN/a.test type require flow esp out from ::/0 to ::/0 type deny SAD: esp tunnel from 10.124.0.11 to 10.124.0.10 spi 0x2ebe4b1b auth hmac-sha2-256 enc aes-256 esp tunnel from 10.124.0.10 to 10.124.0.11 spi 0x3c4b29c3 auth hmac-sha2-256 enc aes-256 -- The fun part: run "tcpdump -ni pair10", then "route -T 10 exec ping -I 10.123.10.1 10.124.0.11" : ... 17:26:24.185391 esp 10.124.0.10 > 10.124.0.11 spi 0x3c4b29c3 seq 36 len 136 17:26:24.185797 10.124.0.11.4500 > 10.124.0.10.4500:udpencap: esp 10.124.0.11 > 10.124.0.10 spi 0x2ebe4b1b seq 36 len 136 17:26:25.190350 esp 10.124.0.10 > 10.124.0.11 spi 0x3c4b29c3 seq 37 len 136 17:26:25.190680 10.124.0.11.4500 > 10.124.0.10.4500:udpencap: esp 10.124.0.11 > 10.124.0.10 spi 0x2ebe4b1b seq 37 len 136 17:26:26.190344 esp 10.124.0.10 > 10.124.0.11 spi 0x3c4b29c3 seq 38 len 136 17:26:26.190701 10.124.0.11.4500 > 10.124.0.10.4500:udpencap: esp 10.124.0.11 > 10.124.0.10 spi 0x2ebe4b1b seq 38 len 136 ... The udpencap'd return traffic is not picked up by enc10, so your ping replies are lost ...
Re: Let iked specify its source address when sending
On Wed, 31 Aug 2016 15:26:53 +0200 Vincent Gross wrote: > On Thu, 11 Aug 2016 16:57:27 +0100 > Stuart Henderson wrote: > > > On 2016/06/27 13:00, JĆ©rĆ©mie CourrĆØges-Anglas wrote: > [...] > > > > > > I also gave my ok to vgross by IM. > > > > > > I know that some concerns have been exposed privately, I was not > > > Cc'd, thus I have no idea what is the current status of that > > > discussion. To the people concerned, please keep me / us updated > > > about that discussion and Cc us. > > > > How are things looking with IN_SENDSRCADDR now, are there any > > remaining concerns that need fixing before it could be committed? > > (Also if anyone has a share-able diff to use this with iked it > > would be quite handy..) > > > > Tested locally with two iked on two distinct rdomains plus a bit of > LD_PRELOAD goop. Unfortunately I couldn't ping from one rdom to the > other, but I also have this problem without my patch, so I am > confident this ping problem is unrelated. > > I would be very grateful if someone could test this. > Take two, unmangled version : Index: sbin/iked/iked.h === RCS file: /cvs/src/sbin/iked/iked.h,v retrieving revision 1.96 diff -u -p -r1.96 iked.h --- sbin/iked/iked.h1 Jun 2016 11:16:41 - 1.96 +++ sbin/iked/iked.h31 Aug 2016 13:19:10 - @@ -898,6 +898,8 @@ int socket_setport(struct sockaddr *, i int socket_getaddr(int, struct sockaddr_storage *); int socket_bypass(int, struct sockaddr *); int udp_bind(struct sockaddr *, in_port_t); +ssize_t sendtofrom(int, void *, size_t, int, struct sockaddr *, + socklen_t, struct sockaddr *, socklen_t); ssize_t recvfromto(int, void *, size_t, int, struct sockaddr *, socklen_t *, struct sockaddr *, socklen_t *); const char * Index: sbin/iked/ikev2_msg.c === RCS file: /cvs/src/sbin/iked/ikev2_msg.c,v retrieving revision 1.45 diff -u -p -r1.45 ikev2_msg.c --- sbin/iked/ikev2_msg.c 19 Oct 2015 11:25:35 - 1.45 +++ sbin/iked/ikev2_msg.c 31 Aug 2016 13:19:10 - @@ -319,9 +319,11 @@ ikev2_msg_send(struct iked *env, struct msg->msg_offset += sizeof(natt); } - if ((sendto(msg->msg_fd, ibuf_data(buf), ibuf_size(buf), 0, - (struct sockaddr *)&msg->msg_peer, msg->msg_peerlen)) == -1) { - log_warn("%s: sendto", __func__); + if (sendtofrom(msg->msg_fd, ibuf_data(buf), ibuf_size(buf), 0, + (struct sockaddr *)&msg->msg_peer, msg->msg_peerlen, + (struct sockaddr *)&msg->msg_local, msg->msg_locallen) < + ibuf_size(buf)) { + log_warn("%s: sendtofrom", __func__); return (-1); } @@ -969,10 +971,12 @@ int ikev2_msg_retransmit_response(struct iked *env, struct iked_sa *sa, struct iked_message *msg) { - if ((sendto(msg->msg_fd, ibuf_data(msg->msg_data), - ibuf_size(msg->msg_data), 0, (struct sockaddr *)&msg->msg_peer, - msg->msg_peerlen)) == -1) { - log_warn("%s: sendto", __func__); + if (sendtofrom(msg->msg_fd, ibuf_data(msg->msg_data), + ibuf_size(msg->msg_data), 0, + (struct sockaddr *)&msg->msg_peer, msg->msg_peerlen, + (struct sockaddr *)&msg->msg_local, msg->msg_locallen) < + ibuf_size(msg->msg_data)) { + log_warn("%s: sendtofrom", __func__); return (-1); } @@ -996,11 +1000,12 @@ ikev2_msg_retransmit_timeout(struct iked struct iked_sa *sa = msg->msg_sa; if (msg->msg_tries < IKED_RETRANSMIT_TRIES) { - if ((sendto(msg->msg_fd, ibuf_data(msg->msg_data), + if (sendtofrom(msg->msg_fd, ibuf_data(msg->msg_data), ibuf_size(msg->msg_data), 0, - (struct sockaddr *)&msg->msg_peer, - msg->msg_peerlen)) == -1) { - log_warn("%s: sendto", __func__); + (struct sockaddr *)&msg->msg_peer, msg->msg_peerlen, + (struct sockaddr *)&msg->msg_local, msg->msg_locallen) < + ibuf_size(msg->msg_data)) { + log_warn("%s: sendtofrom", __func__); sa_free(env, sa); return; } Index: sbin/iked/util.c === RCS file: /cvs/src/sbin/iked/util.c,v retrieving revision 1.30 diff -u -p -r1.30 util.c --- sbin/iked/
Let iked specify its source address when sending
On Thu, 11 Aug 2016 16:57:27 +0100 Stuart Henderson wrote: > On 2016/06/27 13:00, JĆ©rĆ©mie CourrĆØges-Anglas wrote: [...] > > > > I also gave my ok to vgross by IM. > > > > I know that some concerns have been exposed privately, I was not > > Cc'd, thus I have no idea what is the current status of that > > discussion. To the people concerned, please keep me / us updated > > about that discussion and Cc us. > > How are things looking with IN_SENDSRCADDR now, are there any > remaining concerns that need fixing before it could be committed? > (Also if anyone has a share-able diff to use this with iked it > would be quite handy..) > Tested locally with two iked on two distinct rdomains plus a bit of LD_PRELOAD goop. Unfortunately I couldn't ping from one rdom to the other, but I also have this problem without my patch, so I am confident this ping problem is unrelated. I would be very grateful if someone could test this. Index: sbin/iked/iked.h === RCS file: /cvs/src/sbin/iked/iked.h,v retrieving revision 1.96 diff -u -p -r1.96 iked.h --- sbin/iked/iked.h1 Jun 2016 11:16:41 - 1.96 +++ sbin/iked/iked.h31 Aug 2016 13:19:10 - @@ -898,6 +898,8 @@ int socket_setport(struct sockaddr *, i int socket_getaddr(int, struct sockaddr_storage *); int socket_bypass(int, struct sockaddr *); int udp_bind(struct sockaddr *, in_port_t); +ssize_t sendtofrom(int, void *, size_t, int, struct sockaddr *, + socklen_t, struct sockaddr *, socklen_t); ssize_t recvfromto(int, void *, size_t, int, struct sockaddr *, socklen_t *, struct sockaddr *, socklen_t *); const char * Index: sbin/iked/ikev2_msg.c === RCS file: /cvs/src/sbin/iked/ikev2_msg.c,v retrieving revision 1.45 diff -u -p -r1.45 ikev2_msg.c --- sbin/iked/ikev2_msg.c 19 Oct 2015 11:25:35 - 1.45 +++ sbin/iked/ikev2_msg.c 31 Aug 2016 13:19:10 - @@ -319,9 +319,11 @@ ikev2_msg_send(struct iked *env, struct msg->msg_offset += sizeof(natt); } - if ((sendto(msg->msg_fd, ibuf_data(buf), ibuf_size(buf), 0, - (struct sockaddr *)&msg->msg_peer, msg->msg_peerlen)) == -1) { - log_warn("%s: sendto", __func__); + if (sendtofrom(msg->msg_fd, ibuf_data(buf), ibuf_size(buf), 0, + (struct sockaddr *)&msg->msg_peer, msg->msg_peerlen, + (struct sockaddr *)&msg->msg_local, msg->msg_locallen) < + ibuf_size(buf)) { + log_warn("%s: sendtofrom", __func__); return (-1); } @@ -969,10 +971,12 @@ int ikev2_msg_retransmit_response(struct iked *env, struct iked_sa *sa, struct iked_message *msg) { - if ((sendto(msg->msg_fd, ibuf_data(msg->msg_data), - ibuf_size(msg->msg_data), 0, (struct sockaddr *)&msg->msg_peer, - msg->msg_peerlen)) == -1) { - log_warn("%s: sendto", __func__); + if (sendtofrom(msg->msg_fd, ibuf_data(msg->msg_data), + ibuf_size(msg->msg_data), 0, + (struct sockaddr *)&msg->msg_peer, msg->msg_peerlen, + (struct sockaddr *)&msg->msg_local, msg->msg_locallen) < + ibuf_size(msg->msg_data)) { + log_warn("%s: sendtofrom", __func__); return (-1); } @@ -996,11 +1000,12 @@ ikev2_msg_retransmit_timeout(struct iked struct iked_sa *sa = msg->msg_sa; if (msg->msg_tries < IKED_RETRANSMIT_TRIES) { - if ((sendto(msg->msg_fd, ibuf_data(msg->msg_data), + if (sendtofrom(msg->msg_fd, ibuf_data(msg->msg_data), ibuf_size(msg->msg_data), 0, - (struct sockaddr *)&msg->msg_peer, - msg->msg_peerlen)) == -1) { - log_warn("%s: sendto", __func__); + (struct sockaddr *)&msg->msg_peer, msg->msg_peerlen, + (struct sockaddr *)&msg->msg_local, msg->msg_locallen) < + ibuf_size(msg->msg_data)) { + log_warn("%s: sendtofrom", __func__); sa_free(env, sa); return; } Index: sbin/iked/util.c === RCS file: /cvs/src/sbin/iked/util.c,v retrieving revision 1.30 diff -u -p -r1.30 util.c --- sbin/iked/util.c23 Nov 2015 19:28:34 - 1.30 +++ sbin/iked/util.c31 Aug 2016 13:19:10 - @@ -287,6 +287,57 @@ sockaddr_cmp(struct sockaddr *a, struct } ssize_t +sendtofrom(int s, void *buf, size_t len, int flags, struct sockaddr *to, +socklen_t tolen, struct sockaddr *from, socklen_t fromlen) +{ + struct iovec iov; + struct msghdrmsg; + struct cmsghdr *cmsg; + struct in6_pktinfo *pkt6; + struct sockaddr_in *in; + stru
Re: IP_SENDSRCADDR [2/2] : add cmsg support
On Thu, 11 Aug 2016 16:57:27 +0100 Stuart Henderson wrote: > On 2016/06/27 13:00, JĆ©rĆ©mie CourrĆØges-Anglas wrote: > > Stuart Henderson writes: > > [...] > > > > > > Basically yes but one observation. > > > > I also gave my ok to vgross by IM. > > > > I know that some concerns have been exposed privately, I was not > > Cc'd, thus I have no idea what is the current status of that > > discussion. To the people concerned, please keep me / us updated > > about that discussion and Cc us. > > How are things looking with IN_SENDSRCADDR now, are there any > remaining concerns that need fixing before it could be committed? > (Also if anyone has a share-able diff to use this with iked it > would be quite handy..) > I just commited the diff with fixes, enhancements and regression tests. All manners of testing and feedback are welcome ! -- Vincent Gross
Re: split in6_selectsrc() for saner prototypes
On Wed, 20 Jul 2016 12:36:45 +0200 Vincent Gross wrote: > This is a completely mechanical diff to get rid of the 7-params > madness in in6_selectsrc(). > > I also apply the same treatment to in_selectsrc() for consistency. > > Ok? ... and of course I forgot to initialize a variable and broke all ipv6, thanks to Heikko for reporting this on bugs@. New diff below adds dst init in in6_selectsrc(), ok ? Index: sys/netinet/in_pcb.c === RCS file: /cvs/src/sys/netinet/in_pcb.c,v retrieving revision 1.212 diff -u -p -r1.212 in_pcb.c --- sys/netinet/in_pcb.c22 Jul 2016 11:14:41 - 1.212 +++ sys/netinet/in_pcb.c29 Jul 2016 19:53:21 - @@ -525,8 +525,7 @@ in_pcbconnect(struct inpcb *inp, struct if (sin->sin_port == 0) return (EADDRNOTAVAIL); - error = in_selectsrc(&ina, sin, inp->inp_moptions, &inp->inp_route, - &inp->inp_laddr, inp->inp_rtableid); + error = in_pcbselsrc(&ina, sin, inp); if (error) return (error); @@ -876,10 +875,14 @@ in_pcbrtentry(struct inpcb *inp) * an entry to the caller for later use. */ int -in_selectsrc(struct in_addr **insrc, struct sockaddr_in *sin, -struct ip_moptions *mopts, struct route *ro, struct in_addr *laddr, -u_int rtableid) +in_pcbselsrc(struct in_addr **insrc, struct sockaddr_in *sin, +struct inpcb *inp) { + struct ip_moptions *mopts = inp->inp_moptions; + struct route *ro = &inp->inp_route; + struct in_addr *laddr = &inp->inp_laddr; + u_int rtableid = inp->inp_rtableid; + struct sockaddr_in *sin2; struct in_ifaddr *ia = NULL; Index: sys/netinet/in_pcb.h === RCS file: /cvs/src/sys/netinet/in_pcb.h,v retrieving revision 1.102 diff -u -p -r1.102 in_pcb.h --- sys/netinet/in_pcb.h22 Jul 2016 11:14:41 - 1.102 +++ sys/netinet/in_pcb.h29 Jul 2016 19:53:21 - @@ -289,8 +289,7 @@ void in_setpeeraddr(struct inpcb *, str voidin_setsockaddr(struct inpcb *, struct mbuf *); int in_baddynamic(u_int16_t, u_int16_t); int in_rootonly(u_int16_t, u_int16_t); -int in_selectsrc(struct in_addr **, struct sockaddr_in *, - struct ip_moptions *, struct route *, struct in_addr *, u_int); +int in_pcbselsrc(struct in_addr **, struct sockaddr_in *, struct inpcb *); struct rtentry * in_pcbrtentry(struct inpcb *); Index: sys/netinet/udp_usrreq.c === RCS file: /cvs/src/sys/netinet/udp_usrreq.c,v retrieving revision 1.216 diff -u -p -r1.216 udp_usrreq.c --- sys/netinet/udp_usrreq.c22 Jul 2016 11:14:41 - 1.216 +++ sys/netinet/udp_usrreq.c29 Jul 2016 19:53:22 - @@ -989,8 +989,7 @@ udp_output(struct inpcb *inp, struct mbu goto release; } - error = in_selectsrc(&laddr, sin, inp->inp_moptions, - &inp->inp_route, &inp->inp_laddr, inp->inp_rtableid); + error = in_pcbselsrc(&laddr, sin, inp); if (error) goto release; Index: sys/netinet6/icmp6.c === RCS file: /cvs/src/sys/netinet6/icmp6.c,v retrieving revision 1.188 diff -u -p -r1.188 icmp6.c --- sys/netinet6/icmp6.c22 Jul 2016 11:14:41 - 1.188 +++ sys/netinet6/icmp6.c29 Jul 2016 19:53:22 - @@ -1259,7 +1259,7 @@ icmp6_reflect(struct mbuf *m, size_t off * source address of the erroneous packet. */ bzero(&ro, sizeof(ro)); - error = in6_selectsrc(&src, &sa6_src, NULL, NULL, &ro, NULL, + error = in6_selectsrc(&src, &sa6_src, NULL, &ro, m->m_pkthdr.ph_rtableid); if (ro.ro_rt) rtfree(ro.ro_rt); /* XXX: we could use this */ Index: sys/netinet6/in6_pcb.c === RCS file: /cvs/src/sys/netinet6/in6_pcb.c,v retrieving revision 1.95 diff -u -p -r1.95 in6_pcb.c --- sys/netinet6/in6_pcb.c 22 Jul 2016 11:14:41 - 1.95 +++ sys/netinet6/in6_pcb.c 29 Jul 2016 19:53:22 - @@ -281,9 +281,7 @@ in6_pcbconnect(struct inpcb *inp, struct * with the address specified by setsockopt(IPV6_PKTINFO). * Is it the intended behavior? */ - error = in6_selectsrc(&in6a, sin6, inp->inp_outputopts6, - inp->inp_moptions6, &inp->inp_route6, &inp->inp_laddr6, - inp->inp_rtableid); + error = in6_pcbselsrc(&in6a, sin6, inp, inp->inp_outputopts6); if (error) return (error); Index: sy
split in6_selectsrc() for saner prototypes
This is a completely mechanical diff to get rid of the 7-params madness in in6_selectsrc(). I also apply the same treatment to in_selectsrc() for consistency. Ok? Index: sys/netinet/in_pcb.c === RCS file: /cvs/src/sys/netinet/in_pcb.c,v retrieving revision 1.210 diff -u -p -r1.210 in_pcb.c --- sys/netinet/in_pcb.c19 Jul 2016 14:49:46 - 1.210 +++ sys/netinet/in_pcb.c20 Jul 2016 10:21:17 - @@ -525,8 +525,7 @@ in_pcbconnect(struct inpcb *inp, struct if (sin->sin_port == 0) return (EADDRNOTAVAIL); - error = in_selectsrc(&ina, sin, inp->inp_moptions, &inp->inp_route, - &inp->inp_laddr, inp->inp_rtableid); + error = in_selpcbsrc(&ina, sin, inp); if (error) return (error); @@ -876,10 +875,14 @@ in_pcbrtentry(struct inpcb *inp) * an entry to the caller for later use. */ int -in_selectsrc(struct in_addr **insrc, struct sockaddr_in *sin, -struct ip_moptions *mopts, struct route *ro, struct in_addr *laddr, -u_int rtableid) +in_selpcbsrc(struct in_addr **insrc, struct sockaddr_in *sin, +struct inpcb *inp) { + struct ip_moptions *mopts = inp->inp_moptions; + struct route *ro = &inp->inp_route; + struct in_addr *laddr = &inp->inp_laddr; + u_int rtableid = inp->inp_rtableid; + struct sockaddr_in *sin2; struct in_ifaddr *ia = NULL; Index: sys/netinet/in_pcb.h === RCS file: /cvs/src/sys/netinet/in_pcb.h,v retrieving revision 1.100 diff -u -p -r1.100 in_pcb.h --- sys/netinet/in_pcb.h27 Jun 2016 16:33:48 - 1.100 +++ sys/netinet/in_pcb.h20 Jul 2016 10:21:17 - @@ -289,8 +289,7 @@ void in_setpeeraddr(struct inpcb *, str voidin_setsockaddr(struct inpcb *, struct mbuf *); int in_baddynamic(u_int16_t, u_int16_t); int in_rootonly(u_int16_t, u_int16_t); -int in_selectsrc(struct in_addr **, struct sockaddr_in *, - struct ip_moptions *, struct route *, struct in_addr *, u_int); +int in_selpcbsrc(struct in_addr **, struct sockaddr_in *, struct inpcb *); struct rtentry * in_pcbrtentry(struct inpcb *); Index: sys/netinet/udp_usrreq.c === RCS file: /cvs/src/sys/netinet/udp_usrreq.c,v retrieving revision 1.214 diff -u -p -r1.214 udp_usrreq.c --- sys/netinet/udp_usrreq.c28 Jun 2016 11:22:53 - 1.214 +++ sys/netinet/udp_usrreq.c20 Jul 2016 10:21:19 - @@ -989,8 +989,7 @@ udp_output(struct inpcb *inp, struct mbu goto release; } - error = in_selectsrc(&laddr, sin, inp->inp_moptions, - &inp->inp_route, &inp->inp_laddr, inp->inp_rtableid); + error = in_selpcbsrc(&laddr, sin, inp); if (error) goto release; Index: sys/netinet6/icmp6.c === RCS file: /cvs/src/sys/netinet6/icmp6.c,v retrieving revision 1.186 diff -u -p -r1.186 icmp6.c --- sys/netinet6/icmp6.c5 Jul 2016 10:17:14 - 1.186 +++ sys/netinet6/icmp6.c20 Jul 2016 10:21:19 - @@ -1259,7 +1259,7 @@ icmp6_reflect(struct mbuf *m, size_t off * source address of the erroneous packet. */ bzero(&ro, sizeof(ro)); - error = in6_selectsrc(&src, &sa6_src, NULL, NULL, &ro, NULL, + error = in6_selectsrc(&src, &sa6_src, NULL, &ro, m->m_pkthdr.ph_rtableid); if (ro.ro_rt) rtfree(ro.ro_rt); /* XXX: we could use this */ Index: sys/netinet6/in6_pcb.c === RCS file: /cvs/src/sys/netinet6/in6_pcb.c,v retrieving revision 1.93 diff -u -p -r1.93 in6_pcb.c --- sys/netinet6/in6_pcb.c 5 Jul 2016 10:17:14 - 1.93 +++ sys/netinet6/in6_pcb.c 20 Jul 2016 10:21:19 - @@ -281,9 +281,7 @@ in6_pcbconnect(struct inpcb *inp, struct * with the address specified by setsockopt(IPV6_PKTINFO). * Is it the intended behavior? */ - error = in6_selectsrc(&in6a, sin6, inp->inp_outputopts6, - inp->inp_moptions6, &inp->inp_route6, &inp->inp_laddr6, - inp->inp_rtableid); + error = in6_selpcbsrc(&in6a, sin6, inp, inp->inp_outputopts6); if (error) return (error); Index: sys/netinet6/in6_src.c === RCS file: /cvs/src/sys/netinet6/in6_src.c,v retrieving revision 1.76 diff -u -p -r1.76 in6_src.c --- sys/netinet6/in6_src.c 5 Jul 2016 10:17:14 - 1.76 +++ sys/netinet6/in6_src.c 20 Jul 2016 10:21:19 - @@ -88,15 +88,18 @@ int in6_selectif(struct sockaddr_in6 *, /* * Return an IPv6 address,
Re: IP_SENDSRCADDR [2/2] : add cmsg support
On Mon, 13 Jun 2016 16:49:01 +0200 Vincent Gross wrote: > > While validating source address inside selection functions is the > right direction, I don't think it would be a good thing to extend > further in_selectsrc() prototype. However it is easy to add a check > while processing cmsg. > > rev2 below. Ok ? > rev3 below. I fixed the line length, the useless bzero(), and also the wording in ip.4 Ok ? Index: sys/netinet/in.h === RCS file: /cvs/src/sys/netinet/in.h,v retrieving revision 1.115 diff -u -p -r1.115 in.h --- sys/netinet/in.h20 Oct 2015 20:22:42 - 1.115 +++ sys/netinet/in.h15 Jun 2016 17:37:11 - @@ -307,6 +307,7 @@ struct ip_opts { #define IP_RECVRTABLE 35 /* bool; receive rdomain w/dgram */ #define IP_IPSECFLOWINFO 36 /* bool; IPsec flow info for dgram */ #define IP_IPDEFTTL37 /* int; IP TTL system default */ +#define IP_SENDSRCADDR 38 /* struct in_addr; source address to use */ #define IP_RTABLE 0x1021 /* int; routing table, see SO_RTABLE */ #define IP_DIVERTFL0x1022 /* int; divert direction flag opt */ Index: sys/netinet/udp_usrreq.c === RCS file: /cvs/src/sys/netinet/udp_usrreq.c,v retrieving revision 1.212 diff -u -p -r1.212 udp_usrreq.c --- sys/netinet/udp_usrreq.c15 Jun 2016 16:06:35 - 1.212 +++ sys/netinet/udp_usrreq.c15 Jun 2016 17:37:11 - @@ -888,6 +888,7 @@ udp_output(struct inpcb *inp, struct mbu struct sockaddr_in *sin = NULL; struct udpiphdr *ui; u_int32_t ipsecflowinfo = 0; + struct sockaddr_in src_sin; int len = m->m_pkthdr.len; struct in_addr *laddr; int error = 0; @@ -906,6 +907,8 @@ udp_output(struct inpcb *inp, struct mbu goto release; } + memset(&src_sin, 0, sizeof(src_sin)); + if (control) { u_int clen; struct cmsghdr *cm; @@ -939,9 +942,20 @@ udp_output(struct inpcb *inp, struct mbu cm->cmsg_level == IPPROTO_IP && cm->cmsg_type == IP_IPSECFLOWINFO) { ipsecflowinfo = *(u_int32_t *)CMSG_DATA(cm); - break; - } + } else #endif + if (cm->cmsg_len == CMSG_LEN(sizeof(struct in_addr)) && + cm->cmsg_level == IPPROTO_IP && + cm->cmsg_type == IP_SENDSRCADDR) { + memcpy(&src_sin.sin_addr, CMSG_DATA(cm), + sizeof(struct in_addr)); + src_sin.sin_family = AF_INET; + src_sin.sin_len = sizeof(src_sin); + /* no check on reuse when sin->sin_port == 0 */ + if ((error = in_pcbaddrisavail(inp, &src_sin, + 0, curproc))) + goto release; + } clen -= CMSG_ALIGN(cm->cmsg_len); cmsgs += CMSG_ALIGN(cm->cmsg_len); } while (clen); @@ -979,6 +993,17 @@ udp_output(struct inpcb *inp, struct mbu splx(s); if (error) goto release; + } + + if (src_sin.sin_len > 0 && + src_sin.sin_addr.s_addr != INADDR_ANY && + src_sin.sin_addr.s_addr != inp->inp_laddr.s_addr) { + src_sin.sin_port = inp->inp_lport; + if (inp->inp_laddr.s_addr != INADDR_ANY && + (error = + in_pcbaddrisavail(inp, &src_sin, 0, curproc))) + goto release; + laddr = &src_sin.sin_addr; } } else { if (inp->inp_faddr.s_addr == INADDR_ANY) { Index: share/man/man4/ip.4 === RCS file: /cvs/src/share/man/man4/ip.4,v retrieving revision 1.38 diff -u -p -r1.38 ip.4 --- share/man/man4/ip.4 20 Oct 2015 22:08:19 - 1.38 +++ share/man/man4/ip.4 15 Jun 2016 17:37:12 - @@ -290,6 +290,34 @@ cmsg_len = CMSG_LEN(sizeof(u_int)) cmsg_level = IPPROTO_IP cmsg_type = IP_RECVRTABLE .Ed +.Pp +When sending on a +.Dv SOCK_DGRAM +socket with +.Xr sendmsg 2 +, the source address to be used can be passed as ancillary data with a type code of +.Dv IP_SENDSRCADDR . +The +.Va msg_control +field in the +.Vt msghdr +structure should point to a buffer that contains a +.Vt cmsghdr +structure followed by
Re: IP_SENDSRCADDR [2/2] : add cmsg support
On Mon, 13 Jun 2016 19:57:15 +0200 Jeremie Courreges-Anglas wrote: > Vincent Gross writes: > > > Le Mon, 13 Jun 2016 07:35:16 +0200, > > j...@wxcvbn.org (Jeremie Courreges-Anglas) a Ć©crit : > > > >> j...@wxcvbn.org (Jeremie Courreges-Anglas) writes: > >> > >> > cc'ing sthen since he also has interest in IP_SENDSRCADDR > >> > > >> > Jeremie Courreges-Anglas writes: > >> > > >> >> Vincent Gross writes: > >> >> > >> >>> This diff adds support for IP_SENDSRCADDR cmsg on UDP sockets. > >> >>> As for udp6_output(), we check that the source address+port is > >> >>> available only if inp_laddr != * > >> >> > >> >> Your last IP_SENDSRCADDR diff didn't have that check, I think > >> >> it is harmful. If the socket is not bound then there is > >> >> effectively no check performed by in_pcbaddrisavail(), thus I > >> >> can use any random address. Other than this additional bypass > >> >> check, your diff looks good to me. > >> >> > > [...] > >> >> > >> >> I haven't checked yet whether udp6_output is also affected. If > >> >> you folks already know that it isn't, please let me know. > >> > >> The answer is "no", a few tests can't trigger the same problem. > >> IIUC in6_selectsrc is responsible for rejection of non-local > >> systems. Maybe we should take the same approach in netinet/, and > >> extend in_selectsrc()? > >> > >> -- > > > > While validating source address inside selection functions is the > > right direction, I don't think it would be a good thing to extend > > further in_selectsrc() prototype. > > I find it nice to have all the source address selection in one place. > Or do you have another refactoring in mind? > Uh, turns out I was operating on obsolete data. I would actually be easy to shrink in_selectsrc() prototype to (int)(struct in_addr **, struct sockaddr_in *, struct in_pcb *). But this looks like a layering violation to me ... What do you think ? $ grep -r in_selectsrc sys/net* sys/netinet/in_pcb.c sys/netinet/in_pcb.h sys/netinet/udp_usrreq.c $ cd sys/netinet $ grep -A2 in_selectsrc error = in_selectsrc(&ina, sin, inp->inp_moptions, &inp->inp_route, &inp->inp_laddr, inp->inp_rtableid); if (error) in_selectsrc(struct in_addr **insrc, struct sockaddr_in *sin, struct ip_moptions *mopts, struct route *ro, struct in_addr *laddr, u_int rtableid) $ grep -A2 in_selectsrc udp_usrreq.c error = in_selectsrc(&laddr, sin, inp->inp_moptions, &inp->inp_route, &inp->inp_laddr, inp->inp_rtableid); if (error) > > However it is easy to add a check while > > processing cmsg. > > > > rev2 below. Ok ? > > Nits below, looks fine otherwise. The checks do detect addresses not > configured on the system and overlaps of bound sockets. > > > > > diff --git a/share/man/man4/ip.4 b/share/man/man4/ip.4 > > index 111432b..154b0d1 100644 > > --- a/share/man/man4/ip.4 > > +++ b/share/man/man4/ip.4 > > @@ -290,6 +290,27 @@ cmsg_len = CMSG_LEN(sizeof(u_int)) > > cmsg_level = IPPROTO_IP > > cmsg_type = IP_RECVRTABLE > > .Ed > > +.Pp > > +If the > > +.Dv IP_SENDSRCADDR > > +option is passed to a > > +.Xr sendmsg 2 > > +call on a > > +.Dv SOCK_DGRAM > > +socket, the address passed along the > > +.Vt cmsghdr > > +structure will be used as the source of the outgoing > > +.Tn UDP > > +datagram. The > > +.Vt cmsghdr > > +fields for > > +.Xr sendmsg 2 > > +have the following values: > > I would have worded it "should have" here, since these are the values > that the developer is supposed to pass. Yes, I have to find a better wording for this part. > > > +.Bd -literal -offset indent > > +cmsg_len = CMSG_LEN(sizeof(struct in_addr)) > > +cmsg_level = IPPROTO_IP > > +cmsg_type = IP_SENDSRCADDR > > +.Ed > > .Ss "Multicast Options" > > .Tn IP > > multicasting is supported only on > > diff --git a/sys/netinet/in.h b/sys/netinet/in.h > > index adb1b30..bf8c95d 100644 > > --- a/sys/netinet/in.h > > +++ b/sys/netinet/in.h > > @@ -307,6 +307,7 @@ struct ip_opts { > > #define IP_RECVRTABLE 35 /* bool; receive rdomain > > w/dgram */ #define IP_IPSECFLOWINFO 36 /* bool; IP
Re: IP_SENDSRCADDR [2/2] : add cmsg support
Le Mon, 13 Jun 2016 07:35:16 +0200, j...@wxcvbn.org (JĆ©rĆ©mie CourrĆØges-Anglas) a Ć©crit : > j...@wxcvbn.org (Jeremie Courreges-Anglas) writes: > > > cc'ing sthen since he also has interest in IP_SENDSRCADDR > > > > Jeremie Courreges-Anglas writes: > > > >> Vincent Gross writes: > >> > >>> This diff adds support for IP_SENDSRCADDR cmsg on UDP sockets. As > >>> for udp6_output(), we check that the source address+port is > >>> available only if inp_laddr != * > >> > >> Your last IP_SENDSRCADDR diff didn't have that check, I think it is > >> harmful. If the socket is not bound then there is effectively no > >> check performed by in_pcbaddrisavail(), thus I can use any random > >> address. Other than this additional bypass check, your diff looks > >> good to me. > >> [...] > >> > >> I haven't checked yet whether udp6_output is also affected. If you > >> folks already know that it isn't, please let me know. > > The answer is "no", a few tests can't trigger the same problem. IIUC > in6_selectsrc is responsible for rejection of non-local systems. > Maybe we should take the same approach in netinet/, and extend > in_selectsrc()? > > -- While validating source address inside selection functions is the right direction, I don't think it would be a good thing to extend further in_selectsrc() prototype. However it is easy to add a check while processing cmsg. rev2 below. Ok ? diff --git a/share/man/man4/ip.4 b/share/man/man4/ip.4 index 111432b..154b0d1 100644 --- a/share/man/man4/ip.4 +++ b/share/man/man4/ip.4 @@ -290,6 +290,27 @@ cmsg_len = CMSG_LEN(sizeof(u_int)) cmsg_level = IPPROTO_IP cmsg_type = IP_RECVRTABLE .Ed +.Pp +If the +.Dv IP_SENDSRCADDR +option is passed to a +.Xr sendmsg 2 +call on a +.Dv SOCK_DGRAM +socket, the address passed along the +.Vt cmsghdr +structure will be used as the source of the outgoing +.Tn UDP +datagram. The +.Vt cmsghdr +fields for +.Xr sendmsg 2 +have the following values: +.Bd -literal -offset indent +cmsg_len = CMSG_LEN(sizeof(struct in_addr)) +cmsg_level = IPPROTO_IP +cmsg_type = IP_SENDSRCADDR +.Ed .Ss "Multicast Options" .Tn IP multicasting is supported only on diff --git a/sys/netinet/in.h b/sys/netinet/in.h index adb1b30..bf8c95d 100644 --- a/sys/netinet/in.h +++ b/sys/netinet/in.h @@ -307,6 +307,7 @@ struct ip_opts { #define IP_RECVRTABLE 35 /* bool; receive rdomain w/dgram */ #define IP_IPSECFLOWINFO 36 /* bool; IPsec flow info for dgram */ #define IP_IPDEFTTL37 /* int; IP TTL system default */ +#define IP_SENDSRCADDR 38 /* struct in_addr; source address to use */ #define IP_RTABLE 0x1021 /* int; routing table, see SO_RTABLE */ #define IP_DIVERTFL0x1022 /* int; divert direction flag opt */ diff --git a/sys/netinet/udp_usrreq.c b/sys/netinet/udp_usrreq.c index 1feea11..401ed7a 100644 --- a/sys/netinet/udp_usrreq.c +++ b/sys/netinet/udp_usrreq.c @@ -888,6 +888,7 @@ udp_output(struct inpcb *inp, struct mbuf *m, struct mbuf *addr, struct sockaddr_in *sin = NULL; struct udpiphdr *ui; u_int32_t ipsecflowinfo = 0; + struct sockaddr_in src_sin; int len = m->m_pkthdr.len; struct in_addr *laddr; int error = 0; @@ -906,6 +907,8 @@ udp_output(struct inpcb *inp, struct mbuf *m, struct mbuf *addr, goto release; } + memset(&src_sin, 0, sizeof(src_sin)); + if (control) { u_int clen; struct cmsghdr *cm; @@ -939,9 +942,20 @@ udp_output(struct inpcb *inp, struct mbuf *m, struct mbuf *addr, cm->cmsg_level == IPPROTO_IP && cm->cmsg_type == IP_IPSECFLOWINFO) { ipsecflowinfo = *(u_int32_t *)CMSG_DATA(cm); - break; - } + } else #endif + if (cm->cmsg_len == CMSG_LEN(sizeof(struct in_addr)) && + cm->cmsg_level == IPPROTO_IP && + cm->cmsg_type == IP_SENDSRCADDR) { + bzero(&src_sin, sizeof(src_sin)); + memcpy(&src_sin.sin_addr, CMSG_DATA(cm), + sizeof(struct in_addr)); + src_sin.sin_family = AF_INET; + src_sin.sin_len = sizeof(src_sin); + /* no check on reuse done when sin->sin_port == 0 */ + if ((error = in_pcbaddrisavail(inp, &src_sin, 0, curproc))) + got
Re: IP_SENDSRCADDR [2/2] : add cmsg support
On Sun, 12 Jun 2016 15:29:32 +0200 (CEST) Mark Kettenis wrote: > > Date: Sun, 12 Jun 2016 14:59:55 +0200 > > From: Vincent Gross > > > > This diff adds support for IP_SENDSRCADDR cmsg on UDP sockets. As > > for udp6_output(), we check that the source address+port is > > available only if inp_laddr != * > > > > Ok ? > > Why do we need this? cmsg stuff is fragile, so we want the to keep it > as simple as possible. In iked.conf(5), you can specify the local and remote addresses to use for IKEv2 handshake. Let's say I have 192.0.2.1/25 on em0, and 192.0.2.129/25 on em1, and that I have a single udp socket bound to 0.0.0.0. I receive an IKEv2 message on em0, with 192.0.2.129 as destination address, and a source address reachable only via em0. If I reply with the receiving socket, in_selectsrc() will pick 192.0.2.1 as the reply source address, and the handshake will abort. isakmpd(8) work around this by opening one socket per local address. This means that we must either watch for RTM_NEWADDR and RTM_DELADDR, or poll using getifaddrs(3), if we want to catch all changes. This is one example, I remember other developers saying how they would benefit from this, but I can't find the conversations back :P > > > diff --git a/share/man/man4/ip.4 b/share/man/man4/ip.4 > > index 111432b..154b0d1 100644 > > --- a/share/man/man4/ip.4 > > +++ b/share/man/man4/ip.4 > > @@ -290,6 +290,27 @@ cmsg_len = CMSG_LEN(sizeof(u_int)) > > cmsg_level = IPPROTO_IP > > cmsg_type = IP_RECVRTABLE > > .Ed > > +.Pp > > +If the > > +.Dv IP_SENDSRCADDR > > +option is passed to a > > +.Xr sendmsg 2 > > +call on a > > +.Dv SOCK_DGRAM > > +socket, the address passed along the > > +.Vt cmsghdr > > +structure will be used as the source of the outgoing > > +.Tn UDP > > +datagram. The > > +.Vt cmsghdr > > +fields for > > +.Xr sendmsg 2 > > +have the following values: > > +.Bd -literal -offset indent > > +cmsg_len = CMSG_LEN(sizeof(struct in_addr)) > > +cmsg_level = IPPROTO_IP > > +cmsg_type = IP_SENDSRCADDR > > +.Ed > > .Ss "Multicast Options" > > .Tn IP > > multicasting is supported only on > > diff --git a/sys/netinet/in.h b/sys/netinet/in.h > > index adb1b30..bf8c95d 100644 > > --- a/sys/netinet/in.h > > +++ b/sys/netinet/in.h > > @@ -307,6 +307,7 @@ struct ip_opts { > > #define IP_RECVRTABLE 35 /* bool; receive rdomain > > w/dgram */ #define IP_IPSECFLOWINFO 36 /* bool; IPsec flow > > info for dgram */ #define IP_IPDEFTTL 37 /* int; > > IP TTL system default */ +#define IP_SENDSRCADDR > > 38 /* struct in_addr; source address to use */ > > #define IP_RTABLE 0x1021 /* int; routing > > table, see SO_RTABLE */ #define IP_DIVERTFL > > 0x1022 /* int; divert direction flag opt */ diff --git > > a/sys/netinet/udp_usrreq.c b/sys/netinet/udp_usrreq.c index > > 1feea11..35675b4 100644 --- a/sys/netinet/udp_usrreq.c > > +++ b/sys/netinet/udp_usrreq.c > > @@ -888,6 +888,7 @@ udp_output(struct inpcb *inp, struct mbuf *m, > > struct mbuf *addr, struct sockaddr_in *sin = NULL; > > struct udpiphdr *ui; > > u_int32_t ipsecflowinfo = 0; > > + struct sockaddr_in src_sin; > > int len = m->m_pkthdr.len; > > struct in_addr *laddr; > > int error = 0; > > @@ -906,6 +907,8 @@ udp_output(struct inpcb *inp, struct mbuf *m, > > struct mbuf *addr, goto release; > > } > > > > + memset(&src_sin, 0, sizeof(src_sin)); > > + > > if (control) { > > u_int clen; > > struct cmsghdr *cm; > > @@ -939,9 +942,16 @@ udp_output(struct inpcb *inp, struct mbuf *m, > > struct mbuf *addr, cm->cmsg_level == IPPROTO_IP && > > cm->cmsg_type == IP_IPSECFLOWINFO) { > > ipsecflowinfo = *(u_int32_t > > *)CMSG_DATA(cm); > > - break; > > - } > > + } else > > #endif > > + if (cm->cmsg_len == CMSG_LEN(sizeof(struct > > in_addr)) && > > + cm->cmsg_level == IPPROTO_IP && > > + cm->cmsg_type == IP_SENDSRCADDR) { > > + memcpy(&src_sin.sin_addr, > > CMSG_DATA(cm), > > + sizeof(struct in_addr)); > > + src_sin.sin_family = AF_INET; > > + src_sin.sin_len = sizeof(src_sin); > >
Re: IP_SENDSRCADDR [1/2] : move cmsg handling code
On Sun, 12 Jun 2016 15:00:14 +0200 Vincent Gross wrote: Damn you autowrap ! get off my diff ! (thanks jca@ for spotting) > This diff moves the cmsg handling code on top of udp_output(). I split > the whole IP_SENDSRCADDR thung in two chunks so that it's easier to > audit. > > ok ? > diff --git a/sys/netinet/udp_usrreq.c b/sys/netinet/udp_usrreq.c index 2db5998..1feea11 100644 --- a/sys/netinet/udp_usrreq.c +++ b/sys/netinet/udp_usrreq.c @@ -906,6 +906,47 @@ udp_output(struct inpcb *inp, struct mbuf *m, struct mbuf *addr, goto release; } + if (control) { + u_int clen; + struct cmsghdr *cm; + caddr_t cmsgs; + + /* +* XXX: Currently, we assume all the optional information is stored +* in a single mbuf. +*/ + if (control->m_next) { + error = EINVAL; + goto release; + } + + clen = control->m_len; + cmsgs = mtod(control, caddr_t); + do { + if (clen < CMSG_LEN(0)) { + error = EINVAL; + goto release; + } + cm = (struct cmsghdr *)cmsgs; + if (cm->cmsg_len < CMSG_LEN(0) || + CMSG_ALIGN(cm->cmsg_len) > clen) { + error = EINVAL; + goto release; + } +#ifdef IPSEC + if (ISSET(inp->inp_flags,INP_IPSECFLOWINFO) && + cm->cmsg_len == CMSG_LEN(sizeof(ipsecflowinfo)) && + cm->cmsg_level == IPPROTO_IP && + cm->cmsg_type == IP_IPSECFLOWINFO) { + ipsecflowinfo = *(u_int32_t *)CMSG_DATA(cm); + break; + } +#endif + clen -= CMSG_ALIGN(cm->cmsg_len); + cmsgs += CMSG_ALIGN(cm->cmsg_len); + } while (clen); + } + if (addr) { sin = mtod(addr, struct sockaddr_in *); @@ -947,45 +988,6 @@ udp_output(struct inpcb *inp, struct mbuf *m, struct mbuf *addr, laddr = &inp->inp_laddr; } -#ifdef IPSEC - if (control && (inp->inp_flags & INP_IPSECFLOWINFO) != 0) { - u_int clen; - struct cmsghdr *cm; - caddr_t cmsgs; - - /* -* XXX: Currently, we assume all the optional information is stored -* in a single mbuf. -*/ - if (control->m_next) { - error = EINVAL; - goto release; - } - - clen = control->m_len; - cmsgs = mtod(control, caddr_t); - do { - if (clen < CMSG_LEN(0)) { - error = EINVAL; - goto release; - } - cm = (struct cmsghdr *)cmsgs; - if (cm->cmsg_len < CMSG_LEN(0) || - CMSG_ALIGN(cm->cmsg_len) > clen) { - error = EINVAL; - goto release; - } - if (cm->cmsg_len == CMSG_LEN(sizeof(ipsecflowinfo)) && - cm->cmsg_level == IPPROTO_IP && - cm->cmsg_type == IP_IPSECFLOWINFO) { - ipsecflowinfo = *(u_int32_t *)CMSG_DATA(cm); - break; - } - clen -= CMSG_ALIGN(cm->cmsg_len); - cmsgs += CMSG_ALIGN(cm->cmsg_len); - } while (clen); - } -#endif /* * Calculate data length and get a mbuf * for UDP and IP headers.
IP_SENDSRCADDR [1/2] : move cmsg handling code
This diff moves the cmsg handling code on top of udp_output(). I split the whole IP_SENDSRCADDR thung in two chunks so that it's easier to audit. ok ? diff --git a/sys/netinet/udp_usrreq.c b/sys/netinet/udp_usrreq.c index 2db5998..1feea11 100644 --- a/sys/netinet/udp_usrreq.c +++ b/sys/netinet/udp_usrreq.c @@ -906,6 +906,47 @@ udp_output(struct inpcb *inp, struct mbuf *m, struct mbuf *addr, goto release; } + if (control) { + u_int clen; + struct cmsghdr *cm; + caddr_t cmsgs; + + /* +* XXX: Currently, we assume all the optional information is stored +* in a single mbuf. +*/ + if (control->m_next) { + error = EINVAL; + goto release; + } + + clen = control->m_len; + cmsgs = mtod(control, caddr_t); + do { + if (clen < CMSG_LEN(0)) { + error = EINVAL; + goto release; + } + cm = (struct cmsghdr *)cmsgs; + if (cm->cmsg_len < CMSG_LEN(0) || + CMSG_ALIGN(cm->cmsg_len) > clen) { + error = EINVAL; + goto release; + } +#ifdef IPSEC + if (ISSET(inp->inp_flags,INP_IPSECFLOWINFO) && + cm->cmsg_len == CMSG_LEN(sizeof(ipsecflowinfo)) && + cm->cmsg_level == IPPROTO_IP && + cm->cmsg_type == IP_IPSECFLOWINFO) { + ipsecflowinfo = *(u_int32_t *)CMSG_DATA(cm); + break; + } +#endif + clen -= CMSG_ALIGN(cm->cmsg_len); + cmsgs += CMSG_ALIGN(cm->cmsg_len); + } while (clen); + } + if (addr) { sin = mtod(addr, struct sockaddr_in *); @@ -947,45 +988,6 @@ udp_output(struct inpcb *inp, struct mbuf *m, struct mbuf *addr, laddr = &inp->inp_laddr; } -#ifdef IPSEC - if (control && (inp->inp_flags & INP_IPSECFLOWINFO) != 0) { - u_int clen; - struct cmsghdr *cm; - caddr_t cmsgs; - - /* -* XXX: Currently, we assume all the optional information is stored -* in a single mbuf. -*/ - if (control->m_next) { - error = EINVAL; - goto release; - } - - clen = control->m_len; - cmsgs = mtod(control, caddr_t); - do { - if (clen < CMSG_LEN(0)) { - error = EINVAL; - goto release; - } - cm = (struct cmsghdr *)cmsgs; - if (cm->cmsg_len < CMSG_LEN(0) || - CMSG_ALIGN(cm->cmsg_len) > clen) { - error = EINVAL; - goto release; - } - if (cm->cmsg_len == CMSG_LEN(sizeof(ipsecflowinfo)) && - cm->cmsg_level == IPPROTO_IP && - cm->cmsg_type == IP_IPSECFLOWINFO) { - ipsecflowinfo = *(u_int32_t *)CMSG_DATA(cm); - break; - } - clen -= CMSG_ALIGN(cm->cmsg_len); - cmsgs += CMSG_ALIGN(cm->cmsg_len); - } while (clen); - } -#endif /* * Calculate data length and get a mbuf * for UDP and IP headers.
IP_SENDSRCADDR [2/2] : add cmsg support
This diff adds support for IP_SENDSRCADDR cmsg on UDP sockets. As for udp6_output(), we check that the source address+port is available only if inp_laddr != * Ok ? diff --git a/share/man/man4/ip.4 b/share/man/man4/ip.4 index 111432b..154b0d1 100644 --- a/share/man/man4/ip.4 +++ b/share/man/man4/ip.4 @@ -290,6 +290,27 @@ cmsg_len = CMSG_LEN(sizeof(u_int)) cmsg_level = IPPROTO_IP cmsg_type = IP_RECVRTABLE .Ed +.Pp +If the +.Dv IP_SENDSRCADDR +option is passed to a +.Xr sendmsg 2 +call on a +.Dv SOCK_DGRAM +socket, the address passed along the +.Vt cmsghdr +structure will be used as the source of the outgoing +.Tn UDP +datagram. The +.Vt cmsghdr +fields for +.Xr sendmsg 2 +have the following values: +.Bd -literal -offset indent +cmsg_len = CMSG_LEN(sizeof(struct in_addr)) +cmsg_level = IPPROTO_IP +cmsg_type = IP_SENDSRCADDR +.Ed .Ss "Multicast Options" .Tn IP multicasting is supported only on diff --git a/sys/netinet/in.h b/sys/netinet/in.h index adb1b30..bf8c95d 100644 --- a/sys/netinet/in.h +++ b/sys/netinet/in.h @@ -307,6 +307,7 @@ struct ip_opts { #define IP_RECVRTABLE 35 /* bool; receive rdomain w/dgram */ #define IP_IPSECFLOWINFO 36 /* bool; IPsec flow info for dgram */ #define IP_IPDEFTTL37 /* int; IP TTL system default */ +#define IP_SENDSRCADDR 38 /* struct in_addr; source address to use */ #define IP_RTABLE 0x1021 /* int; routing table, see SO_RTABLE */ #define IP_DIVERTFL0x1022 /* int; divert direction flag opt */ diff --git a/sys/netinet/udp_usrreq.c b/sys/netinet/udp_usrreq.c index 1feea11..35675b4 100644 --- a/sys/netinet/udp_usrreq.c +++ b/sys/netinet/udp_usrreq.c @@ -888,6 +888,7 @@ udp_output(struct inpcb *inp, struct mbuf *m, struct mbuf *addr, struct sockaddr_in *sin = NULL; struct udpiphdr *ui; u_int32_t ipsecflowinfo = 0; + struct sockaddr_in src_sin; int len = m->m_pkthdr.len; struct in_addr *laddr; int error = 0; @@ -906,6 +907,8 @@ udp_output(struct inpcb *inp, struct mbuf *m, struct mbuf *addr, goto release; } + memset(&src_sin, 0, sizeof(src_sin)); + if (control) { u_int clen; struct cmsghdr *cm; @@ -939,9 +942,16 @@ udp_output(struct inpcb *inp, struct mbuf *m, struct mbuf *addr, cm->cmsg_level == IPPROTO_IP && cm->cmsg_type == IP_IPSECFLOWINFO) { ipsecflowinfo = *(u_int32_t *)CMSG_DATA(cm); - break; - } + } else #endif + if (cm->cmsg_len == CMSG_LEN(sizeof(struct in_addr)) && + cm->cmsg_level == IPPROTO_IP && + cm->cmsg_type == IP_SENDSRCADDR) { + memcpy(&src_sin.sin_addr, CMSG_DATA(cm), + sizeof(struct in_addr)); + src_sin.sin_family = AF_INET; + src_sin.sin_len = sizeof(src_sin); + } clen -= CMSG_ALIGN(cm->cmsg_len); cmsgs += CMSG_ALIGN(cm->cmsg_len); } while (clen); @@ -980,6 +990,17 @@ udp_output(struct inpcb *inp, struct mbuf *m, struct mbuf *addr, if (error) goto release; } + + if (src_sin.sin_len > 0 && + src_sin.sin_addr.s_addr != INADDR_ANY && + src_sin.sin_addr.s_addr != inp->inp_laddr.s_addr) { + src_sin.sin_port = inp->inp_lport; + if (inp->inp_laddr.s_addr != INADDR_ANY && + (error = + in_pcbaddrisavail(inp, &src_sin, 0, curproc))) + goto release; + laddr = &src_sin.sin_addr; + } } else { if (inp->inp_faddr.s_addr == INADDR_ANY) { error = ENOTCONN;
Re: Set prio when bypassing pf(4)
On Wed, 8 Jun 2016 15:12:23 +0200 Martin Pieuchot wrote: > On 07/06/16(Tue) 22:02, Stuart Henderson wrote: > > On 2016/06/07 21:49, Vincent Gross wrote: > > > > > > It's how henning@ set things up when integrating the new queuing > > > mechanism. > > > http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/kern/uipc_mbuf.c#rev1.160 > > > > > > > Is there any use for this apart for vlan(4) interfaces? > > > > > > AFAICT, no. > > In this case I'd suggest to make this a vlan(4) specific > configuration, is there a problem with that? Actually, there is. Consider this setup: # ifconfig vlan4 vlan 4 vlandev em0 up # ifconfig vlan5 vlan 4 vlandev em1 up # ifconfig trunk0 trunkproto failover trunkport vlan4 trunport vlan5 up # ifconfig trunk0 10.10.10.50/24 llprio in vlan4 or vlan5 is useless because they are not initiating ARP requests, and adding lookups in trunk would be the Wrong Way. This particular exemple might seem far-fetched, but I'm sure there are plenty worse actually deployed. [...] > > > I don't think we should make a special case for vlan(4), this > > > kind of detail do not belong to the arp(4) or bpf(4) layer. > > Which kind detail are you talking about? I still don't understand how > the scope if this problem is beyond vlan(4). Are you also interested > in queue priority? > What I meant is arp(4) and bpf(4) and maybe other protocols should not need to concern themselves about the nature of the device they are transmitting on. This problem manifests itself only on vlan(4) so far, because of the CoS field in the 802.1Q header. But the problem is broader than that actually, as some network protocols completely bypass pf(4), and it can have consequences as we go down into the network device stacking.
Re: Set prio when bypassing pf(4)
Le Tue, 7 Jun 2016 10:48:22 +0200, Martin Pieuchot a Ć©crit : > On 06/06/16(Mon) 23:52, Vincent Gross wrote: > > On Mon, 6 Jun 2016 17:33:36 +0100 > > Stuart Henderson wrote: > > > > > On 2016/06/06 16:15, Vincent Gross wrote: > > > > When sending ARP requests, or when writing to a bpf handle (as > > > > when sending DHCP Discover), we bypass pf(4) so we have no way > > > > to define the priority (m->m_pkthdr.pf.prio) of the outgoing > > > > packets. > > [...] > > > > > > > > This diff adds > > > > 1) an if_llprio field to struct ifnet > > > > > > struct if_data.. this is used by enough ports that changing the > > > abi > > [...] > > > > > > > diff --git a/sbin/ifconfig/ifconfig.8 > > > > b/sbin/ifconfig/ifconfig.8 > > > > > > BTW. patch warns about offsets if you apply this to -current. > > > > > [...] > > > > > > Other than these points, it seems a useful thing to do, pppoe > > > could use it too. > > > > > > I wonder what these broken ISP devices are that require the > > > priority field in the vlan frame header to be 0 (aka "prio 1")... > > > > > > > r2 below. I moved if_llprio from if_data to struct ifnet, and went > > from u_char to u_int8_t. I also added a bound check in ifioctl(). > > > > Comments ? ok ? > > Could you explain me why our default prio is 3? > It's how henning@ set things up when integrating the new queuing mechanism. http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/kern/uipc_mbuf.c#rev1.160 > Is there any use for this apart for vlan(4) interfaces? AFAICT, no. > Should it > really be part of "struct ifnet" ? > sthen@ pointed out that struct if_data was heavily used by our ports, and that such a change would require a version bump. Now, I may have overlooked a better place for it. I don't think we should make a special case for vlan(4), this kind of detail do not belong to the arp(4) or bpf(4) layer. > I also find weird to see a field inside ``m_pkthdr.pf'' being used > without pf(4). > > > Index: sbin/ifconfig/ifconfig.8 > > === > > RCS file: /cvs/src/sbin/ifconfig/ifconfig.8,v > > retrieving revision 1.267 > > diff -u -p -r1.267 ifconfig.8 > > --- sbin/ifconfig/ifconfig.86 Apr 2016 10:07:14 > > - 1.267 +++ sbin/ifconfig/ifconfig.8 6 Jun 2016 > > 21:43:46 - @@ -327,6 +327,10 @@ Disable special processing at > > the link l Change the link layer address (MAC address) of the > > interface. This should be specified as six colon-separated hex > > values, or can be chosen randomly. > > +.It Cm llprio Ar prio > > +Set the priority for link layer communications > > +.Pf ( Xr arp 4 , > > +.Xr bpf 4 ) . > > .It Cm media Op Ar type > > Set the media type of the interface to > > .Ar type . > > Index: sbin/ifconfig/ifconfig.c > > === > > RCS file: /cvs/src/sbin/ifconfig/ifconfig.c,v > > retrieving revision 1.322 > > diff -u -p -r1.322 ifconfig.c > > --- sbin/ifconfig/ifconfig.c3 May 2016 17:52:33 > > - 1.322 +++ sbin/ifconfig/ifconfig.c 6 Jun 2016 > > 21:43:46 - @@ -135,6 +135,7 @@ char name[IFNAMSIZ]; > > intflags, xflags, setaddr, setipdst, doalias; > > u_long metric, mtu; > > intrdomainid; > > +intllprio; > > intclearaddr, s; > > intnewaddr = 0; > > intaf = AF_INET; > > @@ -157,6 +158,7 @@ voidaddaf(const char *, int); > > void removeaf(const char *, int); > > void setifbroadaddr(const char *, int); > > void setifmtu(const char *, int); > > +void setifllprio(const char *, int); > > void setifnwid(const char *, int); > > void setifbssid(const char *, int); > > void setifnwkey(const char *, int); > > @@ -521,6 +523,7 @@ const structcmd { > > { "instance", NEXTARG,A_MEDIAINST, > > setmediainst }, { "inst", NEXTARG, > > A_MEDIAINST,setmediainst }, { "lladdr", > > NEXTARG,0, setiflladdr }, > > + { "llprio", NEXTARG,0, > > setifllprio }, { NULL, /*src*/ 0, > > 0, setifaddr }, { NULL, /*dst*/ > > 0, 0, setifdstaddr }, > > { NULL, /*illegal*/0,
Re: Set prio when bypassing pf(4)
On Mon, 6 Jun 2016 17:33:36 +0100 Stuart Henderson wrote: > On 2016/06/06 16:15, Vincent Gross wrote: > > When sending ARP requests, or when writing to a bpf handle (as when > > sending DHCP Discover), we bypass pf(4) so we have no way to define > > the priority (m->m_pkthdr.pf.prio) of the outgoing packets. [...] > > > > This diff adds > > 1) an if_llprio field to struct ifnet > > struct if_data.. this is used by enough ports that changing the abi [...] > > > diff --git a/sbin/ifconfig/ifconfig.8 b/sbin/ifconfig/ifconfig.8 > > BTW. patch warns about offsets if you apply this to -current. > [...] > > Other than these points, it seems a useful thing to do, pppoe could > use it too. > > I wonder what these broken ISP devices are that require the > priority field in the vlan frame header to be 0 (aka "prio 1")... > r2 below. I moved if_llprio from if_data to struct ifnet, and went from u_char to u_int8_t. I also added a bound check in ifioctl(). Comments ? ok ? Index: sbin/ifconfig/ifconfig.8 === RCS file: /cvs/src/sbin/ifconfig/ifconfig.8,v retrieving revision 1.267 diff -u -p -r1.267 ifconfig.8 --- sbin/ifconfig/ifconfig.86 Apr 2016 10:07:14 - 1.267 +++ sbin/ifconfig/ifconfig.86 Jun 2016 21:43:46 - @@ -327,6 +327,10 @@ Disable special processing at the link l Change the link layer address (MAC address) of the interface. This should be specified as six colon-separated hex values, or can be chosen randomly. +.It Cm llprio Ar prio +Set the priority for link layer communications +.Pf ( Xr arp 4 , +.Xr bpf 4 ) . .It Cm media Op Ar type Set the media type of the interface to .Ar type . Index: sbin/ifconfig/ifconfig.c === RCS file: /cvs/src/sbin/ifconfig/ifconfig.c,v retrieving revision 1.322 diff -u -p -r1.322 ifconfig.c --- sbin/ifconfig/ifconfig.c3 May 2016 17:52:33 - 1.322 +++ sbin/ifconfig/ifconfig.c6 Jun 2016 21:43:46 - @@ -135,6 +135,7 @@ charname[IFNAMSIZ]; intflags, xflags, setaddr, setipdst, doalias; u_long metric, mtu; intrdomainid; +intllprio; intclearaddr, s; intnewaddr = 0; intaf = AF_INET; @@ -157,6 +158,7 @@ voidaddaf(const char *, int); void removeaf(const char *, int); void setifbroadaddr(const char *, int); void setifmtu(const char *, int); +void setifllprio(const char *, int); void setifnwid(const char *, int); void setifbssid(const char *, int); void setifnwkey(const char *, int); @@ -521,6 +523,7 @@ const structcmd { { "instance", NEXTARG,A_MEDIAINST,setmediainst }, { "inst", NEXTARG,A_MEDIAINST,setmediainst }, { "lladdr", NEXTARG,0, setiflladdr }, + { "llprio", NEXTARG,0, setifllprio }, { NULL, /*src*/ 0, 0, setifaddr }, { NULL, /*dst*/ 0, 0, setifdstaddr }, { NULL, /*illegal*/0, 0, NULL }, @@ -854,6 +857,11 @@ getinfo(struct ifreq *ifr, int create) else rdomainid = ifr->ifr_rdomainid; #endif + if (ioctl(s, SIOCGIFLLPRIO, (caddr_t)ifr) < 0) + llprio = 0; + else + llprio = ifr->ifr_llprio; + return (0); } @@ -1411,6 +1419,21 @@ setifmtu(const char *val, int d) /* ARGSUSED */ void +setifllprio(const char *val, int d) +{ + const char *errmsg = NULL; + + (void) strlcpy(ifr.ifr_name, name, sizeof(ifr.ifr_name)); + + ifr.ifr_mtu = strtonum(val, 0, UCHAR_MAX, &errmsg); + if (errmsg) + errx(1, "mtu %s: %s", val, errmsg); + if (ioctl(s, SIOCSIFLLPRIO, (caddr_t)&ifr) < 0) + warn("SIOCSIFLLPRIO"); +} + +/* ARGSUSED */ +void setifgroup(const char *group_name, int dummy) { struct ifgroupreq ifgr; @@ -2894,6 +2917,7 @@ status(int link, struct sockaddr_dl *sdl printf(" metric %lu", metric); if (mtu) printf(" mtu %lu", mtu); + printf(" llprio %lu", llprio); putchar('\n'); #ifndef SMALL if (showcapsflag) Index: sys/net/bpf.c === RCS file: /cvs/src/sys/net/bpf.c,v retrieving revision 1.141 diff -u -p -r1.141 bpf.c --- sys/net/bpf.c 18 May 2016 03:46:03 - 1.141 +++ sys/net/bpf.c 6 Jun 2016 21:43:48 - @@ -561,6 +561,7 @@ bpfwrite(dev_t dev, struct uio *uio, int } m->m_pkthdr.ph_rtableid = ifp->if_rdomain; + m->m_pkthdr.pf.prio = ifp->if_llprio; if (d->bd_hdrcmplt && dst.ss
Set prio when bypassing pf(4)
When sending ARP requests, or when writing to a bpf handle (as when sending DHCP Discover), we bypass pf(4) so we have no way to define the priority (m->m_pkthdr.pf.prio) of the outgoing packets. My ISP runs two vlans to separate the delivery of general-purpose internet and TV/phone over fiber; on the internet vlan, any frame with a priority different from 0 is dropped; because we use m_pkthdr.pf.prio to define this priority, and the default priority IFQ_DEFPRIO == 3, all of my ARP and DHCP frames are dropped when I use a stock OpenBSD kernel. This diff adds 1) an if_llprio field to struct ifnet 2) the "llprio" keyword to ifconfig(8) and its manpage 3) code to init m_pkthdr.pf.prio from ifp->if_llprio when doing arp(4) and bpf(4) Don't forget to install the new headers before rebuilding ifconfig(8). Comments ? diff --git a/sbin/ifconfig/ifconfig.8 b/sbin/ifconfig/ifconfig.8 index c301a90..1f42e41 100644 --- a/sbin/ifconfig/ifconfig.8 +++ b/sbin/ifconfig/ifconfig.8 @@ -327,6 +327,10 @@ Disable special processing at the link level with the specified interface. Change the link layer address (MAC address) of the interface. This should be specified as six colon-separated hex values, or can be chosen randomly. +.It Cm llprio Ar prio +Set the priority for link layer communications +.Pf ( Xr arp 4 , +.Xr bpf 4 ) . .It Cm media Op Ar type Set the media type of the interface to .Ar type . diff --git a/sbin/ifconfig/ifconfig.c b/sbin/ifconfig/ifconfig.c index c30ced5..c1e3594 100644 --- a/sbin/ifconfig/ifconfig.c +++ b/sbin/ifconfig/ifconfig.c @@ -135,6 +135,7 @@ charname[IFNAMSIZ]; intflags, xflags, setaddr, setipdst, doalias; u_long metric, mtu; intrdomainid; +intllprio; intclearaddr, s; intnewaddr = 0; intaf = AF_INET; @@ -157,6 +158,7 @@ voidaddaf(const char *, int); void removeaf(const char *, int); void setifbroadaddr(const char *, int); void setifmtu(const char *, int); +void setifllprio(const char *, int); void setifnwid(const char *, int); void setifbssid(const char *, int); void setifnwkey(const char *, int); @@ -521,6 +523,7 @@ const structcmd { { "instance", NEXTARG,A_MEDIAINST,setmediainst }, { "inst", NEXTARG,A_MEDIAINST,setmediainst }, { "lladdr", NEXTARG,0, setiflladdr }, + { "llprio", NEXTARG,0, setifllprio }, { NULL, /*src*/ 0, 0, setifaddr }, { NULL, /*dst*/ 0, 0, setifdstaddr }, { NULL, /*illegal*/0, 0, NULL }, @@ -854,6 +857,11 @@ getinfo(struct ifreq *ifr, int create) else rdomainid = ifr->ifr_rdomainid; #endif + if (ioctl(s, SIOCGIFLLPRIO, (caddr_t)ifr) < 0) + llprio = 0; + else + llprio = ifr->ifr_llprio; + return (0); } @@ -1411,6 +1419,21 @@ setifmtu(const char *val, int d) /* ARGSUSED */ void +setifllprio(const char *val, int d) +{ + const char *errmsg = NULL; + + (void) strlcpy(ifr.ifr_name, name, sizeof(ifr.ifr_name)); + + ifr.ifr_mtu = strtonum(val, 0, UCHAR_MAX, &errmsg); + if (errmsg) + errx(1, "mtu %s: %s", val, errmsg); + if (ioctl(s, SIOCSIFLLPRIO, (caddr_t)&ifr) < 0) + warn("SIOCSIFLLPRIO"); +} + +/* ARGSUSED */ +void setifgroup(const char *group_name, int dummy) { struct ifgroupreq ifgr; @@ -2894,6 +2917,7 @@ status(int link, struct sockaddr_dl *sdl, int ls) printf(" metric %lu", metric); if (mtu) printf(" mtu %lu", mtu); + printf(" llprio %lu", llprio); putchar('\n'); #ifndef SMALL if (showcapsflag) diff --git a/sys/net/bpf.c b/sys/net/bpf.c index 31b6ed0..d2f1060 100644 --- a/sys/net/bpf.c +++ b/sys/net/bpf.c @@ -561,6 +561,7 @@ bpfwrite(dev_t dev, struct uio *uio, int ioflag) } m->m_pkthdr.ph_rtableid = ifp->if_rdomain; + m->m_pkthdr.pf.prio = ifp->if_llprio; if (d->bd_hdrcmplt && dst.ss_family == AF_UNSPEC) dst.ss_family = pseudo_AF_HDRCMPLT; diff --git a/sys/net/if.c b/sys/net/if.c index 9b53bf1..e155b77 100644 --- a/sys/net/if.c +++ b/sys/net/if.c @@ -536,6 +536,7 @@ if_attach_common(struct ifnet *ifp) M_TEMP, M_WAITOK|M_ZERO); ifp->if_linkstatetask = malloc(sizeof(*ifp->if_linkstatetask), M_TEMP, M_WAITOK|M_ZERO); + ifp->if_llprio = IFQ_DEFPRIO; SRPL_INIT(&ifp->if_inputs); } @@ -1988,6 +1989,16 @@ ifioctl(struct socket *so, u_long cmd, caddr_t data, struct proc *p) ifnewlladdr(ifp); break; + case SIOCGIFLLPRIO: + ifr->ifr_llprio = ifp->if_llprio; + break; + + case SIOCSIFLLPRIO: + if ((error = suser(p, 0))) + return (error); + ifp->if_llprio = ifr->ifr_ll
Re: ifa_ifwithroute() fix
On Tue, 31 May 2016 09:51:10 +0200 Martin Pieuchot wrote: > On 19/04/16(Tue) 10:43, Martin Pieuchot wrote: > > Mart TƵnso reported [0] a weird case related to the use of > > ifa_ifwithnet(). > > > > The problem is that ifa_ifwithroute() does not always use route > > entries but the poor's man routing table: ifa_ifwithnet(). This is > > misleading because one cannot understand why "# route add" is not > > coherent with "# route get". > > > > So I'd like to commit the diff below which always use the route > > table unless an interface index is specified in the gateway. Mart > > TƵnso confirmed it fixes his issue. > > > > ok? > > Anyone? ok vgross@ > > > > > [0] https://marc.info/?l=openbsd-misc&m=146046751201006&w=2 > > > > > > Index: net/route.c > > === > > RCS file: /cvs/src/sys/net/route.c,v > > retrieving revision 1.298 > > diff -u -p -r1.298 route.c > > --- net/route.c 26 Mar 2016 21:56:04 - 1.298 > > +++ net/route.c 13 Apr 2016 07:38:11 - > > @@ -740,20 +740,16 @@ ifa_ifwithroute(int flags, struct sockad > > ifa = ifaof_ifpforaddr(dst, ifp); > > if_put(ifp); > > } else { > > - ifa = ifa_ifwithnet(gateway, rtableid); > > - } > > - } > > - if (ifa == NULL) { > > - struct rtentry *rt = rtalloc(gateway, 0, > > rtableid); > > - /* The gateway must be local if the same address > > family. */ > > - if (!rtisvalid(rt) || ((rt->rt_flags & > > RTF_GATEWAY) && > > - rt_key(rt)->sa_family == dst->sa_family)) { > > + struct rtentry *rt; > > + > > + rt = rtalloc(gateway, RT_RESOLVE, > > rtableid); > > + if (rt != NULL) > > + ifa = rt->rt_ifa; > > rtfree(rt); > > - return (NULL); > > } > > - ifa = rt->rt_ifa; > > - rtfree(rt); > > } > > + if (ifa == NULL) > > + return (NULL); > > if (ifa->ifa_addr->sa_family != dst->sa_family) { > > struct ifaddr *oifa = ifa; > > ifa = ifaof_ifpforaddr(dst, ifa->ifa_ifp); > > >
Preserve DiffServ when fragmenting ipv4
When fragmenting ipv4, we do not preserve DiffServ/ToS field. Here is how to observe this : [obsd1](vlan10) (vlan10)[obsd2](vlan20) --mtu600-- (vlan20)[obsd3] root@obsd2 # sysctl net.inet.ip.forwarding=1 root@obsd2 # tcpdump -ni $VLAN20DEV user@obsd3 $ nc -4ul root@obsd1 $ echo "pass set prio 1" | pfctl -f - user@obsd1 $ perl -e 'print "a"x800' | nc -4u $OBSD3VLAN20 tcpdump: listening on vio0, link-type EN10MB 11:34.26.588937 802.1Q vid 10 pri 0 10.10.0.10.45095 > 10.20.0.20.: udp 800 11:34.26.589121 802.1Q vid 20 pri 0 10.10.0.10.45095 > 10.20.0.20.: udp 800 (frag 26935:576@0+) 11:34.26.589152 802.1Q vid 20 pri 3 10.10.0.10 > 10.20.0.20: (frag 26935:232@576) Diff below ensures the fragmented packets have the same priority. Ok ? diff --git a/sys/netinet/ip_output.c b/sys/netinet/ip_output.c index d0b15f8..5921566 100644 --- a/sys/netinet/ip_output.c +++ b/sys/netinet/ip_output.c @@ -678,9 +678,10 @@ ip_fragment(struct mbuf *m, struct ifnet *ifp, u_long mtu) m->m_data += max_linkhdr; mhip = mtod(m, struct ip *); *mhip = *ip; - /* we must inherit MCAST and BCAST flags and routing table */ + /* we must inherit MCAST/BCAST flags, routing table and prio */ m->m_flags |= m0->m_flags & (M_MCAST|M_BCAST); m->m_pkthdr.ph_rtableid = m0->m_pkthdr.ph_rtableid; + m->m_pkthdr.pf.prio = m0->m_pkthdr.pf.prio; if (hlen > sizeof (struct ip)) { mhlen = ip_optcopy(ip, mhip) + sizeof (struct ip); mhip->ip_hl = mhlen >> 2;
Re: arm: new FDT-enabled mainbus
On Sun, 1 May 2016 13:27:29 +0200 Patrick Wildt wrote: > Hi, > > I updated the diff with the feedback received. This basically adds > a tree-like topology by making mainbus FDT aware and implementing > a simplebus that can span the tree's roots into more branches. > > Next steps (and diffs) are implementing an FDT platform for armv7, > similar to imx/omap/... and having the generic interrupt controller > and timer attach to a simplebus/fdt bus. > > Comments? Legacy boot works fine on my novena laptop, fdt boot fails, I lack expertise but it seems that cortex0 and its children were skipped. Thank you for pushing a better arm support :) Both output below : ## Booting kernel from Legacy Image at 1030 ... Image Name: boot Image Type: ARM Linux Kernel Image (uncompressed) Data Size:4993156 Bytes = 4.8 MiB Load Address: 1030 Entry Point: 1030 Verifying Checksum ... OK Loading Kernel Image ... OK Using machid 0x10ad from environment Starting kernel ... OpenBSD/imx booting ... arg0 0x0 arg1 0x10ad arg2 0x1100 atag core flags 0 pagesize 0 rootdev 0 atag cmdline [sd0i:/bsd.imx.umg] atag revision 00063012 atag mem start 0x1000 size 0xf000 bootfile: sd0i:/bsd.imx.umg bootargs: memory size derived from u-boot bootconf.mem[0].address = 1000 pages 983040/0xf000 Allocating page tables freestart = 0x107c4000, free_pages = 981051 (0x000ef83b) IRQ stack: p0x107f2000 v0xc07f2000 ABT stack: p0x107f3000 v0xc07f3000 UND stack: p0x107f4000 v0xc07f4000 SVC stack: p0x107f5000 v0xc07f5000 Creating L1 page table at 0x107c4000 Mapping kernel Constructing L2 page tables undefined page pmap [ using 715008 bytes of bsd ELF symbol table ] board type: 4269 Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. Copyright (c) 1995-2016 OpenBSD. All rights reserved. http://www.OpenBSD.org OpenBSD 5.9-current (GENERIC) #31: Fri Apr 29 22:42:20 CET 2016 dermi...@russell.kilb.yt:/home/dermiste/OpenBSD/srcsys/arch/armv7/compile/GENERIC real mm = 4026527744 (3839MB) avail mem = 3940864000 (3758MB) warning: no entropy supplied by boot lader mainbus0 at root: no device tree cortex0 at mainbus0 ampintc0 at cortex0 nirq 160 amptimer0 at cortex0: tick rate 396000 KHz armliicc0 at cortex0: rtl 7 waymask: 0x000f cpu0 at mainbus0: ARM Cortex A9 R2 rev 10 (ARMv7 core) cpu0: DC enabled IC enabled WB disabled EABT branch prediction enabled cpu0: 32KB(32b/l,4way) I-cache, 32KB(32b/l,4way) wr-back D-cache imx0 at mainbus0: Kosagi Novena imxccm0 at imx0: imx6 rev 1.2 CPU freq: 792 MHz imxiomuxc0 at imx0 imxdog0 at imx0 imxocotp0 at imx0 imxuart0 at imx0 console imxgpio0 at imx0 imxgpio1 at imx0 imxgpio2 at imx0 imxgpio3 at imx0 imxgpio4 at imx0 imxgpio5 at imx0 imxgpio6 at imx0 imxesdhc0 at imx0 sdmmc0 at imxesdhc0 imxesdhc1 at imx0 sdmmc1 at imxesdhc1 ehci0 at imx0 usb0 at ehci0: USB revision 2.0 uhub0 at usb0 "i.MX6 EHCI root hub" rev 2.00/1.00 addr 1 imxenet0 at imx0 imxenet0: address 00:1f:11:02:17:de ukphy0 at imxenet0 phy 7: Generic IEEE 802.3u media interface, rev. 1: OUI 0x000885, model 0x0021 ahci0 at imx0 AHCI 1.3 ahci0: port 0: 3.0Gb/s scsibus0 at ahci0: 32 targets sd0 at scsibus0 targ 0 lun 0: SCSI3 0/direct fixed naa.5002538da0003f36 sd0: 238475MB, 512 bytes/sector, 488397168 sectors, thin scsibus1 at sdmmc1: 2 targets, initiator 0 sd1 at scsibus1 targ 1 lun 0: SCSI2 0/direct fixed sd1: 7600MB, 512 bytes/sector, 15564800 sectors uhub1 at uhub0 port 1 "Genesys Logic USB2.0 Hub Charger" rev 2.00/1.97 addr 2 axe0 at uhub1 port 2 configuration 1 interface 0 "ASIX Electronics AX88772B" rev 2.00/0.01 addr 3 axe0: AX88772B, address 00:0e:c6:87:72:01 ukphy1 at axe0 phy 16: Generic IEEE 802.3u media interface, rev. 1: OUI 0x000ec6, model 0x0008 ugen0 at uhub1 port 3 "AsureWave product 0x3393" rev 1.10/0.01 addr 4 uhub2 at uhub1 port 4 "Genesys Logic USB2.0 Hub Charger" rev 2.00/1.97 addr 5 vscsi0 at root scsibus2 at vscsi0: 256 targets softraid0 at root scsibus3 at softraid0: 256 targets boot device: sd0 root on sd0a (46b8d5734c644ff3.a) swap on sd0b dump on sd0b ## Booting kernel from Legacy Image at 1030 ... Image Name: boot Image Type: ARM Linux Kernel Image (uncompressed) Data Size:4993156 Bytes = 4.8 MiB Load Address: 1030 Entry Point: 1030 Verifying Checksum ... OK ## Flattened Device Tree blob at 1010 Booting using the fdt blob at 0x1010 Loading Kernel Image ... OK reserving fdt memory region: addr=1010 size=b000 Using Device Tree in place at 1010, end 1010dfff Using machid 0x10ad from environment Starting kernel ... OpenBSD/imx booting ... arg0 0x0 arg1 0x10ad arg2 0x1010 Allocating page tables freestart = 0x107c4000, free_pages = 981051 (0x000ef83b) IRQ stack: p0x107f2000 v0xc07f2000 ABT stack: p0x107f3000 v0xc07f3000 UND stack: p0x107f4000 v0xc07f4000 SVC stack: p0x107f5000 v0xc07f5000 Creating L1 page table at 0x107c4000
Simplify in_pcblookup()
in_pcblookup() is always called with *:0 for the remote side. Remove the useless bits, shuffle the tests around and it's much easier to audit. Ok ? Index: netinet/in_pcb.c === RCS file: /cvs/src/sys/netinet/in_pcb.c,v retrieving revision 1.201 diff -u -p -r1.201 in_pcb.c --- netinet/in_pcb.c8 Apr 2016 14:34:21 - 1.201 +++ netinet/in_pcb.c9 Apr 2016 09:42:07 - @@ -415,14 +415,13 @@ in_pcbaddrisavail(struct inpcb *inp, str struct inpcb *t; if (so->so_euid) { - t = in_pcblookup(table, &zeroin_addr, 0, - &sin->sin_addr, lport, INPLOOKUP_WILDCARD, - inp->inp_rtableid); + t = in_pcblookup_local(table, &sin->sin_addr, lport, + INPLOOKUP_WILDCARD, inp->inp_rtableid); if (t && (so->so_euid != t->inp_socket->so_euid)) return (EADDRINUSE); } - t = in_pcblookup(table, &zeroin_addr, 0, - &sin->sin_addr, lport, wild, inp->inp_rtableid); + t = in_pcblookup_local(table, &sin->sin_addr, lport, + wild, inp->inp_rtableid); if (t && (reuseport & t->inp_socket->so_options) == 0) return (EADDRINUSE); } @@ -475,8 +474,8 @@ in_pcbpickport(u_int16_t *lport, void *l candidate = lower; localport = htons(candidate); } while (in_baddynamic(localport, so->so_proto->pr_protocol) || - in_pcblookup(table, &zeroin46_addr, 0, - laddr, localport, wild, inp->inp_rtableid)); + in_pcblookup_local(table, laddr, localport, wild, + inp->inp_rtableid)); *lport = localport; return (0); @@ -734,14 +733,14 @@ in_rtchange(struct inpcb *inp, int errno } struct inpcb * -in_pcblookup(struct inpcbtable *table, void *faddrp, u_int fport_arg, -void *laddrp, u_int lport_arg, int flags, u_int rdomain) +in_pcblookup_local(struct inpcbtable *table, void *laddrp, u_int lport_arg, +int flags, u_int rdomain) { struct inpcb *inp, *match = NULL; int matchwild = 3, wildcard; - u_int16_t fport = fport_arg, lport = lport_arg; - struct in_addr faddr = *(struct in_addr *)faddrp; + u_int16_t lport = lport_arg; struct in_addr laddr = *(struct in_addr *)laddrp; + struct in6_addr *laddr6 = (struct in6_addr *)laddrp; struct inpcbhead *head; rdomain = rtable_l2(rdomain); /* convert passed rtableid to rdomain */ @@ -753,60 +752,40 @@ in_pcblookup(struct inpcbtable *table, v continue; wildcard = 0; #ifdef INET6 - if (flags & INPLOOKUP_IPV6) { - struct in6_addr *laddr6 = (struct in6_addr *)laddrp; - struct in6_addr *faddr6 = (struct in6_addr *)faddrp; - - if (!(inp->inp_flags & INP_IPV6)) + if (ISSET(flags, INPLOOKUP_IPV6)) { + if (!ISSET(inp->inp_flags, INP_IPV6)) continue; - if (!IN6_IS_ADDR_UNSPECIFIED(&inp->inp_laddr6)) { - if (IN6_IS_ADDR_UNSPECIFIED(laddr6)) - wildcard++; - else if (!IN6_ARE_ADDR_EQUAL(&inp->inp_laddr6, laddr6)) - continue; - } else { - if (!IN6_IS_ADDR_UNSPECIFIED(laddr6)) - wildcard++; - } + if (!IN6_IS_ADDR_UNSPECIFIED(&inp->inp_faddr6)) + wildcard++; - if (!IN6_IS_ADDR_UNSPECIFIED(&inp->inp_faddr6)) { - if (IN6_IS_ADDR_UNSPECIFIED(faddr6)) + if (!IN6_ARE_ADDR_EQUAL(&inp->inp_laddr6, laddr6)) { + if (IN6_IS_ADDR_UNSPECIFIED(&inp->inp_laddr6) || + IN6_IS_ADDR_UNSPECIFIED(laddr6)) wildcard++; - else if (!IN6_ARE_ADDR_EQUAL(&inp->inp_faddr6, - faddr6) || inp->inp_fport != fport) + else continue; - } else { - if (!IN6_IS_ADDR_UNSPECIFIED(faddr6)) - wildcard++; } + } else #endif /* INET6 */ { #ifdef INET6 - if (inp->inp_flags & INP_IPV6) + if (ISSET(inp->inp_flags, INP_IPV6)) continue; #endif /* INET6 */ -
Remove long-dead and confusing code on rip6_ctlinput()
When using raw ip6 socket, one can connect(2) then send(2), or just sendto(2). The code below would try to find the non-connected raw ip6 socket corresponding to an incoming icmp6 message, to deliver the failure. This code has been disabled ever since it has been put in-tree, justifiably so because we are doing a wildcard socket search based on barely-checked external input. Better remove it altogether and prevent future useless head-scratching. Ok? Index: netinet6/raw_ip6.c === RCS file: /cvs/src/sys/netinet6/raw_ip6.c,v retrieving revision 1.89 diff -u -p -r1.89 raw_ip6.c --- netinet6/raw_ip6.c 29 Mar 2016 11:57:51 - 1.89 +++ netinet6/raw_ip6.c 8 Apr 2016 17:55:24 - @@ -285,21 +285,6 @@ rip6_ctlinput(int cmd, struct sockaddr * */ in6p = in6_pcbhashlookup(&rawin6pcbtable, &sa6->sin6_addr, 0, &sa6_src->sin6_addr, 0, rdomain); -#if 0 - if (!in6p) { - /* -* As the use of sendto(2) is fairly popular, -* we may want to allow non-connected pcb too. -* But it could be too weak against attacks... -* We should at least check if the local -* address (= s) is really ours. -*/ - in6p = in_pcblookup(&rawin6pcbtable, &sa6->sin6_addr, 0, - (struct in6_addr *)&sa6_src->sin6_addr, 0, - INPLOOKUP_WILDCARD | INPLOOKUP_IPV6, - rdomain); - } -#endif if (in6p && in6p->inp_ipv6.ip6_nxt && in6p->inp_ipv6.ip6_nxt == nxt)
Fix regression in in_pcbbind(), bug observable with INET6
The regression test in regress/sys/netinet6/autoport is failing because my merge of in_pcbbind() and in6_pcbbind() introduced a bug. Long story short, if nam == NULL, then you skip the part where you check if the socket is already bound based on inp_laddr/inp_laddr6. Also INPLOOKUP_IPV6 is not set when dealing with INET6 sockets. The diff below fixes that by moving the 'if (nam){}' inside the 'switch (sotopf(so)){}'. Ok ? Index: in_pcb.c === RCS file: /cvs/src/sys/netinet/in_pcb.c,v retrieving revision 1.200 diff -u -p -r1.200 in_pcb.c --- in_pcb.c5 Apr 2016 21:21:41 - 1.200 +++ in_pcb.c6 Apr 2016 15:03:50 - @@ -297,35 +297,36 @@ in_pcbbind(struct inpcb *inp, struct mbu (so->so_options & SO_ACCEPTCONN) == 0)) wild = INPLOOKUP_WILDCARD; - if (nam) { - switch (sotopf(so)) { + switch (sotopf(so)) { #ifdef INET6 - case PF_INET6: { - struct sockaddr_in6 *sin6; - if (TAILQ_EMPTY(&in6_ifaddr)) - return (EADDRNOTAVAIL); - if (!IN6_IS_ADDR_UNSPECIFIED(&inp->inp_laddr6)) - return (EINVAL); + case PF_INET6: + if (TAILQ_EMPTY(&in6_ifaddr)) + return (EADDRNOTAVAIL); + if (!IN6_IS_ADDR_UNSPECIFIED(&inp->inp_laddr6)) + return (EINVAL); + wild |= INPLOOKUP_IPV6; + if (nam) { + struct sockaddr_in6 *sin6; sin6 = mtod(nam, struct sockaddr_in6 *); if (nam->m_len != sizeof(struct sockaddr_in6)) return (EINVAL); if (sin6->sin6_family != AF_INET6) return (EAFNOSUPPORT); - wild |= INPLOOKUP_IPV6; if ((error = in6_pcbaddrisavail(inp, sin6, wild, p))) return (error); laddr = &sin6->sin6_addr; lport = sin6->sin6_port; - break; } + break; #endif - case PF_INET: { - struct sockaddr_in *sin; - if (inp->inp_laddr.s_addr != INADDR_ANY) - return (EINVAL); + case PF_INET: + if (inp->inp_laddr.s_addr != INADDR_ANY) + return (EINVAL); + if (nam) { + struct sockaddr_in *sin; sin = mtod(nam, struct sockaddr_in *); if (nam->m_len != sizeof(*sin)) return (EINVAL); @@ -336,11 +337,10 @@ in_pcbbind(struct inpcb *inp, struct mbu return (error); laddr = &sin->sin_addr; lport = sin->sin_port; - break; - } - default: - return (EINVAL); } + break; + default: + return (EINVAL); } if (lport == 0) {
new diff for reserved ports checks [2/2] Was: Re: move "privileged port" check out of in(6)_pcbaddrisavail()
On 03/31/16 14:07, Alexander Bluhm wrote: > On Wed, Mar 30, 2016 at 10:44:14PM +0200, Vincent Gross wrote: >> This diff moves the "are we binding to a privileged port while not being >> root ?" >> check from in(6)_pcbaddrisavail() to in_pcbbind(). > >> --- sys/netinet/in_pcb.c 26 Mar 2016 21:56:04 - 1.198 >> +++ sys/netinet/in_pcb.c 30 Mar 2016 20:33:00 - >> @@ -341,9 +341,14 @@ in_pcbbind(struct inpcb *inp, struct mbu >> } >> } >> >> -if (lport == 0) >> +if (lport == 0) { >> if ((error = in_pcbpickport(&lport, wild, inp, p))) >> return (error); >> +} else { >> +if (ntohs(lport) < IPPORT_RESERVED && >> +(error = suser(p, 0))) >> +return (EACCES); >> +} >> inp->inp_lport = lport; > > At this point inp has already been modified. So when we bail out > with EACCES here, we have a partially successful system call. > > Move the assignments > inp->inp_laddr6 = sin6->sin6_addr; > inp->inp_laddr = sin->sin_addr; > down after the return (EACCES). > > Looks like that return (error) was wrong before. diff --git a/sys/netinet/in_pcb.c b/sys/netinet/in_pcb.c index 1ff0056..63b3357 100644 --- a/sys/netinet/in_pcb.c +++ b/sys/netinet/in_pcb.c @@ -343,9 +343,22 @@ in_pcbbind(struct inpcb *inp, struct mbuf *nam, struct proc *p) } } - if (lport == 0) + if (lport == 0) { if ((error = in_pcbpickport(&lport, laddr, wild, inp, p))) return (error); + } else { + /* +* Question: Do we wish to continue the Berkeley +* tradition of ports < IPPORT_RESERVED be only for +* root? +* Answer: For now yes, but IMHO, it should be REMOVED! +* OUCH: One other thing, is there no better way of +* finding a process for a socket instead of using +* curproc? (Marked with BSD's {in,}famous XXX ? +*/ + if (ntohs(lport) < IPPORT_RESERVED && (error = suser(p, 0))) + return (EACCES); + } if (nam) { switch (sotopf(so)) { #ifdef INET6 @@ -371,7 +384,6 @@ in_pcbaddrisavail(struct inpcb *inp, struct sockaddr_in *sin, int wild, struct inpcbtable *table = inp->inp_table; u_int16_t lport = sin->sin_port; int reuseport = (so->so_options & SO_REUSEPORT); - int error; if (IN_MULTICAST(sin->sin_addr.s_addr)) { /* @@ -411,10 +423,6 @@ in_pcbaddrisavail(struct inpcb *inp, struct sockaddr_in *sin, int wild, if (lport) { struct inpcb *t; - /* GROSS */ - if (ntohs(lport) < IPPORT_RESERVED && - (error = suser(p, 0))) - return (EACCES); if (so->so_euid) { t = in_pcblookup(table, &zeroin_addr, 0, &sin->sin_addr, lport, INPLOOKUP_WILDCARD, diff --git a/sys/netinet6/in6_pcb.c b/sys/netinet6/in6_pcb.c index 4fde210..c11b936 100644 --- a/sys/netinet6/in6_pcb.c +++ b/sys/netinet6/in6_pcb.c @@ -158,7 +158,6 @@ in6_pcbaddrisavail(struct inpcb *inp, struct sockaddr_in6 *sin6, int wild, struct inpcbtable *table = inp->inp_table; u_short lport = sin6->sin6_port; int reuseport = (so->so_options & SO_REUSEPORT); - int error; wild |= INPLOOKUP_IPV6; /* KAME hack: embed scopeid */ @@ -217,17 +216,6 @@ in6_pcbaddrisavail(struct inpcb *inp, struct sockaddr_in6 *sin6, int wild, if (lport) { struct inpcb *t; - /* -* Question: Do we wish to continue the Berkeley -* tradition of ports < IPPORT_RESERVED be only for -* root? -* Answer: For now yes, but IMHO, it should be REMOVED! -* OUCH: One other thing, is there no better way of -* finding a process for a socket instead of using -* curproc? (Marked with BSD's {in,}famous XXX ? -*/ - if (ntohs(lport) < IPPORT_RESERVED && (error = suser(p, 0))) - return error; if (so->so_euid) { t = in_pcblookup(table, (struct in_addr *)&zeroin6_addr, 0,
new diff for reserved ports checks [1/2] Was: Re: move "privileged port" check out of in(6)_pcbaddrisavail()
On 03/31/16 14:07, Alexander Bluhm wrote: > On Wed, Mar 30, 2016 at 10:44:14PM +0200, Vincent Gross wrote: >> This diff moves the "are we binding to a privileged port while not being >> root ?" >> check from in(6)_pcbaddrisavail() to in_pcbbind(). > >> --- sys/netinet/in_pcb.c 26 Mar 2016 21:56:04 - 1.198 >> +++ sys/netinet/in_pcb.c 30 Mar 2016 20:33:00 - >> @@ -341,9 +341,14 @@ in_pcbbind(struct inpcb *inp, struct mbu >> } >> } >> >> -if (lport == 0) >> +if (lport == 0) { >> if ((error = in_pcbpickport(&lport, wild, inp, p))) >> return (error); >> +} else { >> +if (ntohs(lport) < IPPORT_RESERVED && >> +(error = suser(p, 0))) >> +return (EACCES); >> +} >> inp->inp_lport = lport; > > At this point inp has already been modified. So when we bail out > with EACCES here, we have a partially successful system call. > > Move the assignments > inp->inp_laddr6 = sin6->sin6_addr; > inp->inp_laddr = sin->sin_addr; > down after the return (EACCES). > > Looks like that return (error) was wrong before. in_pcbpickport() need the local address, so I extend the prototype and keep a void * to the sin(6)_addr or zeroin46_addr. And while at it, I set the INPLOOKUP_IPV6 flag which will be needed in in_pcbpickport(). Ok ? Index: netinet/in_pcb.c === RCS file: /cvs/src/sys/netinet/in_pcb.c,v retrieving revision 1.198 diff -u -p -r1.198 in_pcb.c --- netinet/in_pcb.c26 Mar 2016 21:56:04 - 1.198 +++ netinet/in_pcb.c3 Apr 2016 19:16:37 - @@ -286,6 +286,7 @@ in_pcbbind(struct inpcb *inp, struct mbu struct socket *so = inp->inp_socket; u_int16_t lport = 0; int wild = 0; + void *laddr = &zeroin46_addr; int error; if (inp->inp_lport) @@ -312,9 +313,10 @@ in_pcbbind(struct inpcb *inp, struct mbu if (sin6->sin6_family != AF_INET6) return (EAFNOSUPPORT); + wild |= INPLOOKUP_IPV6; if ((error = in6_pcbaddrisavail(inp, sin6, wild, p))) return (error); - inp->inp_laddr6 = sin6->sin6_addr; + laddr = &sin6->sin6_addr; lport = sin6->sin6_port; break; } @@ -332,7 +334,7 @@ in_pcbbind(struct inpcb *inp, struct mbu if ((error = in_pcbaddrisavail(inp, sin, wild, p))) return (error); - inp->inp_laddr = sin->sin_addr; + laddr = &sin->sin_addr; lport = sin->sin_port; break; } @@ -342,8 +344,20 @@ in_pcbbind(struct inpcb *inp, struct mbu } if (lport == 0) - if ((error = in_pcbpickport(&lport, wild, inp, p))) + if ((error = in_pcbpickport(&lport, laddr, wild, inp, p))) return (error); + if (nam) { + switch (sotopf(so)) { +#ifdef INET6 + case PF_INET6: + inp->inp_laddr6 = *(struct in6_addr *)laddr; + break; +#endif + case PF_INET: + inp->inp_laddr = *(struct in_addr *)laddr; + break; + } + } inp->inp_lport = lport; in_pcbrehash(inp); return (0); @@ -418,12 +432,12 @@ in_pcbaddrisavail(struct inpcb *inp, str } int -in_pcbpickport(u_int16_t *lport, int wild, struct inpcb *inp, struct proc *p) +in_pcbpickport(u_int16_t *lport, void *laddr, int wild, struct inpcb *inp, +struct proc *p) { struct socket *so = inp->inp_socket; struct inpcbtable *table = inp->inp_table; u_int16_t first, last, lower, higher, candidate, localport; - void *laddr; int count; if (inp->inp_flags & INP_HIGHPORT) { @@ -453,10 +467,6 @@ in_pcbpickport(u_int16_t *lport, int wil count = higher - lower; candidate = lower + arc4random_uniform(count); - if (sotopf(so) == PF_INET6) - laddr = &inp->inp_laddr6; - else - laddr = &inp->inp_laddr; do { if (count-- < 0)/* completely used? */ Index: netinet/in_pcb.h === RCS file: /cvs/src/sys/netinet/in_pcb.h,v retrieving revision 1.96 diff -u -p -r1.96 in_pcb.h --- netinet/in_pcb.h
move "privileged port" check out of in(6)_pcbaddrisavail()
Hello, This diff moves the "are we binding to a privileged port while not being root ?" check from in(6)_pcbaddrisavail() to in_pcbbind(). This way we have a cleaner separation between "is the resource available ?" and "am I allowed to access the resource ?" (which may or may not get its own function later). Also, it unbreaks naddy@'s iked setup (ikev2:sendmsg([::]:500) => in6_selectsrc() != in6p->inp_laddr6 => in6_pcbaddrisavail() => EPERM). Ok ? Index: sys/netinet/in_pcb.c === RCS file: /cvs/src/sys/netinet/in_pcb.c,v retrieving revision 1.198 diff -u -p -r1.198 in_pcb.c --- sys/netinet/in_pcb.c26 Mar 2016 21:56:04 - 1.198 +++ sys/netinet/in_pcb.c30 Mar 2016 20:33:00 - @@ -341,9 +341,14 @@ in_pcbbind(struct inpcb *inp, struct mbu } } - if (lport == 0) + if (lport == 0) { if ((error = in_pcbpickport(&lport, wild, inp, p))) return (error); + } else { + if (ntohs(lport) < IPPORT_RESERVED && + (error = suser(p, 0))) + return (EACCES); + } inp->inp_lport = lport; in_pcbrehash(inp); return (0); @@ -357,7 +362,6 @@ in_pcbaddrisavail(struct inpcb *inp, str struct inpcbtable *table = inp->inp_table; u_int16_t lport = sin->sin_port; int reuseport = (so->so_options & SO_REUSEPORT); - int error; if (IN_MULTICAST(sin->sin_addr.s_addr)) { /* @@ -398,9 +402,6 @@ in_pcbaddrisavail(struct inpcb *inp, str struct inpcb *t; /* GROSS */ - if (ntohs(lport) < IPPORT_RESERVED && - (error = suser(p, 0))) - return (EACCES); if (so->so_euid) { t = in_pcblookup(table, &zeroin_addr, 0, &sin->sin_addr, lport, INPLOOKUP_WILDCARD, Index: sys/netinet6/in6_pcb.c === RCS file: /cvs/src/sys/netinet6/in6_pcb.c,v retrieving revision 1.90 diff -u -p -r1.90 in6_pcb.c --- sys/netinet6/in6_pcb.c 30 Mar 2016 13:02:22 - 1.90 +++ sys/netinet6/in6_pcb.c 30 Mar 2016 20:33:01 - @@ -158,7 +158,6 @@ in6_pcbaddrisavail(struct inpcb *inp, st struct inpcbtable *table = inp->inp_table; u_short lport = sin6->sin6_port; int reuseport = (so->so_options & SO_REUSEPORT); - int error; wild |= INPLOOKUP_IPV6; /* KAME hack: embed scopeid */ @@ -226,8 +225,6 @@ in6_pcbaddrisavail(struct inpcb *inp, st * finding a process for a socket instead of using * curproc? (Marked with BSD's {in,}famous XXX ? */ - if (ntohs(lport) < IPPORT_RESERVED && (error = suser(p, 0))) - return error; if (so->so_euid) { t = in_pcblookup(table, (struct in_addr *)&zeroin6_addr, 0,
use fast lookup in in6_pcbconnect()
The current use of in_pcblookup() in in6_pcbconnect() is suboptimal : all of the addresses and ports are defined, we are only interested in exact matches, and its v4 cousin in_pcbconnect() already uses in_pcbhashlookup(). Ok ? Index: sys/netinet6/in6_pcb.c === RCS file: /cvs/src/sys/netinet6/in6_pcb.c,v retrieving revision 1.89 diff -u -p -r1.89 in6_pcb.c --- sys/netinet6/in6_pcb.c 23 Mar 2016 15:50:36 - 1.89 +++ sys/netinet6/in6_pcb.c 23 Mar 2016 17:09:11 - @@ -304,9 +304,9 @@ in6_pcbconnect(struct inpcb *inp, struct inp->inp_ipv6.ip6_hlim = (u_int8_t)in6_selecthlim(inp); - if (in_pcblookup(inp->inp_table, &sin6->sin6_addr, sin6->sin6_port, + if (in6_pcbhashlookup(inp->inp_table, &sin6->sin6_addr, sin6->sin6_port, IN6_IS_ADDR_UNSPECIFIED(&inp->inp_laddr6) ? in6a : &inp->inp_laddr6, - inp->inp_lport, INPLOOKUP_IPV6, inp->inp_rtableid)) { + inp->inp_lport, inp->inp_rtableid)) { return (EADDRINUSE); }
merge in_ and in6_pcbbind(), introduce in(6)_pcbaddrisavail()
in_pcbbind and in6_pcbbind have a lot in common, the only meaningful differences are in the checks done to ensure a sockaddr is available. This diff splits theses checks in their own functions, and merge the remaining code in one single function. Aside from being easier to read, it also makes it very easy to check sockaddr availability without actually binding. Tested on my own laptop for the last ten days ; no regression observed with regress/sys/netinet/in_pcbbind. Ok ? Index: netinet/in_pcb.c === RCS file: /cvs/src/sys/netinet/in_pcb.c,v retrieving revision 1.195 diff -u -p -r1.195 in_pcb.c --- netinet/in_pcb.c18 Dec 2015 22:25:16 - 1.195 +++ netinet/in_pcb.c23 Dec 2015 08:07:14 - @@ -284,91 +284,129 @@ int in_pcbbind(struct inpcb *inp, struct mbuf *nam, struct proc *p) { struct socket *so = inp->inp_socket; - struct inpcbtable *table = inp->inp_table; - struct sockaddr_in *sin; u_int16_t lport = 0; - int wild = 0, reuseport = (so->so_options & SO_REUSEPORT); + int wild = 0; int error; -#ifdef INET6 - if (sotopf(so) == PF_INET6) - return in6_pcbbind(inp, nam, p); -#endif /* INET6 */ - - if (inp->inp_lport || inp->inp_laddr.s_addr != INADDR_ANY) + if (inp->inp_lport != 0) return (EINVAL); + if ((so->so_options & (SO_REUSEADDR|SO_REUSEPORT)) == 0 && ((so->so_proto->pr_flags & PR_CONNREQUIRED) == 0 || (so->so_options & SO_ACCEPTCONN) == 0)) wild = INPLOOKUP_WILDCARD; + if (nam) { - sin = mtod(nam, struct sockaddr_in *); - if (nam->m_len != sizeof(*sin)) + switch (sotopf(so)) { +#ifdef INET6 + case PF_INET6: { + struct sockaddr_in6 *sin6; + + if (TAILQ_EMPTY(&in6_ifaddr)) + return (EADDRNOTAVAIL); + if (!IN6_IS_ADDR_UNSPECIFIED(&inp->inp_laddr6)) + return (EINVAL); + + sin6 = mtod(nam, struct sockaddr_in6 *); + if (nam->m_len != sizeof(*sin6)) + return (EINVAL); + if (sin6->sin6_family != AF_INET6) + return (EAFNOSUPPORT); + if ((error = in6_pcbaddrisavail(inp, sin6, wild, p))) + return (error); + inp->inp_laddr6 = sin6->sin6_addr; + lport = sin6->sin6_port; + break; + } +#endif + case PF_INET: { + struct sockaddr_in *sin; + + if (inp->inp_laddr.s_addr != INADDR_ANY) + return (EINVAL); + + sin = mtod(nam, struct sockaddr_in *); + if (nam->m_len != sizeof(*sin)) + return (EINVAL); + if (sin->sin_family != AF_INET) + return (EAFNOSUPPORT); + if ((error = in_pcbaddrisavail(inp, sin, wild, p))) + return (error); + inp->inp_laddr = sin->sin_addr; + lport = sin->sin_port; + break; + } + default: return (EINVAL); + } + } + + if (lport == 0) + if ((error = in_pcbpickport(&lport, wild, inp, p))) + return (error); + inp->inp_lport = lport; + in_pcbrehash(inp); + return (0); +} + +int +in_pcbaddrisavail(struct inpcb *inp, struct sockaddr_in *sin, int wild, +struct proc *p) +{ + struct socket *so = inp->inp_socket; + struct inpcbtable *table = inp->inp_table; + u_int16_t lport = sin->sin_port; + int reuseport = (so->so_options & SO_REUSEPORT); + int error; - if (sin->sin_family != AF_INET) - return (EAFNOSUPPORT); + if (IN_MULTICAST(sin->sin_addr.s_addr)) { + /* +* Treat SO_REUSEADDR as SO_REUSEPORT for multicast; +* allow complete duplication of binding if +* SO_REUSEPORT is set, or if SO_REUSEADDR is set +* and a multicast address is bound on both +* new and duplicated sockets. +*/ + if (so->so_options & (SO_REUSEADDR|SO_REUSEPORT)) + reuseport = SO_REUSEADDR|SO_REUSEPORT; + } else if (sin->sin_addr.s_addr != INADDR_ANY) { + + if ((so->so_options & SO_BINDANY) == 0 || + (so->so_type != SOCK_DGRAM) || + (sin->sin_addr.s_addr != INADDR_BROADCAST && +!in_broadcast(sin->sin_addr, inp->inp_rtableid))
Re: "Adding" the same IPv6 address twice
On 12/21/15 11:36, Martin Pieuchot wrote: > Currently if you try to configure the same IPv6 address twice via the > SIOCAIFADDR_IN6 ioctl(2) the kernel will return EEXIST and the address > will be unset: > > # ifconfig vether0 inet6 2001::1 > # ifconfig vether0 inet6 2001::1 > ifconfig: SIOCAIFADDR: File exists > > Diff below fixes that by not inserting the local route if we're "just" > updating an existing address. sebastia@ confirmed it fixes his use > case, so I'm looking for oks. > > Index: netinet6/in6.c > === > RCS file: /cvs/src/sys/netinet6/in6.c,v > retrieving revision 1.181 > diff -u -p -r1.181 in6.c > --- netinet6/in6.c3 Dec 2015 13:13:42 - 1.181 > +++ netinet6/in6.c18 Dec 2015 09:27:18 - [...] > @@ -454,12 +454,15 @@ in6_control(struct socket *so, u_long cm > return (EINVAL); > } > > + if (ia6 == NULL) > + newifaddr = 1; > + > /* >* Make the address tentative before joining multicast >* addresses, so that corresponding MLD responses would >* not have a tentative source address. >*/ > - if ((ia6 == NULL) && in6if_do_dad(ifp)) > + if (newifaddr && in6if_do_dad(ifp)) > ifra->ifra_flags |= IN6_IFF_TENTATIVE; > > /* [...] > @@ -489,6 +493,9 @@ in6_control(struct socket *so, u_long cm > /* Perform DAD, if needed. */ > if (ia6->ia6_flags & IN6_IFF_TENTATIVE) > nd6_dad_start(&ia6->ia_ifa); > + > + if (!newifaddr) > + break; > > plen = in6_mask2len(&ia6->ia_prefixmask.sin6_addr, NULL); > if ((ifp->if_flags & IFF_LOOPBACK) || plen == 128) { > The "if (!newaddr)" should be moved above the "if (IN6_IFF_TENTATIVE)" ; this way the skipping of DAD when !newaddr is more explicit.
Re: Add SO_REUSEADDR when binding SO_REUSEPORT socket to multicast address
On 12/09/15 16:49, Vincent Gross wrote: > in_pcbbind and in6_pcbbind both extends SO_REUSEADDR for multicast > addresses so that it turns into a SO_REUSEPORT. But the check is done > in such a way that you cannot bind a SO_REUSEPORT-enabled socket to a > multicast address *after* you bound a SO_REUSEADDR-enabled socket to > the same address. > > *But:* due to how the struct in_pcb are handled, if you : > 1) bind a SO_REUSEADDR-enabled socket to a multicast address, > 2) then bind a SO_REUSEADDR|SO_REUSEPORT-enabled socket to the same address, > as a result you can now bind a SO_REUSEPORT-enabled socket to this address. > > The regress test in regress/sys/netinet/in_pcbbind reproduce this behaviour > (be sure to get v1.2 for Makefile and runtest.c) > > This diff allow SO_REUSEPORT-only socket to be bound after SO_REUSEADDR-only. > > ok ? Diff kindly tested by ajacoutot@ on avahi, "still working". Ok ? > > Index: netinet/in_pcb.c > === > RCS file: /cvs/src/sys/netinet/in_pcb.c,v > retrieving revision 1.194 > diff -u -p -r1.194 in_pcb.c > --- netinet/in_pcb.c 3 Dec 2015 21:57:59 - 1.194 > +++ netinet/in_pcb.c 9 Dec 2015 15:22:16 - > @@ -318,7 +318,7 @@ in_pcbbind(struct inpcb *inp, struct mbu >* and a multicast address is bound on both >* new and duplicated sockets. >*/ > - if (so->so_options & SO_REUSEADDR) > + if (so->so_options & (SO_REUSEADDR|SO_REUSEPORT)) > reuseport = SO_REUSEADDR|SO_REUSEPORT; > } else if (sin->sin_addr.s_addr != INADDR_ANY) { > sin->sin_port = 0; /* yech... */ > Index: netinet6/in6_pcb.c > === > RCS file: /cvs/src/sys/netinet6/in6_pcb.c,v > retrieving revision 1.83 > diff -u -p -r1.83 in6_pcb.c > --- netinet6/in6_pcb.c2 Dec 2015 22:13:44 - 1.83 > +++ netinet6/in6_pcb.c9 Dec 2015 15:22:16 - > @@ -214,7 +214,7 @@ in6_pcbbind(struct inpcb *inp, struct mb >* and a multicast address is bound on both >* new and duplicated sockets. >*/ > - if (so->so_options & SO_REUSEADDR) > + if (so->so_options & (SO_REUSEADDR|SO_REUSEPORT)) > reuseport = SO_REUSEADDR | SO_REUSEPORT; > } else if (!IN6_IS_ADDR_UNSPECIFIED(&sin6->sin6_addr)) { > struct ifaddr *ifa = NULL; >
Add SO_REUSEADDR when binding SO_REUSEPORT socket to multicast address
in_pcbbind and in6_pcbbind both extends SO_REUSEADDR for multicast addresses so that it turns into a SO_REUSEPORT. But the check is done in such a way that you cannot bind a SO_REUSEPORT-enabled socket to a multicast address *after* you bound a SO_REUSEADDR-enabled socket to the same address. *But:* due to how the struct in_pcb are handled, if you : 1) bind a SO_REUSEADDR-enabled socket to a multicast address, 2) then bind a SO_REUSEADDR|SO_REUSEPORT-enabled socket to the same address, as a result you can now bind a SO_REUSEPORT-enabled socket to this address. The regress test in regress/sys/netinet/in_pcbbind reproduce this behaviour (be sure to get v1.2 for Makefile and runtest.c) This diff allow SO_REUSEPORT-only socket to be bound after SO_REUSEADDR-only. ok ? Index: netinet/in_pcb.c === RCS file: /cvs/src/sys/netinet/in_pcb.c,v retrieving revision 1.194 diff -u -p -r1.194 in_pcb.c --- netinet/in_pcb.c3 Dec 2015 21:57:59 - 1.194 +++ netinet/in_pcb.c9 Dec 2015 15:22:16 - @@ -318,7 +318,7 @@ in_pcbbind(struct inpcb *inp, struct mbu * and a multicast address is bound on both * new and duplicated sockets. */ - if (so->so_options & SO_REUSEADDR) + if (so->so_options & (SO_REUSEADDR|SO_REUSEPORT)) reuseport = SO_REUSEADDR|SO_REUSEPORT; } else if (sin->sin_addr.s_addr != INADDR_ANY) { sin->sin_port = 0; /* yech... */ Index: netinet6/in6_pcb.c === RCS file: /cvs/src/sys/netinet6/in6_pcb.c,v retrieving revision 1.83 diff -u -p -r1.83 in6_pcb.c --- netinet6/in6_pcb.c 2 Dec 2015 22:13:44 - 1.83 +++ netinet6/in6_pcb.c 9 Dec 2015 15:22:16 - @@ -214,7 +214,7 @@ in6_pcbbind(struct inpcb *inp, struct mb * and a multicast address is bound on both * new and duplicated sockets. */ - if (so->so_options & SO_REUSEADDR) + if (so->so_options & (SO_REUSEADDR|SO_REUSEPORT)) reuseport = SO_REUSEADDR | SO_REUSEPORT; } else if (!IN6_IS_ADDR_UNSPECIFIED(&sin6->sin6_addr)) { struct ifaddr *ifa = NULL;
Re: Do not pass NULL to rtdeletemsg()
On 12/07/15 14:57, Martin Pieuchot wrote: > If the interface is gone that means you're dealing with a cached route > so there's no need to try to remove it from the table. > > Better be explicit and do that before calling rtdeletemsg() rather than > inside. > > ok? ok vgross@ > > Index: netinet/ip_icmp.c > === > RCS file: /cvs/src/sys/netinet/ip_icmp.c,v > retrieving revision 1.150 > diff -u -p -r1.150 ip_icmp.c > --- netinet/ip_icmp.c 3 Dec 2015 21:11:53 - 1.150 > +++ netinet/ip_icmp.c 7 Dec 2015 12:40:06 - > @@ -1042,19 +1042,21 @@ icmp_mtudisc(struct icmp *icp, u_int rta > void > icmp_mtudisc_timeout(struct rtentry *rt, struct rttimer *r) > { > - if (rt == NULL) > - panic("icmp_mtudisc_timeout: bad route to timeout"); > + struct ifnet *ifp; > + int s; > > - if ((rt->rt_flags & (RTF_DYNAMIC | RTF_HOST)) == > - (RTF_DYNAMIC | RTF_HOST)) { > + ifp = if_get(rt->rt_ifidx); > + if (ifp == NULL) > + return; > + > + if ((rt->rt_flags & (RTF_DYNAMIC|RTF_HOST)) == (RTF_DYNAMIC|RTF_HOST)) { > void *(*ctlfunc)(int, struct sockaddr *, u_int, void *); > struct sockaddr_in sin; > - int s; > > sin = *satosin(rt_key(rt)); > > s = splsoftnet(); > - rtdeletemsg(rt, NULL, r->rtt_tableid); > + rtdeletemsg(rt, ifp, r->rtt_tableid); > > /* Notify TCP layer of increased Path MTU estimate */ > ctlfunc = inetsw[ip_protox[IPPROTO_TCP]].pr_ctlinput; > @@ -1062,9 +1064,12 @@ icmp_mtudisc_timeout(struct rtentry *rt, > (*ctlfunc)(PRC_MTUINC, sintosa(&sin), > r->rtt_tableid, NULL); > splx(s); > - } else > + } else { > if ((rt->rt_rmx.rmx_locks & RTV_MTU) == 0) > rt->rt_rmx.rmx_mtu = 0; > + } > + > + if_put(ifp); > } > > /* > @@ -1088,17 +1093,20 @@ icmp_ratelimit(const struct in_addr *dst > void > icmp_redirect_timeout(struct rtentry *rt, struct rttimer *r) > { > - if (rt == NULL) > - panic("icmp_redirect_timeout: bad route to timeout"); > + struct ifnet *ifp; > + int s; > > - if ((rt->rt_flags & (RTF_DYNAMIC | RTF_HOST)) == > - (RTF_DYNAMIC | RTF_HOST)) { > - int s; > + ifp = if_get(rt->rt_ifidx); > + if (ifp == NULL) > + return; > > + if ((rt->rt_flags & (RTF_DYNAMIC|RTF_HOST)) == (RTF_DYNAMIC|RTF_HOST)) { > s = splsoftnet(); > - rtdeletemsg(rt, NULL, r->rtt_tableid); > + rtdeletemsg(rt, ifp, r->rtt_tableid); > splx(s); > } > + > + if_put(ifp); > } > > int > Index: netinet6/icmp6.c > === > RCS file: /cvs/src/sys/netinet6/icmp6.c,v > retrieving revision 1.182 > diff -u -p -r1.182 icmp6.c > --- netinet6/icmp6.c 3 Dec 2015 21:11:53 - 1.182 > +++ netinet6/icmp6.c 7 Dec 2015 12:39:28 - > @@ -1952,34 +1952,42 @@ icmp6_mtudisc_clone(struct sockaddr *dst > void > icmp6_mtudisc_timeout(struct rtentry *rt, struct rttimer *r) > { > - if (rt == NULL) > - panic("icmp6_mtudisc_timeout: bad route to timeout"); > - if ((rt->rt_flags & (RTF_DYNAMIC | RTF_HOST)) == > - (RTF_DYNAMIC | RTF_HOST)) { > - int s; > + struct ifnet *ifp; > + int s; > > + ifp = if_get(rt->rt_ifidx); > + if (ifp == NULL) > + return; > + > + if ((rt->rt_flags & (RTF_DYNAMIC|RTF_HOST)) == (RTF_DYNAMIC|RTF_HOST)) { > s = splsoftnet(); > - rtdeletemsg(rt, NULL, r->rtt_tableid); > + rtdeletemsg(rt, ifp, r->rtt_tableid); > splx(s); > } else { > if (!(rt->rt_rmx.rmx_locks & RTV_MTU)) > rt->rt_rmx.rmx_mtu = 0; > } > + > + if_put(ifp); > } > > void > icmp6_redirect_timeout(struct rtentry *rt, struct rttimer *r) > { > - if (rt == NULL) > - panic("icmp6_redirect_timeout: bad route to timeout"); > - if ((rt->rt_flags & (RTF_GATEWAY | RTF_DYNAMIC | RTF_HOST)) == > - (RTF_GATEWAY | RTF_DYNAMIC | RTF_HOST)) { > - int s; > + struct ifnet *ifp; > + int s; > > + ifp = if_get(rt->rt_ifidx); > + if (ifp == NULL) > + return; > + > + if ((rt->rt_flags & (RTF_DYNAMIC|RTF_HOST)) == (RTF_DYNAMIC|RTF_HOST)) { > s = splsoftnet(); > - rtdeletemsg(rt, NULL, r->rtt_tableid); > + rtdeletemsg(rt, ifp, r->rtt_tableid); > splx(s); > } > + > + if_put(ifp); > } > > int *icmpv6ctl_vars[ICMPV6CTL_MAXID] = ICMPV6CTL_VARS; >
simplify in6_selectsrc() logic
in6_selectsrc() uses two different rtalloc calls depending on whether or not the destination address is multicast or not, but there is nothing to explain why. I dug a bit and found this commit from itojun@ : diff -u -r1.6 -r1.7 --- src/sys/netinet6/in6_src.c 2000/06/18 04:49:32 1.6 +++ src/sys/netinet6/in6_src.c 2000/06/18 17:02:59 1.7 @@ -244,7 +244,11 @@ ro->ro_dst.sin6_family = AF_INET6; ro->ro_dst.sin6_len = sizeof(struct sockaddr_in6); ro->ro_dst.sin6_addr = *dst; - if (!IN6_IS_ADDR_MULTICAST(dst)) { + ro->ro_dst.sin6_scope_id = dstsock->sin6_scope_id; + if (IN6_IS_ADDR_MULTICAST(dst)) { + ro->ro_rt = rtalloc1(&((struct route *)ro) +->ro_dst, 0); + } else { rtalloc((struct route *)ro); } } Below are rtalloc() and rtalloc1() from sys/net/route.c r1.19 commited on 05/21/2000 : > void > rtalloc(ro) > register struct route *ro; > { > if (ro->ro_rt && ro->ro_rt->rt_ifp && (ro->ro_rt->rt_flags & RTF_UP)) > return; /* XXX */ > ro->ro_rt = rtalloc1(&ro->ro_dst, 1); > } > > struct rtentry * > rtalloc1(dst, report) > register struct sockaddr *dst; > int report; > { [...] > /* >* IP encapsulation does lots of lookups where we don't need nor want >* the RTM_MISSes that would be generated. It causes RTM_MISS storms >* sent upward breaking user-level routing queries. >*/ > miss: if (report && dst->sa_family != PF_KEY) { > bzero((caddr_t)&info, sizeof(info)); > info.rti_info[RTAX_DST] = dst; > rt_missmsg(msgtype, &info, 0, err); > } > } > splx(s); > return (newrt); > } So this if(MULTICAST) has been introduced to prevent RTM_MISS storms when looking up routes to multicast addresses ; multicast and unicast route lookups are the same. Also, rtalloc(foo, RT_RESOLVE, bar) and rtalloc_mpath(foo, NULL, bar) are both equivalent to _rtalloc(foo, NULL, RT_RESOLVE, bar). Let's remove this if(MULTICAST), it's just confusing. ok ? Index: sys/netinet6/in6_src.c === RCS file: /cvs/src/sys/netinet6/in6_src.c,v retrieving revision 1.71 diff -u -p -r1.71 in6_src.c --- sys/netinet6/in6_src.c 2 Dec 2015 13:29:26 - 1.71 +++ sys/netinet6/in6_src.c 5 Dec 2015 12:03:48 - @@ -240,13 +240,8 @@ in6_selectsrc(struct in6_addr **in6src, sa6->sin6_len = sizeof(struct sockaddr_in6); sa6->sin6_addr = *dst; sa6->sin6_scope_id = dstsock->sin6_scope_id; - if (IN6_IS_ADDR_MULTICAST(dst)) { - ro->ro_rt = rtalloc(sin6tosa(&ro->ro_dst), - RT_RESOLVE, ro->ro_tableid); - } else { - ro->ro_rt = rtalloc_mpath(sin6tosa(&ro->ro_dst), - NULL, ro->ro_tableid); - } + ro->ro_rt = rtalloc(sin6tosa(&ro->ro_dst), + RT_RESOLVE, ro->ro_tableid); } /*
Re: explicitly check broadcast addresses on some ifa_ifwithaddr() uses
On 12/03/15 10:21, Vincent Gross wrote: > On 12/02/15 20:06, Martin Pieuchot wrote: >> On 02/12/15(Wed) 16:18, Vincent Gross wrote: >>> When fed a broadcast address, ifa_ifwitaddr() returns the unicast ifa >>> whose broadcast address match the input. This is used mainly to select >>> ifa, and there can be trouble when you have 2 ifas on the same range >>> (e.g. 10.0.0.1/24@em0 & 10.0.0.20/24@em1) : >>> >>> netinet/ip_mroute.c:814 >>> net/route.c:785 >>> netinet/ip_divert.c:143 >>> net/if_vxlan.c:241 >>> >>> There are also places where broadcast addresses should not be tolerated : >>> >>> netinet/ip_input.c:1061 broadcast address is not a module identifier >>> netinet/ip_input.c:1141 see above >>> netinet/ip_input.c:1197 see above >>> netinet6/*: no broadcast in ipv6 >>> net/route.c:562: gateway shall never be a broadcast addr >>> net/route.c:713: see above >>> >>> This diff removes broadcast matching from ifa_ifwithaddr, and >>> adds or rewrites checks where necessary. >>> >>> Comments ? Ok ? >> >> Looks good to me. Some nits below. > > Nits applied. > > Anyone else ? bluhm@ spotted one case where in_broadcast was needed. ok ? Index: sys/net/if.c === RCS file: /cvs/src/sys/net/if.c,v retrieving revision 1.418 diff -u -p -r1.418 if.c --- sys/net/if.c3 Dec 2015 12:22:51 - 1.418 +++ sys/net/if.c3 Dec 2015 13:48:58 - @@ -1220,13 +1220,6 @@ ifa_ifwithaddr(struct sockaddr *addr, u_ if (equal(addr, ifa->ifa_addr)) return (ifa); - - /* IPv6 doesn't have broadcast */ - if ((ifp->if_flags & IFF_BROADCAST) && - ifa->ifa_broadaddr && - ifa->ifa_broadaddr->sa_len != 0 && - equal(ifa->ifa_broadaddr, addr)) - return (ifa); } } return (NULL); Index: sys/net/route.c === RCS file: /cvs/src/sys/net/route.c,v retrieving revision 1.283 diff -u -p -r1.283 route.c --- sys/net/route.c 2 Dec 2015 16:49:58 - 1.283 +++ sys/net/route.c 3 Dec 2015 13:49:00 - @@ -539,7 +539,9 @@ rtredirect(struct sockaddr *dst, struct bcmp((caddr_t)(a1), (caddr_t)(a2), (a1)->sa_len) == 0) if (rt != NULL && (!equal(src, rt->rt_gateway) || rt->rt_ifa != ifa)) error = EINVAL; - else if (ifa_ifwithaddr(gateway, rdomain) != NULL) + else if (ifa_ifwithaddr(gateway, rdomain) != NULL || + (gateway->sa_family = AF_INET && + in_broadcast(satosin(gateway)->sin_addr, rdomain))) error = EHOSTUNREACH; if (error) goto done; Index: sys/netinet/in_pcb.c === RCS file: /cvs/src/sys/netinet/in_pcb.c,v retrieving revision 1.191 diff -u -p -r1.191 in_pcb.c --- sys/netinet/in_pcb.c3 Dec 2015 09:49:15 - 1.191 +++ sys/netinet/in_pcb.c3 Dec 2015 13:49:00 - @@ -332,14 +332,13 @@ in_pcbbind(struct inpcb *inp, struct mbu ia = ifatoia(ifa_ifwithaddr(sintosa(sin), inp->inp_rtableid)); - if (ia == NULL) - return (EADDRNOTAVAIL); /* SOCK_RAW does not use in_pcbbind() */ - if (so->so_type != SOCK_DGRAM && - sin->sin_addr.s_addr != - ia->ia_addr.sin_addr.s_addr) - return (EADDRNOTAVAIL); + if (ia == NULL && + (so->so_type != SOCK_DGRAM || + !in_broadcast(sin->sin_addr, + inp->inp_rtableid))) + return (EADDRNOTAVAIL); } } if (lport) { @@ -353,7 +352,8 @@ in_pcbbind(struct inpcb *inp, struct mbu t = in_pcblookup(table, &zeroin_addr, 0, &sin->sin_addr, lport, INPLOOKUP_WILDCARD, inp->inp_rtableid); - if (t && (so->so_euid != t->inp_socket->so_euid)) +
Re: explicitly check broadcast addresses on some ifa_ifwithaddr() uses
On 12/02/15 20:06, Martin Pieuchot wrote: > On 02/12/15(Wed) 16:18, Vincent Gross wrote: >> When fed a broadcast address, ifa_ifwitaddr() returns the unicast ifa >> whose broadcast address match the input. This is used mainly to select >> ifa, and there can be trouble when you have 2 ifas on the same range >> (e.g. 10.0.0.1/24@em0 & 10.0.0.20/24@em1) : >> >> netinet/ip_mroute.c:814 >> net/route.c:785 >> netinet/ip_divert.c:143 >> net/if_vxlan.c:241 >> >> There are also places where broadcast addresses should not be tolerated : >> >> netinet/ip_input.c:1061 broadcast address is not a module identifier >> netinet/ip_input.c:1141 see above >> netinet/ip_input.c:1197 see above >> netinet6/*: no broadcast in ipv6 >> net/route.c:562: gateway shall never be a broadcast addr >> net/route.c:713: see above >> >> This diff removes broadcast matching from ifa_ifwithaddr, and >> adds or rewrites checks where necessary. >> >> Comments ? Ok ? > > Looks good to me. Some nits below. Nits applied. Anyone else ? Index: sys/net/if.c === RCS file: /cvs/src/sys/net/if.c,v retrieving revision 1.417 diff -u -p -r1.417 if.c --- sys/net/if.c2 Dec 2015 16:35:52 - 1.417 +++ sys/net/if.c3 Dec 2015 07:59:53 - @@ -1179,13 +1179,6 @@ ifa_ifwithaddr(struct sockaddr *addr, u_ if (equal(addr, ifa->ifa_addr)) return (ifa); - - /* IPv6 doesn't have broadcast */ - if ((ifp->if_flags & IFF_BROADCAST) && - ifa->ifa_broadaddr && - ifa->ifa_broadaddr->sa_len != 0 && - equal(ifa->ifa_broadaddr, addr)) - return (ifa); } } return (NULL); Index: sys/netinet/in_pcb.c === RCS file: /cvs/src/sys/netinet/in_pcb.c,v retrieving revision 1.190 diff -u -p -r1.190 in_pcb.c --- sys/netinet/in_pcb.c2 Dec 2015 22:13:44 - 1.190 +++ sys/netinet/in_pcb.c3 Dec 2015 07:59:53 - @@ -332,14 +332,13 @@ in_pcbbind(struct inpcb *inp, struct mbu ia = ifatoia(ifa_ifwithaddr(sintosa(sin), inp->inp_rtableid)); - if (ia == NULL) - return (EADDRNOTAVAIL); /* SOCK_RAW does not use in_pcbbind() */ - if (so->so_type != SOCK_DGRAM && - sin->sin_addr.s_addr != - ia->ia_addr.sin_addr.s_addr) - return (EADDRNOTAVAIL); + if (ia == NULL && + (so->so_type != SOCK_DGRAM || + !in_broadcast(sin->sin_addr, + inp->inp_rtableid))) + return (EADDRNOTAVAIL); } } if (lport) { @@ -353,7 +352,8 @@ in_pcbbind(struct inpcb *inp, struct mbu t = in_pcblookup(table, &zeroin_addr, 0, &sin->sin_addr, lport, INPLOOKUP_WILDCARD, inp->inp_rtableid); - if (t && (so->so_euid != t->inp_socket->so_euid)) + if (t && + (so->so_euid != t->inp_socket->so_euid)) return (EADDRINUSE); } t = in_pcblookup(table, &zeroin_addr, 0, Index: sys/netinet/ip_output.c === RCS file: /cvs/src/sys/netinet/ip_output.c,v retrieving revision 1.311 diff -u -p -r1.311 ip_output.c --- sys/netinet/ip_output.c 2 Dec 2015 20:50:20 - 1.311 +++ sys/netinet/ip_output.c 3 Dec 2015 07:59:53 - @@ -1368,13 +1368,12 @@ ip_setmoptions(int optname, struct ip_mo sin.sin_family = AF_INET; sin.sin_addr = addr; ia = ifatoia(ifa_ifwithaddr(sintosa(&sin), rtableid)); - if (ia && in_hosteq(sin.sin_addr, ia->ia_addr.sin_addr)) - ifp = ia->ia_ifp; - if (ifp == NULL || (ifp->if_flags & IFF_MULTICAST) == 0) { + if (ia == NULL || + (ia->ia_ifp->if_flags
explicitly check broadcast addresses on some ifa_ifwithaddr() uses
When fed a broadcast address, ifa_ifwitaddr() returns the unicast ifa whose broadcast address match the input. This is used mainly to select ifa, and there can be trouble when you have 2 ifas on the same range (e.g. 10.0.0.1/24@em0 & 10.0.0.20/24@em1) : netinet/ip_mroute.c:814 net/route.c:785 netinet/ip_divert.c:143 net/if_vxlan.c:241 There are also places where broadcast addresses should not be tolerated : netinet/ip_input.c:1061 broadcast address is not a module identifier netinet/ip_input.c:1141 see above netinet/ip_input.c:1197 see above netinet6/*: no broadcast in ipv6 net/route.c:562: gateway shall never be a broadcast addr net/route.c:713: see above This diff removes broadcast matching from ifa_ifwithaddr, and adds or rewrites checks where necessary. Comments ? Ok ? Index: sys/net/if.c === RCS file: /cvs/src/sys/net/if.c,v retrieving revision 1.416 diff -u -p -r1.416 if.c --- sys/net/if.c2 Dec 2015 08:47:00 - 1.416 +++ sys/net/if.c2 Dec 2015 15:17:26 - @@ -1178,13 +1178,6 @@ ifa_ifwithaddr(struct sockaddr *addr, u_ if (equal(addr, ifa->ifa_addr)) return (ifa); - - /* IPv6 doesn't have broadcast */ - if ((ifp->if_flags & IFF_BROADCAST) && - ifa->ifa_broadaddr && - ifa->ifa_broadaddr->sa_len != 0 && - equal(ifa->ifa_broadaddr, addr)) - return (ifa); } } return (NULL); Index: sys/netinet/in_pcb.c === RCS file: /cvs/src/sys/netinet/in_pcb.c,v retrieving revision 1.188 diff -u -p -r1.188 in_pcb.c --- sys/netinet/in_pcb.c30 Oct 2015 09:39:42 - 1.188 +++ sys/netinet/in_pcb.c2 Dec 2015 15:17:26 - @@ -328,14 +328,12 @@ in_pcbbind(struct inpcb *inp, struct mbu ia = ifatoia(ifa_ifwithaddr(sintosa(sin), inp->inp_rtableid)); - if (ia == NULL) - return (EADDRNOTAVAIL); /* SOCK_RAW does not use in_pcbbind() */ - if (so->so_type != SOCK_DGRAM && - sin->sin_addr.s_addr != - ia->ia_addr.sin_addr.s_addr) - return (EADDRNOTAVAIL); + if (ia == NULL && + (so->so_type != SOCK_DGRAM || + !in_broadcast(sin->sin_addr, inp->inp_rtableid))) + return (EADDRNOTAVAIL); } } if (lport) { Index: sys/netinet/ip_output.c === RCS file: /cvs/src/sys/netinet/ip_output.c,v retrieving revision 1.310 diff -u -p -r1.310 ip_output.c --- sys/netinet/ip_output.c 2 Dec 2015 13:29:26 - 1.310 +++ sys/netinet/ip_output.c 2 Dec 2015 15:17:27 - @@ -1387,9 +1387,8 @@ ip_setmoptions(int optname, struct ip_mo sin.sin_family = AF_INET; sin.sin_addr = addr; ia = ifatoia(ifa_ifwithaddr(sintosa(&sin), rtableid)); - if (ia && in_hosteq(sin.sin_addr, ia->ia_addr.sin_addr)) - ifp = ia->ia_ifp; - if (ifp == NULL || (ifp->if_flags & IFF_MULTICAST) == 0) { + if (ia == NULL || (ifp = ia->ia_ifp) == NULL || + (ia->ia_ifp->if_flags & IFF_MULTICAST) == 0) { error = EADDRNOTAVAIL; break; } @@ -1561,12 +1560,11 @@ ip_setmoptions(int optname, struct ip_mo sin.sin_family = AF_INET; sin.sin_addr = mreq->imr_interface; ia = ifatoia(ifa_ifwithaddr(sintosa(&sin), rtableid)); - if (ia && in_hosteq(sin.sin_addr, ia->ia_addr.sin_addr)) - ifp = ia->ia_ifp; - else { + if (ia == NULL) { error = EADDRNOTAVAIL; break; } + ifp = ia->ia_ifp; } /* * Find the membership in the membership array. Index: sys/netinet/raw_ip.c === RCS file: /cvs/src/sys/netinet/raw_ip.c,v retrieving revision 1.84 diff -u -p -r1.84 raw_ip.c --- sys/netinet/raw_ip.c28 Jul 2015 12:22:07 - 1.84 +++ sys/netinet/raw_ip.c2 Dec 2015 15:17:27 - @@ -473,6 +473,7 @@ rip_usrre
rewrite if_ifwithaddr() to use rtalloc(9)
regress/sys/net/rdomains still passes with this diff. Ok ? Index: net/if.c === RCS file: /cvs/src/sys/net/if.c,v retrieving revision 1.398 diff -u -p -r1.398 if.c --- net/if.c25 Oct 2015 21:58:04 - 1.398 +++ net/if.c26 Oct 2015 09:44:10 - @@ -1143,31 +1143,19 @@ if_congested(void) struct ifaddr * ifa_ifwithaddr(struct sockaddr *addr, u_int rtableid) { - struct ifnet *ifp; struct ifaddr *ifa; + struct rtentry *rt; u_int rdomain; + /* +* Local routes corresponding to ifas are in rdomain's +* default rtable. +*/ rdomain = rtable_l2(rtableid); - TAILQ_FOREACH(ifp, &ifnet, if_list) { - if (ifp->if_rdomain != rdomain) - continue; - - TAILQ_FOREACH(ifa, &ifp->if_addrlist, ifa_list) { - if (ifa->ifa_addr->sa_family != addr->sa_family) - continue; - - if (equal(addr, ifa->ifa_addr)) - return (ifa); - - /* IPv6 doesn't have broadcast */ - if ((ifp->if_flags & IFF_BROADCAST) && - ifa->ifa_broadaddr && - ifa->ifa_broadaddr->sa_len != 0 && - equal(ifa->ifa_broadaddr, addr)) - return (ifa); - } - } - return (NULL); + rt = rtalloc(addr, 0, rdomain); + ifa = rt && (rt->rt_flags & RTF_LOCAL) ? rt->rt_ifa : NULL; + rtfree(rt); + return ifa; } /*
Re: dedup in_pcbbind() port scan loop
On 10/07/15 14:05, Martin Pieuchot wrote: > On 01/10/15(Thu) 19:40, Vincent Gross wrote: >> Although the sysctls controlling the port range are labelled >> "port(hi)?first" and >> "port(hi)?last", no ordering is enforced and you can have portfirst > >> portlast. >> in_pcbbind() (and in6_pcbsetport()) work around this by duplicating the loop >> looking >> for an available port. > > What about a small regress test? > >> This diff introduce temporary bounds and compare them to guarantee that >> first <= last, thus allowing deduplication of the port scan loop. > > Makes sense to me, I'd keep the comment though. > >> >> Tested on my laptop with a narrow port range and heavy cheezburger browsing, >> no fault >> detected. Deeper testing welcome. >> >> Should I include in6_pcbsetport() changes right now or should ipv4 be >> validated first ? > > I prefer when both version are keep in sync, so yes a in6_pcbsetport() > diff would be nice. Plus if it's possible to have a regress test it > would be awesome. > Ok ? Index: sys/netinet/in_pcb.c === RCS file: /cvs/src/sys/netinet/in_pcb.c,v retrieving revision 1.181 diff -u -p -r1.181 in_pcb.c --- sys/netinet/in_pcb.c9 Oct 2015 01:10:27 - 1.181 +++ sys/netinet/in_pcb.c15 Oct 2015 12:08:06 - @@ -360,67 +360,47 @@ in_pcbbind(struct inpcb *inp, struct mbu inp->inp_laddr = sin->sin_addr; } if (lport == 0) { - u_int16_t first, last; + u_int16_t bound_a, bound_b, first, last; int count; if (inp->inp_flags & INP_HIGHPORT) { - first = ipport_hifirstauto; /* sysctl */ - last = ipport_hilastauto; + bound_a = ipport_hifirstauto; /* sysctl */ + bound_b = ipport_hilastauto; } else if (inp->inp_flags & INP_LOWPORT) { if ((error = suser(p, 0))) return (EACCES); - first = IPPORT_RESERVED-1; /* 1023 */ - last = 600;/* not IPPORT_RESERVED/2 */ + bound_a = IPPORT_RESERVED-1; /* 1023 */ + bound_b = 600; /* not IPPORT_RESERVED/2 */ } else { - first = ipport_firstauto; /* sysctl */ - last = ipport_lastauto; + bound_a = ipport_firstauto; /* sysctl */ + bound_b = ipport_lastauto; + } + if (bound_a < bound_b) { + first = bound_a; + last = bound_b; + } else { + first = bound_b; + last = bound_a; } /* * Simple check to ensure all ports are not used up causing * a deadlock here. -* -* We split the two cases (up and down) so that the direction -* is not being tested on each round of the loop. */ - if (first > last) { - /* -* counting down -*/ - count = first - last; - if (count) - lastport = first - arc4random_uniform(count); + count = last - first; + lastport = first + arc4random_uniform(count); - do { - if (count-- < 0)/* completely used? */ - return (EADDRNOTAVAIL); - --lastport; - if (lastport > first || lastport < last) - lastport = first; - lport = htons(lastport); - } while (in_baddynamic(lastport, so->so_proto->pr_protocol) || - in_pcblookup(table, &zeroin_addr, 0, - &inp->inp_laddr, lport, wild, inp->inp_rtableid)); - } else { - /* -* counting up -*/ - count = last - first; - if (count) - lastport = first + arc4random_uniform(count); - - do { - if (count-- < 0)/* completely used? */ - return (EADDRNOTAVAIL); - ++lastport; -
dedup in_pcbbind() port scan loop
Although the sysctls controlling the port range are labelled "port(hi)?first" and "port(hi)?last", no ordering is enforced and you can have portfirst > portlast. in_pcbbind() (and in6_pcbsetport()) work around this by duplicating the loop looking for an available port. This diff introduce temporary bounds and compare them to guarantee that first <= last, thus allowing deduplication of the port scan loop. Tested on my laptop with a narrow port range and heavy cheezburger browsing, no fault detected. Deeper testing welcome. Should I include in6_pcbsetport() changes right now or should ipv4 be validated first ? -- Vincent Gross Index: netinet/in_pcb.c === RCS file: /cvs/src/sys/netinet/in_pcb.c,v retrieving revision 1.180 diff -u -p -r1.180 in_pcb.c --- netinet/in_pcb.c22 Sep 2015 09:34:38 - 1.180 +++ netinet/in_pcb.c1 Oct 2015 09:47:16 - @@ -360,67 +360,43 @@ in_pcbbind(struct inpcb *inp, struct mbu inp->inp_laddr = sin->sin_addr; } if (lport == 0) { - u_int16_t first, last; + u_int16_t bound_a, bound_b, first, last; int count; if (inp->inp_flags & INP_HIGHPORT) { - first = ipport_hifirstauto; /* sysctl */ - last = ipport_hilastauto; + bound_a = ipport_hifirstauto; /* sysctl */ + bound_b = ipport_hilastauto; } else if (inp->inp_flags & INP_LOWPORT) { if ((error = suser(p, 0))) return (EACCES); - first = IPPORT_RESERVED-1; /* 1023 */ - last = 600;/* not IPPORT_RESERVED/2 */ + bound_a = IPPORT_RESERVED-1; /* 1023 */ + bound_b = 600; /* not IPPORT_RESERVED/2 */ } else { - first = ipport_firstauto; /* sysctl */ - last = ipport_lastauto; + bound_a = ipport_firstauto; /* sysctl */ + bound_b = ipport_lastauto; } - - /* -* Simple check to ensure all ports are not used up causing -* a deadlock here. -* -* We split the two cases (up and down) so that the direction -* is not being tested on each round of the loop. -*/ - - if (first > last) { - /* -* counting down -*/ - count = first - last; - if (count) - lastport = first - arc4random_uniform(count); - - do { - if (count-- < 0)/* completely used? */ - return (EADDRNOTAVAIL); - --lastport; - if (lastport > first || lastport < last) - lastport = first; - lport = htons(lastport); - } while (in_baddynamic(lastport, so->so_proto->pr_protocol) || - in_pcblookup(table, &zeroin_addr, 0, - &inp->inp_laddr, lport, wild, inp->inp_rtableid)); + if (bound_a < bound_b) { + first = bound_a; + last = bound_b; } else { - /* -* counting up -*/ - count = last - first; - if (count) - lastport = first + arc4random_uniform(count); - - do { - if (count-- < 0)/* completely used? */ - return (EADDRNOTAVAIL); - ++lastport; - if (lastport < first || lastport > last) - lastport = first; - lport = htons(lastport); - } while (in_baddynamic(lastport, so->so_proto->pr_protocol) || - in_pcblookup(table, &zeroin_addr, 0, - &inp->inp_laddr, lport, wild, inp->inp_rtableid)); + first = bound_b; + last = bound_a; } + /* first <= last */ + + count = last - first; + lastport = first + arc4random_uniform(count); + + do { + if (count-- < 0)/* completely used? */ +
Re: kill struct inpcbtable's inpt_lastport
On 09/18/15 23:39, David Hill wrote: > On Fri, Sep 18, 2015 at 11:05:55PM +0200, Vincent Gross wrote: >> On 09/18/15 15:18, David Hill wrote: >>> Is this 'if (count)' statement needed? We know first > last, so count >>> will always be positive. lastport will always be set. >> >>> if last == first, then the if statement will be false and lastport will >>> be uninitialized, I believe. >>> >> >> Both remarks are true, but I think it is better to keep a more extensive >> refactoring in a separate diff, refactoring that shall get rid of this >> yucky code duplication. >> > > Well, this code changes the current behavior. I'd at least change > lastport to be initialized to 0 to keep the behavior the same. It was > previously set to 0 with M_ZERO. > Fixed. Ok ? Index: sys/netinet/in_pcb.c === RCS file: /cvs/src/sys/netinet/in_pcb.c,v retrieving revision 1.179 diff -u -p -r1.179 in_pcb.c --- sys/netinet/in_pcb.c11 Sep 2015 15:29:47 - 1.179 +++ sys/netinet/in_pcb.c19 Sep 2015 17:52:42 - @@ -199,7 +199,6 @@ in_pcbinit(struct inpcbtable *table, int &table->inpt_lhash); if (table->inpt_lhashtbl == NULL) panic("in_pcbinit: hashinit failed for lport"); - table->inpt_lastport = 0; table->inpt_count = 0; arc4random_buf(&table->inpt_key, sizeof(table->inpt_key)); } @@ -281,8 +280,8 @@ in_pcbbind(struct inpcb *inp, struct mbu { struct socket *so = inp->inp_socket; struct inpcbtable *table = inp->inp_table; - u_int16_t *lastport = &inp->inp_table->inpt_lastport; struct sockaddr_in *sin; + u_int16_t lastport = 0; u_int16_t lport = 0; int wild = 0, reuseport = (so->so_options & SO_REUSEPORT); int error; @@ -391,16 +390,16 @@ in_pcbbind(struct inpcb *inp, struct mbu */ count = first - last; if (count) - *lastport = first - arc4random_uniform(count); + lastport = first - arc4random_uniform(count); do { if (count-- < 0)/* completely used? */ return (EADDRNOTAVAIL); - --*lastport; - if (*lastport > first || *lastport < last) - *lastport = first; - lport = htons(*lastport); - } while (in_baddynamic(*lastport, so->so_proto->pr_protocol) || + --lastport; + if (lastport > first || lastport < last) + lastport = first; + lport = htons(lastport); + } while (in_baddynamic(lastport, so->so_proto->pr_protocol) || in_pcblookup(table, &zeroin_addr, 0, &inp->inp_laddr, lport, wild, inp->inp_rtableid)); } else { @@ -409,16 +408,16 @@ in_pcbbind(struct inpcb *inp, struct mbu */ count = last - first; if (count) - *lastport = first + arc4random_uniform(count); + lastport = first + arc4random_uniform(count); do { if (count-- < 0)/* completely used? */ return (EADDRNOTAVAIL); - ++*lastport; - if (*lastport < first || *lastport > last) - *lastport = first; - lport = htons(*lastport); - } while (in_baddynamic(*lastport, so->so_proto->pr_protocol) || + ++lastport; + if (lastport < first || lastport > last) + lastport = first; + lport = htons(lastport); + } while (in_baddynamic(lastport, so->so_proto->pr_protocol) || in_pcblookup(table, &zeroin_addr, 0, &inp->inp_laddr, lport, wild, inp->inp_rtableid)); } Index: sys/netinet/in_pcb.h === RCS file: /cvs/src/sys/netinet/in_pcb.h,v retrieving revision 1.89 diff -u -p -r1.89 in_pcb.h --- sys/netinet/in_pcb.h16 Apr 2015 19:24:13 - 1.89 +++ sy
Re: kill struct inpcbtable's inpt_lastport
On 09/18/15 15:18, David Hill wrote: > Is this 'if (count)' statement needed? We know first > last, so count > will always be positive. lastport will always be set. > if last == first, then the if statement will be false and lastport will > be uninitialized, I believe. > Both remarks are true, but I think it is better to keep a more extensive refactoring in a separate diff, refactoring that shall get rid of this yucky code duplication. -- Vincent Gross
Re: kill struct inpcbtable's inpt_lastport
On 09/13/15 11:49, Vincent Gross wrote: > On 09/13/15 10:37, Claudio Jeker wrote: >> On Sun, Sep 13, 2015 at 12:18:10AM +0200, Vincent Gross wrote: >>> On 09/12/15 22:10, Claudio Jeker wrote: >>>> On Sat, Sep 12, 2015 at 02:40:59PM +0200, Vincent Gross wrote: >>>>> inpt_lastport is never read without being written before, and only >>>>> in_pcbbind() >>>>> and in6_pcbsetport() are using it. This diff removes inpt_lastport from >>>>> struct inpcbtable and turns it into a local variable where it is used. >>>>> >>>>> Ok ? >>>> Reads OK but can not be applied because something wrapped some lines. >>> >> >> Lines are now fixed but now all the tabs got replaced by spaces. So the >> thing still fails to apply. >> > > How about now ? > > > Index: sys/netinet/in_pcb.c > === > RCS file: /cvs/src/sys/netinet/in_pcb.c,v > retrieving revision 1.179 > diff -u -p -r1.179 in_pcb.c > --- sys/netinet/in_pcb.c 11 Sep 2015 15:29:47 - 1.179 > +++ sys/netinet/in_pcb.c 12 Sep 2015 12:22:03 - > @@ -199,7 +199,6 @@ in_pcbinit(struct inpcbtable *table, int > &table->inpt_lhash); > if (table->inpt_lhashtbl == NULL) > panic("in_pcbinit: hashinit failed for lport"); > - table->inpt_lastport = 0; > table->inpt_count = 0; > arc4random_buf(&table->inpt_key, sizeof(table->inpt_key)); > } > @@ -281,9 +280,8 @@ in_pcbbind(struct inpcb *inp, struct mbu > { > struct socket *so = inp->inp_socket; > struct inpcbtable *table = inp->inp_table; > - u_int16_t *lastport = &inp->inp_table->inpt_lastport; > struct sockaddr_in *sin; > - u_int16_t lport = 0; > + u_int16_t lastport, lport = 0; > int wild = 0, reuseport = (so->so_options & SO_REUSEPORT); > int error; > > @@ -391,16 +389,16 @@ in_pcbbind(struct inpcb *inp, struct mbu >*/ > count = first - last; > if (count) > - *lastport = first - arc4random_uniform(count); > + lastport = first - arc4random_uniform(count); > > do { > if (count-- < 0)/* completely used? */ > return (EADDRNOTAVAIL); > - --*lastport; > - if (*lastport > first || *lastport < last) > - *lastport = first; > - lport = htons(*lastport); > - } while (in_baddynamic(*lastport, > so->so_proto->pr_protocol) || > + --lastport; > + if (lastport > first || lastport < last) > + lastport = first; > + lport = htons(lastport); > + } while (in_baddynamic(lastport, > so->so_proto->pr_protocol) || > in_pcblookup(table, &zeroin_addr, 0, > &inp->inp_laddr, lport, wild, inp->inp_rtableid)); > } else { > @@ -409,16 +407,16 @@ in_pcbbind(struct inpcb *inp, struct mbu >*/ > count = last - first; > if (count) > - *lastport = first + arc4random_uniform(count); > + lastport = first + arc4random_uniform(count); > > do { > if (count-- < 0)/* completely used? */ > return (EADDRNOTAVAIL); > - ++*lastport; > - if (*lastport < first || *lastport > last) > - *lastport = first; > - lport = htons(*lastport); > - } while (in_baddynamic(*lastport, > so->so_proto->pr_protocol) || > + ++lastport; > + if (lastport < first || lastport > last) > + lastport = first; > + lport = htons(lastport); > + } while (in_baddynamic(lastport, > so->so_proto->pr_protocol) || > in_pcblookup(table, &zeroin_addr, 0, > &inp->inp_laddr, lport, wild, inp->inp_r
Re: kill struct inpcbtable's inpt_lastport
On 09/13/15 10:37, Claudio Jeker wrote: > On Sun, Sep 13, 2015 at 12:18:10AM +0200, Vincent Gross wrote: >> On 09/12/15 22:10, Claudio Jeker wrote: >>> On Sat, Sep 12, 2015 at 02:40:59PM +0200, Vincent Gross wrote: >>>> inpt_lastport is never read without being written before, and only >>>> in_pcbbind() >>>> and in6_pcbsetport() are using it. This diff removes inpt_lastport from >>>> struct inpcbtable and turns it into a local variable where it is used. >>>> >>>> Ok ? >>> Reads OK but can not be applied because something wrapped some lines. >> > > Lines are now fixed but now all the tabs got replaced by spaces. So the > thing still fails to apply. > How about now ? Index: sys/netinet/in_pcb.c === RCS file: /cvs/src/sys/netinet/in_pcb.c,v retrieving revision 1.179 diff -u -p -r1.179 in_pcb.c --- sys/netinet/in_pcb.c11 Sep 2015 15:29:47 - 1.179 +++ sys/netinet/in_pcb.c12 Sep 2015 12:22:03 - @@ -199,7 +199,6 @@ in_pcbinit(struct inpcbtable *table, int &table->inpt_lhash); if (table->inpt_lhashtbl == NULL) panic("in_pcbinit: hashinit failed for lport"); - table->inpt_lastport = 0; table->inpt_count = 0; arc4random_buf(&table->inpt_key, sizeof(table->inpt_key)); } @@ -281,9 +280,8 @@ in_pcbbind(struct inpcb *inp, struct mbu { struct socket *so = inp->inp_socket; struct inpcbtable *table = inp->inp_table; - u_int16_t *lastport = &inp->inp_table->inpt_lastport; struct sockaddr_in *sin; - u_int16_t lport = 0; + u_int16_t lastport, lport = 0; int wild = 0, reuseport = (so->so_options & SO_REUSEPORT); int error; @@ -391,16 +389,16 @@ in_pcbbind(struct inpcb *inp, struct mbu */ count = first - last; if (count) - *lastport = first - arc4random_uniform(count); + lastport = first - arc4random_uniform(count); do { if (count-- < 0)/* completely used? */ return (EADDRNOTAVAIL); - --*lastport; - if (*lastport > first || *lastport < last) - *lastport = first; - lport = htons(*lastport); - } while (in_baddynamic(*lastport, so->so_proto->pr_protocol) || + --lastport; + if (lastport > first || lastport < last) + lastport = first; + lport = htons(lastport); + } while (in_baddynamic(lastport, so->so_proto->pr_protocol) || in_pcblookup(table, &zeroin_addr, 0, &inp->inp_laddr, lport, wild, inp->inp_rtableid)); } else { @@ -409,16 +407,16 @@ in_pcbbind(struct inpcb *inp, struct mbu */ count = last - first; if (count) - *lastport = first + arc4random_uniform(count); + lastport = first + arc4random_uniform(count); do { if (count-- < 0)/* completely used? */ return (EADDRNOTAVAIL); - ++*lastport; - if (*lastport < first || *lastport > last) - *lastport = first; - lport = htons(*lastport); - } while (in_baddynamic(*lastport, so->so_proto->pr_protocol) || + ++lastport; + if (lastport < first || lastport > last) + lastport = first; + lport = htons(lastport); + } while (in_baddynamic(lastport, so->so_proto->pr_protocol) || in_pcblookup(table, &zeroin_addr, 0, &inp->inp_laddr, lport, wild, inp->inp_rtableid)); } Index: sys/netinet/in_pcb.h === RCS file: /cvs/src/sys/netinet/in_pcb.h,v retrieving revision 1.89 diff -u -p -r1.89 in_pcb.h --- sys/netinet/in_pcb.h16 Apr 2015 19:24:13 - 1.89 +++ sys/netinet/in_pcb.h12 Sep 2015 12:22:03 - @@ -152
Re: kill struct inpcbtable's inpt_lastport
On 09/12/15 22:10, Claudio Jeker wrote: > On Sat, Sep 12, 2015 at 02:40:59PM +0200, Vincent Gross wrote: >> inpt_lastport is never read without being written before, and only >> in_pcbbind() >> and in6_pcbsetport() are using it. This diff removes inpt_lastport from >> struct inpcbtable and turns it into a local variable where it is used. >> >> Ok ? > Reads OK but can not be applied because something wrapped some lines. Ok, thunderbird and I reached an agreement where we will keep our legs and lines unbroken. -- Vincent Index: sys/netinet/in_pcb.c === RCS file: /cvs/src/sys/netinet/in_pcb.c,v retrieving revision 1.179 diff -u -p -r1.179 in_pcb.c --- sys/netinet/in_pcb.c11 Sep 2015 15:29:47 - 1.179 +++ sys/netinet/in_pcb.c12 Sep 2015 12:22:03 - @@ -199,7 +199,6 @@ in_pcbinit(struct inpcbtable *table, int &table->inpt_lhash); if (table->inpt_lhashtbl == NULL) panic("in_pcbinit: hashinit failed for lport"); - table->inpt_lastport = 0; table->inpt_count = 0; arc4random_buf(&table->inpt_key, sizeof(table->inpt_key)); } @@ -281,9 +280,8 @@ in_pcbbind(struct inpcb *inp, struct mbu { struct socket *so = inp->inp_socket; struct inpcbtable *table = inp->inp_table; - u_int16_t *lastport = &inp->inp_table->inpt_lastport; struct sockaddr_in *sin; - u_int16_t lport = 0; + u_int16_t lastport, lport = 0; int wild = 0, reuseport = (so->so_options & SO_REUSEPORT); int error; @@ -391,16 +389,16 @@ in_pcbbind(struct inpcb *inp, struct mbu */ count = first - last; if (count) - *lastport = first - arc4random_uniform(count); + lastport = first - arc4random_uniform(count); do { if (count-- < 0)/* completely used? */ return (EADDRNOTAVAIL); - --*lastport; - if (*lastport > first || *lastport < last) - *lastport = first; - lport = htons(*lastport); - } while (in_baddynamic(*lastport, so->so_proto->pr_protocol) || + --lastport; + if (lastport > first || lastport < last) + lastport = first; + lport = htons(lastport); + } while (in_baddynamic(lastport, so->so_proto->pr_protocol) || in_pcblookup(table, &zeroin_addr, 0, &inp->inp_laddr, lport, wild, inp->inp_rtableid)); } else { @@ -409,16 +407,16 @@ in_pcbbind(struct inpcb *inp, struct mbu */ count = last - first; if (count) - *lastport = first + arc4random_uniform(count); + lastport = first + arc4random_uniform(count); do { if (count-- < 0)/* completely used? */ return (EADDRNOTAVAIL); - ++*lastport; - if (*lastport < first || *lastport > last) - *lastport = first; - lport = htons(*lastport); - } while (in_baddynamic(*lastport, so->so_proto->pr_protocol) || + ++lastport; + if (lastport < first || lastport > last) + lastport = first; + lport = htons(lastport); + } while (in_baddynamic(lastport, so->so_proto->pr_protocol) || in_pcblookup(table, &zeroin_addr, 0, &inp->inp_laddr, lport, wild, inp->inp_rtableid)); } Index: sys/netinet/in_pcb.h === RCS file: /cvs/src/sys/netinet/in_pcb.h,v retrieving revision 1.89 diff -u -p -r1.89 in_pcb.h --- sys/netinet/in_pcb.h16 Apr 2015 19:24:13 - 1.89 +++ sys/netinet/in_pcb.h12 Sep 2015 12:22:03 - @@ -152,7 +152,6 @@ struct inpcbtable { struct inpcbhead *inpt_hashtbl, *inpt_lhashtbl; SIPHASH_KEY inpt_key; u_longinpt_hash, inpt_lhash; - u_int16_t inpt_lastport; int inpt_count; }; Index: sys/netinet6/in6_pcb.
kill struct inpcbtable's inpt_lastport
inpt_lastport is never read without being written before, and only in_pcbbind() and in6_pcbsetport() are using it. This diff removes inpt_lastport from struct inpcbtable and turns it into a local variable where it is used. Ok ? -- Vincent Index: sys/netinet/in_pcb.c === RCS file: /cvs/src/sys/netinet/in_pcb.c,v retrieving revision 1.179 diff -u -p -r1.179 in_pcb.c --- sys/netinet/in_pcb.c11 Sep 2015 15:29:47 - 1.179 +++ sys/netinet/in_pcb.c12 Sep 2015 12:22:03 - @@ -199,7 +199,6 @@ in_pcbinit(struct inpcbtable *table, int &table->inpt_lhash); if (table->inpt_lhashtbl == NULL) panic("in_pcbinit: hashinit failed for lport"); - table->inpt_lastport = 0; table->inpt_count = 0; arc4random_buf(&table->inpt_key, sizeof(table->inpt_key)); } @@ -281,9 +280,8 @@ in_pcbbind(struct inpcb *inp, struct mbu { struct socket *so = inp->inp_socket; struct inpcbtable *table = inp->inp_table; - u_int16_t *lastport = &inp->inp_table->inpt_lastport; struct sockaddr_in *sin; - u_int16_t lport = 0; + u_int16_t lastport, lport = 0; int wild = 0, reuseport = (so->so_options & SO_REUSEPORT); int error; @@ -391,16 +389,16 @@ in_pcbbind(struct inpcb *inp, struct mbu */ count = first - last; if (count) - *lastport = first - arc4random_uniform(count); + lastport = first - arc4random_uniform(count); do { if (count-- < 0)/* completely used? */ return (EADDRNOTAVAIL); - --*lastport; - if (*lastport > first || *lastport < last) - *lastport = first; - lport = htons(*lastport); - } while (in_baddynamic(*lastport, so->so_proto->pr_protocol) || + --lastport; + if (lastport > first || lastport < last) + lastport = first; + lport = htons(lastport); + } while (in_baddynamic(lastport, so->so_proto->pr_protocol) || in_pcblookup(table, &zeroin_addr, 0, &inp->inp_laddr, lport, wild, inp->inp_rtableid)); } else { @@ -409,16 +407,16 @@ in_pcbbind(struct inpcb *inp, struct mbu */ count = last - first; if (count) - *lastport = first + arc4random_uniform(count); + lastport = first + arc4random_uniform(count); do { if (count-- < 0)/* completely used? */ return (EADDRNOTAVAIL); - ++*lastport; - if (*lastport < first || *lastport > last) - *lastport = first; - lport = htons(*lastport); - } while (in_baddynamic(*lastport, so->so_proto->pr_protocol) || + ++lastport; + if (lastport < first || lastport > last) + lastport = first; + lport = htons(lastport); + } while (in_baddynamic(lastport, so->so_proto->pr_protocol) || in_pcblookup(table, &zeroin_addr, 0, &inp->inp_laddr, lport, wild, inp->inp_rtableid)); } Index: sys/netinet/in_pcb.h === RCS file: /cvs/src/sys/netinet/in_pcb.h,v retrieving revision 1.89 diff -u -p -r1.89 in_pcb.h --- sys/netinet/in_pcb.h16 Apr 2015 19:24:13 - 1.89 +++ sys/netinet/in_pcb.h12 Sep 2015 12:22:03 - @@ -152,7 +152,6 @@ struct inpcbtable { struct inpcbhead *inpt_hashtbl, *inpt_lhashtbl; SIPHASH_KEY inpt_key; u_longinpt_hash, inpt_lhash; - u_int16_t inpt_lastport; int inpt_count; }; Index: sys/netinet6/in6_pcb.c === RCS file: /cvs/src/sys/netinet6/in6_pcb.c,v retrieving revision 1.74 diff -u -p -r1.74 in6_pcb.c --- sys/netinet6/in6_pcb.c 11 Sep 2015 15:29:47 - 1.74 +++ sys/netinet6/in6_pcb.c 12 Sep 2015 12:22:07 - @@ -294,8 +294,7 @@ in6_pcbsetport(struct in6_addr *laddr, s struct socket *so = inp->inp_socket; struct inpcbtable *table = inp->inp_table; u_int16_t first, last;
PATCH: bring crypto(9) up to speed with crypto/cryptodev.h
Hi folks, crypto(9) describes functions and constants that are not part of crypto/cryptodev.h anymore (see 1.58 -> 1.60), this patch fixes that. Cheers, -- Vincent / dermiste Index: crypto.9 === RCS file: /cvs/src/share/man/man9/crypto.9,v retrieving revision 1.37 diff -u -p -r1.37 crypto.9 --- crypto.920 Aug 2014 11:23:42 - 1.37 +++ crypto.92 May 2015 20:02:31 - @@ -28,21 +28,15 @@ .Ft int .Fn crypto_register "u_int32_t" "int *" "int (*)(u_int32_t *, struct cryptoini *)" "int (*)(u_int64_t)" "int (*)(struct cryptop *)" .Ft int -.Fn crypto_kregister "u_int32_t" "int *" "int (*)(struct cryptkop *)" -.Ft int .Fn crypto_unregister "u_int32_t" "int" .Ft void .Fn crypto_done "struct cryptop *" -.Ft void -.Fn crypto_kdone "struct cryptkop *" .Ft int .Fn crypto_newsession "u_int64_t *" "struct cryptoini *" "int" .Ft int .Fn crypto_freesession "u_int64_t" .Ft int .Fn crypto_dispatch "struct cryptop *" -.Ft int -.Fn crypto_kdispatch "struct cryptkop *" .Ft struct cryptop * .Fn crypto_getreq "int" .Ft void @@ -84,23 +78,6 @@ struct cryptop { caddr_tcrp_mac; }; -struct crparam { -caddr_t crp_p; -u_int crp_nbits; -}; - -#define CRK_MAXPARAM8 - -struct cryptkop { -u_int krp_op; /* ie. CRK_MOD_EXP or other */ -u_int krp_status; /* return status */ -u_shortkrp_iparams;/* # of input parameters */ -u_shortkrp_oparams;/* # of output parameters */ - u_int32_t krp_hid; -struct crparam krp_param[CRK_MAXPARAM]; /* kvm */ -int (*krp_callback)(struct cryptkop *); -struct cryptkop *krp_next; -}; .Ed .Sh DESCRIPTION .Nm @@ -119,11 +96,6 @@ descriptors that instruct the framework with it) of the operations that should be applied on the data (more than one cryptographic operation can be requested). .Pp -Keying operations are supported as well. -Unlike the symmetric operators described above, -these sessionless commands perform mathematical operations using -input and output parameters. -.Pp Since the consumers may not be associated with a process, drivers may not use .Xr tsleep 9 . @@ -168,8 +140,6 @@ CRYPTO_CAST_CBC CRYPTO_MD5_HMAC CRYPTO_SHA1_HMAC CRYPTO_RIPEMD160_HMAC -CRYPTO_MD5_KPDK -CRYPTO_SHA1_KPDK CRYPTO_AES_CBC CRYPTO_AES_CTR CRYPTO_AES_XTS @@ -391,37 +361,11 @@ callback routine to do the necessary cle opaque field in the .Fa cryptop structure. -.Pp -.Fn crypto_kdispatch -is called to perform a keying operation. -The various fields in the -.Fa cryptkop -structure are: -.Bl -tag -width crp_alloctype -.It Fa krp_op -Operation code, such as CRK_MOD_EXP. -.It Fa krp_status -Return code. -This errno-style variable indicates whether there were lower level reasons -for operation failure. -.It Fa krp_iparams -Number of input parameters to the specified operation. -Note that each operation has a (typically hardwired) number of such parameters. -.It Fa krp_oparams -Number of output parameters from the specified operation. -Note that each operation has a (typically hardwired) number of such parameters. -.It Fa krp_kvp -An array of kernel memory blocks containing the parameters. -.It Fa krp_hid -Identifier specifying which low-level driver is being used. -.It Fa krp_callback -Callback called on completion of a keying operation. .El .Sh DRIVER-SIDE API The .Fn crypto_get_driverid , .Fn crypto_register , -.Fn crypto_kregister , .Fn crypto_unregister , and .Fn crypto_done @@ -465,7 +409,6 @@ The calling convention for the three dri int (*newsession) (u_int32_t *, struct cryptoini *); int (*freesession) (u_int64_t); int (*process) (struct cryptop *); -int (*kprocess) (struct cryptkop *); .Ed .Pp On invocation, the first argument to @@ -501,24 +444,8 @@ routine should invoke .Fn crypto_done . Session migration may be performed, as mentioned previously. .Pp -The -.Fn kprocess -routine is invoked with a request to perform crypto key processing. -This routine must not block, but should queue the request and return -immediately. -Upon processing the request, the callback routine should be invoked. -In case of error, the error indication must be placed in the -.Fa krp_status -field of the -.Fa cryptkop -structure. -When the request is completed, or an error is detected, the -.Fn kprocess -routine should invoke -.Fn crypto_kdone . .Sh RETURN VALUES .Fn crypto_register , -.Fn crypto_kregister , .Fn crypto_unregister , .Fn crypto_newsession , and
PATCH: iked SA cleanup on shutdown
Hi folks, this patch makes iked clean its SAs on shutdown: for each existing IKE SA, all of their Child SAs will be removed from the kernel, and a IKE DELETE notification payload will be sent to the peer. Comments ? Cheers, -- Vincent / dermiste Index: iked.h === RCS file: /cvs/src/sbin/iked/iked.h,v retrieving revision 1.84 diff -u -p -r1.84 iked.h --- iked.h 26 Mar 2015 19:52:35 - 1.84 +++ iked.h 2 May 2015 17:11:34 - @@ -549,7 +549,7 @@ struct privsep_proc { const char *p_chroot; struct privsep *p_ps; struct iked *p_env; - void(*p_shutdown)(void); + void(*p_shutdown)(struct privsep_proc *); u_intp_instance; }; @@ -744,6 +744,7 @@ pid_tikev1(struct privsep *, struct pr /* ikev2.c */ pid_t ikev2(struct privsep *, struct privsep_proc *); +voidikev2_shutdown(struct privsep_proc *); voidikev2_recv(struct iked *, struct iked_message *); voidikev2_init_ike_sa(struct iked *, void *); int ikev2_sa_negotiate(struct iked_proposals *, struct iked_proposals *, Index: ikev2.c === RCS file: /cvs/src/sbin/iked/ikev2.c,v retrieving revision 1.120 diff -u -p -r1.120 ikev2.c --- ikev2.c 26 Mar 2015 19:52:35 - 1.120 +++ ikev2.c 2 May 2015 17:11:39 - @@ -136,7 +136,20 @@ static struct privsep_proc procs[] = { pid_t ikev2(struct privsep *ps, struct privsep_proc *p) { + p->p_shutdown = ikev2_shutdown; return (proc_run(ps, p, procs, nitems(procs), NULL, NULL)); +} + +void +ikev2_shutdown(struct privsep_proc *p) +{ + struct iked *env = p->p_env; + struct iked_sa *sa, *tmpsa; + + RB_FOREACH_SAFE(sa, iked_sas, &env->sc_sas,tmpsa) { + ikev2_ikesa_delete(env, sa, sa->sa_hdr.sh_initiator); + sa_free(env, sa); + } } int Index: proc.c === RCS file: /cvs/src/sbin/iked/proc.c,v retrieving revision 1.22 diff -u -p -r1.22 proc.c --- proc.c 16 Jan 2015 06:39:58 - 1.22 +++ proc.c 2 May 2015 17:11:39 - @@ -297,7 +297,7 @@ proc_shutdown(struct privsep_proc *p) control_cleanup(&ps->ps_csock); if (p->p_shutdown != NULL) - (*p->p_shutdown)(); + (*p->p_shutdown)(p); proc_close(ps);
Re: PATCH: clarifying iked.conf man
On Mon, Apr 20, 2015 at 07:35:58PM +0059, Jason McIntyre wrote: > On Wed, Apr 15, 2015 at 05:13:13PM +0200, Vincent Gross wrote: > > Hello, > > > > iked.conf's man page is a bit fuzzy on how local and peer ip defaults > > are set. This patch below attempts to fix that. > > > > if you can specify one and have the other default to "any", i agree we'd > want to document it. > > for the rest, the diff essentially removes the information about when > these options might be useful and needed. i'm less sure about that. > > i'd appreciate some feedback from a developer that the content is > correct. > > i'm less inclined to rearrange the page this way without good reason. > > also note for future man diffs to start new sentences on new lines. > I took a second look at parse.y, and found it would choke on configs like this one : ikev2 active esp \ from 10.0.1.0/24 to 172.16.0.1 local 10.0.1.1 \ srcid 'client.lan' dstid 'gateway.lan' To get this config to work you would need to add "peer 172.16.0.1". It would be more logical to default local to src and peer to dst when having only one traffic selector, and both to any otherwise. The diff below changes how defaults are set for peer and local, and reflects the change in iked.conf(5). Comments ? Suggestions ? --- parse.y.origFri May 1 15:10:51 2015 +++ parse.y Fri May 1 17:08:51 2015 @@ -2482,25 +2482,21 @@ if (peers) { if (peers->src) ipa = peers->src; + else if (hosts->src && hosts->src->next == NULL) + ipa = hosts->src; if (peers->dst) ipb = peers->dst; - if (ipa == NULL && ipb == NULL) { - if (hosts->src && hosts->src->next == NULL) - ipa = hosts->src; - if (hosts->dst && hosts->dst->next == NULL) - ipb = hosts->dst; - } + else if (hosts->dst && hosts->dst->next == NULL) + ipb = hosts->dst; } if (ipa == NULL && ipb == NULL) { yyerror("could not get local/peer specification"); return (-1); } - if (pol.pol_flags & IKED_POLICY_ACTIVE) { - if (ipb == NULL || ipb->netaddress || - (ipa != NULL && ipa->netaddress)) { - yyerror("active mode requires local/peer address"); + if ((pol.pol_flags & IKED_POLICY_ACTIVE) && + (ipb == NULL || ipb->netaddress)) { + yyerror("active mode requires peer host address"); return (-1); - } } if (ipa) { memcpy(&pol.pol_local.addr, &ipa->address, --- iked.conf.5 28 Feb 2015 21:51:57 - 1.38 +++ iked.conf.5 1 May 2015 15:12:44 - @@ -341,16 +341,24 @@ this option is generally not needed. The .Ic peer parameter specifies the address or FQDN of the remote endpoint. -For host-to-host connections where +For single-traffic-selector host-to-host connections where .Ar dst is identical to .Ar remote , this option is generally not needed as it will be set to .Ar dst automatically. -If it is not specified or if the keyword -.Ar any -is given, the default peer is used. +.Pp +When the policy contains only one traffic selector, +.Ic local +and +.Ic peer +default values are +.Ar src +and +.Ar dst +respectively. Otherwise they both default to +.Ar any . .It Xo .Ic ikesa .Ic auth Ar algorithm
PATCH: clarifying iked.conf man
Hello, iked.conf's man page is a bit fuzzy on how local and peer ip defaults are set. This patch below attempts to fix that. Also, can you take a look at my previous nat-on-ipsec-on-iked patchset ? see http://marc.info/?l=openbsd-tech&m=142662971007779&w=2 Cheers, Index: iked.conf.5 === RCS file: /cvs/src/sbin/iked/iked.conf.5,v retrieving revision 1.38 diff -u -p -r1.38 iked.conf.5 --- iked.conf.5 28 Feb 2015 21:51:57 - 1.38 +++ iked.conf.5 15 Apr 2015 15:02:21 - @@ -334,23 +334,21 @@ see the file .It Ic local Ar localip Ic peer Ar remote The .Ic local -parameter specifies the address or FQDN of the local endpoint. -Unless the gateway is multi-homed or uses address aliases, -this option is generally not needed. -.Pp -The +and .Ic peer -parameter specifies the address or FQDN of the remote endpoint. -For host-to-host connections where +parameters specify the address or FQDN of the local and remote +endpoints respectively. +If neither are specified, their default values are equal to +.Ar src +and .Ar dst -is identical to -.Ar remote , -this option is generally not needed as it will be set to -.Ar dst -automatically. -If it is not specified or if the keyword -.Ar any -is given, the default peer is used. +for +.Ar localip +and +.Ar remote +respectively. When only one is specified, the other +defaults to +.Ar any . .It Xo .Ic ikesa .Ic auth Ar algorithm
Re: autoinstall(8) tweaks
On Wed, Apr 15, 2015 at 08:20:15AM +0900, Ryan McBride wrote: > On Thu, Apr 09, 2015 at 04:27:17AM -0600, Theo de Raadt wrote: > > > But it seems people are expected to build a custom bsd.rd if they > > > want something different so I'll bow out of this conversation. > > > > No, the situation is that less than 1% of the user community > > apparently have a secret usage case, but never manage to explain it. > > > > I manage a bunch of OpenBSD proxies that I would like to be able to > build from scratch using automated tools; everything is in place > (ansible) except for the base OpenBSD install as I need a separate > /var/squid partition to prevent cache / log disasters from filling /var; > similar concerns would apply to many other data / log-heavy daemons. > > On other systems where I don't know how the data will grow, I typically > configure them with something close to the auto layout, but a smaller > /home, and leave the remaining disk empty. When I get a feel for what > the data usage is in /var/ or /home or /usr/local, I can expand > /home or create a new partition and migrate the data. The default allocation is actually easy to rework right after a fresh install, as /usr/src, /usr/obj and /home are at the end. Ssh as root, kill /usr/src, /usr/obj and /home, optionally extend /usr/local, and then repartition as you wish. As for swap and /tmp, you can move /tmp to the end, at worst you will lose 4G worth of disk space you can add to swap. And if you need more than 2x RAM swap, you have bigger problems than partitioning. > Other reasons to want non-auto partitioning like include: > - simpler dump/restore Yeah, "embrace failure" is what all the cool kids do these days. Except that this kind of non-management just sweeps problems under the rug so they can mature into propers monsters ready to gnaw at your skull at the worst possible moment. > - moving certain parts of hier(7) onto a different device > (you can do this as a post-install task if they are empty, but > it becomes a pain if it's something that's part of base) > > A place where the latter can be quite useful is on a virtualised guest, > where you can easily make one storage device persistant, and another > ephemeral across reboots. Which part of baseXX, compXX, manXX, gameXX, xbaseXX, xfontXX, xservXX or xshareXX would fall under such a case ? None. > Yes, all of this can be done manually, but basically any place I would > care to work at is moving towards complete automation of system installs > (for *hack*Cloud*spit*, Continuous Delivery, DR, or just plain old > laziness). It would be really nice if the OpenBSD installer would handle > this in a sane fashion. Do you want me to write an ansible playbook to run a handful of shell commands over ssh ? Cheers, -- Vincent Gross
PATCH: nat on ipsec for iked
Hi, The diff below bring into iked the same nat capabilities that isakmpd already has. Tested on a daily basis to tunnel from work to home over UMTS. Comments ? -- Vincent Gross Index: iked.h === RCS file: /cvs/src/sbin/iked/iked.h,v retrieving revision 1.83 diff -u -p -r1.83 iked.h --- iked.h 16 Jan 2015 06:39:58 - 1.83 +++ iked.h 17 Mar 2015 21:51:33 - @@ -140,6 +140,8 @@ struct iked_flow { struct iked_addr flow_src; struct iked_addr flow_dst; u_intflow_dir; /* in/out */ + struct iked_addr flow_prenat; /* pre-nat source */ + u_intflow_usenat; u_intflow_loaded; /* pfkey done */ Index: parse.y === RCS file: /cvs/src/sbin/iked/parse.y,v retrieving revision 1.46 diff -u -p -r1.46 parse.y --- parse.y 8 Feb 2015 04:50:32 - 1.46 +++ parse.y 17 Mar 2015 21:51:38 - @@ -2401,7 +2401,7 @@ create_ike(char *name, int af, u_int8_t { char idstr[IKED_ID_SIZE]; u_intidtype = IKEV2_ID_NONE; - struct ipsec_addr_wrap *ipa, *ipb; + struct ipsec_addr_wrap *ipa, *ipb, *ipn; struct iked_policy pol; struct iked_proposal prop[2]; u_intj; @@ -2622,6 +2622,16 @@ create_ike(char *name, int af, u_int8_t flows[j].flow_dst.addr_mask = ipb->mask; flows[j].flow_dst.addr_net = ipb->netaddress; flows[j].flow_dst.addr_port = hosts->dport; + + ipn = ipa->srcnat; + if (ipn) { + memcpy(&flows[j].flow_prenat.addr, &ipn->address, + sizeof(ipn->address)); + flows[j].flow_prenat.addr_af = ipn->af; + flows[j].flow_prenat.addr_mask = ipn->mask; + flows[j].flow_prenat.addr_net = ipn->netaddress; + flows[j].flow_usenat = 1; + } flows[j].flow_ipproto = ipproto; Index: pfkey.c === RCS file: /cvs/src/sbin/iked/pfkey.c,v retrieving revision 1.41 diff -u -p -r1.41 pfkey.c --- pfkey.c 16 Jan 2015 06:39:58 - 1.41 +++ pfkey.c 17 Mar 2015 21:51:39 - @@ -179,6 +179,7 @@ int pfkey_flow(int sd, u_int8_t satype, u_int8_t action, struct iked_flow *flow) { struct sadb_msg smsg; + struct iked_addr*flow_src, *flow_dst; struct sadb_address sa_src, sa_dst, sa_local, sa_peer, sa_smask, sa_dmask; struct sadb_protocol sa_flowtype, sa_protocol; @@ -191,58 +192,76 @@ pfkey_flow(int sd, u_int8_t satype, u_in sport = dport = 0; sa_srcid = sa_dstid = NULL; + flow_src = &flow->flow_src; + flow_dst = &flow->flow_dst; + + if (flow->flow_usenat) + switch (flow->flow_type) { + case SADB_X_FLOW_TYPE_USE: + flow_dst = &flow->flow_prenat; + break; + case SADB_X_FLOW_TYPE_REQUIRE: + flow_src = &flow->flow_prenat; + break; + case 0: + if (flow->flow_dir == IPSP_DIRECTION_IN) + flow_dst = &flow->flow_prenat; + else + flow_src = &flow->flow_prenat; + } + bzero(&ssrc, sizeof(ssrc)); bzero(&smask, sizeof(smask)); - memcpy(&ssrc, &flow->flow_src.addr, sizeof(ssrc)); - memcpy(&smask, &flow->flow_src.addr, sizeof(smask)); - if ((sport = flow->flow_src.addr_port) != 0) + memcpy(&ssrc, &flow_src->addr, sizeof(ssrc)); + memcpy(&smask, &flow_src->addr, sizeof(smask)); + if ((sport = flow_src->addr_port) != 0) dport = 0x; socket_af((struct sockaddr *)&ssrc, sport); socket_af((struct sockaddr *)&smask, dport); - switch (flow->flow_src.addr_af) { + switch (flow_src->addr_af) { case AF_INET: ((struct sockaddr_in *)&smask)->sin_addr.s_addr = - prefixlen2mask(flow->flow_src.addr_net ? - flow->flow_src.addr_mask : 32); + prefixlen2mask(flow_src->addr_net ? + flow_src->addr_mask : 32); break; case AF_INET6: - prefixlen2mask6(flow->flow_src.addr_net ? - flow->flow_src.addr_mask : 128, +
Re: PATCH: NAT on IPSec
On Thu, Jan 15, 2015 at 04:00:20PM +0100, Vincent Gross wrote: > Hello folks, > > This patch brings nat capabilites into iked, the same way that mpf@ did > with isakmpd about 6 years ago. > > Comments ? bumpity bump bump. Any comments on this ? > > Tested with the following setup, with icmp, udp and tcp: > > >> Local pf.conf: > table { 172.23.0.0/23 } > > set skip on lo > > match out on enc0 from ! to nat-to 172.23.50.1 > > block return > pass > block return in on ! lo0 proto tcp to port 6000:6010 > > >> Local iked.conf: > ikev2 active esp \ > from 172.23.50.1 (0.0.0.0/0) to 172.23.0.0/23 peer 79.143.250.153 \ > srcid 'spinoza.kilob.yt' dstid 'brouwer.kilob.yt' > > >> Local ip address: > ppp0: flags=8051 mtu 1500 > priority: 0 > groups: ppp egress > inet 100.97.217.112 --> 10.64.64.64 netmask 0xff00 > > >> Remote pf.conf: > [...] > pass on enc0 > [...] > > >> Remote iked.conf: > ikev2 esp \ > from 172.23.0.0/23 to 172.23.50.1 peer any \ > srcid 'brouwer.kilob.yt' dstid 'spinoza.kilob.yt' > > > > > Index: iked.h > === > RCS file: /cvs/src/sbin/iked/iked.h,v > retrieving revision 1.82 > diff -u -p -r1.82 iked.h > --- iked.h18 Aug 2014 09:43:02 - 1.82 > +++ iked.h15 Jan 2015 13:54:46 - > @@ -139,6 +139,8 @@ struct iked_flow { > struct iked_addr flow_src; > struct iked_addr flow_dst; > u_intflow_dir; /* in/out */ > + struct iked_addr flow_prenat; /* pre-nat source */ > + u_intflow_usenat; > > u_intflow_loaded; /* pfkey done */ > > Index: parse.y > === > RCS file: /cvs/src/sbin/iked/parse.y,v > retrieving revision 1.43 > diff -u -p -r1.43 parse.y > --- parse.y 12 Jan 2015 11:24:58 - 1.43 > +++ parse.y 15 Jan 2015 13:54:47 - > @@ -2401,7 +2401,7 @@ create_ike(char *name, int af, u_int8_t > { > char idstr[IKED_ID_SIZE]; > u_intidtype = IKEV2_ID_NONE; > - struct ipsec_addr_wrap *ipa, *ipb; > + struct ipsec_addr_wrap *ipa, *ipb, *ipn; > struct iked_policy pol; > struct iked_proposal prop[2]; > u_intj; > @@ -2622,6 +2622,16 @@ create_ike(char *name, int af, u_int8_t > flows[j].flow_dst.addr_mask = ipb->mask; > flows[j].flow_dst.addr_net = ipb->netaddress; > flows[j].flow_dst.addr_port = hosts->dport; > + > + ipn = ipa->srcnat; > + if (ipn) { > + memcpy(&flows[j].flow_prenat.addr, &ipn->address, > + sizeof(ipn->address)); > + flows[j].flow_prenat.addr_af = ipn->af; > + flows[j].flow_prenat.addr_mask = ipn->mask; > + flows[j].flow_prenat.addr_net = ipn->netaddress; > + flows[j].flow_usenat = 1; > + } > > flows[j].flow_ipproto = ipproto; > > Index: pfkey.c > === > RCS file: /cvs/src/sbin/iked/pfkey.c,v > retrieving revision 1.40 > diff -u -p -r1.40 pfkey.c > --- pfkey.c 29 Oct 2014 06:26:39 - 1.40 > +++ pfkey.c 15 Jan 2015 13:54:47 - > @@ -180,6 +180,7 @@ int > pfkey_flow(int sd, u_int8_t satype, u_int8_t action, struct iked_flow *flow) > { > struct sadb_msg smsg; > + struct iked_addr*flow_src, *flow_dst; > struct sadb_address sa_src, sa_dst, sa_local, sa_peer, sa_smask, >sa_dmask; > struct sadb_protocol sa_flowtype, sa_protocol; > @@ -192,58 +193,76 @@ pfkey_flow(int sd, u_int8_t satype, u_in > sport = dport = 0; > sa_srcid = sa_dstid = NULL; > > + flow_src = &flow->flow_src; > + flow_dst = &flow->flow_dst; > + > + if (flow->flow_usenat) > + switch (flow->flow_type) { > + case SADB_X_FLOW_TYPE_USE: > + flow_dst = &flow->flow_prenat; > + break; > + case SADB_X_FLOW_TYPE_REQUIRE: > + flow_src = &flow->flow_prenat; > + break; > + case 0: > +
PATCH: NAT on IPSec
Hello folks, This patch brings nat capabilites into iked, the same way that mpf@ did with isakmpd about 6 years ago. Comments ? Tested with the following setup, with icmp, udp and tcp: >> Local pf.conf: table { 172.23.0.0/23 } set skip on lo match out on enc0 from ! to nat-to 172.23.50.1 block return pass block return in on ! lo0 proto tcp to port 6000:6010 >> Local iked.conf: ikev2 active esp \ from 172.23.50.1 (0.0.0.0/0) to 172.23.0.0/23 peer 79.143.250.153 \ srcid 'spinoza.kilob.yt' dstid 'brouwer.kilob.yt' >> Local ip address: ppp0: flags=8051 mtu 1500 priority: 0 groups: ppp egress inet 100.97.217.112 --> 10.64.64.64 netmask 0xff00 >> Remote pf.conf: [...] pass on enc0 [...] >> Remote iked.conf: ikev2 esp \ from 172.23.0.0/23 to 172.23.50.1 peer any \ srcid 'brouwer.kilob.yt' dstid 'spinoza.kilob.yt' Index: iked.h === RCS file: /cvs/src/sbin/iked/iked.h,v retrieving revision 1.82 diff -u -p -r1.82 iked.h --- iked.h 18 Aug 2014 09:43:02 - 1.82 +++ iked.h 15 Jan 2015 13:54:46 - @@ -139,6 +139,8 @@ struct iked_flow { struct iked_addr flow_src; struct iked_addr flow_dst; u_intflow_dir; /* in/out */ + struct iked_addr flow_prenat; /* pre-nat source */ + u_intflow_usenat; u_intflow_loaded; /* pfkey done */ Index: parse.y === RCS file: /cvs/src/sbin/iked/parse.y,v retrieving revision 1.43 diff -u -p -r1.43 parse.y --- parse.y 12 Jan 2015 11:24:58 - 1.43 +++ parse.y 15 Jan 2015 13:54:47 - @@ -2401,7 +2401,7 @@ create_ike(char *name, int af, u_int8_t { char idstr[IKED_ID_SIZE]; u_intidtype = IKEV2_ID_NONE; - struct ipsec_addr_wrap *ipa, *ipb; + struct ipsec_addr_wrap *ipa, *ipb, *ipn; struct iked_policy pol; struct iked_proposal prop[2]; u_intj; @@ -2622,6 +2622,16 @@ create_ike(char *name, int af, u_int8_t flows[j].flow_dst.addr_mask = ipb->mask; flows[j].flow_dst.addr_net = ipb->netaddress; flows[j].flow_dst.addr_port = hosts->dport; + + ipn = ipa->srcnat; + if (ipn) { + memcpy(&flows[j].flow_prenat.addr, &ipn->address, + sizeof(ipn->address)); + flows[j].flow_prenat.addr_af = ipn->af; + flows[j].flow_prenat.addr_mask = ipn->mask; + flows[j].flow_prenat.addr_net = ipn->netaddress; + flows[j].flow_usenat = 1; + } flows[j].flow_ipproto = ipproto; Index: pfkey.c === RCS file: /cvs/src/sbin/iked/pfkey.c,v retrieving revision 1.40 diff -u -p -r1.40 pfkey.c --- pfkey.c 29 Oct 2014 06:26:39 - 1.40 +++ pfkey.c 15 Jan 2015 13:54:47 - @@ -180,6 +180,7 @@ int pfkey_flow(int sd, u_int8_t satype, u_int8_t action, struct iked_flow *flow) { struct sadb_msg smsg; + struct iked_addr*flow_src, *flow_dst; struct sadb_address sa_src, sa_dst, sa_local, sa_peer, sa_smask, sa_dmask; struct sadb_protocol sa_flowtype, sa_protocol; @@ -192,58 +193,76 @@ pfkey_flow(int sd, u_int8_t satype, u_in sport = dport = 0; sa_srcid = sa_dstid = NULL; + flow_src = &flow->flow_src; + flow_dst = &flow->flow_dst; + + if (flow->flow_usenat) + switch (flow->flow_type) { + case SADB_X_FLOW_TYPE_USE: + flow_dst = &flow->flow_prenat; + break; + case SADB_X_FLOW_TYPE_REQUIRE: + flow_src = &flow->flow_prenat; + break; + case 0: + if (flow->flow_dir == IPSP_DIRECTION_IN) + flow_dst = &flow->flow_prenat; + else + flow_src = &flow->flow_prenat; + } + bzero(&ssrc, sizeof(ssrc)); bzero(&smask, sizeof(smask)); - memcpy(&ssrc, &flow->flow_src.addr, sizeof(ssrc)); - memcpy(&smask, &flow->flow_src.addr, sizeof(smask)); - if ((sport = flow->flow_src.addr_port) != 0) + memcpy(&ssrc, &flow_src->addr, sizeof(ssrc)); + memcpy(&smask, &flow_src->addr, sizeof(smask)); + if ((sport = flow_src->addr_port) != 0) dport = 0x; socket_af((struct sockaddr *)&ssrc, sport); socket_af((struct sockaddr *)&smask, dport); - swit
Re: iked control process crash at startup
On Tue, Nov 25, 2014 at 05:35:19PM +0100, Mike Belopuhov wrote: [...] > > seƱior pedro@ was kind enough to send a potential fix for this. > this will be commited soon. > > > diff --git sbin/iked/ca.c sbin/iked/ca.c > index e43b58b..a008f99 100644 > --- sbin/iked/ca.c > +++ sbin/iked/ca.c > @@ -534,7 +534,7 @@ ca_reload(struct iked *env) > X509_OBJECT *xo; > X509*x509; > DIR *dir; > - int i, len, iovcnt = 2; > + int i, len, iovcnt = 0; > > /* >* Load CAs > @@ -620,8 +620,10 @@ ca_reload(struct iked *env) > env->sc_certreqtype = IKEV2_CERT_X509_CERT; > iov[0].iov_base = &env->sc_certreqtype; > iov[0].iov_len = sizeof(env->sc_certreqtype); > + iovcnt++; > iov[1].iov_base = ibuf_data(env->sc_certreq); > iov[1].iov_len = ibuf_length(env->sc_certreq); > + iovcnt++; > > log_debug("%s: loaded %zu ca certificate%s", __func__, > ibuf_length(env->sc_certreq) / SHA_DIGEST_LENGTH, > @@ -677,6 +679,8 @@ ca_reload(struct iked *env) > > iov[0].iov_base = &env->sc_certreqtype; > iov[0].iov_len = sizeof(env->sc_certreqtype); > + if (iovcnt == 0) > + iovcnt++; > (void)proc_composev_imsg(&env->sc_ps, PROC_IKEV2, -1, > IMSG_CERTREQ, -1, iov, iovcnt); > Welp, with this fix I can't get iked to crash no more, and I did try several times. Looks like pedro@ got things right. Thank you all for your help ! -- Vincent pgpVpOvEZDU3F.pgp Description: PGP signature
iked control process crash at startup
Hi tech@, I've been using iked for some weeks to tunnel my laptop to home over 3G. Sunday I upgraded my laptop to the latest snapshot; previous upgrade was about 2 or 3 weeks ago. When I started iked, it crashed randomly, as in one time it runs just fine and completes the handshake, the other it crashes before even sending the first packet. I ran "ktrace -di /sbin/iked" and kdump'd the resulting file. Of the 5 processes, 4 finished by calling exit(0), one was terminated on a SIGSEGV. As it is also the only one that do stuff on /var/run/iked.sock, it is the control process. I repeated the above ktrace 4 times and got consistent results: SIGSEGV'd control process. I'll keep the hunt going, but I am not sure how long this will take nor how much time I'll have to spare, so here is the control process kdump. Cheers, -- Vincent 17866 iked RET fork 0 17866 iked CALL getpid() 17866 iked RET getpid 17866/0x45ca 17866 iked CALL setpgid(0,0x45ca) 17866 iked RET setpgid 0 17866 iked CALL socket(PF_LOCAL,0x1,0) 17866 iked RET socket 15/0xf 17866 iked CALL unlink(0x631ceb) 17866 iked NAMI "/var/run/iked.sock" 17866 iked RET unlink 0 17866 iked CALL umask(0117) 17866 iked RET umask 18/0x12 17866 iked CALL bind(0xf,0x7f7c8660,0x6a) 17866 iked STRU struct sockaddr { AF_LOCAL, "/var/run/iked.sock" } 17866 iked NAMI "/var/run/iked.sock" 17866 iked RET bind 0 17866 iked CALL umask(022) 17866 iked RET umask 79/0x4f 17866 iked CALL chmod(0x631ceb,0660) 17866 iked NAMI "/var/run/iked.sock" 17866 iked RET chmod 0 17866 iked CALL fcntl(0xf,F_GETFL) 17866 iked RET fcntl 2 17866 iked CALL fcntl(0xf,F_SETFL,0x6) 17866 iked RET fcntl 0 17866 iked CALL chroot(0x631d39) 17866 iked NAMI "/etc/iked/" 17866 iked RET chroot 0 17866 iked CALL chdir(0x6669f2) 17866 iked NAMI "/" 17866 iked RET chdir 0 17866 iked CALL __sysctl(2.3,0x7f7c8640,0x7f7c8630,0,0) 17866 iked RET __sysctl 0 17866 iked CALL setgroups(0x1,0x9bddb4) 17866 iked RET setgroups 0 17866 iked CALL setresgid(0x65,0x65,0x65) 17866 iked RET setresgid 0 17866 iked CALL setresuid(0x65,0x65,0x65) 17866 iked RET setresuid 0 17866 iked CALL clock_gettime(CLOCK_MONOTONIC,0x7f7c86c0) 17866 iked STRU struct timespec { 150798.566033906 } 17866 iked RET clock_gettime 0 17866 iked CALL clock_gettime(CLOCK_MONOTONIC,0x7f7c8690) 17866 iked STRU struct timespec { 150798.566077766 } 17866 iked RET clock_gettime 0 17866 iked CALL issetugid() 17866 iked RET issetugid 0 17866 iked CALL kqueue() 17866 iked RET kqueue 16/0x10 17866 iked CALL getpid() 17866 iked RET getpid 17866/0x45ca 17866 iked CALL getentropy(0x7f7c8550,0x28) 17866 iked RET getentropy 0 17866 iked CALL issetugid() 17866 iked RET issetugid 0 17866 iked CALL kevent(0x10,0x7f7c8640,0x1,0,0,0x7f7c8660) 17866 iked STRU struct timespec { 0 } 17866 iked RET kevent 0 17866 iked CALL sigaction(SIGINT,0x7f7c85f0,0x2817fb150) 17866 iked STRU struct sigaction { handler=0x42f6f0, mask=~0<>, flags=0x2 } 17866 iked STRU struct sigaction { handler=SIG_DFL, mask=0<>, flags=0<> } 17866 iked RET sigaction 0 17866 iked CALL kevent(0x10,0x7f7c8640,0x1,0,0,0x7f7c8660) 17866 iked STRU struct timespec { 0 } 17866 iked RET kevent 0 17866 iked CALL sigaction(SIGTERM,0x7f7c85f0,0x2817fcc30) 17866 iked STRU struct sigaction { handler=0x42f6f0, mask=~0<>, flags=0x2 } 17866 iked STRU struct sigaction { handler=SIG_DFL, mask=0<>, flags=0<> } 17866 iked RET sigaction 0 17866 iked CALL kevent(0x10,0x7f7c8640,0x1,0,0,0x7f7c8660) 17866 iked STRU struct timespec { 0 } 17866 iked RET kevent 0 17866 iked CALL sigaction(SIGCHLD,0x7f7c85f0,0x2817fa980) 17866 iked STRU struct sigaction { handler=0x42f6f0, mask=~0<>, flags=0x2 } 17866 iked STRU struct sigaction { handler=SIG_DFL, mask=0<>, flags=0<> } 17866 iked RET sigaction 0 17866 iked CALL kevent(0x10,0x7f7c8640,0x1,0,0,0x7f7c8660) 17866 iked STRU struct timespec { 0 } 17866 iked RET kevent 0 17866 iked CALL sigaction(SIGHUP,0x7f7c85f0,0x2817fc530) 17866 iked STRU struct sigaction { handler=0x42f6f0, mask=~0<>, flags=0x2 } 17866 iked STRU struct sigaction { handler=SIG_DFL, mask=0<>, flags=0<> } 17866 iked RET sigaction 0 17866 iked CALL kevent(0x10,0x7f7c8640,0x1,0,0,0x7f7c8660) 17866 iked STRU struct timespec { 0 } 17866 iked RET kevent 0 17866 iked CALL sigaction(SIGPIPE,0x7f7c85f0,0x2817fc080) 17866 iked STRU struct sigaction { handler=0x42f6f0, mask=~0<>, flags=0x2 }
add DSA and ECDSA to relayd ca engine
Hi, Two diffs below. The first moves ecdsa_method declaration from ecs_locl.h to ecdsa.h, as ecs_locl.h is not installed in /usr/include/openssl/. The second one adds DSA and ECDSA capabilities to relayd ca engine, and also checks that when using a DSA certificate, we have enabled EDH in the relevant proto section. Requirements have been documented in relayd.conf(5). It works, but it surely needs some refactoring and style tweaks. Comments ? [Diff #1] Index: ecdsa.h === RCS file: /cvs/src/lib/libssl/src/crypto/ecdsa/ecdsa.h,v retrieving revision 1.2 diff -u -p -u -r1.2 ecdsa.h --- ecdsa.h 12 Jun 2014 15:49:29 - 1.2 +++ ecdsa.h 8 Nov 2014 18:30:58 - @@ -75,11 +75,36 @@ extern "C" { #endif -typedef struct ECDSA_SIG_st - { +typedef struct ECDSA_SIG_st ECDSA_SIG; + +struct ecdsa_method { + const char *name; + ECDSA_SIG *(*ecdsa_do_sign)(const unsigned char *dgst, int dgst_len, + const BIGNUM *inv, const BIGNUM *rp, EC_KEY *eckey); + int (*ecdsa_sign_setup)(EC_KEY *eckey, BN_CTX *ctx, BIGNUM **kinv, + BIGNUM **r); + int (*ecdsa_do_verify)(const unsigned char *dgst, int dgst_len, + const ECDSA_SIG *sig, EC_KEY *eckey); +#if 0 + int (*init)(EC_KEY *eckey); + int (*finish)(EC_KEY *eckey); +#endif + int flags; + char *app_data; +}; + +/* If this flag is set the ECDSA method is FIPS compliant and can be used + * in FIPS mode. This is set in the validated module method. If an + * application sets this flag in its own methods it is its responsibility + * to ensure the result is compliant. + */ + +#define ECDSA_FLAG_FIPS_METHOD 0x1 + +struct ECDSA_SIG_st { BIGNUM *r; BIGNUM *s; - } ECDSA_SIG; +}; /** Allocates and initialize a ECDSA_SIG structure * \return pointer to a ECDSA_SIG structure or NULL if an error occurred Index: ecs_locl.h === RCS file: /cvs/src/lib/libssl/src/crypto/ecdsa/ecs_locl.h,v retrieving revision 1.2 diff -u -p -u -r1.2 ecs_locl.h --- ecs_locl.h 12 Jun 2014 15:49:29 - 1.2 +++ ecs_locl.h 8 Nov 2014 18:30:58 - @@ -65,31 +65,6 @@ extern "C" { #endif -struct ecdsa_method - { - const char *name; - ECDSA_SIG *(*ecdsa_do_sign)(const unsigned char *dgst, int dgst_len, - const BIGNUM *inv, const BIGNUM *rp, EC_KEY *eckey); - int (*ecdsa_sign_setup)(EC_KEY *eckey, BN_CTX *ctx, BIGNUM **kinv, - BIGNUM **r); - int (*ecdsa_do_verify)(const unsigned char *dgst, int dgst_len, - const ECDSA_SIG *sig, EC_KEY *eckey); -#if 0 - int (*init)(EC_KEY *eckey); - int (*finish)(EC_KEY *eckey); -#endif - int flags; - char *app_data; - }; - -/* If this flag is set the ECDSA method is FIPS compliant and can be used - * in FIPS mode. This is set in the validated module method. If an - * application sets this flag in its own methods it is its responsibility - * to ensure the result is compliant. - */ - -#define ECDSA_FLAG_FIPS_METHOD 0x1 - typedef struct ecdsa_data_st { /* EC_KEY_METH_DATA part */ int (*init)(EC_KEY *); [Diff #2] Index: ca.c === RCS file: /cvs/src/usr.sbin/relayd/ca.c,v retrieving revision 1.9 diff -u -p -u -r1.9 ca.c --- ca.c2 Oct 2014 19:16:31 - 1.9 +++ ca.c8 Nov 2014 18:26:15 - @@ -32,9 +32,13 @@ #include #include +#include +#include #include #include #include +#include +#include #include #include "relayd.h" @@ -60,6 +64,30 @@ int rsae_verify(int dtype, const u_char u_int, const RSA *); int rsae_keygen(RSA *, int, BIGNUM *, BN_GENCB *); + +DSA_SIG*dsae_do_sign(const unsigned char *, int, DSA *); +int dsae_sign_setup(DSA *, BN_CTX *, BIGNUM **, BIGNUM **); +int dsae_do_verify(const unsigned char *, int, DSA_SIG *, + DSA *); +int dsae_mod_exp(DSA *, BIGNUM *, BIGNUM *, BIGNUM *, + BIGNUM *, BIGNUM *, BIGNUM *, BN_CTX *, + BN_MONT_CTX *); +int dsae_bn_mod_exp(DSA *, BIGNUM *, BIGNUM *, + const BIGNUM *, const BIGNUM *, BN_CTX *, + BN_MONT_CTX *); +int dsae_init(DSA *); +int dsae_finish(DSA *); +int dsae_paramgen(DSA *, int, const unsigned char *, int, + int *, unsigned long *, BN_GENCB *); +int dsae_keygen(DSA *); + +ECDSA_SIG *ecdsae_do_sign(const unsigned char *, int , + const BIGNUM *, const BIGNUM *, EC_KEY *); +int ecdsae_sign_setup(EC_KEY *, BN_CTX *, BIGNUM **, + BIGNUM **); +int ecdsae_do_verify(const unsigned char *, int, + const ECDSA_S
Re: Request for Funding our Electricity
On Wed, Jan 15, 2014 at 06:25:53PM +0200, MJ wrote: > > I have long held the opinion that Theo is probably the best coder on this > planet. That?s not any sort of ass-kissing, either, it?s my objective, > unbiased opinion. And I know Henning personally, as in ?live and worked > together with him" - one hell of an expert. > > However, the dilemma that the project has found itself in now very clearly > demonstrates that Theo is not a businessman and that there isn?t any other > businessman at the helm, either. Imagining that people will suddenly start to > pay for something that they have constantly been getting for free is absurd - > their belief is that somebody else will surely step up first or somebody will > fork in the name of fame. No business on this planet is going to allocate > budget to paying OpenBSD?s electricity bills, let alone anything else, > without 1) a detailed itemisation of the electrical bills, 2) a detailed > justification of said line items, and 3) a satisfaction of their own business > interest. It?s just not sexy for a philanthropist to support a relatively > unheard of operating system when cancer is still left uncured. Define sexy. Some people will say it's having flash running full speed on their web browser while streaming 3 youtube videos. For me it's being able to trust my operating system to behave in a way that keeps me in the loop and able to fix it. As for the legalese, some people said "You'll never get anywhere without a protocol number for CARP!", yet some ciscos support CARP nowadays. > > It?s not good to be removing coders from their tasks; the project needs a > businessman or two. One who will handle the corporate feature requests and > charge dearly for them. Things like routing technology and high-speed packet > forwarding - things that can replace the exorbitant costs of maintaining > cisco routers. This is the key. With the FBSD 10GB wire speed packet > forwarding incorporated, OpenBSD would be ready to challenge Cisco in a very > serious way. Completely free as always, but with paid support for this edge > cases that make life what it is. > I don't know what is your background with corporate IT, but my experience is that most of the time what the suits are looking for is the assurance they will have resources to fix arising issues, or in layman terms, a tech support to yell at. I do not see OpenBSD providing such a support. However there are quite a few companies that provide such service for their OpenBSD-based appliances. Does that mean OpenBSD roadmap should be based on what will sell with these companies? The answer (which is "no") has already be given many times on misc@, and I will let Theo add another layer of p[ao]int if he deems it necessary. Lastly, you suggest having a businessman in the project. That is, someone who gets a commit bit by doing something else than coding. It's not even about what this says to the world or the example it sets. It is just plain rude towards the developers. I am not downplaying the skills of businessmen; but you simply can't just say that contributing code the OpenBSD way is the same as selling the product, however tough that may be. This is not a race; this is about doing things right. regards, -- Vincent