On Fri, Oct 25, 2019 at 12:27:25PM -0700, Mike Larkin wrote: > On Fri, Oct 25, 2019 at 06:15:59PM +0000, Reyk Floeter wrote: > > Hi, > > > > the attached diff is rather large and implements two things for vmd: > > > > 1) Allow to configure static IP address/gateway pairs local interfaces. > > 2) Skip statically configured interface names (eg. tap0) when > > allocating dynamic interfaces. > > > > Example: > > ---snip--- > > vm "foo" { > > disable > > local interface "tap0" { > > address 192.168.0.10/24 192.168.0.1 > > } > > local interface "tap1" > > disk "/home/vm/foo.qcow2" > > } > > > > vm "bar" { > > local interface > > disk "/home/vm/bar.qcow2" > > } > > ---snap--- > > > > > > 1) The VM "foo" has two interfaces: The first interface has a fixed > > IPv4 address with 192.168.0.1/24 on the gateway and 192.168.0.10/24 on > > the VM. 192.168.0.10/24 is assigned to the VM's first NIC via the > > built-in DHCP server. The second VM gets a default 100.64.x.x/31 IP. > > I'm not sure the above description matches what I'm seeing in the vm.conf > snippet above. > > What's "the gateway" here? Is this the host machine, or the actual > gateway, perhaps on some other machine? Does this just allow me to specify > the host-side tap(4) IP address for a corresponding given VM vio(4) interface? >
Ah, OK. I used the terms without explaining them: With local interfaces, vmd(8) uses two IPs per interface: one for the tap(4) on the host, one for the vio(4) on the VM. It configures the first one on the host and provides the second one via DHCP. The IP on the host IP is the default "gateway" router for the VM. The address syntax is currently reversed: address "address/prefix" "gateway" Maybe I should change it to address "gateway" "address/prefix" or address "address/prefix" gateway "gateway" I also wonder if we could technically use a non-local IP address for the gateway. I currently enforce that the prefix matches, but I don't enforce that both addresses are in the same subnet. When using the default auto-generated 100.64.0.0/31 method, it uses the first IP in the subnet as the gateway and the second IP for the VM. > And did you mean "The second interface" there instead of the "The second VM"? > (Although I think the description fits for "The second VM" also...) > Yes, both, the second interface is correct as well. > I think the idea is sound. As long as we don't end up adding extra command > line args to vmctl to manually configure this, which it doesn't appear we are > doing here. :) > I don't want to add it to vmctl either. > I didn't read the diff in great detail, I'll wait until you say you have a > final version. > OK, thanks. Reyk > -ml > > > This idea came up when I talked with Mischa at EuroBSDCon about > > OpenBSDAms: instead of using L2 and external static dhcpd for all VMs, > > it could be a solution to use L3 and to avoid bridge(4) and dhcpd(8). > > But it would need a way to serve static IPs via the internal dhcp > > server. Using L3 with vmd is better with performance, routing, PF, > > etc., but has the drawback that it wastes a subnet and gateway IP per > > VM (maybe rdomains or other tricks could help here, but this is a > > problem for later). > > > > 2) The VM "foo" uses two static interface names, tap0 and tap1, and > > the VM "bar" uses a dynamic interface name (tapX). Without this diff, > > vmd would most certainly use tap0 for bar's interface because foo is > > disabled and not started before bar. With the diff, the first > > interface of bar will be tap2 or higher. > > The problem was just reported by kn@. I mixed both things into > > one diff because I was working on 1) when kn@ reported it. There are > > other ways to implement 2) but solving both issues in a similar way > > made more sense. > > > > This is not the final diff. I still have to clean it up, get > > feedback, think a little bit about it, and split it into smaller parts > > for review. I wanted to share the big picture. > > > > As a side node, I implemented the lookup with sorted tables because it > > is the most efficient way to do it, but maybe a simple linear lookup > > (iterating over all the VMs and all the interfaces all the time) would > > be good enough. But the current approach has benefits - if I did it > > right ;) > > > > Thoughts? > > > > Reyk > > > > Index: usr.sbin/vmd/dhcp.c > > =================================================================== > > RCS file: /cvs/src/usr.sbin/vmd/dhcp.c,v > > retrieving revision 1.8 > > diff -u -p -u -p -r1.8 dhcp.c > > --- usr.sbin/vmd/dhcp.c 27 Dec 2018 19:51:30 -0000 1.8 > > +++ usr.sbin/vmd/dhcp.c 25 Oct 2019 18:11:05 -0000 > > @@ -119,8 +119,7 @@ dhcp_request(struct vionet_dev *dev, cha > > } > > > > if ((client_addr.s_addr = > > - vm_priv_addr(&env->vmd_cfg, > > - dev->vm_vmid, dev->idx, 1)) == 0) > > + vm_priv_addr(&env->vmd_cfg, dev->vm_vmid, dev->idx, 1, &mask)) == 0) > > return (-1); > > memcpy(&resp.yiaddr, &client_addr, > > sizeof(client_addr)); > > @@ -129,7 +128,7 @@ dhcp_request(struct vionet_dev *dev, cha > > ss2sin(&pc.pc_dst)->sin_port = htons(CLIENT_PORT); > > > > if ((server_addr.s_addr = vm_priv_addr(&env->vmd_cfg, dev->vm_vmid, > > - dev->idx, 0)) == 0) > > + dev->idx, 0, &mask)) == 0) > > return (-1); > > memcpy(&resp.siaddr, &server_addr, sizeof(server_addr)); > > memcpy(&ss2sin(&pc.pc_src)->sin_addr, &server_addr, > > Index: usr.sbin/vmd/parse.y > > =================================================================== > > RCS file: /cvs/src/usr.sbin/vmd/parse.y,v > > retrieving revision 1.52 > > diff -u -p -u -p -r1.52 parse.y > > --- usr.sbin/vmd/parse.y 14 May 2019 06:05:45 -0000 1.52 > > +++ usr.sbin/vmd/parse.y 25 Oct 2019 18:11:05 -0000 > > @@ -120,9 +120,9 @@ typedef struct { > > > > > > %token INCLUDE ERROR > > -%token ADD ALLOW BOOT CDROM DEVICE DISABLE DISK DOWN ENABLE FORMAT > > GROUP > > -%token INET6 INSTANCE INTERFACE LLADDR LOCAL LOCKED MEMORY NET NIFS > > OWNER > > -%token PATH PREFIX RDOMAIN SIZE SOCKET SWITCH UP VM VMID > > +%token ADD ADDRESS ALLOW BOOT CDROM DEVICE DISABLE DISK DOWN ENABLE > > FORMAT > > +%token GROUP INET6 INSTANCE INTERFACE LLADDR LOCAL LOCKED MEMORY NET > > NIFS > > +%token OWNER PATH PREFIX RDOMAIN SIZE SOCKET SWITCH UP VM VMID > > %token <v.number> NUMBER > > %token <v.string> STRING > > %type <v.lladdr> lladdr > > @@ -413,6 +413,12 @@ vm_opts : disable > > { > > > > if ($1) > > vmc.vmc_ifflags[i] |= VMIFF_LOCAL; > > + else if (vmc.vmc_ifflags[i] & > > + (VMIFF_ADDR4|VMIFF_ADDR6)) { > > + yyerror("address on non-local interface"); > > + free($3); > > + YYERROR; > > + } > > if ($3 != NULL) { > > if (strcmp($3, "tap") != 0 && > > (priv_getiftype($3, type, NULL) == -1 || > > @@ -617,7 +623,53 @@ iface_opts_c : iface_opts_c iface_opts o > > | iface_opts > > ; > > > > -iface_opts : SWITCH string { > > +iface_opts : ADDRESS STRING STRING { > > + struct vmop_address *vma; > > + unsigned int i = vcp_nnics; > > + struct address addr, gw; > > + char *gwp = NULL; > > + int maxprefixlen = 0; > > + > > + /* Does the gateway have a /prefix syntax? */ > > + gwp = strrchr($3, '/'); > > + > > + if (host($2, &addr) == -1 || > > + host($3, &gw) == -1 || > > + addr.ss.ss_family != gw.ss.ss_family) { > > + yyerror("invalid address: %s %s", $2, $3); > > + free($2); > > + free($3); > > + YYERROR; > > + } > > + free($2); > > + > > + if (gwp == NULL) > > + gw.prefixlen = addr.prefixlen; > > + else if (gw.prefixlen != addr.prefixlen) { > > + yyerror("mismatched gateway prefix: %s", $3); > > + free($3); > > + YYERROR; > > + } > > + free($3); > > + > > + if (addr.ss.ss_family == AF_INET) { > > + vmc.vmc_ifflags[i] |= VMIFF_ADDR4; > > + vma = &vmc.vmc_ifaddr4[i]; > > + maxprefixlen = 127; > > + } else { > > + vmc.vmc_ifflags[i] |= VMIFF_ADDR6; > > + vma = &vmc.vmc_ifaddr6[i]; > > + maxprefixlen = 31; > > + } > > + if (maxprefixlen && addr.prefixlen > maxprefixlen) { > > + yyerror("address prefix larger than /%u"); > > + YYERROR; > > + } > > + memcpy(&vma->vma_addr, &addr.ss, sizeof(addr.ss)); > > + memcpy(&vma->vma_gw, &gw.ss, sizeof(gw.ss)); > > + vma->vma_prefixlen = addr.prefixlen; > > + } > > + | SWITCH string { > > unsigned int i = vcp_nnics; > > > > /* No need to check if the switch exists */ > > @@ -763,6 +815,7 @@ lookup(char *s) > > /* this has to be sorted always */ > > static const struct keywords keywords[] = { > > { "add", ADD }, > > + { "address", ADDRESS }, > > { "allow", ALLOW }, > > { "boot", BOOT }, > > { "cdrom", CDROM }, > > Index: usr.sbin/vmd/priv.c > > =================================================================== > > RCS file: /cvs/src/usr.sbin/vmd/priv.c,v > > retrieving revision 1.15 > > diff -u -p -u -p -r1.15 priv.c > > --- usr.sbin/vmd/priv.c 28 Jun 2019 13:32:51 -0000 1.15 > > +++ usr.sbin/vmd/priv.c 25 Oct 2019 18:11:05 -0000 > > @@ -46,6 +46,12 @@ > > #include "proc.h" > > #include "vmd.h" > > > > +static unsigned int *priv_ifunits; > > +static size_t priv_nifunits; > > + > > +static struct vmd_ifconfig *priv_ifs; > > +static size_t priv_nifs; > > + > > int priv_dispatch_parent(int, struct privsep_proc *, struct imsg > > *); > > void priv_run(struct privsep *, struct privsep_proc *, void *); > > > > @@ -91,7 +97,9 @@ priv_dispatch_parent(int fd, struct priv > > struct ifaliasreq ifra; > > struct in6_aliasreq in6_ifra; > > struct if_afreq ifar; > > + struct vmd_ifconfig vifc; > > char type[IF_NAMESIZE]; > > + int i; > > > > switch (imsg->hdr.type) { > > case IMSG_VMDOP_PRIV_IFDESCR: > > @@ -112,6 +120,8 @@ priv_dispatch_parent(int fd, struct priv > > fatalx("%s: rejected priv operation on interface: %s", > > __func__, vfr.vfr_name); > > break; > > + case IMSG_VMDOP_IF_REGISTER: > > + case IMSG_VMDOP_IF_UNREGISTER: > > case IMSG_VMDOP_CONFIG: > > case IMSG_CTL_RESET: > > break; > > @@ -244,6 +254,18 @@ priv_dispatch_parent(int fd, struct priv > > if (ioctl(env->vmd_fd6, SIOCAIFADDR_IN6, &in6_ifra) == -1) > > log_warn("SIOCAIFADDR_IN6"); > > break; > > + case IMSG_VMDOP_IF_REGISTER: > > + IMSG_SIZE_CHECK(imsg, &vifc); > > + memcpy(&vifc, imsg->data, sizeof(vifc)); > > + if (vm_priv_register(ps, &vifc) == -1) > > + fatalx("%s: failed to register interface", > > + __func__); > > + break; > > + case IMSG_VMDOP_IF_UNREGISTER: > > + IMSG_SIZE_CHECK(imsg, &i); > > + memcpy(&i, imsg->data, sizeof(i)); > > + vm_priv_unregister(ps, imsg->hdr.peerid, i); > > + break; > > case IMSG_VMDOP_CONFIG: > > config_getconfig(env, imsg); > > break; > > @@ -328,8 +350,9 @@ vm_priv_ifconfig(struct privsep *ps, str > > struct vmd_switch *vsw; > > unsigned int i; > > struct vmop_ifreq vfr, vfbr; > > - struct sockaddr_in *sin4; > > - struct sockaddr_in6 *sin6; > > + struct sockaddr_in *sin4, *mask4; > > + struct sockaddr_in6 *sin6, *mask6; > > + uint8_t prefixlen; > > > > for (i = 0; i < VMM_MAX_NICS_PER_VM; i++) { > > vif = &vm->vm_ifs[i]; > > @@ -435,24 +458,25 @@ vm_priv_ifconfig(struct privsep *ps, str > > memset(&vfr.vfr_mask, 0, sizeof(vfr.vfr_mask)); > > memset(&vfr.vfr_addr, 0, sizeof(vfr.vfr_addr)); > > > > - /* local IPv4 address with a /31 mask */ > > - sin4 = (struct sockaddr_in *)&vfr.vfr_mask; > > - sin4->sin_family = AF_INET; > > - sin4->sin_len = sizeof(*sin4); > > - sin4->sin_addr.s_addr = htonl(0xfffffffe); > > + /* local IPv4 address and netmask */ > > + mask4 = ss2sin(&vfr.vfr_mask); > > + mask4->sin_family = AF_INET; > > + mask4->sin_len = sizeof(*mask4); > > > > - sin4 = (struct sockaddr_in *)&vfr.vfr_addr; > > + sin4 = ss2sin(&vfr.vfr_addr); > > sin4->sin_family = AF_INET; > > sin4->sin_len = sizeof(*sin4); > > if ((sin4->sin_addr.s_addr = > > vm_priv_addr(&env->vmd_cfg, > > - vm->vm_vmid, i, 0)) == 0) > > + vm->vm_vmid, i, 0, &mask4->sin_addr)) == 0) > > return (-1); > > > > inet_ntop(AF_INET, &sin4->sin_addr, > > name, sizeof(name)); > > - log_debug("%s: interface %s address %s/31", > > - __func__, vfr.vfr_name, name); > > + prefixlen = mask2prefixlen((struct sockaddr *)mask4); > > + > > + log_debug("%s: interface %s address %s/%u", > > + __func__, vfr.vfr_name, name, prefixlen); > > > > proc_compose(ps, PROC_PRIV, IMSG_VMDOP_PRIV_IFADDR, > > &vfr, sizeof(vfr)); > > @@ -462,24 +486,24 @@ vm_priv_ifconfig(struct privsep *ps, str > > memset(&vfr.vfr_mask, 0, sizeof(vfr.vfr_mask)); > > memset(&vfr.vfr_addr, 0, sizeof(vfr.vfr_addr)); > > > > - /* local IPv6 address with a /96 mask */ > > - sin6 = ss2sin6(&vfr.vfr_mask); > > - sin6->sin6_family = AF_INET6; > > - sin6->sin6_len = sizeof(*sin6); > > - memset(&sin6->sin6_addr.s6_addr[0], 0xff, 12); > > - memset(&sin6->sin6_addr.s6_addr[12], 0, 4); > > + /* local IPv6 address and netmask */ > > + mask6 = ss2sin6(&vfr.vfr_mask); > > + mask6->sin6_family = AF_INET6; > > + mask6->sin6_len = sizeof(*sin6); > > > > sin6 = ss2sin6(&vfr.vfr_addr); > > sin6->sin6_family = AF_INET6; > > sin6->sin6_len = sizeof(*sin6); > > if (vm_priv_addr6(&env->vmd_cfg, > > - vm->vm_vmid, i, 0, &sin6->sin6_addr) == -1) > > + vm->vm_vmid, i, 0, &sin6->sin6_addr, > > + &mask6->sin6_addr) == -1) > > return (-1); > > > > inet_ntop(AF_INET6, &sin6->sin6_addr, > > name, sizeof(name)); > > - log_debug("%s: interface %s address %s/96", > > - __func__, vfr.vfr_name, name); > > + prefixlen = mask2prefixlen6((struct sockaddr *)mask6); > > + log_debug("%s: interface %s address %s/%u", > > + __func__, vfr.vfr_name, name, prefixlen); > > > > proc_compose(ps, PROC_PRIV, IMSG_VMDOP_PRIV_IFADDR6, > > &vfr, sizeof(vfr)); > > @@ -543,11 +567,196 @@ vm_priv_brconfig(struct privsep *ps, str > > return (0); > > } > > > > +static int > > +priv_if_cmp(const void *a, const void *b) > > +{ > > + const struct vmd_ifconfig *vifca = a; > > + const struct vmd_ifconfig *vifcb = b; > > + > > + if (vifca->vifc_vmid != vifcb->vifc_vmid) > > + return (vifca->vifc_vmid > vifcb->vifc_vmid ? 1 : -1); > > + if (vifca->vifc_idx != vifcb->vifc_idx) > > + return (vifca->vifc_idx > vifcb->vifc_idx ? 1 : -1); > > + > > + return (0); > > +} > > + > > +static int > > +priv_ifunit_cmp(const void *a, const void *b) > > +{ > > + int ia = *(const unsigned int *)a; > > + int ib = *(const unsigned int *)b; > > + > > + return ((int)ia - (int)ib); > > +} > > + > > +unsigned int * > > +vm_priv_byunit(unsigned int unit) > > +{ > > + return (bsearch(&unit, priv_ifunits, priv_nifunits, sizeof(unit), > > + priv_ifunit_cmp)); > > +} > > + > > +struct vmd_ifconfig * > > +vm_priv_byid(uint32_t vmid, int idx) > > +{ > > + struct vmd_ifconfig key; > > + > > + key.vifc_vmid = vmid; > > + key.vifc_idx = idx; > > + return (bsearch(&key, priv_ifs, priv_nifs, sizeof(key), priv_if_cmp)); > > +} > > + > > +/* > > + * Called to register global interface configuration > > + * - the associated VM id > > + * - the relativ interface index of the VM > > + * - the fixed tap(4) interface unit (optional) > > + * - the fixed IP address (optional) > > + */ > > +int > > +vm_priv_register(struct privsep *ps, struct vmd_ifconfig *vifc) > > +{ > > + struct vmd_ifconfig *ifc = NULL; > > + unsigned int *ifu; > > + > > + /* Ignore interfaces that don't have any relevant configuration */ > > + if (vifc->vifc_flags == 0) > > + return (0); > > + > > + if (vifc->vifc_vmid == UINT32_MAX) { > > + log_warnx("VM id %u too large", vifc->vifc_unit); > > + goto fail; > > + } > > + > > + if (vm_priv_byid(vifc->vifc_vmid, vifc->vifc_idx) != NULL) { > > + log_warnx("interface vm %u #%u registered twice", > > + vifc->vifc_vmid, vifc->vifc_idx); > > + goto fail; > > + } > > + > > + /* Append new interface */ > > + if ((ifc = recallocarray(priv_ifs, priv_nifs, > > + priv_nifs + 1, sizeof(*ifc))) == NULL) { > > + log_warn("failed to grow interface table"); > > + goto fail; > > + } > > + priv_ifs = ifc; > > + memcpy(&priv_ifs[priv_nifs], vifc, sizeof(*vifc)); > > + priv_nifs++; > > + > > + /* Sort table */ > > + qsort(priv_ifs, priv_nifs, sizeof(*ifc), priv_if_cmp); > > + > > + if (vifc->vifc_flags & VMD_IFC_UNIT) { > > + if (vifc->vifc_unit == UINT_MAX) { > > + log_warnx("interface tap%u unit too large", > > + vifc->vifc_unit); > > + goto fail; > > + } > > + > > + if (vm_priv_byunit(vifc->vifc_unit) != NULL) { > > + log_warnx("interface tap%u defined twice", > > + vifc->vifc_unit); > > + goto fail; > > + } > > + > > + /* Append new interface unit */ > > + if ((ifu = recallocarray(priv_ifunits, priv_nifunits, > > + priv_nifunits + 1, sizeof(*ifu))) == NULL) { > > + log_warn("failed to grow interface unit table"); > > + goto fail; > > + } > > + priv_ifunits = ifu; > > + priv_ifunits[priv_nifunits++] = vifc->vifc_unit; > > + > > + /* Sort table */ > > + qsort(priv_ifunits, priv_nifunits, sizeof(*ifu), > > + priv_ifunit_cmp); > > + > > + log_debug("%s: %s registered interface tap%u", __func__, > > + ps->ps_title[privsep_process], > > + vifc->vifc_unit); > > + } > > + > > + return (0); > > + > > + fail: > > + if (ifc != NULL) > > + vm_priv_unregister(ps, vifc->vifc_vmid, vifc->vifc_idx); > > + return (-1); > > +} > > + > > +/* > > + * Called to unregister global interface configuration > > + */ > > +void > > +vm_priv_unregister(struct privsep *ps, uint32_t vmid, int idx) > > +{ > > + struct vmd_ifconfig *vifc, *ifc; > > + unsigned int *ifu; > > + > > + if ((vifc = vm_priv_byid(vmid, idx)) == NULL) > > + return; > > + > > + if (vifc->vifc_flags & VMD_IFC_UNIT && > > + (ifu = vm_priv_byunit(vifc->vifc_unit)) != NULL) { > > + /* Move entry to the end */ > > + *ifu = UINT_MAX; > > + qsort(priv_ifunits, priv_nifunits, sizeof(*ifu), > > + priv_ifunit_cmp); > > + > > + /* and remove last entry from the table */ > > + if ((ifu = recallocarray(priv_ifunits, priv_nifunits, > > + priv_nifunits - 1, sizeof(*ifu))) == NULL && > > + priv_nifunits > 1) { > > + log_warn("failed to shrink interface unit table"); > > + return; > > + } > > + priv_ifunits = ifu; > > + priv_nifunits--; > > + > > + log_debug("%s: %s unregistered interface tap%u", __func__, > > + ps->ps_title[privsep_process], > > + vifc->vifc_unit); > > + } > > + > > + /* Move entry to the end */ > > + vifc->vifc_vmid = UINT32_MAX; > > + qsort(priv_ifs, priv_nifs, sizeof(*ifc), priv_if_cmp); > > + > > + /* and remove last entry from the table */ > > + if ((ifc = recallocarray(priv_ifs, priv_nifs, > > + priv_nifs - 1, sizeof(*ifc))) == NULL && > > + priv_nifs > 1) { > > + log_warn("failed to shrink interface table"); > > + return; > > + } > > + priv_ifs = ifc; > > + priv_nifs--; > > + > > + log_debug("%s: %s unregistered interface vm %u #%u", __func__, > > + ps->ps_title[privsep_process], vmid, idx); > > +} > > + > > uint32_t > > -vm_priv_addr(struct vmd_config *cfg, uint32_t vmid, int idx, int isvm) > > +vm_priv_addr(struct vmd_config *cfg, uint32_t vmid, int idx, int isvm, > > + struct in_addr *mask) > > { > > struct address *h = &cfg->cfg_localprefix; > > - in_addr_t prefix, mask, addr; > > + in_addr_t prefix, addr; > > + struct vmd_ifconfig *vifc; > > + > > + /* Check if there is a preconfigured address for this interface */ > > + if ((vifc = vm_priv_byid(vmid, idx)) != NULL && > > + vifc->vifc_flags & VMD_IFC_ADDR4) { > > + if (isvm) > > + addr = vifc->vifc_addr4.sin_addr.s_addr; > > + else > > + addr = vifc->vifc_gw4.sin_addr.s_addr; > > + mask->s_addr = prefixlen2mask(vifc->vifc_prefixlen4); > > + return (addr); > > + } > > > > /* > > * 1. Set the address prefix and mask, 100.64.0.0/10 by default. > > @@ -556,7 +765,7 @@ vm_priv_addr(struct vmd_config *cfg, uin > > h->prefixlen < 0 || h->prefixlen > 32) > > fatal("local prefix"); > > prefix = ss2sin(&h->ss)->sin_addr.s_addr; > > - mask = prefixlen2mask(h->prefixlen); > > + mask->s_addr = prefixlen2mask(h->prefixlen); > > > > /* 2. Encode the VM ID as a per-VM subnet range N, 100.64.N.0/24. */ > > addr = vmid << 8; > > @@ -580,7 +789,7 @@ vm_priv_addr(struct vmd_config *cfg, uin > > * - the address should not exceed the prefix (eg. VM ID to high). > > * - up to 126 interfaces can be encoded per VM. > > */ > > - if (prefix != (addr & mask) || idx >= 0x7f) { > > + if (prefix != (addr & mask->s_addr) || idx >= 0x7f) { > > log_warnx("%s: dhcp address range exceeded," > > " vm id %u interface %d", __func__, vmid, idx); > > return (0); > > @@ -591,21 +800,35 @@ vm_priv_addr(struct vmd_config *cfg, uin > > > > int > > vm_priv_addr6(struct vmd_config *cfg, uint32_t vmid, > > - int idx, int isvm, struct in6_addr *in6_addr) > > + int idx, int isvm, struct in6_addr *in6_addr, struct in6_addr *mask) > > { > > struct address *h = &cfg->cfg_localprefix6; > > - struct in6_addr addr, mask; > > + struct in6_addr addr, *addrptr; > > + struct vmd_ifconfig *vifc; > > uint32_t addr4; > > + struct in_addr mask4; > > + > > + /* Check if there is a preconfigured address for this interface */ > > + if ((vifc = vm_priv_byid(vmid, idx)) != NULL && > > + vifc->vifc_flags & VMD_IFC_ADDR6) { > > + if (isvm) > > + addrptr = &vifc->vifc_addr6.sin6_addr; > > + else > > + addrptr = &vifc->vifc_gw6.sin6_addr; > > + memcpy(in6_addr, addrptr, sizeof(*in6_addr)); > > + prefixlen2mask6(vifc->vifc_prefixlen6, mask); > > + return (0); > > + } > > > > /* 1. Set the address prefix and mask, fd00::/8 by default. */ > > if (h->ss.ss_family != AF_INET6 || > > h->prefixlen < 0 || h->prefixlen > 128) > > fatal("local prefix6"); > > addr = ss2sin6(&h->ss)->sin6_addr; > > - prefixlen2mask6(h->prefixlen, &mask); > > + prefixlen2mask6(h->prefixlen, mask); > > > > /* 2. Encode the VM IPv4 address as subnet, fd00::NN:NN:0:0/96. */ > > - if ((addr4 = vm_priv_addr(cfg, vmid, idx, 1)) == 0) > > + if ((addr4 = vm_priv_addr(cfg, vmid, idx, 1, &mask4)) == 0) > > return (0); > > memcpy(&addr.s6_addr[8], &addr4, sizeof(addr4)); > > > > Index: usr.sbin/vmd/vm.conf.5 > > =================================================================== > > RCS file: /cvs/src/usr.sbin/vmd/vm.conf.5,v > > retrieving revision 1.44 > > diff -u -p -u -p -r1.44 vm.conf.5 > > --- usr.sbin/vmd/vm.conf.5 14 May 2019 12:47:17 -0000 1.44 > > +++ usr.sbin/vmd/vm.conf.5 25 Oct 2019 18:11:05 -0000 > > @@ -209,6 +209,14 @@ to select a specific one. > > .Pp > > Valid options are: > > .Bl -tag -width Ds > > +.It Ic address Ar address Ns Li / Ns Ar prefix Ar gateway > > +If the interface is configured as a > > +.Cm local > > +interface, > > +use a static IP address and gateway. > > +This option can be specified for IPv4 and for IPv6. > > +If not specified, the default is to auto-generate the address pair using > > the > > +.Cm local Oo Cm inet6 Oc Cm prefix . > > .It Cm group Ar group-name > > Assign the interface to a specific interface > > .Dq group . > > @@ -258,6 +266,8 @@ A > > interface will auto-generate an IPv4 subnet for the interface, > > configure a gateway address on the VM host side, > > and run a simple DHCP/BOOTP server for the VM. > > +The address can optionally be configured as a static > > +.Cm address . > > This option can be used for layer 3 mode without configuring a switch. > > .Pp > > If the global > > Index: usr.sbin/vmd/vmd.c > > =================================================================== > > RCS file: /cvs/src/usr.sbin/vmd/vmd.c,v > > retrieving revision 1.116 > > diff -u -p -u -p -r1.116 vmd.c > > --- usr.sbin/vmd/vmd.c 4 Sep 2019 07:02:03 -0000 1.116 > > +++ usr.sbin/vmd/vmd.c 25 Oct 2019 18:11:05 -0000 > > @@ -1161,6 +1161,8 @@ void > > vm_remove(struct vmd_vm *vm, const char *caller) > > { > > struct privsep *ps = &env->vmd_ps; > > + size_t i; > > + int idx; > > > > if (vm == NULL) > > return; > > @@ -1171,6 +1173,16 @@ vm_remove(struct vmd_vm *vm, const char > > > > TAILQ_REMOVE(env->vmd_vms, vm, vm_entry); > > > > + for (i = 0; i < vm->vm_params.vmc_params.vcp_nnics; i++) { > > + idx = (int)i; > > + vm_priv_unregister(ps, vm->vm_vmid, idx); > > + if (privsep_process == PROC_PARENT) { > > + proc_compose_imsg(ps, PROC_PRIV, -1, > > + IMSG_VMDOP_IF_UNREGISTER, > > + vm->vm_vmid, -1, &idx, sizeof(idx)); > > + } > > + } > > + > > user_put(vm->vm_user); > > vm_stop(vm, 0, caller); > > free(vm); > > @@ -1211,14 +1223,17 @@ int > > vm_register(struct privsep *ps, struct vmop_create_params *vmc, > > struct vmd_vm **ret_vm, uint32_t id, uid_t uid) > > { > > - struct vmd_vm *vm = NULL, *vm_parent = NULL; > > + char ifname[IF_NAMESIZE], *s; > > + struct vmd_vm *vm = NULL, *vm_new = NULL, *vm_parent = NULL; > > struct vm_create_params *vcp = &vmc->vmc_params; > > struct vmop_owner *vmo = NULL; > > + struct vmop_address *vma; > > struct vmd_user *usr = NULL; > > + struct vmd_ifconfig vifc; > > + int maxprefixlen; > > uint32_t nid, rng; > > unsigned int i, j; > > struct vmd_switch *sw; > > - char *s; > > > > /* Check if this is an instance of another VM */ > > if (vm_instance(ps, &vm_parent, vmc, uid) == -1) > > @@ -1294,7 +1309,7 @@ vm_register(struct privsep *ps, struct v > > goto fail; > > } > > > > - if ((vm = calloc(1, sizeof(*vm))) == NULL) > > + if ((vm = vm_new = calloc(1, sizeof(*vm))) == NULL) > > goto fail; > > > > memcpy(&vm->vm_params, vmc, sizeof(vm->vm_params)); > > @@ -1305,6 +1320,20 @@ vm_register(struct privsep *ps, struct v > > vm->vm_receive_fd = -1; > > vm->vm_state &= ~VM_STATE_PAUSED; > > vm->vm_user = usr; > > + vm->vm_kernel = -1; > > + vm->vm_cdrom = -1; > > + vm->vm_iev.ibuf.fd = -1; > > + > > + /* > > + * Assign a new internal Id if not specified and we succeed in > > + * claiming a new Id. > > + */ > > + if (id != 0) > > + vm->vm_vmid = id; > > + else if (vm_claimid(vcp->vcp_name, uid, &nid) == -1) > > + goto fail; > > + else > > + vm->vm_vmid = nid; > > > > for (i = 0; i < VMM_MAX_DISKS_PER_VM; i++) > > for (j = 0; j < VM_MAX_BASE_PER_DISK; j++) > > @@ -1333,30 +1362,69 @@ vm_register(struct privsep *ps, struct v > > vcp->vcp_macs[i][4] = rng; > > vcp->vcp_macs[i][5] = rng >> 8; > > } > > - } > > - vm->vm_kernel = -1; > > - vm->vm_cdrom = -1; > > - vm->vm_iev.ibuf.fd = -1; > > > > - /* > > - * Assign a new internal Id if not specified and we succeed in > > - * claiming a new Id. > > - */ > > - if (id != 0) > > - vm->vm_vmid = id; > > - else if (vm_claimid(vcp->vcp_name, uid, &nid) == -1) > > - goto fail; > > - else > > - vm->vm_vmid = nid; > > + /* > > + * Store interface in global configuration table > > + */ > > + memset(&vifc, 0, sizeof(vifc)); > > + > > + /* Get and check pre-configured interface name */ > > + s = vmc->vmc_ifnames[i]; > > + if (*s != '\0' && strcmp("tap", s) != 0 && > > + priv_getiftype(s, ifname, &vifc.vifc_unit) != -1) > > + vifc.vifc_flags |= VMD_IFC_UNIT; > > + > > + maxprefixlen = 0; > > + if (vmc->vmc_ifflags[i] & VMIFF_ADDR4) { > > + vma = &vmc->vmc_ifaddr4[i]; > > + memcpy(&vifc.vifc_addr4, &vma->vma_addr, > > + sizeof(vifc.vifc_addr4)); > > + memcpy(&vifc.vifc_gw4, &vma->vma_gw, > > + sizeof(vifc.vifc_gw4)); > > + vifc.vifc_prefixlen4 = vma->vma_prefixlen; > > + vifc.vifc_flags |= VMD_IFC_ADDR4; > > + maxprefixlen = 127; > > + } > > + if (vmc->vmc_ifflags[i] & VMIFF_ADDR6) { > > + vma = &vmc->vmc_ifaddr6[i]; > > + memcpy(&vifc.vifc_addr4, &vma->vma_addr, > > + sizeof(vifc.vifc_addr4)); > > + memcpy(&vifc.vifc_gw4, &vma->vma_gw, > > + sizeof(vifc.vifc_gw4)); > > + vifc.vifc_prefixlen4 = vma->vma_prefixlen; > > + vifc.vifc_flags |= VMD_IFC_ADDR6; > > + maxprefixlen = 31; > > + } > > + if (maxprefixlen && vma->vma_prefixlen > maxprefixlen) { > > + log_warnx("address prefix larger than /%d", > > + maxprefixlen); > > + goto fail; > > + } > > + > > + vifc.vifc_vmid = vm->vm_vmid; > > + vifc.vifc_idx = i; > > + > > + if (vm_priv_register(ps, &vifc) == -1) > > + goto fail; > > + > > + if (privsep_process == PROC_PARENT) { > > + proc_compose_imsg(ps, PROC_PRIV, -1, > > + IMSG_VMDOP_IF_REGISTER, -1, -1, > > + &vifc, sizeof(vifc)); > > + } > > + } > > > > log_debug("%s: registering vm %d", __func__, vm->vm_vmid); > > TAILQ_INSERT_TAIL(env->vmd_vms, vm, vm_entry); > > > > *ret_vm = vm; > > return (0); > > + > > fail: > > + free(vm_new); > > if (errno == 0) > > errno = EINVAL; > > + > > return (-1); > > } > > > > @@ -1956,6 +2024,71 @@ get_string(uint8_t *ptr, size_t len) > > break; > > > > return strndup(ptr, i); > > +} > > + > > +uint8_t > > +mask2prefixlen(struct sockaddr *sa) > > +{ > > + struct sockaddr_in *sa_in = (struct sockaddr_in *)sa; > > + in_addr_t ina = sa_in->sin_addr.s_addr; > > + > > + if (ina == 0) > > + return (0); > > + else > > + return (33 - ffs(ntohl(ina))); > > +} > > + > > +uint8_t > > +mask2prefixlen6(struct sockaddr *sa) > > +{ > > + struct sockaddr_in6 *sa_in6 = (struct sockaddr_in6 *)sa; > > + uint8_t *ap, *ep; > > + unsigned int l = 0; > > + > > + /* > > + * sin6_len is the size of the sockaddr so substract the offset of > > + * the possibly truncated sin6_addr struct. > > + */ > > + ap = (uint8_t *)&sa_in6->sin6_addr; > > + ep = (uint8_t *)sa_in6 + sa_in6->sin6_len; > > + for (; ap < ep; ap++) { > > + /* this "beauty" is adopted from sbin/route/show.c ... */ > > + switch (*ap) { > > + case 0xff: > > + l += 8; > > + break; > > + case 0xfe: > > + l += 7; > > + goto done; > > + case 0xfc: > > + l += 6; > > + goto done; > > + case 0xf8: > > + l += 5; > > + goto done; > > + case 0xf0: > > + l += 4; > > + goto done; > > + case 0xe0: > > + l += 3; > > + goto done; > > + case 0xc0: > > + l += 2; > > + goto done; > > + case 0x80: > > + l += 1; > > + goto done; > > + case 0x00: > > + goto done; > > + default: > > + fatalx("non contiguous inet6 netmask"); > > + } > > + } > > + > > +done: > > + if (l > sizeof(struct in6_addr) * 8) > > + fatalx("%s: prefixlen %d out of bound", __func__, l); > > + return (l); > > } > > > > uint32_t > > Index: usr.sbin/vmd/vmd.h > > =================================================================== > > RCS file: /cvs/src/usr.sbin/vmd/vmd.h,v > > retrieving revision 1.97 > > diff -u -p -u -p -r1.97 vmd.h > > --- usr.sbin/vmd/vmd.h 7 Sep 2019 09:11:14 -0000 1.97 > > +++ usr.sbin/vmd/vmd.h 25 Oct 2019 18:11:06 -0000 > > @@ -119,6 +119,8 @@ enum imsg_type { > > IMSG_VMDOP_PRIV_IFRDOMAIN, > > IMSG_VMDOP_VM_SHUTDOWN, > > IMSG_VMDOP_VM_REBOOT, > > + IMSG_VMDOP_IF_REGISTER, > > + IMSG_VMDOP_IF_UNREGISTER, > > IMSG_VMDOP_CONFIG, > > IMSG_VMDOP_DONE > > }; > > @@ -160,6 +162,12 @@ struct vmop_owner { > > int64_t gid; > > }; > > > > +struct vmop_address { > > + struct sockaddr_storage vma_addr; > > + struct sockaddr_storage vma_gw; > > + int vma_prefixlen; > > +}; > > + > > struct vmop_create_params { > > struct vm_create_params vmc_params; > > unsigned int vmc_flags; > > @@ -185,7 +193,10 @@ struct vmop_create_params { > > #define VMIFF_LOCKED 0x02 > > #define VMIFF_LOCAL 0x04 > > #define VMIFF_RDOMAIN 0x08 > > -#define VMIFF_OPTMASK (VMIFF_LOCKED|VMIFF_LOCAL|VMIFF_RDOMAIN) > > +#define VMIFF_ADDR4 0x10 > > +#define VMIFF_ADDR6 0x20 > > +#define VMIFF_OPTMASK \ > > + (VMIFF_LOCKED|VMIFF_LOCAL|VMIFF_RDOMAIN|VMIFF_ADDR4|VMIFF_ADDR6) > > > > unsigned int vmc_disktypes[VMM_MAX_DISKS_PER_VM]; > > unsigned int vmc_diskbases[VMM_MAX_DISKS_PER_VM]; > > @@ -196,6 +207,8 @@ struct vmop_create_params { > > char vmc_ifswitch[VMM_MAX_NICS_PER_VM][VM_NAME_MAX]; > > char vmc_ifgroup[VMM_MAX_NICS_PER_VM][IF_NAMESIZE]; > > unsigned int vmc_ifrdomain[VMM_MAX_NICS_PER_VM]; > > + struct vmop_address vmc_ifaddr4[VMM_MAX_NICS_PER_VM]; > > + struct vmop_address vmc_ifaddr6[VMM_MAX_NICS_PER_VM]; > > struct vmop_owner vmc_owner; > > > > /* instance template params */ > > @@ -315,6 +328,26 @@ struct address { > > }; > > TAILQ_HEAD(addresslist, address); > > > > +struct vmd_ifconfig { > > + uint32_t vifc_vmid; /* associated VM id */ > > + unsigned int vifc_idx; /* relative interface index */ > > + > > + unsigned int vifc_flags; > > +#define VMD_IFC_UNIT 0x01 /* has interface tap(4) > > unit */ > > +#define VMD_IFC_ADDR4 0x02 /* has IPv4 address */ > > +#define VMD_IFC_ADDR6 0x04 /* has IPv6 address */ > > + > > + unsigned int vifc_unit; > > + > > + struct sockaddr_in vifc_addr4; > > + struct sockaddr_in vifc_gw4; > > + int vifc_prefixlen4; > > + > > + struct sockaddr_in6 vifc_addr6; > > + struct sockaddr_in6 vifc_gw6; > > + int vifc_prefixlen6; > > +}; > > + > > struct vmd_config { > > unsigned int cfg_flags; > > #define VMD_CFG_INET6 0x01 > > @@ -391,6 +424,7 @@ void vm_stop(struct vmd_vm *, int, cons > > void vm_remove(struct vmd_vm *, const char *); > > int vm_register(struct privsep *, struct vmop_create_params *, > > struct vmd_vm **, uint32_t, uid_t); > > +void vm_priv_unregister(struct privsep *, uint32_t, int); > > int vm_checkperm(struct vmd_vm *, struct vmop_owner *, uid_t); > > int vm_checkaccess(int, unsigned int, uid_t, int); > > int vm_opentty(struct vmd_vm *); > > @@ -402,6 +436,8 @@ void user_put(struct vmd_user *); > > void user_inc(struct vm_create_params *, struct vmd_user *, int); > > int user_checklimit(struct vmd_user *, struct vm_create_params *); > > char *get_string(uint8_t *, size_t); > > +uint8_t mask2prefixlen(struct sockaddr *); > > +uint8_t mask2prefixlen6(struct sockaddr *); > > uint32_t prefixlen2mask(uint8_t); > > void prefixlen2mask6(u_int8_t, struct in6_addr *); > > void getmonotime(struct timeval *); > > @@ -411,11 +447,15 @@ void priv(struct privsep *, struct priv > > int priv_getiftype(char *, char *, unsigned int *); > > int priv_findname(const char *, const char **); > > int priv_validgroup(const char *); > > +int vm_priv_register(struct privsep *, struct vmd_ifconfig *); > > int vm_priv_ifconfig(struct privsep *, struct vmd_vm *); > > int vm_priv_brconfig(struct privsep *, struct vmd_switch *); > > -uint32_t vm_priv_addr(struct vmd_config *, uint32_t, int, int); > > +uint32_t vm_priv_addr(struct vmd_config *, uint32_t, int, int, > > + struct in_addr *); > > int vm_priv_addr6(struct vmd_config *, uint32_t, int, int, > > - struct in6_addr *); > > + struct in6_addr *, struct in6_addr *); > > +unsigned int *vm_priv_byunit(unsigned int); > > +struct vmd_ifconfig *vm_priv_byid(uint32_t, int); > > > > /* vmm.c */ > > struct iovec; > > Index: usr.sbin/vmd/vmm.c > > =================================================================== > > RCS file: /cvs/src/usr.sbin/vmd/vmm.c,v > > retrieving revision 1.94 > > diff -u -p -u -p -r1.94 vmm.c > > --- usr.sbin/vmd/vmm.c 25 Oct 2019 09:57:33 -0000 1.94 > > +++ usr.sbin/vmd/vmm.c 25 Oct 2019 18:11:06 -0000 > > @@ -602,6 +602,9 @@ opentap(char *ifname) > > char path[PATH_MAX]; > > > > for (i = 0; i < MAX_TAP; i++) { > > + /* Skip statically configured interface names (eg. tap0) */ > > + if (vm_priv_byunit(i) != NULL) > > + continue; > > snprintf(path, PATH_MAX, "/dev/tap%d", i); > > fd = open(path, O_RDWR | O_NONBLOCK); > > if (fd != -1) { > >