Re: Infinite loop in uvm protection mapping
On Mon, Oct 19, 2020 at 3:13 PM Tom Rollet wrote: > Hi, > > I'm starting to help in the development of the dt device. > > I'm stuck on permission handling of memory. I'm trying to allocate a > page in kernel with read/write protections, fill the allocated page > with data then change the permissions to read/exec. > > Snippet of my code: > > addr = uvm_km_alloc(kernel_map, PAGE_SIZE); > > [...] (memcpy data in allocated page) > > uvm_map_protect(kernel_map, addr, addr + PAGE_SIZE, PROT_READ > | PROT_EXEC, FALSE))) > This is same usage as seen in the 'sti' driver...which is on hppa only, so while it's presumably the correct usage of uvm_km_alloc() and uvm_map_protect(), I don't think uvm_map_protect() has been used on kernel-space on amd64 (or possibly all non-hppa archs) before in OpenBSD. Whee? It triggers the following error at boot time when executing > the uvm_map_protect function. > > uvm_fault(0x81fb2c90, 0x7ffec0008000, 0, 2) -> e kernel: page fault > trap, code=0 Stopped atpmap_write_protect+0x1f5: lock andq > $-0x3,0(%rdi) > > Trace: > > pmap_write_protect(82187b28,80002255b000,80002255c000, > 5,50e8b70481f4f622,fd81b6567e70) at pmap_write_protect+0x212 > uvm_map_protect(82129ae0,80002255b000,80002255c000 > ,5,0,82129ae0) at uvm_map_protect+0x501 > dt_alloc_kprobe(815560e0,80173900,e7ef01a2855152cc, > 82395c98,0,815560e0) at dt_alloc_kprobe+0x1ff > dt_prov_kprobe_init(2333e28db00d3edd,0,82121150,0,0, > 824d9008) at dt_prov_kprobe_init+0x1d9 > dtattach(1,821fb384,f,1,c2ee1c3f472154e,2dda28) at dtattach+0x5d > main(0,0,0,0,0,1) at main+0x419 > > The problem comes from the loop in pmap_write_protect > (sys/arch/amd64/amd64/pmap.c:2108) that is executed > infinity in my case. > > Entry of function pmap_write_protect: > sva: 80002250A000 > eva: 80002250B000 > > After &= PG_FRAME (line 2098-2099) > sva= F80002250A000 > eva= F80002250B000 > > loop: (ligne 2108) > > first iteration: > va = F80002250A000 > eva = F80002250B000 > blockend = 080012240 > ... > Does anyone have an idea how to fix this issue? So, blockend is clearly wrong for va and eva. I suspect the use of L2_FRAME here: blockend = (va & L2_FRAME) + NBPD_L2; is wrong here and it should be blockend = (va & VA_SIGN_NEG(L2_FRAME)) + NBPD_L2; or some equivalent expression to keep all the bits above the frame. Philip Guenther
Re: [PATCH] umb(4) fix for X20 (DW5821e) in Dell Latitude 7300
On Fri, Oct 02, 2020 at 12:33:15PM -0700, Bryan Vyhmeister wrote: > On Wed, Sep 30, 2020 at 04:05:51PM -0700, Bryan Vyhmeister wrote: > > Gerhard Roth provided a patch to me to get the Qualcomm Snapdragon X20 > > umb(4) interface in my Dell Latitude 7300 working. It is also known as a > > Dell DW5821e interface. I have been using this patch for months now with > > AT&T Wireless in the US and it works great just as I have been used to > > on my ThinkPad X270 with the older umb(4) interface. This is how it > > shows up now in my dmesg: > > > > umb0 at uhub0 port 16 "Dell Inc. DW5821e Snapdragon X20 LTE" rev > > 3.10/3.18 addr 2 > > > > I would love to get this committed at some point so I no longer have to > > keep compiling a new kernel for every snapshot to have umb(4) working. > > Thanks to jmc@ for suggesting a man page diff as well. Both the diff for > if_umb.c and umb.4 are below. Ping. Bryan Index: share/man/man4/umb.4 === RCS file: /cvs/src/share/man/man4/umb.4,v retrieving revision 1.11 diff -u -p -r1.11 umb.4 --- share/man/man4/umb.412 May 2020 13:03:52 - 1.11 +++ share/man/man4/umb.42 Oct 2020 19:30:53 - @@ -44,6 +44,7 @@ PIN again even if the system was reboote The following devices should work: .Pp .Bl -tag -width Ds -offset indent -compact +.It Dell DW5821e .It Ericsson H5321gw and N5321gw .It Fibocom L831-EAU .It Medion Mobile S4222 (MediaTek OEM) Index: sys/dev/usb/if_umb.c === RCS file: /cvs/src/sys/dev/usb/if_umb.c,v retrieving revision 1.36 diff -u -p -r1.36 if_umb.c --- sys/dev/usb/if_umb.c22 Jul 2020 02:16:02 - 1.36 +++ sys/dev/usb/if_umb.c2 Oct 2020 19:30:53 - @@ -1,4 +1,4 @@ -/* $OpenBSD: if_umb.c,v 1.36 2020/07/22 02:16:02 dlg Exp $ */ +/* $OpenBSD: if_umb.c,v 1.34 2020/05/04 14:41:03 gerhard Exp $ */ /* * Copyright (c) 2016 genua mbH @@ -225,6 +225,26 @@ const struct cfattach umb_ca = { int umb_delay = 4000; /* + * Normally, MBIM devices are detected by their interface class and subclass. + * But for some models that have multiple configurations, it is better to + * match by vendor and product id so that we can select the desired + * configuration ourselves. + * + * OTOH, some devices identifiy themself als an MBIM device but fail to speak + * the MBIM protocol. + */ +struct umb_products { + struct usb_devno dev; + int confno; +}; +const struct umb_products umb_devs[] = { + { { USB_VENDOR_DELL, 0x81d7 }, 2 }, /* XXX FIXME */ +}; + +#define umb_lookup(vid, pid) \ + ((const struct umb_products *)usb_lookup(umb_devs, vid, pid)) + +/* * These devices require an "FCC Authentication" command. */ const struct usb_devno umb_fccauth_devs[] = { @@ -263,6 +283,8 @@ umb_match(struct device *parent, void *m struct usb_attach_arg *uaa = aux; usb_interface_descriptor_t *id; + if (umb_lookup(uaa->vendor, uaa->product) != NULL) + return UMATCH_VENDOR_PRODUCT; if (!uaa->iface) return UMATCH_NONE; if ((id = usbd_get_interface_descriptor(uaa->iface)) == NULL) @@ -315,6 +337,43 @@ umb_attach(struct device *parent, struct sc->sc_ctrl_ifaceno = uaa->ifaceno; ml_init(&sc->sc_tx_ml); + if (uaa->configno < 0) { + /* +* In case the device was matched by VID/PID instead of +* InterfaceClass/InterfaceSubClass, we have to pick the +* correct configuration ourself. +*/ + uaa->configno = umb_lookup(uaa->vendor, uaa->product)->confno; + DPRINTF("%s: switching to config #%d\n", DEVNAM(sc), + uaa->configno); + status = usbd_set_config_no(sc->sc_udev, uaa->configno, 1); + if (status) { + printf("%s: failed to switch to config #%d: %s\n", + DEVNAM(sc), uaa->configno, usbd_errstr(status)); + goto fail; + } + usbd_delay_ms(sc->sc_udev, 200); + + /* +* Need to do some manual setups that usbd_probe_and_attach() +* would do for us otherwise. +*/ + uaa->nifaces = uaa->device->cdesc->bNumInterface; + for (i = 0; i < uaa->nifaces; i++) { + if (usbd_iface_claimed(sc->sc_udev, i)) + continue; + id = usbd_get_interface_descriptor(&uaa->device->ifaces[i]); + if (id != NULL && id->bInterfaceClass == UICLASS_CDC && + id->bInterfaceSubClass == + UISUBCLASS_MOBILE_BROADBAND_INTERFACE_MODEL) { + uaa->iface = &uaa->device->ifaces[i]; +
Re: net.inet.ip.forwarding=0 vs lo(4)
On Mon, Oct 19, 2020 at 07:16:46PM +1000, David Gwynne wrote: > On Mon, Oct 19, 2020 at 10:03:29AM +0100, Stuart Henderson wrote: > > On 2020/10/19 11:47, David Gwynne wrote: > > > On Sun, Oct 18, 2020 at 08:57:34PM +0100, Stuart Henderson wrote: > > > > On 2020/10/18 14:04, David Gwynne wrote: > > > > > the problem i'm hitting is that i have a multihomed box where the > > > > > service it provides listens on an IP address that's assigned to lo1. > > > > > it's a host running a service, it's not a router, so the > > > > > net.inet.ip.forwarding sysctl is not set to 1. > > > > > > > > I ran into this, I just turned on the forwarding sysctl to avoid the > > > > problem. > > > > > > > > > i came up with this diff, which adds even more special casing for > > > > > loopback interfaces. it says addreesses on loopbacks are globally > > > > > reachable, even if ip forwarding is disabled. > > > > > > > > I don't see why loopbacks should be special. Another place this > > > > might show up is services running on carp addresses (I haven't updated > > > > those machines yet but there's a fair chance they'll be affected too). > > > > I would prefer an explicit sysctl to disable "strong host model". > > > > > > loopback is already special. if a packet comes from an loopback > > > interface, we allow it to talk to any IP on the local machine. i think > > > this is mostly to cope with the semantic we've had where local traffic > > > get's tied to a loopback interface instead of going anywhere near the > > > physical ones. > > > > > > carp is also special. > > > > > > let me paste the ip_laddr function instead of the diff to it, it's a bit > > > more obvious what's going on: > > > > Thanks, that will already work for the machines I was thinking of then. > > > > > back to loopback and receiving packets. loopback is special because it > > > is not connected to the outside world. it is impossible to send a packet > > > via a loopback interface from another host, so configuring a globally > > > (externally) routable IP on it is currently pointless unless you enable > > > forwarding. i think making loopback more special and allowing it > > > to be globally reachable makes sense. i can't think of any downsides to > > > this at the moment, except that the behaviour would be subtle/not > > > obvious > > > > ok, so it makes sense for this to be independent of any possible > > separate lever. > > > > > is there a need to configure a globally reachable IP on a non-loopback > > > interface on a host (not router)? if so, then i'd be more convinced that > > > we need a separate lever to pull. > > > > I'm not using it this way, but here's a scenario. > > > > Say there are a couple of webservers with addresses from a carp on > > ethernet/vlan, with a link to their upstream router on some separate > > interface. They announce the carp prefix into ospf. > > so carp is just being used to elect a webserver as a master, and then > the result of that election is fed upstream. > > > They aren't routing themselves so the only reason to have forwarding=1 > > is to have them use "weak host model". > > > > With forwarding=0 I think they'll have to use "stub router no" otherwise > > everything will be announced high metric (rather than being dependent on > > carp state), but ospfd explicitly handles this; it's marked in parse.y > > with "/* allow to force non stub mode */". > > so is a Big Global Lever what you want here? if you enable weak host > mode, all IPs on the host will be addressible from all legs of the > host. would it make more sense to configure specific interfaces as > holding globally addressible IPs? > > if my understanding of your scenario is right, you could configure > the carp interface with the weak or globally accessible flag. in > my situation i could configure that on lo1. such a diff looks like this. it adds a "global" flag that you can set on interfaces. Index: sbin/ifconfig/ifconfig.c === RCS file: /cvs/src/sbin/ifconfig/ifconfig.c,v retrieving revision 1.429 diff -u -p -r1.429 ifconfig.c --- sbin/ifconfig/ifconfig.c7 Oct 2020 14:38:54 - 1.429 +++ sbin/ifconfig/ifconfig.c20 Oct 2020 00:12:06 - @@ -468,6 +468,8 @@ const structcmd { { "-autoconfprivacy", IFXF_INET6_NOPRIVACY, 0, setifxflags }, { "soii", -IFXF_INET6_NOSOII, 0, setifxflags }, { "-soii", IFXF_INET6_NOSOII, 0, setifxflags }, + { "global", IFXF_GLOBAL,0, setifxflags }, + { "-global",-IFXF_GLOBAL, 0, setifxflags }, #ifndef SMALL { "hwfeatures", NEXTARG0, 0, printifhwfeatures }, { "metric", NEXTARG,0, setifmetric }, @@ -675,7 +677,7 @@ const structcmd { "\7RUNNING\10NOARP\11PROMISC\12ALLMULTI\13OACTIVE\14SIMPLEX"\ "\15LINK0\16LINK1\17LINK2\20MULTICAST"
Infinite loop in uvm protection mapping
Hi, I'm starting to help in the development of the dt device. I'm stuck on permission handling of memory. I'm trying to allocate a page in kernel with read/write protections, fill the allocated page with data then change the permissions to read/exec. Snippet of my code: addr = uvm_km_alloc(kernel_map, PAGE_SIZE); [...] (memcpy data in allocated page) uvm_map_protect(kernel_map, addr, addr + PAGE_SIZE, PROT_READ | PROT_EXEC, FALSE))) It triggers the following error at boot time when executing the uvm_map_protect function. uvm_fault(0x81fb2c90, 0x7ffec0008000, 0, 2) -> e kernel: page fault trap, code=0 Stopped at pmap_write_protect+0x1f5: lock andq $-0x3,0(%rdi) Trace: pmap_write_protect(82187b28,80002255b000,80002255c000, 5,50e8b70481f4f622,fd81b6567e70) at pmap_write_protect+0x212 uvm_map_protect(82129ae0,80002255b000,80002255c000 ,5,0,82129ae0) at uvm_map_protect+0x501 dt_alloc_kprobe(815560e0,80173900,e7ef01a2855152cc, 82395c98,0,815560e0) at dt_alloc_kprobe+0x1ff dt_prov_kprobe_init(2333e28db00d3edd,0,82121150,0,0, 824d9008) at dt_prov_kprobe_init+0x1d9 dtattach(1,821fb384,f,1,c2ee1c3f472154e,2dda28) at dtattach+0x5d main(0,0,0,0,0,1) at main+0x419 The problem comes from the loop in pmap_write_protect (sys/arch/amd64/amd64/pmap.c:2108) that is executed infinity in my case. Entry of function pmap_write_protect: sva: 80002250A000 eva: 80002250B000 After &= PG_FRAME (line 2098-2099) sva= F80002250A000 eva= F80002250B000 loop: (ligne 2108) first iteration: va = F80002250A000 eva = F80002250B000 blockend = 080012240 second iteration: va = 080012240 eva = F80002250B000 blockend = 080022240 We can see that the problem is that the "va" variable has lost her 4 upper set bits (48 to 51). So now the comparison between "va" and "eva" is always false and it loops indefinitely because "va" 4 upper bits are clean each iteration. The fix that I found is to clear the 4 same bits in "eva". It can be done with this mask: new_mask = F000 PG_FRAME = 000FF000 Diff with quick dirty fix: diff --git a/sys/arch/amd64/amd64/pmap.c b/sys/arch/amd64/amd64/pmap.c index 10ab3da2949..2783c9d26a5 100644 --- a/sys/arch/amd64/amd64/pmap.c +++ b/sys/arch/amd64/amd64/pmap.c @@ -2105,10 +2105,12 @@ pmap_write_protect(struct pmap *pmap, vaddr_t sva, vaddr_t eva, vm_prot_t prot) if ((eva - sva > 32 * PAGE_SIZE) && sva < VM_MIN_KERNEL_ADDRESS) shootall = 1; - for (va = sva; va < eva ; va = blockend) { - blockend = (va & L2_FRAME) + NBPD_L2; - if (blockend > eva) - blockend = eva; + vaddr_t tmp_eva = eva & 0xF000UL; + vaddr_t tmp_sva = sva & 0xF000UL; + for (va = tmp_sva; va < tmp_eva ; va = blockend) { + blockend = (va & L2_FRAME) + NBPD_L2; + if (blockend > tmp_eva) + blockend = tmp_eva; /* * XXXCDC: our PTE mappings should never be write-protected! The problem is solved at boot time (I am able to boot without crashing). But during the execution, we I jump on the page (with read and exec permissions), the OS completely freeze without error message, I can't launch the kernel debugger and I don't know how debug it further. Does anyone have an idea how to fix this issue? Thanks -- Tom Rollet
Re: push NET_LOCK() down in pf_ioctl.c
Hello Klemens, > > the change is fairly large, but mostly mechanical. > Relocating malloc(9) and pool(9) seems good but other pf_*() calls are > now locked inconsistently after your diff. > > Given that only reason about "allocations" this is either an oversight > and should be fixed or you need to provide more explanation. > > See inline for the spots I'm talking about. thanks for catching this. there was unfinished clean up in a branch I took diff from. Currently we need to keep pf_rm_rule() under both locks. The function might be calling pf_tag_unref(), pf_dynaddr_remove()... which alter lists, which are currently supposed to be protected by PF_LOCK()/NET_LOCK(). updated diff is below. pf_rm_rule() is being called with both locks held. thanks and regards sashan 8<---8<---8<--8< diff --git a/sys/net/pf_ioctl.c b/sys/net/pf_ioctl.c index ef7d995e5a7..3a75cd05005 100644 --- a/sys/net/pf_ioctl.c +++ b/sys/net/pf_ioctl.c @@ -1006,10 +1006,10 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p) return (EACCES); } - NET_LOCK(); switch (cmd) { case DIOCSTART: + NET_LOCK(); PF_LOCK(); if (pf_status.running) error = EEXIST; @@ -1025,9 +1025,11 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p) DPFPRINTF(LOG_NOTICE, "pf: started"); } PF_UNLOCK(); + NET_UNLOCK(); break; case DIOCSTOP: + NET_LOCK(); PF_LOCK(); if (!pf_status.running) error = ENOENT; @@ -1038,6 +1040,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p) DPFPRINTF(LOG_NOTICE, "pf: stopped"); } PF_UNLOCK(); + NET_UNLOCK(); break; case DIOCGETQUEUES: { @@ -1045,6 +1048,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p) struct pf_queuespec *qs; u_int32_tnr = 0; + NET_LOCK(); PF_LOCK(); pq->ticket = pf_main_ruleset.rules.active.ticket; @@ -1056,6 +1060,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p) } pq->nr = nr; PF_UNLOCK(); + NET_UNLOCK(); break; } @@ -1064,10 +1069,12 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p) struct pf_queuespec *qs; u_int32_tnr = 0; + NET_LOCK(); PF_LOCK(); if (pq->ticket != pf_main_ruleset.rules.active.ticket) { error = EBUSY; PF_UNLOCK(); + NET_UNLOCK(); break; } @@ -1078,10 +1085,12 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p) if (qs == NULL) { error = EBUSY; PF_UNLOCK(); + NET_UNLOCK(); break; } memcpy(&pq->queue, qs, sizeof(pq->queue)); PF_UNLOCK(); + NET_UNLOCK(); break; } @@ -1091,10 +1100,12 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p) u_int32_tnr; int nbytes; + NET_LOCK(); PF_LOCK(); if (pq->ticket != pf_main_ruleset.rules.active.ticket) { error = EBUSY; PF_UNLOCK(); + NET_UNLOCK(); break; } nbytes = pq->nbytes; @@ -1107,6 +1118,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p) if (qs == NULL) { error = EBUSY; PF_UNLOCK(); + NET_UNLOCK(); break; } memcpy(&pq->queue, qs, sizeof(pq->queue)); @@ -1121,6 +1133,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p) if (error == 0) pq->nbytes = nbytes; PF_UNLOCK(); + NET_UNLOCK(); break; } @@ -1128,38 +1141,44 @@ pfioctl(dev_t dev, u_long cmd, caddr_t addr, int flags, struct proc *p) struct pfioc_queue *q = (struct pfioc_queue *)addr; struct pf_queuespec *qs; - PF_LOCK(); - if (q->ticket != pf_main_ru
Re: pf route-to issues
On Mon, Oct 19, 2020 at 12:33:25PM +0100, Stuart Henderson wrote: > On 2020/10/19 19:53, David Gwynne wrote: > > On Mon, Oct 19, 2020 at 09:34:31AM +0100, Stuart Henderson wrote: > > > On 2020/10/19 15:35, David Gwynne wrote: > > > > every few years i try and use route-to in pf, and every time it > > > > goes badly. i tried it again last week in a slightly different > > > > setting, and actually tried to understand the sharp edges i hit > > > > this time instead of giving up. it turns out there are 2 or 3 > > > > different things together that have cause me trouble, which is why > > > > the diff below is so big. > > > > > > I used to route-to/reply-to quite a lot at places with poor internet > > > connections to split traffic between lines (mostly those have better > > > connections now so I don't need it as often). It worked as I expected - > > > but I only ever used it with the interface specified. > > > > cool. did it work beyond the first packet in a connection? > > It must have done. The webcams would have utterly broken the rest of > traffic if it hadn't :) > > > > I mostly used it with pppoe interfaces so the peer address was unknown > > > at ruleset load time. (I was lucky and had static IPs my side, but the > > > ISP side was variable). I relied on the fact that once packets are > > > directed at a point-point interface there's only one place for them to > > > go. I didn't notice that ":peer" might be useful here (and the syntax > > > 'route-to pppoe1:peer@pppoe1' is pretty awkward so I probably wouldn't > > > have come up with it), I had 0.0.0.1@pppoe1, 0.0.0.2@pppoe2 etc > > > (though actually I think it works with $any_random_address@pppoeX). > > > > yes. i was trying to use it with peers over ethernet, and always > > struggled with the syntax. > > > > > > the first and i would argue most fundamental problem is a semantic > > > > problem. if you ask a random person who has some clue about networks > > > > and routing what they would expect the "argument" to route-to or > > > > reply-to to be, they would say "a nexthop address" or "a gateway > > > > address". eg, say i want to force packets to a specific backend > > > > server without using NAT, i would write a rule like this: > > > > > > > > n_servers="192.0.2.128/27" > > > > pass out on $if_internal to $n_servers route-to 192.168.0.1 > > > > > > > > pfctl will happily parse this, shove it into the kernel, let you read > > > > the rules back out again with pfctl -sr, and it all looks plausible, but > > > > it turns out that it's using the argument to route-to as an interface > > > > name. because rulesets can refer to interfaces that don't exist yet, pf > > > > just passes the IP address around as a string, hoping i'll plug in an > > > > interface with a driver name that looks like an ip address. i spent > > > > literally a day trying to figure out why a rule like this wasn't > > > > working. > > > > > > I don't think I tried this, but the pf.conf(5) BNF syntax suggests it's > > > supposed to work. So either doc or implementation bug there. > > > > im leaning toward implementation bug. > > > > > route = ( "route-to" | "reply-to" | "dup-to" ) > > > ( routehost | "{" routehost-list "}" ) > > > [ pooltype ] > > > > > > routehost-list = routehost [ [ "," ] routehost-list ] > > > > > > routehost = host | host "@" interface-name | > > > "(" interface-name [ address [ "/" mask-bits ] ] ")" > > > > > > > the second problem is that the pf_route calls from pfsync don't > > > > have all the information it is supposed to have. more specifically, > > > > an ifp pointer isn't set which leads to a segfault. the ifp pointer > > > > isn't set because pfsync doesnt track which interface a packet is > > > > going out, it assumes the ip layer will get it right again later, or a > > > > rule provided something usable. > > > > > > > > the third problem is that pf_route relies on information from rules to > > > > work correctly. this is a problem in a pfsync environment because you > > > > cannot have the same ruleset on both firewalls 100% of the time, which > > > > means you cannot have route-to/reply-to behave consistently on a pair of > > > > firwalls 100% of the time. > > > > > > I didn't run into this because pppoe(4) and pfsync/carp don't really > > > go well together, but ouch! > > > > > > > all of this together makes things work pretty obviously and smoothly. > > > > in my opinion anyway. route-to now works more like rdr-to, it just > > > > feels like it changes the address used for the route lookup rather > > > > than changing the actual IP address in the packet. it also works > > > > predictably in a pfsync pair, which is great from the point of view of > > > > high availability. > > > > > > > > the main caveat is that it's not backward compatible. if you're already > > > > using route-to, you will need to tweak your rules to have them par
Re: basename(3) should have non-const arg, says POSIX
On Mon, 19 Oct 2020 22:06:52 +0200, Christian Weisgerber wrote: > The patch below aligns the function prototypes with POSIX. All > resulting warnings "passing 'const char *' to parameter of type > 'char *' discards qualifiers" in the base system have been cleaned > up. It successfully passes "make release". For good measure I'm > also running a package bulk build with it as we speak. OK millert@ - todd
Re: basename(3) should have non-const arg, says POSIX
On Mon, Oct 19, 2020 at 10:06:52PM +0200, Christian Weisgerber wrote: > [Picking this up again after a month:] > > Our basename(3) and dirname(3) take a const argument: > > char*basename(const char *); > char*dirname(const char *); > > POSIX says otherwise... > > char *basename(char *path); > char *dirname(char *path); > > ... and explicitly says the functions may modify the input string. > > > Our functions were const-ified in 1999 by espie@: > > proper const semantics for dirname & basename. > (this follows FreeBSD and Linux. Single Unix 2 is still illogical) > > Well, four years ago, FreeBSD finally switched to the POSIX prototypes > https://svnweb.freebsd.org/base/head/include/libgen.h?revision=303451&view=markup > and in fact now has an implementation that modifies the string. > > Linux (GNU libc) has a bizarro solution where you get a const basename() > and no dirname() if you just include , but POSIX basename() > and dirname() if you instead or also include . > > This is a portability trap. Code written on OpenBSD may not be > prepared for basename() or dirname() to splat a '\0' into the input > string, despite the warning in the man page. This is not hypothetical. > Both Got and OpenCVS have fallen victim to the unportable assumption. > > > The patch below aligns the function prototypes with POSIX. All > resulting warnings "passing 'const char *' to parameter of type > 'char *' discards qualifiers" in the base system have been cleaned > up. It successfully passes "make release". For good measure I'm > also running a package bulk build with it as we speak. > > OK? Ok by me. > Index: include/libgen.h > === > RCS file: /cvs/src/include/libgen.h,v > retrieving revision 1.9 > diff -u -p -r1.9 libgen.h > --- include/libgen.h 25 Jan 2019 00:19:25 - 1.9 > +++ include/libgen.h 11 Sep 2020 20:41:34 - > @@ -22,8 +22,8 @@ > #include > > __BEGIN_DECLS > -char *basename(const char *); > -char *dirname(const char *); > +char *basename(char *); > +char *dirname(char *); > __END_DECLS > > #endif /* _LIBGEN_H_ */ > Index: lib/libc/gen/basename.3 > === > RCS file: /cvs/src/lib/libc/gen/basename.3,v > retrieving revision 1.24 > diff -u -p -r1.24 basename.3 > --- lib/libc/gen/basename.3 25 Jan 2019 00:19:25 - 1.24 > +++ lib/libc/gen/basename.3 11 Sep 2020 20:46:30 - > @@ -23,7 +23,7 @@ > .Sh SYNOPSIS > .In libgen.h > .Ft char * > -.Fn basename "const char *path" > +.Fn basename "char *path" > .Sh DESCRIPTION > The > .Fn basename > Index: lib/libc/gen/basename.c > === > RCS file: /cvs/src/lib/libc/gen/basename.c,v > retrieving revision 1.16 > diff -u -p -r1.16 basename.c > --- lib/libc/gen/basename.c 25 Jan 2019 00:19:25 - 1.16 > +++ lib/libc/gen/basename.c 11 Sep 2020 20:43:13 - > @@ -22,7 +22,7 @@ > #include > > char * > -basename(const char *path) > +basename(char *path) > { > static char bname[PATH_MAX]; > size_t len; > Index: lib/libc/gen/dirname.3 > === > RCS file: /cvs/src/lib/libc/gen/dirname.3,v > retrieving revision 1.23 > diff -u -p -r1.23 dirname.3 > --- lib/libc/gen/dirname.38 Mar 2019 17:33:23 - 1.23 > +++ lib/libc/gen/dirname.311 Sep 2020 20:47:08 - > @@ -23,7 +23,7 @@ > .Sh SYNOPSIS > .In libgen.h > .Ft char * > -.Fn dirname "const char *path" > +.Fn dirname "char *path" > .Sh DESCRIPTION > The > .Fn dirname > Index: lib/libc/gen/dirname.c > === > RCS file: /cvs/src/lib/libc/gen/dirname.c,v > retrieving revision 1.16 > diff -u -p -r1.16 dirname.c > --- lib/libc/gen/dirname.c25 Jan 2019 00:19:25 - 1.16 > +++ lib/libc/gen/dirname.c11 Sep 2020 20:43:34 - > @@ -24,7 +24,7 @@ > /* A slightly modified copy of this file exists in libexec/ld.so */ > > char * > -dirname(const char *path) > +dirname(char *path) > { > static char dname[PATH_MAX]; > size_t len; > -- > Christian "naddy" Weisgerber na...@mips.inka.de > >
basename(3) should have non-const arg, says POSIX
[Picking this up again after a month:] Our basename(3) and dirname(3) take a const argument: char*basename(const char *); char*dirname(const char *); POSIX says otherwise... char *basename(char *path); char *dirname(char *path); ... and explicitly says the functions may modify the input string. Our functions were const-ified in 1999 by espie@: proper const semantics for dirname & basename. (this follows FreeBSD and Linux. Single Unix 2 is still illogical) Well, four years ago, FreeBSD finally switched to the POSIX prototypes https://svnweb.freebsd.org/base/head/include/libgen.h?revision=303451&view=markup and in fact now has an implementation that modifies the string. Linux (GNU libc) has a bizarro solution where you get a const basename() and no dirname() if you just include , but POSIX basename() and dirname() if you instead or also include . This is a portability trap. Code written on OpenBSD may not be prepared for basename() or dirname() to splat a '\0' into the input string, despite the warning in the man page. This is not hypothetical. Both Got and OpenCVS have fallen victim to the unportable assumption. The patch below aligns the function prototypes with POSIX. All resulting warnings "passing 'const char *' to parameter of type 'char *' discards qualifiers" in the base system have been cleaned up. It successfully passes "make release". For good measure I'm also running a package bulk build with it as we speak. OK? Index: include/libgen.h === RCS file: /cvs/src/include/libgen.h,v retrieving revision 1.9 diff -u -p -r1.9 libgen.h --- include/libgen.h25 Jan 2019 00:19:25 - 1.9 +++ include/libgen.h11 Sep 2020 20:41:34 - @@ -22,8 +22,8 @@ #include __BEGIN_DECLS -char *basename(const char *); -char *dirname(const char *); +char *basename(char *); +char *dirname(char *); __END_DECLS #endif /* _LIBGEN_H_ */ Index: lib/libc/gen/basename.3 === RCS file: /cvs/src/lib/libc/gen/basename.3,v retrieving revision 1.24 diff -u -p -r1.24 basename.3 --- lib/libc/gen/basename.3 25 Jan 2019 00:19:25 - 1.24 +++ lib/libc/gen/basename.3 11 Sep 2020 20:46:30 - @@ -23,7 +23,7 @@ .Sh SYNOPSIS .In libgen.h .Ft char * -.Fn basename "const char *path" +.Fn basename "char *path" .Sh DESCRIPTION The .Fn basename Index: lib/libc/gen/basename.c === RCS file: /cvs/src/lib/libc/gen/basename.c,v retrieving revision 1.16 diff -u -p -r1.16 basename.c --- lib/libc/gen/basename.c 25 Jan 2019 00:19:25 - 1.16 +++ lib/libc/gen/basename.c 11 Sep 2020 20:43:13 - @@ -22,7 +22,7 @@ #include char * -basename(const char *path) +basename(char *path) { static char bname[PATH_MAX]; size_t len; Index: lib/libc/gen/dirname.3 === RCS file: /cvs/src/lib/libc/gen/dirname.3,v retrieving revision 1.23 diff -u -p -r1.23 dirname.3 --- lib/libc/gen/dirname.3 8 Mar 2019 17:33:23 - 1.23 +++ lib/libc/gen/dirname.3 11 Sep 2020 20:47:08 - @@ -23,7 +23,7 @@ .Sh SYNOPSIS .In libgen.h .Ft char * -.Fn dirname "const char *path" +.Fn dirname "char *path" .Sh DESCRIPTION The .Fn dirname Index: lib/libc/gen/dirname.c === RCS file: /cvs/src/lib/libc/gen/dirname.c,v retrieving revision 1.16 diff -u -p -r1.16 dirname.c --- lib/libc/gen/dirname.c 25 Jan 2019 00:19:25 - 1.16 +++ lib/libc/gen/dirname.c 11 Sep 2020 20:43:34 - @@ -24,7 +24,7 @@ /* A slightly modified copy of this file exists in libexec/ld.so */ char * -dirname(const char *path) +dirname(char *path) { static char dname[PATH_MAX]; size_t len; -- Christian "naddy" Weisgerber na...@mips.inka.de
Re: Non-const basename: usr.bin/cvs
On Sat, 17 Oct 2020 00:15:54 +0200, Christian Weisgerber wrote: > Accommodate POSIX basename(3) that takes a non-const parameter and > may modify the string buffer. > > There were only two compiler warnings about discarded const, but > there are numerous instances where the code assumes non-POSIX > semantics for basename() and dirname(). Given that there is at > least a FreeBSD port of OpenCVS, cleaning this up is more than > cosmetic. OK millert@ - todd
Re: arm64 ddb: decode "udf" instruction
> Date: Mon, 19 Oct 2020 20:03:01 +0200 > From: Christian Weisgerber > > This decodes the UDF ("permanently undefined") instruction in ddb's > arm64 disassembler. The particular immediate16 format appears to > be unique to this instruction. > > OK? Or don't bother? ok kettenis@ > Index: arch/arm64/arm64/disasm.c > === > RCS file: /cvs/src/sys/arch/arm64/arm64/disasm.c,v > retrieving revision 1.2 > diff -u -p -r1.2 disasm.c > --- arch/arm64/arm64/disasm.c 11 Sep 2020 09:27:10 - 1.2 > +++ arch/arm64/arm64/disasm.c 19 Oct 2020 16:17:55 - > @@ -3107,6 +3107,11 @@ OP4FUNC(op_tbz, b5, b40, imm14, Rt) > PRINTF("\n"); > } > > +OP1FUNC(op_udf, imm16) > +{ > + PRINTF("udf\t#0x%"PRIx64"\n", imm16); > +} > + > OP4FUNC(op_udiv, sf, Rm, Rn, Rd) > { > PRINTF("udiv\t%s, %s, %s\n", > @@ -3668,6 +3673,8 @@ struct insn_info { > {{ 5,16}, { 0, 0}, { 0, 0}, { 0, 0}, { 0, 0}, { 0, 0}, { 0, 0}, { 0, 0}} > #define FMT_IMM16_LL \ > {{ 5,16}, { 0, 2}, { 0, 0}, { 0, 0}, { 0, 0}, { 0, 0}, { 0, 0}, { 0, 0}} > +#define FMT_IMM16_UDF\ > + {{ 0,16}, { 0, 0}, { 0, 0}, { 0, 0}, { 0, 0}, { 0, 0}, { 0, 0}, { 0, 0}} > #define FMT_OP0_OP1_CRN_CRM_OP2_RT \ > {{19, 2}, {16, 3}, {12, 4}, { 8, 4}, { 5, 3}, { 0, 5}, { 0, 0}, { 0, 0}} > #define FMT_IMM7_RT2_RN_RT \ > @@ -3786,6 +3793,7 @@ static const struct insn_info insn_table > { 0xd800, 0xdac1, FMT_Z_M_RN_RD, op_pacia }, > { 0xcc00, 0x4e284800, FMT_M_D_RN_RD, op_simd_aes }, > { 0x8c00, 0x5e280800, FMT_OP3_RN_RD, op_simd_sha_reg2 }, > + { 0x, 0x, FMT_IMM16_UDF, op_udf }, > { 0xfff8f01f, 0xd500401f, FMT_OP1_CRM_OP2, op_msr_imm }, > { 0xfff8, 0xd508, FMT_OP1_CRN_CRM_OP2_RT, op_sys }, > { 0xfff8, 0xd528, FMT_OP1_CRN_CRM_OP2_RT, op_sysl }, > -- > Christian "naddy" Weisgerber na...@mips.inka.de > >
arm64 ddb: decode "udf" instruction
This decodes the UDF ("permanently undefined") instruction in ddb's arm64 disassembler. The particular immediate16 format appears to be unique to this instruction. OK? Or don't bother? Index: arch/arm64/arm64/disasm.c === RCS file: /cvs/src/sys/arch/arm64/arm64/disasm.c,v retrieving revision 1.2 diff -u -p -r1.2 disasm.c --- arch/arm64/arm64/disasm.c 11 Sep 2020 09:27:10 - 1.2 +++ arch/arm64/arm64/disasm.c 19 Oct 2020 16:17:55 - @@ -3107,6 +3107,11 @@ OP4FUNC(op_tbz, b5, b40, imm14, Rt) PRINTF("\n"); } +OP1FUNC(op_udf, imm16) +{ + PRINTF("udf\t#0x%"PRIx64"\n", imm16); +} + OP4FUNC(op_udiv, sf, Rm, Rn, Rd) { PRINTF("udiv\t%s, %s, %s\n", @@ -3668,6 +3673,8 @@ struct insn_info { {{ 5,16}, { 0, 0}, { 0, 0}, { 0, 0}, { 0, 0}, { 0, 0}, { 0, 0}, { 0, 0}} #define FMT_IMM16_LL \ {{ 5,16}, { 0, 2}, { 0, 0}, { 0, 0}, { 0, 0}, { 0, 0}, { 0, 0}, { 0, 0}} +#define FMT_IMM16_UDF \ + {{ 0,16}, { 0, 0}, { 0, 0}, { 0, 0}, { 0, 0}, { 0, 0}, { 0, 0}, { 0, 0}} #define FMT_OP0_OP1_CRN_CRM_OP2_RT \ {{19, 2}, {16, 3}, {12, 4}, { 8, 4}, { 5, 3}, { 0, 5}, { 0, 0}, { 0, 0}} #define FMT_IMM7_RT2_RN_RT \ @@ -3786,6 +3793,7 @@ static const struct insn_info insn_table { 0xd800, 0xdac1, FMT_Z_M_RN_RD, op_pacia }, { 0xcc00, 0x4e284800, FMT_M_D_RN_RD, op_simd_aes }, { 0x8c00, 0x5e280800, FMT_OP3_RN_RD, op_simd_sha_reg2 }, + { 0x, 0x, FMT_IMM16_UDF, op_udf }, { 0xfff8f01f, 0xd500401f, FMT_OP1_CRM_OP2, op_msr_imm }, { 0xfff8, 0xd508, FMT_OP1_CRN_CRM_OP2_RT, op_sys }, { 0xfff8, 0xd528, FMT_OP1_CRN_CRM_OP2_RT, op_sysl }, -- Christian "naddy" Weisgerber na...@mips.inka.de
utpms(4): less jumpy mouse movement
Hi, i noticed that the mouse movement on my powermac can be pretty jittery at times. One of the reasons I have identified is our use of a position change threshold. The driver ignores all finger position changes below a certain threshold. If the finger position change is > threshold it is used to calculate a new mouse pointer position from the value of the change. I think it would be better to use the difference of the finger position delta minus threshold for those calculations, otherwise the minimal mouse position change is also > threshold which seems wrong. feedback, ok? Index: utpms.c === RCS file: /cvs/src/sys/dev/usb/utpms.c,v retrieving revision 1.10 diff -u -p -r1.10 utpms.c --- utpms.c 11 Sep 2020 19:18:01 - 1.10 +++ utpms.c 19 Oct 2020 16:55:53 - @@ -638,8 +638,8 @@ detect_pos(int *sensors, int n_sensors, if (sensors[i] >= threshold) { if (i == 0 || sensors[i - 1] < threshold) *fingers_ret += 1; - s += sensors[i]; - w += sensors[i] * i; + s += sensors[i] - threshold; + w += (sensors[i] - threshold) * i; } }
Re: arm64, armv7: proper illegal instruction
> From: "Theo de Raadt" > Date: Mon, 19 Oct 2020 09:25:30 -0600 > > Mark Kettenis wrote: > > > > Date: Mon, 19 Oct 2020 16:36:14 +0200 > > > From: Christian Weisgerber > > > > > > Belatedly, ARM has taken a slice of the reserved opcode space and > > > assigned it as a properly defined illegal instruction, udf #imm16. > > > (Armv8 Architecture Reference Manual, edition F.c, section C6.2.335). > > > Clang already knows about it. > > > > > > We really should use this instead of picking something ad-hoc out > > > of the opcode space. > > > > > > I have verified that this builds on arm64, produces a SIGILL in > > > userland, and drops me into ddb in the kernel. > > > > > > armv7 has an equivalent instruction. kettenis@ confirms it builds > > > and SIGILLs there. > > > > > > OK? > > > > So on armv7 there is an additional consideration. The architecture > > defines tow instruction sets: A32 and T32 (Thumb). A32 instructions > > are 32-bit but T32 instructions can be 16-bit. If an attacker can > > switch the CPU into T32 mode, it will interpret this UDF instruction > > as two different instructions. We may have to consider how "bad" > > these two instructions are and maybe tune that #imm16 accordingly. > > If the attacker has turned on thumb in the kernel I think all causes > are lost ... aren't there thousands of thumb gadgets? Probably. So 0xa000f7f0 happens to be the 32-bit Thumb UDF instruction. If you jump into the middle of that you get an ADR instruction. SoI don't think that one was particularly "safe". So I think naddy's patch is fine.
Re: arm64, armv7: proper illegal instruction
Mark Kettenis wrote: > > Date: Mon, 19 Oct 2020 16:36:14 +0200 > > From: Christian Weisgerber > > > > Belatedly, ARM has taken a slice of the reserved opcode space and > > assigned it as a properly defined illegal instruction, udf #imm16. > > (Armv8 Architecture Reference Manual, edition F.c, section C6.2.335). > > Clang already knows about it. > > > > We really should use this instead of picking something ad-hoc out > > of the opcode space. > > > > I have verified that this builds on arm64, produces a SIGILL in > > userland, and drops me into ddb in the kernel. > > > > armv7 has an equivalent instruction. kettenis@ confirms it builds > > and SIGILLs there. > > > > OK? > > So on armv7 there is an additional consideration. The architecture > defines tow instruction sets: A32 and T32 (Thumb). A32 instructions > are 32-bit but T32 instructions can be 16-bit. If an attacker can > switch the CPU into T32 mode, it will interpret this UDF instruction > as two different instructions. We may have to consider how "bad" > these two instructions are and maybe tune that #imm16 accordingly. If the attacker has turned on thumb in the kernel I think all causes are lost ... aren't there thousands of thumb gadgets?
Re: arm64, armv7: proper illegal instruction
> Date: Mon, 19 Oct 2020 16:36:14 +0200 > From: Christian Weisgerber > > Belatedly, ARM has taken a slice of the reserved opcode space and > assigned it as a properly defined illegal instruction, udf #imm16. > (Armv8 Architecture Reference Manual, edition F.c, section C6.2.335). > Clang already knows about it. > > We really should use this instead of picking something ad-hoc out > of the opcode space. > > I have verified that this builds on arm64, produces a SIGILL in > userland, and drops me into ddb in the kernel. > > armv7 has an equivalent instruction. kettenis@ confirms it builds > and SIGILLs there. > > OK? So on armv7 there is an additional consideration. The architecture defines tow instruction sets: A32 and T32 (Thumb). A32 instructions are 32-bit but T32 instructions can be 16-bit. If an attacker can switch the CPU into T32 mode, it will interpret this UDF instruction as two different instructions. We may have to consider how "bad" these two instructions are and maybe tune that #imm16 accordingly. > Index: lib/csu/aarch64/md_init.h > === > RCS file: /cvs/src/lib/csu/aarch64/md_init.h,v > retrieving revision 1.9 > diff -u -p -r1.9 md_init.h > --- lib/csu/aarch64/md_init.h 15 Oct 2020 16:30:21 - 1.9 > +++ lib/csu/aarch64/md_init.h 19 Oct 2020 11:57:02 - > @@ -115,5 +115,5 @@ > " svc #0 \n" \ > " dsb nsh \n" \ > " isb \n" \ > - " .word 0xa000f7f0 /* illegal */ \n" \ > + " udf #0 \n" \ > ".previous"); > Index: lib/csu/arm/md_init.h > === > RCS file: /cvs/src/lib/csu/arm/md_init.h,v > retrieving revision 1.16 > diff -u -p -r1.16 md_init.h > --- lib/csu/arm/md_init.h 15 Oct 2020 16:30:23 - 1.16 > +++ lib/csu/arm/md_init.h 19 Oct 2020 13:23:00 - > @@ -159,5 +159,5 @@ > " swi #0 \n" \ > " dsb nsh \n" \ > " isb \n" \ > - " .word 0xa000f7f0 /* illegal */ \n" \ > + " udf #0 \n" \ > ".previous"); > Index: lib/libc/arch/aarch64/sys/tfork_thread.S > === > RCS file: /cvs/src/lib/libc/arch/aarch64/sys/tfork_thread.S,v > retrieving revision 1.5 > diff -u -p -r1.5 tfork_thread.S > --- lib/libc/arch/aarch64/sys/tfork_thread.S 18 Oct 2020 14:28:16 - > 1.5 > +++ lib/libc/arch/aarch64/sys/tfork_thread.S 19 Oct 2020 11:59:32 - > @@ -43,6 +43,6 @@ ENTRY(__tfork_thread) > mov x0, x3 > blr x2 > SYSTRAP(__threxit) > - .word 0xa000f7f0 /* illegal on all cpus? */ > + udf #0 > .cfi_endproc > END(__tfork_thread) > Index: lib/libc/arch/arm/sys/tfork_thread.S > === > RCS file: /cvs/src/lib/libc/arch/arm/sys/tfork_thread.S,v > retrieving revision 1.5 > diff -u -p -r1.5 tfork_thread.S > --- lib/libc/arch/arm/sys/tfork_thread.S 18 Oct 2020 14:28:17 - > 1.5 > +++ lib/libc/arch/arm/sys/tfork_thread.S 19 Oct 2020 13:23:35 - > @@ -37,5 +37,5 @@ ENTRY(__tfork_thread) > mov pc, r2 > nop > SYSTRAP(__threxit) > - .word 0xa000f7f0 /* illegal on all cpus? */ > + udf #0 > END(__tfork_thread) > Index: sys/arch/arm/arm/sigcode.S > === > RCS file: /cvs/src/sys/arch/arm/arm/sigcode.S,v > retrieving revision 1.9 > diff -u -p -r1.9 sigcode.S > --- sys/arch/arm/arm/sigcode.S13 Mar 2020 08:46:50 - 1.9 > +++ sys/arch/arm/arm/sigcode.S19 Oct 2020 13:23:55 - > @@ -72,7 +72,7 @@ _C_LABEL(esigcode): > > .globl sigfill > sigfill: > - .word 0xa000f7f0 /* illegal on all cpus? */ > + udf #0 > esigfill: > > .data > Index: sys/arch/arm64/arm64/locore.S > === > RCS file: /cvs/src/sys/arch/arm64/arm64/locore.S,v > retrieving revision 1.31 > diff -u -p -r1.31 locore.S > --- sys/arch/arm64/arm64/locore.S 13 Mar 2020 00:14:38 - 1.31 > +++ sys/arch/arm64/arm64/locore.S 19 Oct 2020 12:02:23 - > @@ -366,7 +366,7 @@ _C_LABEL(esigcode): > > .globl sigfill > sigfill: > - .word 0xa000f7f0 /* FIXME: illegal on all cpus? */ > + udf #0 > esigfill: > > .data > -- > Christian "naddy" Weisgerber na...@mips.inka.de > >
Re: arm64, armv7: proper illegal instruction
Thanks for finding a better instruction! ok deraadt Christian Weisgerber wrote: > Belatedly, ARM has taken a slice of the reserved opcode space and > assigned it as a properly defined illegal instruction, udf #imm16. > (Armv8 Architecture Reference Manual, edition F.c, section C6.2.335). > Clang already knows about it. > > We really should use this instead of picking something ad-hoc out > of the opcode space. > > I have verified that this builds on arm64, produces a SIGILL in > userland, and drops me into ddb in the kernel. > > armv7 has an equivalent instruction. kettenis@ confirms it builds > and SIGILLs there. > > OK? > > Index: lib/csu/aarch64/md_init.h > === > RCS file: /cvs/src/lib/csu/aarch64/md_init.h,v > retrieving revision 1.9 > diff -u -p -r1.9 md_init.h > --- lib/csu/aarch64/md_init.h 15 Oct 2020 16:30:21 - 1.9 > +++ lib/csu/aarch64/md_init.h 19 Oct 2020 11:57:02 - > @@ -115,5 +115,5 @@ > " svc #0 \n" \ > " dsb nsh \n" \ > " isb \n" \ > - " .word 0xa000f7f0 /* illegal */ \n" \ > + " udf #0 \n" \ > ".previous"); > Index: lib/csu/arm/md_init.h > === > RCS file: /cvs/src/lib/csu/arm/md_init.h,v > retrieving revision 1.16 > diff -u -p -r1.16 md_init.h > --- lib/csu/arm/md_init.h 15 Oct 2020 16:30:23 - 1.16 > +++ lib/csu/arm/md_init.h 19 Oct 2020 13:23:00 - > @@ -159,5 +159,5 @@ > " swi #0 \n" \ > " dsb nsh \n" \ > " isb \n" \ > - " .word 0xa000f7f0 /* illegal */ \n" \ > + " udf #0 \n" \ > ".previous"); > Index: lib/libc/arch/aarch64/sys/tfork_thread.S > === > RCS file: /cvs/src/lib/libc/arch/aarch64/sys/tfork_thread.S,v > retrieving revision 1.5 > diff -u -p -r1.5 tfork_thread.S > --- lib/libc/arch/aarch64/sys/tfork_thread.S 18 Oct 2020 14:28:16 - > 1.5 > +++ lib/libc/arch/aarch64/sys/tfork_thread.S 19 Oct 2020 11:59:32 - > @@ -43,6 +43,6 @@ ENTRY(__tfork_thread) > mov x0, x3 > blr x2 > SYSTRAP(__threxit) > - .word 0xa000f7f0 /* illegal on all cpus? */ > + udf #0 > .cfi_endproc > END(__tfork_thread) > Index: lib/libc/arch/arm/sys/tfork_thread.S > === > RCS file: /cvs/src/lib/libc/arch/arm/sys/tfork_thread.S,v > retrieving revision 1.5 > diff -u -p -r1.5 tfork_thread.S > --- lib/libc/arch/arm/sys/tfork_thread.S 18 Oct 2020 14:28:17 - > 1.5 > +++ lib/libc/arch/arm/sys/tfork_thread.S 19 Oct 2020 13:23:35 - > @@ -37,5 +37,5 @@ ENTRY(__tfork_thread) > mov pc, r2 > nop > SYSTRAP(__threxit) > - .word 0xa000f7f0 /* illegal on all cpus? */ > + udf #0 > END(__tfork_thread) > Index: sys/arch/arm/arm/sigcode.S > === > RCS file: /cvs/src/sys/arch/arm/arm/sigcode.S,v > retrieving revision 1.9 > diff -u -p -r1.9 sigcode.S > --- sys/arch/arm/arm/sigcode.S13 Mar 2020 08:46:50 - 1.9 > +++ sys/arch/arm/arm/sigcode.S19 Oct 2020 13:23:55 - > @@ -72,7 +72,7 @@ _C_LABEL(esigcode): > > .globl sigfill > sigfill: > - .word 0xa000f7f0 /* illegal on all cpus? */ > + udf #0 > esigfill: > > .data > Index: sys/arch/arm64/arm64/locore.S > === > RCS file: /cvs/src/sys/arch/arm64/arm64/locore.S,v > retrieving revision 1.31 > diff -u -p -r1.31 locore.S > --- sys/arch/arm64/arm64/locore.S 13 Mar 2020 00:14:38 - 1.31 > +++ sys/arch/arm64/arm64/locore.S 19 Oct 2020 12:02:23 - > @@ -366,7 +366,7 @@ _C_LABEL(esigcode): > > .globl sigfill > sigfill: > - .word 0xa000f7f0 /* FIXME: illegal on all cpus? */ > + udf #0 > esigfill: > > .data > -- > Christian "naddy" Weisgerber na...@mips.inka.de >
arm64, armv7: proper illegal instruction
Belatedly, ARM has taken a slice of the reserved opcode space and assigned it as a properly defined illegal instruction, udf #imm16. (Armv8 Architecture Reference Manual, edition F.c, section C6.2.335). Clang already knows about it. We really should use this instead of picking something ad-hoc out of the opcode space. I have verified that this builds on arm64, produces a SIGILL in userland, and drops me into ddb in the kernel. armv7 has an equivalent instruction. kettenis@ confirms it builds and SIGILLs there. OK? Index: lib/csu/aarch64/md_init.h === RCS file: /cvs/src/lib/csu/aarch64/md_init.h,v retrieving revision 1.9 diff -u -p -r1.9 md_init.h --- lib/csu/aarch64/md_init.h 15 Oct 2020 16:30:21 - 1.9 +++ lib/csu/aarch64/md_init.h 19 Oct 2020 11:57:02 - @@ -115,5 +115,5 @@ " svc #0 \n" \ " dsb nsh \n" \ " isb \n" \ - " .word 0xa000f7f0 /* illegal */ \n" \ + " udf #0 \n" \ ".previous"); Index: lib/csu/arm/md_init.h === RCS file: /cvs/src/lib/csu/arm/md_init.h,v retrieving revision 1.16 diff -u -p -r1.16 md_init.h --- lib/csu/arm/md_init.h 15 Oct 2020 16:30:23 - 1.16 +++ lib/csu/arm/md_init.h 19 Oct 2020 13:23:00 - @@ -159,5 +159,5 @@ " swi #0 \n" \ " dsb nsh \n" \ " isb \n" \ - " .word 0xa000f7f0 /* illegal */ \n" \ + " udf #0 \n" \ ".previous"); Index: lib/libc/arch/aarch64/sys/tfork_thread.S === RCS file: /cvs/src/lib/libc/arch/aarch64/sys/tfork_thread.S,v retrieving revision 1.5 diff -u -p -r1.5 tfork_thread.S --- lib/libc/arch/aarch64/sys/tfork_thread.S18 Oct 2020 14:28:16 - 1.5 +++ lib/libc/arch/aarch64/sys/tfork_thread.S19 Oct 2020 11:59:32 - @@ -43,6 +43,6 @@ ENTRY(__tfork_thread) mov x0, x3 blr x2 SYSTRAP(__threxit) - .word 0xa000f7f0 /* illegal on all cpus? */ + udf #0 .cfi_endproc END(__tfork_thread) Index: lib/libc/arch/arm/sys/tfork_thread.S === RCS file: /cvs/src/lib/libc/arch/arm/sys/tfork_thread.S,v retrieving revision 1.5 diff -u -p -r1.5 tfork_thread.S --- lib/libc/arch/arm/sys/tfork_thread.S18 Oct 2020 14:28:17 - 1.5 +++ lib/libc/arch/arm/sys/tfork_thread.S19 Oct 2020 13:23:35 - @@ -37,5 +37,5 @@ ENTRY(__tfork_thread) mov pc, r2 nop SYSTRAP(__threxit) - .word 0xa000f7f0 /* illegal on all cpus? */ + udf #0 END(__tfork_thread) Index: sys/arch/arm/arm/sigcode.S === RCS file: /cvs/src/sys/arch/arm/arm/sigcode.S,v retrieving revision 1.9 diff -u -p -r1.9 sigcode.S --- sys/arch/arm/arm/sigcode.S 13 Mar 2020 08:46:50 - 1.9 +++ sys/arch/arm/arm/sigcode.S 19 Oct 2020 13:23:55 - @@ -72,7 +72,7 @@ _C_LABEL(esigcode): .globl sigfill sigfill: - .word 0xa000f7f0 /* illegal on all cpus? */ + udf #0 esigfill: .data Index: sys/arch/arm64/arm64/locore.S === RCS file: /cvs/src/sys/arch/arm64/arm64/locore.S,v retrieving revision 1.31 diff -u -p -r1.31 locore.S --- sys/arch/arm64/arm64/locore.S 13 Mar 2020 00:14:38 - 1.31 +++ sys/arch/arm64/arm64/locore.S 19 Oct 2020 12:02:23 - @@ -366,7 +366,7 @@ _C_LABEL(esigcode): .globl sigfill sigfill: - .word 0xa000f7f0 /* FIXME: illegal on all cpus? */ + udf #0 esigfill: .data -- Christian "naddy" Weisgerber na...@mips.inka.de
Re: mixerctl names
On Sat, Oct 17, 2020 at 05:52:58PM +0200, Jan Stary wrote: > Currently, mixerctl.conf(5) says > > Most devices have a number of digital to analogue converters > (DACs), used for sound playback, and each DAC has a corresponding > output mixer. The mixers are labelled “mix” or “sel”. > > That doesn't seem to be the case, at least not universaly > as the wording seems to imply. For example, this is > mixerctl output on a Thinkpad T400: > > inputs.dac-0:1=222,222 > inputs.dac-2:3=222,222 > inputs.beep=0 > record.adc-2:3_source=mic2 > record.adc-2:3=219,219 > record.adc-0:1_source=mic > record.adc-0:1=219,219 > outputs.hp_source=dac-0:1 > outputs.hp_boost=on > inputs.mic=189,189 > outputs.mic_dir=input-vr80 > outputs.spkr_source=dac-2:3 > outputs.spkr_eapd=on > inputs.mic2=189,189 > outputs.hp_sense=unplugged > outputs.mic_sense=unplugged > outputs.master=240,240 > outputs.master.mute=off > outputs.master.slaves= > record.volume=240,240 > record.volume.mute=off > record.volume.slaves= > record.enable=sysctl > > Apparently, it has two DACS (for the speakers and the headphones). > The current wording might confuse the user into thinking he has > no output mixer, but the > > inputs.dac-0:1=222,222 > inputs.dac-2:3=222,222 > > do control the respective volumes, > while no "mix" or "sel" exists. > > Similarly for recording via the two ADCs. > It depends on the hardware. Basically, azalia exposes the schematic of the mixer and, in turn, the driver exposes all "widgets" to the upper layers, with a generated name. This is very similar to uaudio. Certain devices have no "mix_xxx" widgets others have. On one of my machines: $ doas mixerctl | egrep '(inputs|outputs).mix' | wc -l 20 desktop computers have many jacks and tend to have many mixers. The "sel_xxx" widgets are rare but still exist.
Re: pf route-to issues
On 2020/10/19 19:53, David Gwynne wrote: > On Mon, Oct 19, 2020 at 09:34:31AM +0100, Stuart Henderson wrote: > > On 2020/10/19 15:35, David Gwynne wrote: > > > every few years i try and use route-to in pf, and every time it > > > goes badly. i tried it again last week in a slightly different > > > setting, and actually tried to understand the sharp edges i hit > > > this time instead of giving up. it turns out there are 2 or 3 > > > different things together that have cause me trouble, which is why > > > the diff below is so big. > > > > I used to route-to/reply-to quite a lot at places with poor internet > > connections to split traffic between lines (mostly those have better > > connections now so I don't need it as often). It worked as I expected - > > but I only ever used it with the interface specified. > > cool. did it work beyond the first packet in a connection? It must have done. The webcams would have utterly broken the rest of traffic if it hadn't :) > > I mostly used it with pppoe interfaces so the peer address was unknown > > at ruleset load time. (I was lucky and had static IPs my side, but the > > ISP side was variable). I relied on the fact that once packets are > > directed at a point-point interface there's only one place for them to > > go. I didn't notice that ":peer" might be useful here (and the syntax > > 'route-to pppoe1:peer@pppoe1' is pretty awkward so I probably wouldn't > > have come up with it), I had 0.0.0.1@pppoe1, 0.0.0.2@pppoe2 etc > > (though actually I think it works with $any_random_address@pppoeX). > > yes. i was trying to use it with peers over ethernet, and always > struggled with the syntax. > > > > the first and i would argue most fundamental problem is a semantic > > > problem. if you ask a random person who has some clue about networks > > > and routing what they would expect the "argument" to route-to or > > > reply-to to be, they would say "a nexthop address" or "a gateway > > > address". eg, say i want to force packets to a specific backend > > > server without using NAT, i would write a rule like this: > > > > > > n_servers="192.0.2.128/27" > > > pass out on $if_internal to $n_servers route-to 192.168.0.1 > > > > > > pfctl will happily parse this, shove it into the kernel, let you read > > > the rules back out again with pfctl -sr, and it all looks plausible, but > > > it turns out that it's using the argument to route-to as an interface > > > name. because rulesets can refer to interfaces that don't exist yet, pf > > > just passes the IP address around as a string, hoping i'll plug in an > > > interface with a driver name that looks like an ip address. i spent > > > literally a day trying to figure out why a rule like this wasn't > > > working. > > > > I don't think I tried this, but the pf.conf(5) BNF syntax suggests it's > > supposed to work. So either doc or implementation bug there. > > im leaning toward implementation bug. > > > route = ( "route-to" | "reply-to" | "dup-to" ) > > ( routehost | "{" routehost-list "}" ) > > [ pooltype ] > > > > routehost-list = routehost [ [ "," ] routehost-list ] > > > > routehost = host | host "@" interface-name | > > "(" interface-name [ address [ "/" mask-bits ] ] ")" > > > > > the second problem is that the pf_route calls from pfsync don't > > > have all the information it is supposed to have. more specifically, > > > an ifp pointer isn't set which leads to a segfault. the ifp pointer > > > isn't set because pfsync doesnt track which interface a packet is > > > going out, it assumes the ip layer will get it right again later, or a > > > rule provided something usable. > > > > > > the third problem is that pf_route relies on information from rules to > > > work correctly. this is a problem in a pfsync environment because you > > > cannot have the same ruleset on both firewalls 100% of the time, which > > > means you cannot have route-to/reply-to behave consistently on a pair of > > > firwalls 100% of the time. > > > > I didn't run into this because pppoe(4) and pfsync/carp don't really > > go well together, but ouch! > > > > > all of this together makes things work pretty obviously and smoothly. > > > in my opinion anyway. route-to now works more like rdr-to, it just > > > feels like it changes the address used for the route lookup rather > > > than changing the actual IP address in the packet. it also works > > > predictably in a pfsync pair, which is great from the point of view of > > > high availability. > > > > > > the main caveat is that it's not backward compatible. if you're already > > > using route-to, you will need to tweak your rules to have them parse. > > > however, i doubt anyone is using this stuff because it feels very broken > > > to me. > > > > Do you expect this to work with a bracketed "address" to defer lookup > > until rule evaluation time? i.e. > > > > pass out proto tcp to any p
Re: pf route-to issues
On Mon, Oct 19, 2020 at 12:28:19PM +0200, Alexandr Nedvedicky wrote: > Hello, > > > > > > > > it seems to me 'route-to vs. pfsync' still needs more thought. the > > > next-hop IP address in route-to may be different for each PF box > > > linked by pfsync(4). To be honest I have no answer to address this > > > issue at the moment. > > > > i have thought about that a little bit. we could play with what the > > argument to route-to means. rather than requiring it to be a directly > > connected host/gateway address, we could interpret it as a destination > > address, and use the gateway for that destination as the nexthop. > > > > eg, if i have the following routing table on frontend a: > > > > Internet: > > DestinationGatewayFlags Refs Use Mtu Prio > > Iface > > default192.168.96.33 UGS6 176171 - 8 > > vmx0 > > 224/4 127.0.0.1 URS00 32768 8 lo0 > > > > 10.0.0.0/30192.168.0.1UGS00 - 8 > > gre0 > > > > > and this routing table on frontend b: > > > > Internet: > > DestinationGatewayFlags Refs Use Mtu Prio > > Iface > > default192.168.96.33 UGS987548 - 8 > > aggr0 > > 224/4 127.0.0.1 URS00 32768 8 lo0 > > > > 10.0.0.0/30192.168.0.3UGS00 - 8 > > gre0 > > > > > if gre0 on both frontends pointed at different legs on the same backend > > server, i could write a pf rule like this: > > > > pass out to port 80 route-to 10.0.0.1 > > > > 10.0.0.1 would then end up as the rt_addr field in pf_state and > > pfsync_state. > > > > both frontend a and b would lookup the route to 10.0.0.1, and then > > use 192.168.0.1 and 192.168.0.3 as the gateway address respectively. > > both would end up pushing the packet over their gre link to the > > same backend. the same semantic would work if the link to the backend > > was over ethernet instead of a tunnel. > > > > > > thoughts? > > > > > > > > > > What you've said makes sense. However I still feel pfsync(4) > > > does not play well with route-to. > > > > maybe your opinion is different if the above makes sense? > > > > Thanks for detailed explanation. This is good enough to make me happy. > The remaining questions on this are sort of 'homework for me' to poke > to PF source code, for example: > are we doing route look up for every packet? or route look up > is performed when state is created/imported? (and we cache > outbound interface + next-hop along the state) the "destination" address is determined when the state is created and stored in rt_addr if pf_state. the route lookup using that address is done per packet in pf_route. > also what happens when route does not exist on pfsync peer, which > receives state? How admin will discover state failed to import? the route lookup or interface lookup fails and the packet is dropped. there are no counters to show this is happening though :( > Anyway, your plan above looks solid to me now. It's certainly more > flexible > (?reliable?) to select route to particular destination, than using pair of > interface,next-hop. cool, i'll keep working on it then. > > thanks and > regards > sashan
Re: pf route-to issues
Hello, > > > > it seems to me 'route-to vs. pfsync' still needs more thought. the > > next-hop IP address in route-to may be different for each PF box > > linked by pfsync(4). To be honest I have no answer to address this > > issue at the moment. > > i have thought about that a little bit. we could play with what the > argument to route-to means. rather than requiring it to be a directly > connected host/gateway address, we could interpret it as a destination > address, and use the gateway for that destination as the nexthop. > > eg, if i have the following routing table on frontend a: > > Internet: > DestinationGatewayFlags Refs Use Mtu Prio Iface > default192.168.96.33 UGS6 176171 - 8 vmx0 > 224/4 127.0.0.1 URS00 32768 8 lo0 > 10.0.0.0/30192.168.0.1UGS00 - 8 gre0 > > and this routing table on frontend b: > > Internet: > DestinationGatewayFlags Refs Use Mtu Prio Iface > default192.168.96.33 UGS987548 - 8 aggr0 > 224/4 127.0.0.1 URS00 32768 8 lo0 > 10.0.0.0/30192.168.0.3UGS00 - 8 gre0 > > if gre0 on both frontends pointed at different legs on the same backend > server, i could write a pf rule like this: > > pass out to port 80 route-to 10.0.0.1 > > 10.0.0.1 would then end up as the rt_addr field in pf_state and > pfsync_state. > > both frontend a and b would lookup the route to 10.0.0.1, and then > use 192.168.0.1 and 192.168.0.3 as the gateway address respectively. > both would end up pushing the packet over their gre link to the > same backend. the same semantic would work if the link to the backend > was over ethernet instead of a tunnel. > > > > thoughts? > > > > > > > What you've said makes sense. However I still feel pfsync(4) > > does not play well with route-to. > > maybe your opinion is different if the above makes sense? > Thanks for detailed explanation. This is good enough to make me happy. The remaining questions on this are sort of 'homework for me' to poke to PF source code, for example: are we doing route look up for every packet? or route look up is performed when state is created/imported? (and we cache outbound interface + next-hop along the state) also what happens when route does not exist on pfsync peer, which receives state? How admin will discover state failed to import? Anyway, your plan above looks solid to me now. It's certainly more flexible (?reliable?) to select route to particular destination, than using pair of interface,next-hop. thanks and regards sashan
Re: pf route-to issues
On Mon, Oct 19, 2020 at 09:34:31AM +0100, Stuart Henderson wrote: > On 2020/10/19 15:35, David Gwynne wrote: > > every few years i try and use route-to in pf, and every time it > > goes badly. i tried it again last week in a slightly different > > setting, and actually tried to understand the sharp edges i hit > > this time instead of giving up. it turns out there are 2 or 3 > > different things together that have cause me trouble, which is why > > the diff below is so big. > > I used to route-to/reply-to quite a lot at places with poor internet > connections to split traffic between lines (mostly those have better > connections now so I don't need it as often). It worked as I expected - > but I only ever used it with the interface specified. cool. did it work beyond the first packet in a connection? > I mostly used it with pppoe interfaces so the peer address was unknown > at ruleset load time. (I was lucky and had static IPs my side, but the > ISP side was variable). I relied on the fact that once packets are > directed at a point-point interface there's only one place for them to > go. I didn't notice that ":peer" might be useful here (and the syntax > 'route-to pppoe1:peer@pppoe1' is pretty awkward so I probably wouldn't > have come up with it), I had 0.0.0.1@pppoe1, 0.0.0.2@pppoe2 etc > (though actually I think it works with $any_random_address@pppoeX). yes. i was trying to use it with peers over ethernet, and always struggled with the syntax. > > the first and i would argue most fundamental problem is a semantic > > problem. if you ask a random person who has some clue about networks > > and routing what they would expect the "argument" to route-to or > > reply-to to be, they would say "a nexthop address" or "a gateway > > address". eg, say i want to force packets to a specific backend > > server without using NAT, i would write a rule like this: > > > > n_servers="192.0.2.128/27" > > pass out on $if_internal to $n_servers route-to 192.168.0.1 > > > > pfctl will happily parse this, shove it into the kernel, let you read > > the rules back out again with pfctl -sr, and it all looks plausible, but > > it turns out that it's using the argument to route-to as an interface > > name. because rulesets can refer to interfaces that don't exist yet, pf > > just passes the IP address around as a string, hoping i'll plug in an > > interface with a driver name that looks like an ip address. i spent > > literally a day trying to figure out why a rule like this wasn't > > working. > > I don't think I tried this, but the pf.conf(5) BNF syntax suggests it's > supposed to work. So either doc or implementation bug there. im leaning toward implementation bug. > route = ( "route-to" | "reply-to" | "dup-to" ) > ( routehost | "{" routehost-list "}" ) > [ pooltype ] > > routehost-list = routehost [ [ "," ] routehost-list ] > > routehost = host | host "@" interface-name | > "(" interface-name [ address [ "/" mask-bits ] ] ")" > > > the second problem is that the pf_route calls from pfsync don't > > have all the information it is supposed to have. more specifically, > > an ifp pointer isn't set which leads to a segfault. the ifp pointer > > isn't set because pfsync doesnt track which interface a packet is > > going out, it assumes the ip layer will get it right again later, or a > > rule provided something usable. > > > > the third problem is that pf_route relies on information from rules to > > work correctly. this is a problem in a pfsync environment because you > > cannot have the same ruleset on both firewalls 100% of the time, which > > means you cannot have route-to/reply-to behave consistently on a pair of > > firwalls 100% of the time. > > I didn't run into this because pppoe(4) and pfsync/carp don't really > go well together, but ouch! > > > all of this together makes things work pretty obviously and smoothly. > > in my opinion anyway. route-to now works more like rdr-to, it just > > feels like it changes the address used for the route lookup rather > > than changing the actual IP address in the packet. it also works > > predictably in a pfsync pair, which is great from the point of view of > > high availability. > > > > the main caveat is that it's not backward compatible. if you're already > > using route-to, you will need to tweak your rules to have them parse. > > however, i doubt anyone is using this stuff because it feels very broken > > to me. > > Do you expect this to work with a bracketed "address" to defer lookup > until rule evaluation time? i.e. > > pass out proto tcp to any port 22 route-to (pppoe1:peer) in my opinion route-to should be able to take whatever rdr-to accepts. however, i just tried it and it doesnt currently work, but i'm sure i can figure it out. > I think that will be all that's needed to allow converting the pppoe > use case. I don't have a multiple pppoe setup handy bu
Re: pf route-to issues
On Mon, Oct 19, 2020 at 09:46:19AM +0200, Alexandr Nedvedicky wrote: > Hello, > > disclaimer: I have no chance to run pfsync on production, I'm very > inexperienced with pfsync(4). i wrote the defer code in pfsync, and i think i wrote the code in pfsync that calls pf_route badly, so noones perfect :) > > > > > the third problem is that pf_route relies on information from rules to > > work correctly. this is a problem in a pfsync environment because you > > cannot have the same ruleset on both firewalls 100% of the time, which > > means you cannot have route-to/reply-to behave consistently on a pair of > > firwalls 100% of the time. > > > > my solution to both these problems is reduce the amount of information > > pf_route needs to work with, to make sure that the info it does need > > is in the pf state structure, and that pfsync handles it properly. > > > > if we limit the information needed for pf_route to a nexthop address, > > and which direction the address is used, this is doable. both the > > pf_state and pfsync_state structs already contain an address to store a > > nexthop in, i just had to move the route-to direction from the rule into > > the state. this is easy with pf_state, but i used a spare pad field in > > pfsync_state for this. > > > > it seems to me 'route-to vs. pfsync' still needs more thought. the > next-hop IP address in route-to may be different for each PF box > linked by pfsync(4). To be honest I have no answer to address this > issue at the moment. i have thought about that a little bit. we could play with what the argument to route-to means. rather than requiring it to be a directly connected host/gateway address, we could interpret it as a destination address, and use the gateway for that destination as the nexthop. eg, if i have the following routing table on frontend a: Internet: DestinationGatewayFlags Refs Use Mtu Prio Iface default192.168.96.33 UGS6 176171 - 8 vmx0 224/4 127.0.0.1 URS00 32768 8 lo0 10.0.0.0/30192.168.0.1UGS00 - 8 gre0 127/8 127.0.0.1 UGRS 00 32768 8 lo0 127.0.0.1 127.0.0.1 UHhl 1 142 32768 1 lo0 192.168.0.0192.168.0.0UHl00 - 1 gre0 192.168.0.1192.168.0.0UHh11 - 8 gre0 192.168.96.32/27 192.168.96.34 UCn2 122849 - 4 vmx0 192.168.96.33 00:00:5e:00:01:47 UHLch 114611 - 3 vmx0 192.168.96.34 00:50:56:a1:73:91 UHLl 0 362231 - 1 vmx0 192.168.96.60 fe:e1:ba:d0:74:ef UHLc 059690 - 3 vmx0 192.168.96.63 192.168.96.34 UHb00 - 1 vmx0 and this routing table on frontend b: Internet: DestinationGatewayFlags Refs Use Mtu Prio Iface default192.168.96.33 UGS987548 - 8 aggr0 224/4 127.0.0.1 URS00 32768 8 lo0 10.0.0.0/30192.168.0.3UGS00 - 8 gre0 127/8 127.0.0.1 UGRS 00 32768 8 lo0 127.0.0.1 127.0.0.1 UHhl 1 62 32768 1 lo0 192.168.0.2192.168.0.2UHl00 - 1 gre0 192.168.0.3192.168.0.2UHh11 - 8 gre0 192.168.96.32/27 192.168.96.55 UCn362186 - 4 aggr0 192.168.96.33 00:00:5e:00:01:47 UHLch 1 7442 - 3 aggr0 192.168.96.35 00:23:42:d0:56:8e UHLc 0 1905 - 3 aggr0 192.168.96.55 fe:e1:ba:d0:e0:83 UHLl 0 178119 - 1 aggr0 192.168.96.60 fe:e1:ba:d0:74:ef UHLc 031021 - 3 aggr0 192.168.96.63 192.168.96.55 UHb00 - 1 aggr0 if gre0 on both frontends pointed at different legs on the same backend server, i could write a pf rule like this: pass out to port 80 route-to 10.0.0.1 10.0.0.1 would then end up as the rt_addr field in pf_state and pfsync_state. both frontend a and b would lookup the route to 10.0.0.1, and then use 192.168.0.1 and 192.168.0.3 as the gateway address respectively. both would end up pushing the packet over their gre link to the same backend. the same semantic would work if the link to the backend was over ethernet instead of a tunnel. > > thoughts? > > > > What you've said makes sense. However I still feel pfsync(4) > does not play well with route-to. maybe your opinion is different if the above makes sense? > thanks and > regards > sashan no, thank you for reading my long email. cheers, dlg
Re: net.inet.ip.forwarding=0 vs lo(4)
On Mon, Oct 19, 2020 at 10:03:29AM +0100, Stuart Henderson wrote: > On 2020/10/19 11:47, David Gwynne wrote: > > On Sun, Oct 18, 2020 at 08:57:34PM +0100, Stuart Henderson wrote: > > > On 2020/10/18 14:04, David Gwynne wrote: > > > > the problem i'm hitting is that i have a multihomed box where the > > > > service it provides listens on an IP address that's assigned to lo1. > > > > it's a host running a service, it's not a router, so the > > > > net.inet.ip.forwarding sysctl is not set to 1. > > > > > > I ran into this, I just turned on the forwarding sysctl to avoid the > > > problem. > > > > > > > i came up with this diff, which adds even more special casing for > > > > loopback interfaces. it says addreesses on loopbacks are globally > > > > reachable, even if ip forwarding is disabled. > > > > > > I don't see why loopbacks should be special. Another place this > > > might show up is services running on carp addresses (I haven't updated > > > those machines yet but there's a fair chance they'll be affected too). > > > I would prefer an explicit sysctl to disable "strong host model". > > > > loopback is already special. if a packet comes from an loopback > > interface, we allow it to talk to any IP on the local machine. i think > > this is mostly to cope with the semantic we've had where local traffic > > get's tied to a loopback interface instead of going anywhere near the > > physical ones. > > > > carp is also special. > > > > let me paste the ip_laddr function instead of the diff to it, it's a bit > > more obvious what's going on: > > Thanks, that will already work for the machines I was thinking of then. > > > back to loopback and receiving packets. loopback is special because it > > is not connected to the outside world. it is impossible to send a packet > > via a loopback interface from another host, so configuring a globally > > (externally) routable IP on it is currently pointless unless you enable > > forwarding. i think making loopback more special and allowing it > > to be globally reachable makes sense. i can't think of any downsides to > > this at the moment, except that the behaviour would be subtle/not > > obvious > > ok, so it makes sense for this to be independent of any possible > separate lever. > > > is there a need to configure a globally reachable IP on a non-loopback > > interface on a host (not router)? if so, then i'd be more convinced that > > we need a separate lever to pull. > > I'm not using it this way, but here's a scenario. > > Say there are a couple of webservers with addresses from a carp on > ethernet/vlan, with a link to their upstream router on some separate > interface. They announce the carp prefix into ospf. so carp is just being used to elect a webserver as a master, and then the result of that election is fed upstream. > They aren't routing themselves so the only reason to have forwarding=1 > is to have them use "weak host model". > > With forwarding=0 I think they'll have to use "stub router no" otherwise > everything will be announced high metric (rather than being dependent on > carp state), but ospfd explicitly handles this; it's marked in parse.y > with "/* allow to force non stub mode */". so is a Big Global Lever what you want here? if you enable weak host mode, all IPs on the host will be addressible from all legs of the host. would it make more sense to configure specific interfaces as holding globally addressible IPs? if my understanding of your scenario is right, you could configure the carp interface with the weak or globally accessible flag. in my situation i could configure that on lo1. dlg
Re: net.inet.ip.forwarding=0 vs lo(4)
On 2020/10/19 11:47, David Gwynne wrote: > On Sun, Oct 18, 2020 at 08:57:34PM +0100, Stuart Henderson wrote: > > On 2020/10/18 14:04, David Gwynne wrote: > > > the problem i'm hitting is that i have a multihomed box where the > > > service it provides listens on an IP address that's assigned to lo1. > > > it's a host running a service, it's not a router, so the > > > net.inet.ip.forwarding sysctl is not set to 1. > > > > I ran into this, I just turned on the forwarding sysctl to avoid the > > problem. > > > > > i came up with this diff, which adds even more special casing for > > > loopback interfaces. it says addreesses on loopbacks are globally > > > reachable, even if ip forwarding is disabled. > > > > I don't see why loopbacks should be special. Another place this > > might show up is services running on carp addresses (I haven't updated > > those machines yet but there's a fair chance they'll be affected too). > > I would prefer an explicit sysctl to disable "strong host model". > > loopback is already special. if a packet comes from an loopback > interface, we allow it to talk to any IP on the local machine. i think > this is mostly to cope with the semantic we've had where local traffic > get's tied to a loopback interface instead of going anywhere near the > physical ones. > > carp is also special. > > let me paste the ip_laddr function instead of the diff to it, it's a bit > more obvious what's going on: Thanks, that will already work for the machines I was thinking of then. > back to loopback and receiving packets. loopback is special because it > is not connected to the outside world. it is impossible to send a packet > via a loopback interface from another host, so configuring a globally > (externally) routable IP on it is currently pointless unless you enable > forwarding. i think making loopback more special and allowing it > to be globally reachable makes sense. i can't think of any downsides to > this at the moment, except that the behaviour would be subtle/not > obvious ok, so it makes sense for this to be independent of any possible separate lever. > is there a need to configure a globally reachable IP on a non-loopback > interface on a host (not router)? if so, then i'd be more convinced that > we need a separate lever to pull. I'm not using it this way, but here's a scenario. Say there are a couple of webservers with addresses from a carp on ethernet/vlan, with a link to their upstream router on some separate interface. They announce the carp prefix into ospf. They aren't routing themselves so the only reason to have forwarding=1 is to have them use "weak host model". With forwarding=0 I think they'll have to use "stub router no" otherwise everything will be announced high metric (rather than being dependent on carp state), but ospfd explicitly handles this; it's marked in parse.y with "/* allow to force non stub mode */".
kqueue_scan() refactoring
Diff below is the second part of the refactoring [0] that got committed then reverted 6 months ago. The idea of this refactoring is to move the accounting and markers used when collecting events to a context (struct kqueue_scan_state) such that the collecting routine (kqueue_scan()) can be called multiple times. Please test this especially with X and report back. [0] https://marc.info/?l=openbsd-tech&m=158739418817156&w=2 Index: kern/kern_event.c === RCS file: /cvs/src/sys/kern/kern_event.c,v retrieving revision 1.143 diff -u -p -r1.143 kern_event.c --- kern/kern_event.c 11 Oct 2020 07:11:59 - 1.143 +++ kern/kern_event.c 19 Oct 2020 08:47:00 - @@ -905,7 +905,6 @@ kqueue_scan(struct kqueue_scan_state *sc nkev = 0; kevp = kev; - count = maxevents; if (count == 0) goto done; @@ -921,7 +920,8 @@ retry: s = splhigh(); if (kq->kq_count == 0) { - if (tsp != NULL && !timespecisset(tsp)) { + if ((tsp != NULL && !timespecisset(tsp)) || + scan->kqs_nevent != 0) { splx(s); error = 0; goto done; @@ -937,18 +937,32 @@ retry: goto done; } - TAILQ_INSERT_TAIL(&kq->kq_head, &scan->kqs_end, kn_tqe); + /* +* Put the end marker in the queue to limit the scan to the events +* that are currently active. This prevents events from being +* recollected if they reactivate during scan. +* +* If a partial scan has been performed already but no events have +* been collected, reposition the end marker to make any new events +* reachable. +*/ + if (!scan->kqs_queued) { + TAILQ_INSERT_TAIL(&kq->kq_head, &scan->kqs_end, kn_tqe); + scan->kqs_queued = 1; + } else if (scan->kqs_nevent == 0) { + TAILQ_REMOVE(&kq->kq_head, &scan->kqs_end, kn_tqe); + TAILQ_INSERT_TAIL(&kq->kq_head, &scan->kqs_end, kn_tqe); + } + TAILQ_INSERT_HEAD(&kq->kq_head, &scan->kqs_start, kn_tqe); while (count) { kn = TAILQ_NEXT(&scan->kqs_start, kn_tqe); if (kn->kn_filter == EVFILT_MARKER) { if (kn == &scan->kqs_end) { - TAILQ_REMOVE(&kq->kq_head, &scan->kqs_end, - kn_tqe); TAILQ_REMOVE(&kq->kq_head, &scan->kqs_start, kn_tqe); splx(s); - if (count == maxevents) + if (scan->kqs_nevent == 0) goto retry; goto done; } @@ -984,6 +998,9 @@ retry: *kevp = kn->kn_kevent; kevp++; nkev++; + count--; + scan->kqs_nevent++; + if (kn->kn_flags & EV_ONESHOT) { splx(s); kn->kn_fop->f_detach(kn); @@ -1009,7 +1026,6 @@ retry: knote_release(kn); } kqueue_check(kq); - count--; if (nkev == KQ_NEVENTS) { splx(s); #ifdef KTRACE @@ -1026,7 +1042,6 @@ retry: break; } } - TAILQ_REMOVE(&kq->kq_head, &scan->kqs_end, kn_tqe); TAILQ_REMOVE(&kq->kq_head, &scan->kqs_start, kn_tqe); splx(s); done: @@ -1059,15 +1074,21 @@ void kqueue_scan_finish(struct kqueue_scan_state *scan) { struct kqueue *kq = scan->kqs_kq; + int s; KASSERT(scan->kqs_start.kn_filter == EVFILT_MARKER); KASSERT(scan->kqs_start.kn_status == KN_PROCESSING); KASSERT(scan->kqs_end.kn_filter == EVFILT_MARKER); KASSERT(scan->kqs_end.kn_status == KN_PROCESSING); + if (scan->kqs_queued) { + scan->kqs_queued = 0; + s = splhigh(); + TAILQ_REMOVE(&kq->kq_head, &scan->kqs_end, kn_tqe); + splx(s); + } KQRELE(kq); } - /* * XXX Index: sys/event.h === RCS file: /cvs/src/sys/sys/event.h,v retrieving revision 1.46 diff -u -p -r1.46 event.h --- sys/event.h 11 Oct 2020 07:11:59 - 1.46 +++ sys/event.h 19 Oct 2020 08:39:51 - @@ -204,6 +204,9 @@ struct kqueue_scan_state { struct kqueue *kqs_kq;/* kqueue of this scan */ struct knote kqs_start; /* start marker */ struct knote kqs_end; /* end marker */ + int kqs_nevent;/* number of events collected */ + int kqs_queued;/*
Re: pf route-to issues
On 2020/10/19 15:35, David Gwynne wrote: > every few years i try and use route-to in pf, and every time it > goes badly. i tried it again last week in a slightly different > setting, and actually tried to understand the sharp edges i hit > this time instead of giving up. it turns out there are 2 or 3 > different things together that have cause me trouble, which is why > the diff below is so big. I used to route-to/reply-to quite a lot at places with poor internet connections to split traffic between lines (mostly those have better connections now so I don't need it as often). It worked as I expected - but I only ever used it with the interface specified. I mostly used it with pppoe interfaces so the peer address was unknown at ruleset load time. (I was lucky and had static IPs my side, but the ISP side was variable). I relied on the fact that once packets are directed at a point-point interface there's only one place for them to go. I didn't notice that ":peer" might be useful here (and the syntax 'route-to pppoe1:peer@pppoe1' is pretty awkward so I probably wouldn't have come up with it), I had 0.0.0.1@pppoe1, 0.0.0.2@pppoe2 etc (though actually I think it works with $any_random_address@pppoeX). > the first and i would argue most fundamental problem is a semantic > problem. if you ask a random person who has some clue about networks > and routing what they would expect the "argument" to route-to or > reply-to to be, they would say "a nexthop address" or "a gateway > address". eg, say i want to force packets to a specific backend > server without using NAT, i would write a rule like this: > > n_servers="192.0.2.128/27" > pass out on $if_internal to $n_servers route-to 192.168.0.1 > > pfctl will happily parse this, shove it into the kernel, let you read > the rules back out again with pfctl -sr, and it all looks plausible, but > it turns out that it's using the argument to route-to as an interface > name. because rulesets can refer to interfaces that don't exist yet, pf > just passes the IP address around as a string, hoping i'll plug in an > interface with a driver name that looks like an ip address. i spent > literally a day trying to figure out why a rule like this wasn't > working. I don't think I tried this, but the pf.conf(5) BNF syntax suggests it's supposed to work. So either doc or implementation bug there. route = ( "route-to" | "reply-to" | "dup-to" ) ( routehost | "{" routehost-list "}" ) [ pooltype ] routehost-list = routehost [ [ "," ] routehost-list ] routehost = host | host "@" interface-name | "(" interface-name [ address [ "/" mask-bits ] ] ")" > the second problem is that the pf_route calls from pfsync don't > have all the information it is supposed to have. more specifically, > an ifp pointer isn't set which leads to a segfault. the ifp pointer > isn't set because pfsync doesnt track which interface a packet is > going out, it assumes the ip layer will get it right again later, or a > rule provided something usable. > > the third problem is that pf_route relies on information from rules to > work correctly. this is a problem in a pfsync environment because you > cannot have the same ruleset on both firewalls 100% of the time, which > means you cannot have route-to/reply-to behave consistently on a pair of > firwalls 100% of the time. I didn't run into this because pppoe(4) and pfsync/carp don't really go well together, but ouch! > all of this together makes things work pretty obviously and smoothly. > in my opinion anyway. route-to now works more like rdr-to, it just > feels like it changes the address used for the route lookup rather > than changing the actual IP address in the packet. it also works > predictably in a pfsync pair, which is great from the point of view of > high availability. > > the main caveat is that it's not backward compatible. if you're already > using route-to, you will need to tweak your rules to have them parse. > however, i doubt anyone is using this stuff because it feels very broken > to me. Do you expect this to work with a bracketed "address" to defer lookup until rule evaluation time? i.e. pass out proto tcp to any port 22 route-to (pppoe1:peer) I think that will be all that's needed to allow converting the pppoe use case. I don't have a multiple pppoe setup handy but I can probably hack together some sort of test. I've also used route-to with squid "transparent" proxying (shown in the pkg-readme), I don't do that any more but I can put a squid test together easily enough. > @@ -1842,37 +1833,18 @@ pfrule: action dir logquick interface > decide_address_family($7.src.host, &r.af); > decide_address_family($7.dst.host, &r.af); > > - if ($8.route.rt) { > + if ($8.rt) { > + if ($8.rt != PF_DUPTO && !r.direction) { >
Re: uao_init() cleanup
> Date: Mon, 19 Oct 2020 10:00:49 +0200 > From: Martin Pieuchot > > uao_init() is called from uvm_km_init() which itself is called by > uvm_init(). None of the *init() functions in UVM have a guard, so be > coherent and remove this one. > > ok? Yes, that seems redundant. ok kettenis@ > Index: uvm/uvm_aobj.c > === > RCS file: /cvs/src/sys/uvm/uvm_aobj.c,v > retrieving revision 1.87 > diff -u -p -r1.87 uvm_aobj.c > --- uvm/uvm_aobj.c22 Sep 2020 14:31:08 - 1.87 > +++ uvm/uvm_aobj.c13 Oct 2020 09:25:20 - > @@ -788,12 +788,6 @@ uao_create(vsize_t size, int flags) > void > uao_init(void) > { > - static int uao_initialized; > - > - if (uao_initialized) > - return; > - uao_initialized = TRUE; > - > /* >* NOTE: Pages for this pool must not come from a pageable >* kernel map! > >
UVM fault check
uvm_fault() is one of the most contended "entry point" of the kernel. To reduce this contention I'm carefully refactoring this code to be able to push the KERNEL_LOCK() inside the fault handler. The first aim of this project would be to get the upper layer faults (cases 1A and 1B) out of ze big lock. As these faults do not involve `uobj' the scope of this project should be limited to serializing amap changes without the KERNEL_LOCK(). The diff below moves the first part of uvm_fault() into its own function: uvm_fault_check(). It is inspired by/imitates the current code structure of NetBSD's fault handler. This diff should not have any functional change. I hope it helps build better understanding of this area. Comments? Oks? Index: uvm/uvm_fault.c === RCS file: /cvs/src/sys/uvm/uvm_fault.c,v retrieving revision 1.102 diff -u -p -r1.102 uvm_fault.c --- uvm/uvm_fault.c 29 Sep 2020 11:47:41 - 1.102 +++ uvm/uvm_fault.c 12 Oct 2020 09:01:05 - @@ -472,114 +472,101 @@ uvmfault_update_stats(struct uvm_faultin } } -/* - * F A U L T - m a i n e n t r y p o i n t - */ +struct uvm_faultctx { + /* +* the following members are set up by uvm_fault_check() and +* read-only after that. +*/ + vm_prot_t enter_prot; + vaddr_t startva; + int npages; + int centeridx; + boolean_t narrow; + boolean_t wired; + paddr_t pa_flags; +}; /* - * uvm_fault: page fault handler + * uvm_fault_check: check prot, handle needs-copy, etc. * - * => called from MD code to resolve a page fault - * => VM data structures usually should be unlocked. however, it is - * possible to call here with the main map locked if the caller - * gets a write lock, sets it recursive, and then calls us (c.f. - * uvm_map_pageable). this should be avoided because it keeps - * the map locked off during I/O. + * 1. lookup entry. + * 2. check protection. + * 3. adjust fault condition (mainly for simulated fault). + * 4. handle needs-copy (lazy amap copy). + * 5. establish range of interest for neighbor fault (aka pre-fault). + * 6. look up anons (if amap exists). + * 7. flush pages (if MADV_SEQUENTIAL) + * + * => called with nothing locked. + * => if we fail (result != 0) we unlock everything. + * => initialize/adjust many members of flt. */ -#define MASK(entry) (UVM_ET_ISCOPYONWRITE(entry) ? \ -~PROT_WRITE : PROT_MASK) int -uvm_fault(vm_map_t orig_map, vaddr_t vaddr, vm_fault_t fault_type, -vm_prot_t access_type) +uvm_fault_check(struct uvm_faultinfo *ufi, struct uvm_faultctx *flt, +struct vm_anon ***ranons, vm_prot_t access_type) { - struct uvm_faultinfo ufi; - vm_prot_t enter_prot; - boolean_t wired, narrow, promote, locked, shadowed; - int npages, nback, nforw, centeridx, result, lcv, gotpages, ret; - vaddr_t startva, currva; - voff_t uoff; - paddr_t pa, pa_flags; struct vm_amap *amap; struct uvm_object *uobj; - struct vm_anon *anons_store[UVM_MAXRANGE], **anons, *anon, *oanon; - struct vm_page *pages[UVM_MAXRANGE], *pg, *uobjpage; + int nback, nforw; - anon = NULL; - pg = NULL; - - uvmexp.faults++;/* XXX: locking? */ - TRACEPOINT(uvm, fault, vaddr, fault_type, access_type, NULL); - - /* init the IN parameters in the ufi */ - ufi.orig_map = orig_map; - ufi.orig_rvaddr = trunc_page(vaddr); - ufi.orig_size = PAGE_SIZE; /* can't get any smaller than this */ - if (fault_type == VM_FAULT_WIRE) - narrow = TRUE; /* don't look for neighborhood -* pages on wire */ - else - narrow = FALSE; /* normal fault */ - - /* "goto ReFault" means restart the page fault from ground zero. */ -ReFault: /* lookup and lock the maps */ - if (uvmfault_lookup(&ufi, FALSE) == FALSE) { + if (uvmfault_lookup(ufi, FALSE) == FALSE) { return (EFAULT); } #ifdef DIAGNOSTIC - if ((ufi.map->flags & VM_MAP_PAGEABLE) == 0) + if ((ufi->map->flags & VM_MAP_PAGEABLE) == 0) panic("uvm_fault: fault on non-pageable map (%p, 0x%lx)", - ufi.map, vaddr); + ufi->map, ufi->orig_rvaddr); #endif /* check protection */ - if ((ufi.entry->protection & access_type) != access_type) { - uvmfault_unlockmaps(&ufi, FALSE); + if ((ufi->entry->protection & access_type) != access_type) { + uvmfault_unlockmaps(ufi, FALSE); return (EACCES); } /* * "enter_prot" is the protection we want to enter the page in at. * for certain pages (e.g. copy-on-write pages) this protection can -* be more strict than ufi.en
uao_init() cleanup
uao_init() is called from uvm_km_init() which itself is called by uvm_init(). None of the *init() functions in UVM have a guard, so be coherent and remove this one. ok? Index: uvm/uvm_aobj.c === RCS file: /cvs/src/sys/uvm/uvm_aobj.c,v retrieving revision 1.87 diff -u -p -r1.87 uvm_aobj.c --- uvm/uvm_aobj.c 22 Sep 2020 14:31:08 - 1.87 +++ uvm/uvm_aobj.c 13 Oct 2020 09:25:20 - @@ -788,12 +788,6 @@ uao_create(vsize_t size, int flags) void uao_init(void) { - static int uao_initialized; - - if (uao_initialized) - return; - uao_initialized = TRUE; - /* * NOTE: Pages for this pool must not come from a pageable * kernel map!
Re: pf route-to issues
Hello, disclaimer: I have no chance to run pfsync on production, I'm very inexperienced with pfsync(4). > > the third problem is that pf_route relies on information from rules to > work correctly. this is a problem in a pfsync environment because you > cannot have the same ruleset on both firewalls 100% of the time, which > means you cannot have route-to/reply-to behave consistently on a pair of > firwalls 100% of the time. > > my solution to both these problems is reduce the amount of information > pf_route needs to work with, to make sure that the info it does need > is in the pf state structure, and that pfsync handles it properly. > > if we limit the information needed for pf_route to a nexthop address, > and which direction the address is used, this is doable. both the > pf_state and pfsync_state structs already contain an address to store a > nexthop in, i just had to move the route-to direction from the rule into > the state. this is easy with pf_state, but i used a spare pad field in > pfsync_state for this. > it seems to me 'route-to vs. pfsync' still needs more thought. the next-hop IP address in route-to may be different for each PF box linked by pfsync(4). To be honest I have no answer to address this issue at the moment. > > thoughts? > What you've said makes sense. However I still feel pfsync(4) does not play well with route-to. thanks and regards sashan