Translate POLLNVAL in ppollcollect()

2021-11-20 Thread Visa Hankala
The original poll(2) implementation resumed and returned error POLLNVAL
if the file descriptor was closed by another thread.

This patch adds POLLNVAL translation to ppollcollect(). Currently,
the function returns POLLIN, POLLOUT or something else depending
on the event filter.

OK?

Index: kern/sys_generic.c
===
RCS file: src/sys/kern/sys_generic.c,v
retrieving revision 1.141
diff -u -p -r1.141 sys_generic.c
--- kern/sys_generic.c  16 Nov 2021 13:48:23 -  1.141
+++ kern/sys_generic.c  21 Nov 2021 06:24:50 -
@@ -1263,6 +1263,14 @@ ppollcollect(struct proc *p, struct keve
 */
already_seen = (pl[i].revents != 0);
 
+   /* POLLNVAL preempts other events. */
+   if ((kevp->flags & EV_ERROR) && kevp->data == EBADF) {
+   pl[i].revents = POLLNVAL;
+   goto done;
+   } else if (pl[i].revents & POLLNVAL) {
+   goto done;
+   }
+
switch (kevp->filter) {
case EVFILT_READ:
if (kevp->flags & __EV_HUP)
@@ -1294,6 +1302,7 @@ ppollcollect(struct proc *p, struct keve
KASSERT(0);
}
 
+done:
DPRINTFN(1, "poll get %lu/%d fd %d revents %02x serial %lu filt %d\n",
i+1, nfds, pl[i].fd, pl[i].revents, (unsigned long)kevp->udata,
kevp->filter);



vmm(4): copyout guest regs, irqready on VM_EXIT_NONE

2021-11-20 Thread Dave Voutila
The below diff fixes an issue reported by kn@ on bugs@ [1]. joshe@ also
observed the issue and confirmed the below diff resolves it.

The symptoms were quite odd: errors from fdc(4) during an OpenBSD guest
booting under vmm(4)/vmd(8). We don't emulate a floppy disk drive!!!

I introduced a bug in r1.287 [2] when simplifying parts of
vcpu_run_{svm,vmx} by letting the functions return 0 instead of
voluntarily yielding. The edge case I didn't account for is if after a
vmexit for an IN instruction, the io port address isn't one emulated by
vmd(8) in userland, vmm(4) will perform the emulation (not the bug) by
writing the appropriate number of 0xff bytes to AL/AX/EAX. IF the
scheduler would like us to yield, we return setting a vrp exit code of
VM_EXIT_NONE (since we aren't asking userland/vmd to help with any
emulation).

vmd(8) correctly handles this exit, but vmm(4) never copies out the
current vcpu registers and irqready state. When vmd(8) runs the vcpu
again, the vcpu's guest state still has a vmexit related to the IO
operation and presumes vmd(8) modified RAX and overwrites the vcpu's
RAX before re-entering the guest.

This behavior occurs on both Intel and AMD. To confirm, I added some
printfs to fdc(4) and specifically checked when the dma reads returned
something other than 0xff on instances of both types of host. (Since
it's probabilistic, it's not uncommon to see it happen only 3-4 times
out of the 100k bus reads out_fdc() attempts, but it seems more
reproducible on older hardware.)

ok?

-dv

[1] https://marc.info/?l=openbsd-bugs=163682062027764=2
[2] 
https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/arch/amd64/amd64/vmm.c.diff?r1=1.286=1.287


Index: sys/arch/amd64/amd64/vmm.c
===
RCS file: /opt/cvs/src/sys/arch/amd64/amd64/vmm.c,v
retrieving revision 1.294
diff -u -p -r1.294 vmm.c
--- sys/arch/amd64/amd64/vmm.c  26 Oct 2021 16:29:49 -  1.294
+++ sys/arch/amd64/amd64/vmm.c  20 Nov 2021 21:46:07 -
@@ -4301,9 +4301,10 @@ vm_run(struct vm_run_params *vrp)
rw_exit_write(_softc->vm_lock);
}
ret = 0;
-   } else if (ret == EAGAIN) {
+   } else if (ret == 0 || ret == EAGAIN) {
/* If we are exiting, populate exit data so vmd can help. */
-   vrp->vrp_exit_reason = vcpu->vc_gueststate.vg_exit_reason;
+   vrp->vrp_exit_reason = (ret == 0) ? VM_EXIT_NONE
+   : vcpu->vc_gueststate.vg_exit_reason;
vrp->vrp_irqready = vcpu->vc_irqready;
vcpu->vc_state = VCPU_STATE_STOPPED;

@@ -4312,9 +4313,6 @@ vm_run(struct vm_run_params *vrp)
ret = EFAULT;
} else
ret = 0;
-   } else if (ret == 0) {
-   vrp->vrp_exit_reason = VM_EXIT_NONE;
-   vcpu->vc_state = VCPU_STATE_STOPPED;
} else {
vrp->vrp_exit_reason = VM_EXIT_TERMINATED;
vcpu->vc_state = VCPU_STATE_TERMINATED;



Re: Expose TDBs with excess hardlimit in ipsec(4) statistics

2021-11-20 Thread Alexander Bluhm
On Sun, Nov 21, 2021 at 05:10:54AM +0300, Vitaliy Makkoveev wrote:
> The order ipsecstat_inc(); pfkeyv2_expire(); tdb_delete(); looks
> more consistent to me. ok?

OK bluhm@



Re: tee(1): increase read buffer to MAXBSIZE bytes

2021-11-20 Thread Todd C . Miller
On Sat, 20 Nov 2021 20:19:47 -0600, Scott Cheloha wrote:

> It's not the idiomatic loop from the manpage, but near as I can tell
> it is equivalent.

You are right, I didn't look closely enough.

> ... should we make it idiomatic?  Saves a few lines, is less
> confusing, etc.

OK millert@

 - todd



Re: tee(1): increase read buffer to MAXBSIZE bytes

2021-11-20 Thread Scott Cheloha
On Sat, Nov 20, 2021 at 02:22:41PM -0700, Todd C. Miller wrote:
> On Sat, 20 Nov 2021 11:19:13 -0600, Scott Cheloha wrote:
> 
> > > One advantage to using stdio is that
> > > it will use the optimal I/O blocksize automatically so you could
> > > try using that instead of write(2) and see how it performs.
> >
> > tee(1) needs to write input to N drains, N >= 1.  If we use stdio that
> > means we're doing N additional userspace copies, no?
> 
> Yes, that is true.  Right now tee(1) doesn't check for partial
> writes which I suppose might happen with a larger buffer.  In
> practice, as long as it is less than MAXPHYS it should be OK.

This code in the inner loop doesn't handle partial writes?

   114  SLIST_FOREACH(p, , next) {
   115  n = rval;
   116  bp = buf;
   117  do {
   118  if ((wval = write(p->fd, bp, n)) == -1) 
{
   119  warn("%s", p->name);
   120  exitval = 1;
   121  break;
   122  }
   123  bp += wval;
   124  } while (n -= wval);
   125  }

It's not the idiomatic loop from the manpage, but near as I can tell
it is equivalent.

... should we make it idiomatic?  Saves a few lines, is less
confusing, etc.

Index: tee.c
===
RCS file: /cvs/src/usr.bin/tee/tee.c,v
retrieving revision 1.12
diff -u -p -r1.12 tee.c
--- tee.c   11 Jul 2017 13:14:59 -  1.12
+++ tee.c   21 Nov 2021 02:16:09 -
@@ -68,7 +68,6 @@ main(int argc, char *argv[])
struct list *p;
int fd;
ssize_t n, rval, wval;
-   char *bp;
int append, ch, exitval;
char buf[8192];
 
@@ -112,16 +111,14 @@ main(int argc, char *argv[])
 
while ((rval = read(STDIN_FILENO, buf, sizeof(buf))) > 0) {
SLIST_FOREACH(p, , next) {
-   n = rval;
-   bp = buf;
-   do {
-   if ((wval = write(p->fd, bp, n)) == -1) {
+   for (n = 0; n < rval; n += wval) {
+   wval = write(p->fd, buf + n, rval - n);
+   if (wval == -1) {
warn("%s", p->name);
exitval = 1;
break;
}
-   bp += wval;
-   } while (n -= wval);
+   }
}
}
if (rval == -1) {



Re: Expose TDBs with excess hardlimit in ipsec(4) statistics

2021-11-20 Thread Vitaliy Makkoveev



> On 21 Nov 2021, at 04:14, Alexander Bluhm  wrote:
> 
> On Sun, Nov 21, 2021 at 02:56:58AM +0300, Vitaliy Makkoveev wrote:
>> Also I found the `ipsec_notdb' counter description in header doesn't
>> math to netstat(1) description. We never count `ipsec_notdb' and the
>> netstat(1) description looks more appropriate so I used it to avoid
>> confusion with the new counter added.
> 
> The counters and their description matches the structure of the
> code 20 years ago.  As we are in the middle of refactoring, global
> counter cleanup does not make sense now.
> 
> Adding new counters with proper names makes perfect sense.  When
> the code has settled we can clean up the old stuff easily.
> 
>> ok?
> 
> OK bluhm@
> 
> One nit:
> 
>>  pfkeyv2_expire(tdb, SADB_EXT_LIFETIME_HARD);
>>  tdb_delete(tdb);
>> +ipsecstat_inc(ipsec_exctdb);
> 
>> +if (!(tdb->tdb_flags & TDBF_INVALID)) {
>>  pfkeyv2_expire(tdb, SADB_EXT_LIFETIME_HARD);
>> +ipsecstat_inc(ipsec_exctdb);
>> +}
>>  tdb_delete(tdb);
> 
> The order is inconsistent.  Can we have pfkeyv2_expire();
> ipsecstat_inc(); tdb_delete(); everywhere?

The order ipsecstat_inc(); pfkeyv2_expire(); tdb_delete(); looks
more consistent to me. ok?



Re: IPsec tdb ref counting

2021-11-20 Thread Alexander Bluhm
On Sun, Nov 21, 2021 at 03:17:08AM +0300, Vitaliy Makkoveev wrote:
> Can I propose to commit the diff [1] to expose the TDB  hardlimit excess
> statistics and then test this diff? I want to see stable systems with
> non null "TDBs with hard limit excess" couter in the statistics like
> below.

Yes please.

If you are wondering how I debug the ref counts, I have ref counter
tracing on top of my diff.  I don't want to get that into the tree
now.  But it may help you debugging.  See #if TDB_TRACE_MAX.

ddb{2}> show tdb /f 0x880164b0
tdb at 0x880164b0
...
 trace_idx: 5
  tdb_trace[0]: 16: refs 1 +1 cpu0 import_lifetime:333
  tdb_trace[1]: 17: refs 2 +1 cpu0 import_lifetime:304
  tdb_trace[2]: 34: refs 3 -1 cpu0 tdb_soft_timeout:723
  tdb_trace[3]: 38: refs 2 +0 cpu0 tdb_timeout:685
  tdb_trace[4]: 39: refs 2 -1 cpu0 tdb_delete0:994

bluhm

Index: net/if_bridge.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/net/if_bridge.c,v
retrieving revision 1.358
diff -u -p -r1.358 if_bridge.c
--- net/if_bridge.c 11 Nov 2021 18:08:17 -  1.358
+++ net/if_bridge.c 20 Nov 2021 15:14:54 -
@@ -1567,20 +1567,28 @@ bridge_ipsec(struct ifnet *ifp, struct e
tdb->tdb_xform != NULL) {
if (tdb->tdb_first_use == 0) {
tdb->tdb_first_use = gettime();
-   if (tdb->tdb_flags & TDBF_FIRSTUSE)
-   timeout_add_sec(>tdb_first_tmo,
-   tdb->tdb_exp_first_use);
-   if (tdb->tdb_flags & TDBF_SOFT_FIRSTUSE)
-   timeout_add_sec(>tdb_sfirst_tmo,
-   tdb->tdb_soft_first_use);
+   if (tdb->tdb_flags & TDBF_FIRSTUSE) {
+   if (timeout_add_sec(
+   >tdb_first_tmo,
+   tdb->tdb_exp_first_use))
+   tdb_ref(tdb);
+   }
+   if (tdb->tdb_flags & TDBF_SOFT_FIRSTUSE) {
+   if (timeout_add_sec(
+   >tdb_sfirst_tmo,
+   tdb->tdb_soft_first_use))
+   tdb_ref(tdb);
+   }
}
 
prot = (*(tdb->tdb_xform->xf_input))(, tdb, hlen,
off);
+   tdb_unref(tdb);
if (prot != IPPROTO_DONE)
ip_deliver(, , prot, af);
return (1);
} else {
+   tdb_unref(tdb);
  skiplookup:
/* XXX do an input policy lookup */
return (0);
Index: net/if_pfsync.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/net/if_pfsync.c,v
retrieving revision 1.298
diff -u -p -r1.298 if_pfsync.c
--- net/if_pfsync.c 11 Nov 2021 12:35:01 -  1.298
+++ net/if_pfsync.c 19 Nov 2021 13:04:58 -
@@ -1325,11 +1325,13 @@ pfsync_update_net_tdb(struct pfsync_tdb 
/* Neither replay nor byte counter should ever decrease. */
if (pt->rpl < tdb->tdb_rpl ||
pt->cur_bytes < tdb->tdb_cur_bytes) {
+   tdb_unref(tdb);
goto bad;
}
 
tdb->tdb_rpl = pt->rpl;
tdb->tdb_cur_bytes = pt->cur_bytes;
+   tdb_unref(tdb);
}
return;
 
Index: net/pfkeyv2.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/net/pfkeyv2.c,v
retrieving revision 1.221
diff -u -p -r1.221 pfkeyv2.c
--- net/pfkeyv2.c   25 Oct 2021 18:25:01 -  1.221
+++ net/pfkeyv2.c   20 Nov 2021 15:14:54 -
@@ -1043,8 +1043,11 @@ pfkeyv2_sa_flush(struct tdb *tdb, void *
 {
if (!(*((u_int8_t *) satype_vp)) ||
tdb->tdb_satype == *((u_int8_t *) satype_vp)) {
+   /* keep in sync with tdb_delete() */
tdb_unlink_locked(tdb);
-   tdb_free(tdb);
+   tdb_unbundle(tdb);
+   tdb_deltimeouts(tdb);
+   tdb_unref(tdb);
}
return (0);
 }
@@ -1327,7 +1330,7 @@ pfkeyv2_send(struct socket *so, void *me
 
if ((rval = pfkeyv2_get_proto_alg(newsa->tdb_satype,
>tdb_sproto, ))) {
-   tdb_free(freeme);
+   tdb_unref(freeme);
freeme = NULL;
 

Re: Expose TDBs with excess hardlimit in ipsec(4) statistics

2021-11-20 Thread Alexander Bluhm
On Sun, Nov 21, 2021 at 02:56:58AM +0300, Vitaliy Makkoveev wrote:
> Also I found the `ipsec_notdb' counter description in header doesn't
> math to netstat(1) description. We never count `ipsec_notdb' and the
> netstat(1) description looks more appropriate so I used it to avoid
> confusion with the new counter added.

The counters and their description matches the structure of the
code 20 years ago.  As we are in the middle of refactoring, global
counter cleanup does not make sense now.

Adding new counters with proper names makes perfect sense.  When
the code has settled we can clean up the old stuff easily.

> ok?

OK bluhm@

One nit:

>   pfkeyv2_expire(tdb, SADB_EXT_LIFETIME_HARD);
>   tdb_delete(tdb);
> + ipsecstat_inc(ipsec_exctdb);

> + if (!(tdb->tdb_flags & TDBF_INVALID)) {
>   pfkeyv2_expire(tdb, SADB_EXT_LIFETIME_HARD);
> + ipsecstat_inc(ipsec_exctdb);
> + }
>   tdb_delete(tdb);

The order is inconsistent.  Can we have pfkeyv2_expire();
ipsecstat_inc(); tdb_delete(); everywhere?



Re: Rework UNIX sockets locking to be fine grained

2021-11-20 Thread Vitaliy Makkoveev
On Sat, Nov 20, 2021 at 07:16:40PM +0100, Matthias Schmidt wrote:
> Hi Vitaliy,
> 
> * Vitaliy Makkoveev wrote:
> > Updated diff. Re-lock dances were simplified in the unix(4) sockets
> > layer.
> 
> I am running this diff and the one before on a Thinkpad T450s which is
> my daily driver (office, coding, web, videos, musik, etc) and noticed no
> regressions so far.  Suspend/resume cycles included.
> 
> Not sure if that fits the tests/testers you need so let me know if I
> should look for/do something special.
> 

Thanks!

Ideally it would be to test something which provides heavy multithreaded
load on unix(4) layer on 8+ cores machine :)

> Cheers
> 
>   Matthias
> 
> 
> OpenBSD 7.0-current (GENERIC.MP) #3: Sat Nov 20 15:12:39 CET 2021
> x...@sigma.xosc.net:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> real mem = 12765257728 (12173MB)
> avail mem = 12362416128 (11789MB)
> random: good seed from bootblocks
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.7 @ 0x9cbfd000 (65 entries)
> bios0: vendor LENOVO version "JBET73WW (1.37 )" date 08/14/2019
> bios0: LENOVO 20BX0049GE
> acpi0 at bios0: ACPI 5.0
> acpi0: sleep states S0 S3 S4 S5
> acpi0: tables DSDT FACP SLIC ASF! HPET ECDT APIC MCFG SSDT SSDT SSDT SSDT 
> SSDT SSDT SSDT SSDT SSDT PCCT SSDT TCPA SSDT UEFI MSDM BATB FPDT UEFI DMAR
> acpi0: wakeup devices LID_(S4) SLPB(S3) IGBE(S4) EXP2(S4) XHCI(S3) EHC1(S3)
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpihpet0 at acpi0: 14318179 Hz
> acpiec0 at acpi0
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz, 2095.42 MHz, 06-3d-04
> cpu0: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,PT,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
> cpu0: 256KB 64b/line 8-way L2 cache
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
> cpu0: apic clock running at 99MHz
> cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1.1.1, IBE
> cpu1 at mainbus0: apid 1 (application processor)
> cpu1: Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz, 2095.16 MHz, 06-3d-04
> cpu1: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,PT,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
> cpu1: 256KB 64b/line 8-way L2 cache
> cpu1: smt 1, core 0, package 0
> cpu2 at mainbus0: apid 2 (application processor)
> cpu2: Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz, 2095.17 MHz, 06-3d-04
> cpu2: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,PT,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
> cpu2: 256KB 64b/line 8-way L2 cache
> cpu2: smt 0, core 1, package 0
> cpu3 at mainbus0: apid 3 (application processor)
> cpu3: Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz, 2095.16 MHz, 06-3d-04
> cpu3: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,PT,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
> cpu3: 256KB 64b/line 8-way L2 cache
> cpu3: smt 1, core 1, package 0
> ioapic0 at mainbus0: apid 2 pa 0xfec0, version 20, 40 pins
> acpimcfg0 at acpi0
> acpimcfg0: addr 0xf800, bus 0-63
> acpiprt0 at acpi0: bus 0 (PCI0)
> acpiprt1 at acpi0: bus -1 (PEG_)
> acpiprt2 at acpi0: bus 2 (EXP1)
> acpiprt3 at acpi0: bus 3 (EXP2)
> acpiprt4 at acpi0: bus -1 (EXP3)
> acpibtn0 at acpi0: LID_
> acpibtn1 at acpi0: SLPB
> acpipci0 at acpi0 PCI0: 0x 0x0011 0x0001
> acpicmos0 at acpi0
> acpibat0 at acpi0: BAT0 model "45N" serial 16646 type LiP oem "SONY"
> acpibat1 at acpi0: BAT1 model "45N1777" serial   410 type 

Re: IPsec tdb ref counting

2021-11-20 Thread Vitaliy Makkoveev
On Sat, Nov 20, 2021 at 05:31:33PM +0100, Alexander Bluhm wrote:
> New tdb refcounting diff.
> 
> I delete and unref the timeouts earlier and fixed some leaks.  At
> least on my machine it does not crash and tcp InUse is 0 after
> ipsecctl -F .
> 
> Please test.
> 
> mvs:  Are some of your concerns resolved by deleting the tdb_reaper() ?
> 

As I privately said long time ago, I don't like the timeout(9) handler
reassignment we do here, so I like this direction.

When I had playing with your diffs I found my system always manages to
complete rekeying and I have no TDB's with hardlimit excess. That's why
I had no panics like Hrvoje Popovski had. And it seems he is the only
person who reached tdb_delete() code paths which provides double free.

Can I propose to commit the diff [1] to expose the TDB  hardlimit excess
statistics and then test this diff? I want to see stable systems with
non null "TDBs with hard limit excess" couter in the statistics like
below.

[otto@obsd-test ~]$ netstat -s -p ipsec
ipsec:
44684 input IPsec packets
44603 output IPsec packets
7505232 input bytes
7490792 output bytes
5405234 input bytes, decompressed
5706290 output bytes, uncompressed
1 packet dropped on input
0 packets dropped on output
0 packets that failed crypto processing
0 packets for which no XFORM was set in TDB received
0 packets for which no TDB was found
0 TDBs with hard limit excess

1. https://marc.info/?l=openbsd-tech=163745270409693=2

> bluhm
> 
> Index: net/if_bridge.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/sys/net/if_bridge.c,v
> retrieving revision 1.358
> diff -u -p -r1.358 if_bridge.c
> --- net/if_bridge.c   11 Nov 2021 18:08:17 -  1.358
> +++ net/if_bridge.c   20 Nov 2021 15:14:54 -
> @@ -1567,20 +1567,28 @@ bridge_ipsec(struct ifnet *ifp, struct e
>   tdb->tdb_xform != NULL) {
>   if (tdb->tdb_first_use == 0) {
>   tdb->tdb_first_use = gettime();
> - if (tdb->tdb_flags & TDBF_FIRSTUSE)
> - timeout_add_sec(>tdb_first_tmo,
> - tdb->tdb_exp_first_use);
> - if (tdb->tdb_flags & TDBF_SOFT_FIRSTUSE)
> - timeout_add_sec(>tdb_sfirst_tmo,
> - tdb->tdb_soft_first_use);
> + if (tdb->tdb_flags & TDBF_FIRSTUSE) {
> + if (timeout_add_sec(
> + >tdb_first_tmo,
> + tdb->tdb_exp_first_use))
> + tdb_ref(tdb);
> + }
> + if (tdb->tdb_flags & TDBF_SOFT_FIRSTUSE) {
> + if (timeout_add_sec(
> + >tdb_sfirst_tmo,
> + tdb->tdb_soft_first_use))
> + tdb_ref(tdb);
> + }
>   }
>  
>   prot = (*(tdb->tdb_xform->xf_input))(, tdb, hlen,
>   off);
> + tdb_unref(tdb);
>   if (prot != IPPROTO_DONE)
>   ip_deliver(, , prot, af);
>   return (1);
>   } else {
> + tdb_unref(tdb);
>   skiplookup:
>   /* XXX do an input policy lookup */
>   return (0);
> Index: net/if_pfsync.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/sys/net/if_pfsync.c,v
> retrieving revision 1.298
> diff -u -p -r1.298 if_pfsync.c
> --- net/if_pfsync.c   11 Nov 2021 12:35:01 -  1.298
> +++ net/if_pfsync.c   19 Nov 2021 13:04:58 -
> @@ -1325,11 +1325,13 @@ pfsync_update_net_tdb(struct pfsync_tdb 
>   /* Neither replay nor byte counter should ever decrease. */
>   if (pt->rpl < tdb->tdb_rpl ||
>   pt->cur_bytes < tdb->tdb_cur_bytes) {
> + tdb_unref(tdb);
>   goto bad;
>   }
>  
>   tdb->tdb_rpl = pt->rpl;
>   tdb->tdb_cur_bytes = pt->cur_bytes;
> + tdb_unref(tdb);
>   }
>   return;
>  
> Index: net/pfkeyv2.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/sys/net/pfkeyv2.c,v
> retrieving revision 1.221
> diff -u -p -r1.221 pfkeyv2.c
> --- net/pfkeyv2.c 25 Oct 2021 18:25:01 -  1.221
> +++ net/pfkeyv2.c 20 Nov 2021 15:14:54 -
> @@ -1043,8 +1043,11 @@ pfkeyv2_sa_flush(struct tdb *tdb, void *
>  {
> 

Expose TDBs with excess hardlimit in ipsec(4) statistics

2021-11-20 Thread Vitaliy Makkoveev
When I had playing with bluhm@'s ipsec(4) diffs I found my system
always manages to complete rekeying and I have no TDB's with hardlimit
excess. It seems I'm not the only such person and other testers also
don't have panics Hrvoje Popovski had.

I want to count the TDBs with hardlimit excess as the new `ipsec_exctdb'
ipsec(4) counter and export this statistics to userland. So if the
tester person will expose something like below, we could say he didn't
reach the fragile tdb_delete() paths.

[otto@obsd-test ~]$ netstat -s -p ipsec
ipsec:
44684 input IPsec packets
44603 output IPsec packets
7505232 input bytes
7490792 output bytes
5405234 input bytes, decompressed
5706290 output bytes, uncompressed
1 packet dropped on input
0 packets dropped on output
0 packets that failed crypto processing
0 packets for which no XFORM was set in TDB received
0 packets for which no TDB was found
0 TDBs with hard limit excess

Also I found the `ipsec_notdb' counter description in header doesn't
math to netstat(1) description. We never count `ipsec_notdb' and the
netstat(1) description looks more appropriate so I used it to avoid
confusion with the new counter added.

ok?

Index: sys/netinet/ip_ah.c
===
RCS file: /cvs/src/sys/netinet/ip_ah.c,v
retrieving revision 1.166
diff -u -p -r1.166 ip_ah.c
--- sys/netinet/ip_ah.c 11 Nov 2021 18:08:18 -  1.166
+++ sys/netinet/ip_ah.c 20 Nov 2021 23:32:57 -
@@ -616,6 +616,7 @@ ah_input(struct mbuf **mp, struct tdb *t
tdb->tdb_cur_bytes >= tdb->tdb_exp_bytes) {
pfkeyv2_expire(tdb, SADB_EXT_LIFETIME_HARD);
tdb_delete(tdb);
+   ipsecstat_inc(ipsec_exctdb);
goto drop;
}
 
@@ -955,6 +956,7 @@ ah_output(struct mbuf *m, struct tdb *td
tdb->tdb_cur_bytes >= tdb->tdb_exp_bytes) {
pfkeyv2_expire(tdb, SADB_EXT_LIFETIME_HARD);
tdb_delete(tdb);
+   ipsecstat_inc(ipsec_exctdb);
error = EINVAL;
goto drop;
}
Index: sys/netinet/ip_esp.c
===
RCS file: /cvs/src/sys/netinet/ip_esp.c,v
retrieving revision 1.187
diff -u -p -r1.187 ip_esp.c
--- sys/netinet/ip_esp.c11 Nov 2021 18:08:18 -  1.187
+++ sys/netinet/ip_esp.c20 Nov 2021 23:32:57 -
@@ -428,6 +428,7 @@ esp_input(struct mbuf **mp, struct tdb *
(tdb->tdb_cur_bytes >= tdb->tdb_exp_bytes)) {
pfkeyv2_expire(tdb, SADB_EXT_LIFETIME_HARD);
tdb_delete(tdb);
+   ipsecstat_inc(ipsec_exctdb);
goto drop;
}
 
@@ -784,6 +785,7 @@ esp_output(struct mbuf *m, struct tdb *t
tdb->tdb_cur_bytes >= tdb->tdb_exp_bytes) {
pfkeyv2_expire(tdb, SADB_EXT_LIFETIME_HARD);
tdb_delete(tdb);
+   ipsecstat_inc(ipsec_exctdb);
error = EINVAL;
goto drop;
}
Index: sys/netinet/ip_ipcomp.c
===
RCS file: /cvs/src/sys/netinet/ip_ipcomp.c,v
retrieving revision 1.87
diff -u -p -r1.87 ip_ipcomp.c
--- sys/netinet/ip_ipcomp.c 11 Nov 2021 18:08:18 -  1.87
+++ sys/netinet/ip_ipcomp.c 20 Nov 2021 23:32:57 -
@@ -201,6 +201,7 @@ ipcomp_input(struct mbuf **mp, struct td
(tdb->tdb_cur_bytes >= tdb->tdb_exp_bytes)) {
pfkeyv2_expire(tdb, SADB_EXT_LIFETIME_HARD);
tdb_delete(tdb);
+   ipsecstat_inc(ipsec_exctdb);
goto drop;
}
/* Notify on soft expiration */
@@ -388,6 +389,7 @@ ipcomp_output(struct mbuf *m, struct tdb
(tdb->tdb_cur_bytes >= tdb->tdb_exp_bytes)) {
pfkeyv2_expire(tdb, SADB_EXT_LIFETIME_HARD);
tdb_delete(tdb);
+   ipsecstat_inc(ipsec_exctdb);
error = EINVAL;
goto drop;
}
Index: sys/netinet/ip_ipsp.c
===
RCS file: /cvs/src/sys/netinet/ip_ipsp.c,v
retrieving revision 1.251
diff -u -p -r1.251 ip_ipsp.c
--- sys/netinet/ip_ipsp.c   18 Nov 2021 11:04:10 -  1.251
+++ sys/netinet/ip_ipsp.c   20 Nov 2021 23:32:57 -
@@ -652,8 +652,10 @@ tdb_timeout(void *v)
NET_LOCK();
if (tdb->tdb_flags & TDBF_TIMER) {
/* If it's an "invalid" TDB do a silent expiration. */
-   if (!(tdb->tdb_flags & TDBF_INVALID))
+   if (!(tdb->tdb_flags & TDBF_INVALID)) {
pfkeyv2_expire(tdb, SADB_EXT_LIFETIME_HARD);
+   ipsecstat_inc(ipsec_exctdb);
+   }
tdb_delete(tdb);
}
NET_UNLOCK();
@@ -667,8 +669,10 @@ tdb_firstuse(void *v)
NET_LOCK();

Explicitly tag commands in vi(1)

2021-11-20 Thread Simon Branch
Hello!  Here's a diff that adds explicit .Tg macros to vi(1), so that
you can jump to vi or ex commands using -Otag or :t.  This patch
*should* include every command, but I couldn't figure out how to tag the
'!' and ':!' commands; none of these worked:

   .Tg
   .Tg !
   .Tg !\&
   .Tg "!"

This patch doesn't tag special keys either (word erase, literal next,
etc.), because tag names can't contain whitespace and I'd prefer
consistency with the manpage.  One solution would be to say WERASE and
LNEXT instead, which is what ksh(1) does, and this also makes it easier
to lookup what 'word erase' and 'literal next' really are.

   $ man -k Dv=WERASE Dv=LNEXT
   stty(1) - set the options for a terminal device interface
   termios(4) - general terminal line discipline

   $ man 4 termios -Otag=WERASE
   WERASESpecial character on input and is recognized if the ICANON
 flag is set.  Erases the last word ...

If a line is explicitly tagged with the .Tg macro, mandoc removes any
other automatically-created tags with the same name.  So this patch also
explicitly tags the 'print', 'number', and 'list' options and the
[-ceFRrSstvw] flags.

Comments?  OK?

Index: src/usr.bin/vi/docs/USD.doc/vi.man/vi.1
===
RCS file: /cvs/src/usr.bin/vi/docs/USD.doc/vi.man/vi.1,v
retrieving revision 1.79
diff -u -p -r1.79 vi.1
--- src/usr.bin/vi/docs/USD.doc/vi.man/vi.1 8 Mar 2021 02:47:29 -   
1.79
+++ src/usr.bin/vi/docs/USD.doc/vi.man/vi.1 20 Nov 2021 23:05:26 -
@@ -87,6 +87,7 @@ It's probably enough to get you going.
 .Pp
 The following options are available:
 .Bl -tag -width "-w size "
+.Tg c
 .It Fl c Ar cmd
 Execute
 .Ar cmd
@@ -99,19 +100,23 @@ This is the POSIX 1003.2 interface for t
 syntax.
 .Nm nex Ns / Ns Nm nvi
 supports both the old and new syntax.
+.Tg e
 .It Fl e
 Start editing in ex mode, as if the command name were
 .Nm ex .
+.Tg F
 .It Fl F
 Don't copy the entire file when first starting to edit.
 (The default is to make a copy in case someone else modifies
 the file during your edit session.)
+.Tg R
 .It Fl R
 Start editing in read-only mode, as if the command name was
 .Nm view ,
 or the
 .Cm readonly
 option was set.
+.Tg r
 .It Fl r
 Recover the specified files or, if no files are specified,
 list the files that could be recovered.
@@ -119,10 +124,12 @@ If no recoverable files by the specified
 the file is edited as if the
 .Fl r
 option had not been specified.
+.Tg S
 .It Fl S
 Run with the
 .Cm secure
 edit option set, disallowing all access to external programs.
+.Tg s
 .It Fl s
 Enter batch mode; applicable only to
 .Nm ex
@@ -137,14 +144,17 @@ This is the POSIX 1003.2 interface for t
 argument.
 .Nm nex Ns / Ns Nm nvi
 supports both the old and new syntax.
+.Tg t
 .It Fl t Ar tag
 Start editing at the specified
 .Ar tag
 (see
 .Xr ctags 1 ) .
+.Tg v
 .It Fl v
 Start editing in vi mode, as if the command name was
 .Nm vi .
+.Tg w
 .It Fl w Ar size
 Set the initial window size to the specified number of lines.
 .El
@@ -553,6 +563,7 @@ and considered part of the motion.
 .Pp
 .Bl -tag -width Ds -compact
 .It Xo
+.Tg control-A
 .Aq Cm control-A
 .Xc
 Search forward
@@ -560,6 +571,7 @@ for the word starting at the cursor posi
 .Pp
 .It Xo
 .Op Ar count
+.Tg control-B
 .Aq Cm control-B
 .Xc
 Page backwards
@@ -569,6 +581,7 @@ Two lines of overlap are maintained, if 
 .Pp
 .It Xo
 .Op Ar count
+.Tg control-D
 .Aq Cm control-D
 .Xc
 Scroll forward
@@ -587,6 +600,7 @@ command, scroll half the number of lines
 .Pp
 .It Xo
 .Op Ar count
+.Tg control-E
 .Aq Cm control-E
 .Xc
 Scroll forward
@@ -595,6 +609,7 @@ lines, leaving the current line and colu
 .Pp
 .It Xo
 .Op Ar count
+.Tg control-F
 .Aq Cm control-F
 .Xc
 Page forward
@@ -602,6 +617,7 @@ Page forward
 screens.
 Two lines of overlap are maintained, if possible.
 .Pp
+.Tg control-G
 .It Aq Cm control-G
 Display the following file information:
 the file name (as given to
@@ -614,10 +630,12 @@ and the current line number as a percent
 .Pp
 .It Xo
 .Op Ar count
+.Tg control-H
 .Aq Cm control-H
 .Xc
 .It Xo
 .Op Ar count
+.Tg
 .Cm h
 .Xc
 Move the cursor back
@@ -626,30 +644,37 @@ characters in the current line.
 .Pp
 .It Xo
 .Op Ar count
+.Tg control-J
 .Aq Cm control-J
 .Xc
 .It Xo
 .Op Ar count
+.Tg control-N
 .Aq Cm control-N
 .Xc
 .It Xo
 .Op Ar count
+.Tg
 .Cm j
 .Xc
 Move the cursor down
 .Ar count
 lines without changing the current column.
 .Pp
+.Tg control-L
 .It Aq Cm control-L
+.Tg control-R
 .It Aq Cm control-R
 Repaint the screen.
 .Pp
 .It Xo
 .Op Ar count
+.Tg control-M
 .Aq Cm control-M
 .Xc
 .It Xo
 .Op Ar count
+.Tg
 .Cm +
 .Xc
 Move the cursor down
@@ -658,21 +683,25 @@ lines to the first non-blank character o
 .Pp
 .It Xo
 .Op Ar count
+.Tg control-P
 .Aq Cm control-P
 .Xc
 .It Xo
 .Op Ar count
+.Tg
 .Cm k
 .Xc
 Move the cursor up
 .Ar count
 lines, without changing the current column.
 .Pp
+.Tg control-T
 .It Aq Cm control-T
 Return to the most 

Re: tee(1): increase read buffer to MAXBSIZE bytes

2021-11-20 Thread Todd C . Miller
On Sat, 20 Nov 2021 11:19:13 -0600, Scott Cheloha wrote:

> > One advantage to using stdio is that
> > it will use the optimal I/O blocksize automatically so you could
> > try using that instead of write(2) and see how it performs.
>
> tee(1) needs to write input to N drains, N >= 1.  If we use stdio that
> means we're doing N additional userspace copies, no?

Yes, that is true.  Right now tee(1) doesn't check for partial
writes which I suppose might happen with a larger buffer.  In
practice, as long as it is less than MAXPHYS it should be OK.

> > > I thought MAXBSIZE was that constant, because we use it elsewhere,
> > > e.g.  in cat(1) and wc(1), but if not then what is the right thing?
> > > Is it BUFSIZ?
> > 
> > BUFSIZ is only 1K on OpenBSD, which is its own problem.
>
> That seems a bit small... hmmm...

Yeah, BUFSIZ on Linux is 8K.  I think most others are still 1K.

 - todd



Re: asr(3): strip AD flag in responses

2021-11-20 Thread Otto Moerbeek
On Sat, Nov 20, 2021 at 06:44:58PM +0100, Florian Obser wrote:

> On 2021-11-20 18:41 +01, Florian Obser  wrote:
> > On 2021-11-20 18:19 +01, Florian Obser  wrote:
> >
> >> +/*
> >> + * Clear AD flag in the answer.
> >> + */
> >> +static void
> >> +clear_ad(struct asr_result *ar)
> >> +{
> >> +  struct asr_dns_header   *h;
> >> +  uint16_t flags;
> >> +
> >> +  h = (struct asr_dns_header *)ar->ar_data;
> >> +  flags = ntohs(h->flags);
> >> +  flags &= ~(AD_MASK);
> >> +  h->flags = htons(flags);
> >> +}
> >> +
> >
> > btw. is it possible that this is not alligned correctly on sparc64?
> >
> > should be do something like (not even compile tested)
> >
> > static void
> > clear_ad(struct asr_result *ar)
> > {
> > struct asr_dns_headerh;
> >
> > memmove(, ar->ar_data, sizeof(h));
> > h.flags = ntohs(h.flags);
> > h.flags &= ~(AD_MASK);
> > h.flags = htons(h.flags);
> > memmove(ar->ar_data, , sizeof(h));
> > }
> >
> 
> memcpy obviously, I was distracted by the copious amount of memmove in
> asr code...

It is not needed to copy the "whole" header just to change the flags.
You could just copy out, modify and copy back the flags field only.

otoh, it's just 12 bytes, so no big deal.

-Otto



Re: Rework UNIX sockets locking to be fine grained

2021-11-20 Thread Matthias Schmidt
Hi Vitaliy,

* Vitaliy Makkoveev wrote:
> Updated diff. Re-lock dances were simplified in the unix(4) sockets
> layer.

I am running this diff and the one before on a Thinkpad T450s which is
my daily driver (office, coding, web, videos, musik, etc) and noticed no
regressions so far.  Suspend/resume cycles included.

Not sure if that fits the tests/testers you need so let me know if I
should look for/do something special.

Cheers

Matthias


OpenBSD 7.0-current (GENERIC.MP) #3: Sat Nov 20 15:12:39 CET 2021
x...@sigma.xosc.net:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 12765257728 (12173MB)
avail mem = 12362416128 (11789MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.7 @ 0x9cbfd000 (65 entries)
bios0: vendor LENOVO version "JBET73WW (1.37 )" date 08/14/2019
bios0: LENOVO 20BX0049GE
acpi0 at bios0: ACPI 5.0
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP SLIC ASF! HPET ECDT APIC MCFG SSDT SSDT SSDT SSDT SSDT 
SSDT SSDT SSDT SSDT PCCT SSDT TCPA SSDT UEFI MSDM BATB FPDT UEFI DMAR
acpi0: wakeup devices LID_(S4) SLPB(S3) IGBE(S4) EXP2(S4) XHCI(S3) EHC1(S3)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpihpet0 at acpi0: 14318179 Hz
acpiec0 at acpi0
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz, 2095.42 MHz, 06-3d-04
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,PT,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1.1.1, IBE
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz, 2095.16 MHz, 06-3d-04
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,PT,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 1, core 0, package 0
cpu2 at mainbus0: apid 2 (application processor)
cpu2: Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz, 2095.17 MHz, 06-3d-04
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,PT,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu2: 256KB 64b/line 8-way L2 cache
cpu2: smt 0, core 1, package 0
cpu3 at mainbus0: apid 3 (application processor)
cpu3: Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz, 2095.16 MHz, 06-3d-04
cpu3: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,PT,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu3: 256KB 64b/line 8-way L2 cache
cpu3: smt 1, core 1, package 0
ioapic0 at mainbus0: apid 2 pa 0xfec0, version 20, 40 pins
acpimcfg0 at acpi0
acpimcfg0: addr 0xf800, bus 0-63
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus -1 (PEG_)
acpiprt2 at acpi0: bus 2 (EXP1)
acpiprt3 at acpi0: bus 3 (EXP2)
acpiprt4 at acpi0: bus -1 (EXP3)
acpibtn0 at acpi0: LID_
acpibtn1 at acpi0: SLPB
acpipci0 at acpi0 PCI0: 0x 0x0011 0x0001
acpicmos0 at acpi0
acpibat0 at acpi0: BAT0 model "45N" serial 16646 type LiP oem "SONY"
acpibat1 at acpi0: BAT1 model "45N1777" serial   410 type LION oem "SANYO"
acpiac0 at acpi0: AC unit online
acpithinkpad0 at acpi0: version 1.0
tpm0 at acpi0 TPM_ 1.2 (TIS) addr 0xfed4/0x5000, device 0x104a rev 0x4e
"PNP0C14" at acpi0 not configured
"PNP0C14" at acpi0 not configured
"PNP0C14" at acpi0 not configured
"INT340F" at acpi0 not configured
acpicpu0 at acpi0: C3(200@233 mwait.1@0x40), 

Re: asr(3): strip AD flag in responses

2021-11-20 Thread Florian Obser
On 2021-11-20 18:41 +01, Florian Obser  wrote:
> On 2021-11-20 18:19 +01, Florian Obser  wrote:
>
>> +/*
>> + * Clear AD flag in the answer.
>> + */
>> +static void
>> +clear_ad(struct asr_result *ar)
>> +{
>> +struct asr_dns_header   *h;
>> +uint16_t flags;
>> +
>> +h = (struct asr_dns_header *)ar->ar_data;
>> +flags = ntohs(h->flags);
>> +flags &= ~(AD_MASK);
>> +h->flags = htons(flags);
>> +}
>> +
>
> btw. is it possible that this is not alligned correctly on sparc64?
>
> should be do something like (not even compile tested)
>
> static void
> clear_ad(struct asr_result *ar)
> {
>   struct asr_dns_headerh;
>
> memmove(, ar->ar_data, sizeof(h));
> h.flags = ntohs(h.flags);
> h.flags &= ~(AD_MASK);
> h.flags = htons(h.flags);
> memmove(ar->ar_data, , sizeof(h));
> }
>

memcpy obviously, I was distracted by the copious amount of memmove in
asr code...

-- 
I'm not entirely sure you are real.



Re: asr(3): strip AD flag in responses

2021-11-20 Thread Florian Obser
On 2021-11-20 18:19 +01, Florian Obser  wrote:

> +/*
> + * Clear AD flag in the answer.
> + */
> +static void
> +clear_ad(struct asr_result *ar)
> +{
> + struct asr_dns_header   *h;
> + uint16_t flags;
> +
> + h = (struct asr_dns_header *)ar->ar_data;
> + flags = ntohs(h->flags);
> + flags &= ~(AD_MASK);
> + h->flags = htons(flags);
> +}
> +

btw. is it possible that this is not alligned correctly on sparc64?

should be do something like (not even compile tested)

static void
clear_ad(struct asr_result *ar)
{
struct asr_dns_headerh;

memmove(, ar->ar_data, sizeof(h));
h.flags = ntohs(h.flags);
h.flags &= ~(AD_MASK);
h.flags = htons(h.flags);
memmove(ar->ar_data, , sizeof(h));
}

-- 
I'm not entirely sure you are real.



Re: asr(3): strip AD flag in responses

2021-11-20 Thread Florian Obser
On 2021-11-20 17:05 +01, Jeremie Courreges-Anglas  wrote:
> First, I'm happy to this subject considered again, even if I don't use
> DNSSEC these days I think it makes sense to provide this support in libc.
>
> On Sat, Nov 20 2021, Florian Obser  wrote:
>> On 2021-11-20 14:40 +01, Otto Moerbeek  wrote:
>>> On Sat, Nov 20, 2021 at 12:20:32PM +0100, Florian Obser wrote:
 diff --git share/man/man5/resolv.conf.5 share/man/man5/resolv.conf.5
 index 8d3b91c0832..ac64d3e6fd6 100644
 --- share/man/man5/resolv.conf.5
 +++ share/man/man5/resolv.conf.5
 @@ -259,6 +259,12 @@ first as an absolute name before any search list 
 elements are appended to it.
  .It Cm tcp
  Forces the use of TCP for queries.
  Normal behaviour is to query via UDP but fall back to TCP on failure.
 +.It Cm trust-ad
 +Request DNSSEC validated data from the nameservers and preserve the
 +Authentic Data (AD) flag in responses.
>>>
 +Otherwise the Authentic Data (AD) flag is removed from responses.
>>>
>>> This is not what happens (though the DNS header itself is not exposed
>>> in the API). Maybe describe it as:
>>>
>>
>> That is a very good point. And hints at that the diff is not complete.
>> I think we should fiddle with the header. I think res_send_async_run()
>> should unset the AD flag in ASR_STATE_PACKET. The api exposes the raw
>> packet in the res_query(3) family of functions. And this is actually
>> what glibc does.
>
> Indeed, my previous proposal used to do this (I don't remember if
> I tested the invalidation though).
>
>>> Request DNSSEC validated data from the nameservers and evaluate the AD
>>> flag in responses.
>>>
 +The nameservers and the network path to them must be trusted.
>>>
>>> Maybe:
>>>
>>> Only set this flag if the nameservers and the network paths to them are
>>> trusted.
>>>
>>
>> I wanted to focus less on the technical details (AD flag) and more on
>> what this means. Of course I failed at that ;)
>> I think we should mention the AD flag so that people who know how DNSSEC
>> works can find it, but we need to better explain when random user should
>> set the flag (never!).
>>
>> I'll rework the diff.
>>
 +This is the default for nameservers on localhost.
  .El
  .El
  .Pp
>
>
> Here's a refreshed version of the diff initially sent in this thread:
>
>   https://marc.info/?l=openbsd-tech=159567881530990=2
>
> It's quite similar to Florian's diff so I'd like to re-submit it.
>
>
> I like a lot the "automatic localhost trust" from Florian so
> I incorporated his improvement here, but I can leave it out from at
> commit time.  In the proposal below resolv.conf(5) says:
>
>   +This option should be used only if the transport between the
>   +.Ic nameserver
>   +and the resolver is secure.
>   +It is enabled by default if
>   +.Nm resolv.conf
>   +only lists nameservers on localhost.
>
> which I think is more accurate regarding nameservers on localhost.
> The wording in res_init(3) is unchanged from my first proposal, it
> should also be updated if localhost is to be automatically trusted.
>
> The diff also adds trust-ad support in asr_debug.c.
>
> How do you folks like this?
>
> Index: include/resolv.h
> ===
> RCS file: /home/cvs/src/include/resolv.h,v
> retrieving revision 1.22
> diff -u -p -r1.22 resolv.h
> --- include/resolv.h  14 Jan 2019 06:23:06 -  1.22
> +++ include/resolv.h  20 Nov 2021 14:24:08 -
> @@ -186,6 +186,8 @@ struct __res_state_ext {
>  #define  RES_INSECURE2   0x0800  /* type 2 security disabled */
>  #define  RES_NOALIASES   0x1000  /* shuts off HOSTALIASES 
> feature */
>  #define  RES_USE_INET6   0x2000  /* use/map IPv6 in 
> gethostbyname() */
> +/* GNU extension: use higher bit to avoid conflict with ISC use */
> +#define  RES_TRUSTAD 0x8000  /* query with AD, trust AD in 
> reply */
>  /* KAME extensions: use higher bit to avoid conflict with ISC use */
>  #define  RES_USE_EDNS0   0x4000  /* use EDNS0 */
>  /* DNSSEC extensions: use higher bit to avoid conflict with ISC use */
> Index: share/man/man5/resolv.conf.5
> ===
> RCS file: /home/cvs/src/share/man/man5/resolv.conf.5,v
> retrieving revision 1.62
> diff -u -p -r1.62 resolv.conf.5
> --- share/man/man5/resolv.conf.5  24 Aug 2021 07:30:32 -  1.62
> +++ share/man/man5/resolv.conf.5  20 Nov 2021 15:58:39 -
> @@ -259,6 +259,18 @@ first as an absolute name before any sea
>  .It Cm tcp
>  Forces the use of TCP for queries.
>  Normal behaviour is to query via UDP but fall back to TCP on failure.
> +.It Cm trust-ad
> +Sets the AD flag in DNS queries, and preserve the value of the AD flag
> +in DNS replies (this flag is stripped by default).
> +Useful when applications want to ensure that the DNS resources they
> +look up have been signed with DNSSEC and 

Re: tee(1): increase read buffer to MAXBSIZE bytes

2021-11-20 Thread Scott Cheloha
On Fri, Nov 19, 2021 at 08:38:27PM -0700, Todd C. Miller wrote:
> On Fri, 19 Nov 2021 21:31:12 -0600, Scott Cheloha wrote:
> 
> > Is there a nicer way to pick a "reasonable" buffer size when we just
> > want to move as many bytes as possible on a given platform without
> > hogging the machine?
> 
> Not really.  But I don't think you need to worry about "hogging the
> machine" with a 64K buffer.

Right.

> One advantage to using stdio is that
> it will use the optimal I/O blocksize automatically so you could
> try using that instead of write(2) and see how it performs.

tee(1) needs to write input to N drains, N >= 1.  If we use stdio that
means we're doing N additional userspace copies, no?

> > I thought MAXBSIZE was that constant, because we use it elsewhere,
> > e.g.  in cat(1) and wc(1), but if not then what is the right thing?
> > Is it BUFSIZ?
> 
> BUFSIZ is only 1K on OpenBSD, which is its own problem.

That seems a bit small... hmmm...



Re: IPsec tdb ref counting

2021-11-20 Thread Alexander Bluhm
New tdb refcounting diff.

I delete and unref the timeouts earlier and fixed some leaks.  At
least on my machine it does not crash and tcp InUse is 0 after
ipsecctl -F .

Please test.

mvs:  Are some of your concerns resolved by deleting the tdb_reaper() ?

bluhm

Index: net/if_bridge.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/net/if_bridge.c,v
retrieving revision 1.358
diff -u -p -r1.358 if_bridge.c
--- net/if_bridge.c 11 Nov 2021 18:08:17 -  1.358
+++ net/if_bridge.c 20 Nov 2021 15:14:54 -
@@ -1567,20 +1567,28 @@ bridge_ipsec(struct ifnet *ifp, struct e
tdb->tdb_xform != NULL) {
if (tdb->tdb_first_use == 0) {
tdb->tdb_first_use = gettime();
-   if (tdb->tdb_flags & TDBF_FIRSTUSE)
-   timeout_add_sec(>tdb_first_tmo,
-   tdb->tdb_exp_first_use);
-   if (tdb->tdb_flags & TDBF_SOFT_FIRSTUSE)
-   timeout_add_sec(>tdb_sfirst_tmo,
-   tdb->tdb_soft_first_use);
+   if (tdb->tdb_flags & TDBF_FIRSTUSE) {
+   if (timeout_add_sec(
+   >tdb_first_tmo,
+   tdb->tdb_exp_first_use))
+   tdb_ref(tdb);
+   }
+   if (tdb->tdb_flags & TDBF_SOFT_FIRSTUSE) {
+   if (timeout_add_sec(
+   >tdb_sfirst_tmo,
+   tdb->tdb_soft_first_use))
+   tdb_ref(tdb);
+   }
}
 
prot = (*(tdb->tdb_xform->xf_input))(, tdb, hlen,
off);
+   tdb_unref(tdb);
if (prot != IPPROTO_DONE)
ip_deliver(, , prot, af);
return (1);
} else {
+   tdb_unref(tdb);
  skiplookup:
/* XXX do an input policy lookup */
return (0);
Index: net/if_pfsync.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/net/if_pfsync.c,v
retrieving revision 1.298
diff -u -p -r1.298 if_pfsync.c
--- net/if_pfsync.c 11 Nov 2021 12:35:01 -  1.298
+++ net/if_pfsync.c 19 Nov 2021 13:04:58 -
@@ -1325,11 +1325,13 @@ pfsync_update_net_tdb(struct pfsync_tdb 
/* Neither replay nor byte counter should ever decrease. */
if (pt->rpl < tdb->tdb_rpl ||
pt->cur_bytes < tdb->tdb_cur_bytes) {
+   tdb_unref(tdb);
goto bad;
}
 
tdb->tdb_rpl = pt->rpl;
tdb->tdb_cur_bytes = pt->cur_bytes;
+   tdb_unref(tdb);
}
return;
 
Index: net/pfkeyv2.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/net/pfkeyv2.c,v
retrieving revision 1.221
diff -u -p -r1.221 pfkeyv2.c
--- net/pfkeyv2.c   25 Oct 2021 18:25:01 -  1.221
+++ net/pfkeyv2.c   20 Nov 2021 15:14:54 -
@@ -1043,8 +1043,11 @@ pfkeyv2_sa_flush(struct tdb *tdb, void *
 {
if (!(*((u_int8_t *) satype_vp)) ||
tdb->tdb_satype == *((u_int8_t *) satype_vp)) {
+   /* keep in sync with tdb_delete() */
tdb_unlink_locked(tdb);
-   tdb_free(tdb);
+   tdb_unbundle(tdb);
+   tdb_deltimeouts(tdb);
+   tdb_unref(tdb);
}
return (0);
 }
@@ -1327,7 +1330,7 @@ pfkeyv2_send(struct socket *so, void *me
 
if ((rval = pfkeyv2_get_proto_alg(newsa->tdb_satype,
>tdb_sproto, ))) {
-   tdb_free(freeme);
+   tdb_unref(freeme);
freeme = NULL;
NET_UNLOCK();
goto ret;
@@ -1363,7 +1366,7 @@ pfkeyv2_send(struct socket *so, void *me
headers[SADB_X_EXT_DST_MASK],
headers[SADB_X_EXT_PROTOCOL],
headers[SADB_X_EXT_FLOW_TYPE]))) {
-   tdb_free(freeme);
+   tdb_unref(freeme);
freeme = NULL;
NET_UNLOCK();
goto ret;
@@ -1386,7 +1389,7 @@ pfkeyv2_send(struct socket *so, void *me
   

Re: urndis0: IOERROR

2021-11-20 Thread Mikhail
Comparing Windows and OpenBSD tcpdumps I noticed some differences:

1) In REMOTE_NDIS_INITIALIZE_MSG (I patched the kernel to send it before
getting MAC address and with proper minor version) bInterfaceClass on
OpenBSD is set to Unknown (0x), on Windows it's properly set to
Wireless Controller (0xe0).

2) The control transfer stage for this message on OpenBSD is set to
Data, on Windows it is Setup.

3) The answer for the message on Windows is with USBD_STATUS_SUCCESS
(0x), on OpenBSD it's Unknown (0x000d).

4) The answer for the message on Windows contains "Control transfer
stage: Complete" message, on OpenBSD it's set to Status.

Maybe someone can prompt me:

1) Does those differences matter at all?

2) Where to look regarding changing bInterfaceClass for outgoing
messages? I can see it's properly set in urndis driver (line 133 of
if_urndis.c), but not passed anywhere down to USB stack.

On Sun, Nov 14, 2021 at 02:39:33PM +0300, Mikhail wrote:
> On Sat, Nov 13, 2021 at 08:27:21PM +0300, Mikhail wrote:
> > Hello, I get aforesaid error when trying to plug in my 4G usb modem, it
> > works well on another laptop with windows 10.
> > 
> > I enabled debug info, but seem the failure somewhere deeper in usb stack
> > and I wasn't able to catch it, can someone advice me on further
> > debugging efforts?
> 
> I did some further investigation with wireshark - in attachment two
> captures - from windows (modem is working) and from openbsd (not
> working).
> 
> I also found some differences in behavior of the device attachment and
> processing. According to RNDIS specs[1]:
> 
> 1) MinorVersion of REMOTE_NDIS_INITIALIZE_MSG must be set to 0x
>(paragraph 2.2.2), but in our code it's set to 1:
> 
> sys/dev/usb/if_urndis.c:
>  459 msg->rm_ver_minor = htole32(1);
> 
> 2) The REMOTE_NDIS_INITIALIZE_MSG message must come first, but in
>OpenBSD first message is getting hardware address (same file):
> 
> 1462 if (urndis_ctrl_query(sc, OID_802_3_PERMANENT_ADDRESS, NULL, 0,
> 1463 , ) != RNDIS_STATUS_SUCCESS) {
> 1464 printf(": unable to get hardware address\n");
> 1465 splx(s);
> 1466 return;
> 1467 }
> 
> In yota-windows.pcapng the REMOTE_NDIS_INITIALIZE_MSG is in frame 27.
> 
> In yota-openbsd.pcapng OID_802_3_PERMANENT_ADDRESS is in fram 290.
> 
> Maybe someone with USB protocol understanding can take a look at the
> captures and note the problems in device responses?
> 
> I also tried latest snapshot, the problem still persist.
> 
> [1] - 
> https://winprotocoldoc.blob.core.windows.net/productionwindowsarchives/WinArchive/%5BMS-RNDIS%5D.pdf
> 
> 


yota-patched2.pcapng
Description: Binary data


Re: asr(3): strip AD flag in responses

2021-11-20 Thread Jeremie Courreges-Anglas


First, I'm happy to this subject considered again, even if I don't use
DNSSEC these days I think it makes sense to provide this support in libc.

On Sat, Nov 20 2021, Florian Obser  wrote:
> On 2021-11-20 14:40 +01, Otto Moerbeek  wrote:
>> On Sat, Nov 20, 2021 at 12:20:32PM +0100, Florian Obser wrote:
>>> diff --git share/man/man5/resolv.conf.5 share/man/man5/resolv.conf.5
>>> index 8d3b91c0832..ac64d3e6fd6 100644
>>> --- share/man/man5/resolv.conf.5
>>> +++ share/man/man5/resolv.conf.5
>>> @@ -259,6 +259,12 @@ first as an absolute name before any search list 
>>> elements are appended to it.
>>>  .It Cm tcp
>>>  Forces the use of TCP for queries.
>>>  Normal behaviour is to query via UDP but fall back to TCP on failure.
>>> +.It Cm trust-ad
>>> +Request DNSSEC validated data from the nameservers and preserve the
>>> +Authentic Data (AD) flag in responses.
>>
>>> +Otherwise the Authentic Data (AD) flag is removed from responses.
>>
>> This is not what happens (though the DNS header itself is not exposed
>> in the API). Maybe describe it as:
>>
>
> That is a very good point. And hints at that the diff is not complete.
> I think we should fiddle with the header. I think res_send_async_run()
> should unset the AD flag in ASR_STATE_PACKET. The api exposes the raw
> packet in the res_query(3) family of functions. And this is actually
> what glibc does.

Indeed, my previous proposal used to do this (I don't remember if
I tested the invalidation though).

>> Request DNSSEC validated data from the nameservers and evaluate the AD
>> flag in responses.
>>
>>> +The nameservers and the network path to them must be trusted.
>>
>> Maybe:
>>
>> Only set this flag if the nameservers and the network paths to them are
>> trusted.
>>
>
> I wanted to focus less on the technical details (AD flag) and more on
> what this means. Of course I failed at that ;)
> I think we should mention the AD flag so that people who know how DNSSEC
> works can find it, but we need to better explain when random user should
> set the flag (never!).
>
> I'll rework the diff.
>
>>> +This is the default for nameservers on localhost.
>>>  .El
>>>  .El
>>>  .Pp


Here's a refreshed version of the diff initially sent in this thread:

  https://marc.info/?l=openbsd-tech=159567881530990=2

It's quite similar to Florian's diff so I'd like to re-submit it.


I like a lot the "automatic localhost trust" from Florian so
I incorporated his improvement here, but I can leave it out from at
commit time.  In the proposal below resolv.conf(5) says:

  +This option should be used only if the transport between the
  +.Ic nameserver
  +and the resolver is secure.
  +It is enabled by default if
  +.Nm resolv.conf
  +only lists nameservers on localhost.

which I think is more accurate regarding nameservers on localhost.
The wording in res_init(3) is unchanged from my first proposal, it
should also be updated if localhost is to be automatically trusted.

The diff also adds trust-ad support in asr_debug.c.

How do you folks like this?


Index: include/resolv.h
===
RCS file: /home/cvs/src/include/resolv.h,v
retrieving revision 1.22
diff -u -p -r1.22 resolv.h
--- include/resolv.h14 Jan 2019 06:23:06 -  1.22
+++ include/resolv.h20 Nov 2021 14:24:08 -
@@ -186,6 +186,8 @@ struct __res_state_ext {
 #defineRES_INSECURE2   0x0800  /* type 2 security disabled */
 #defineRES_NOALIASES   0x1000  /* shuts off HOSTALIASES 
feature */
 #defineRES_USE_INET6   0x2000  /* use/map IPv6 in 
gethostbyname() */
+/* GNU extension: use higher bit to avoid conflict with ISC use */
+#defineRES_TRUSTAD 0x8000  /* query with AD, trust AD in 
reply */
 /* KAME extensions: use higher bit to avoid conflict with ISC use */
 #defineRES_USE_EDNS0   0x4000  /* use EDNS0 */
 /* DNSSEC extensions: use higher bit to avoid conflict with ISC use */
Index: share/man/man5/resolv.conf.5
===
RCS file: /home/cvs/src/share/man/man5/resolv.conf.5,v
retrieving revision 1.62
diff -u -p -r1.62 resolv.conf.5
--- share/man/man5/resolv.conf.524 Aug 2021 07:30:32 -  1.62
+++ share/man/man5/resolv.conf.520 Nov 2021 15:58:39 -
@@ -259,6 +259,18 @@ first as an absolute name before any sea
 .It Cm tcp
 Forces the use of TCP for queries.
 Normal behaviour is to query via UDP but fall back to TCP on failure.
+.It Cm trust-ad
+Sets the AD flag in DNS queries, and preserve the value of the AD flag
+in DNS replies (this flag is stripped by default).
+Useful when applications want to ensure that the DNS resources they
+look up have been signed with DNSSEC and validated by the
+.Ic nameserver .
+This option should be used only if the transport between the
+.Ic nameserver
+and the resolver is secure.
+It is enabled by default if
+.Nm resolv.conf
+only lists nameservers on 

Re: uhidev: claim multiple report ids [3/N]

2021-11-20 Thread Damien Couderc

Le 12/11/2021 à 09:11, Anton Lindqvist a écrit :

On Thu, Nov 11, 2021 at 05:09:35PM -0800, Greg Steuck wrote:

Anton Lindqvist  writes:


On Thu, Nov 11, 2021 at 03:29:15PM +0100, Anton Lindqvist wrote:

Hi,
The second attempt to solve the uhidev claim multiple report ids
conflict didn't work out either as it broke fido(4). Signalling claim
multiple report ids to the match routines using the report id does not
work as all 256 values already have semantic meaning. I instead want to
use `uha->claimed != NULL' to signal that multiple report ids can be
claimed. Before doing so, refactor in order to make an upcoming diff
with the actual fix significantly smaller.

No intended^W functional change.

Comments? OK?


... and here's the actual fix applied on top of the previous diff.


The pair of diffs seems to work for me (fido remains operational unlike
the previous iteration). There's a minor change in dmesg output which is
not otherwise consequential:


Thanks, some drivers (ums and ukbd for instance) are still missing a
UHIDEV_CLAIM_MULTIPLE_REPORTID conditional. That can be fixed later on.
For now, keep using 255 as the report id when claiming multiple report
ids.

Here's a new iteration of the second diff. Another round of testing
would be much appreciated.

diff --git sys/dev/usb/uhidev.h sys/dev/usb/uhidev.h
index 86217fb8880..b8daf014662 100644
--- sys/dev/usb/uhidev.h
+++ sys/dev/usb/uhidev.h
@@ -75,12 +75,11 @@ struct uhidev_attach_arg {
struct usb_attach_arg   *uaa;
struct uhidev_softc *parent;
uint8_t  reportid;
-   uint8_t  nreports;
+   u_intnreports;
uint8_t *claimed;
  };
  
-#define UHIDEV_CLAIM_MULTIPLE_REPORTID(u) \

-   ((u)->reportid == __UHIDEV_CLAIM_MULTIPLE_REPORTID)
+#define UHIDEV_CLAIM_MULTIPLE_REPORTID(u)  ((u)->claimed != NULL)
  #define   __UHIDEV_CLAIM_MULTIPLE_REPORTID255 /* XXX */
  
  int uhidev_report_type_conv(int);




Hi,
It works great for me, upd is not crashing with my UPS and unplugging is 
fine too in -current and -stable.


Thanks,
Damien



Re: snmpd: tweak listen on

2021-11-20 Thread Stuart Henderson
On 2021/11/20 10:20, Martijn van Duren wrote:
> On Sun, 2021-11-14 at 22:30 +0100, Sebastian Benoit wrote:
> > If there is no obvious reason (i.e. be different because you need it for a
> > specific feature) why not to use the same host*() function as other parse.y?
> > it would be better to stay in sync with otehrr daemons. That way if there is
> > an issue in one daemon, we can fix it in all of them.
> > 
> > Or, to turn the argument around: if you have found a way to improve those
> > functions in parse.y, you could push for your changes to be applied to all
> > parse.y that use the same function.
> > 
> > This applies to other parse.y functions too. The more they deviate, the
> > harder maintanance becomes.
> 
> This is the original message[0] (code committed had some tweaks not in this
> mail).
> In there I mention reyk's commit message saying that host_* could be merged.
> This commit tried to implement that (but apparently doesn't hold true any
> more, or maybe it never did). Moving away from struct address in host() also
> has the benefit that having different implementations of struct address to
> be more forgiving and it's less code overall.
> 
> Since it took over a year to find this particular edge case I think it could
> be a good idea to push this code to the other daemons as well, but I'm short
> on time at the moment.
> 
> Diff below should fix this particular issue and is easy enough to revert if
> we decide to go for the behaviour change of getaddrinfo proposed in my
> previous mail.

ok

> I'm not afraid of moving the other way in getaddrinfo (that AI_NUMERICHOST
> is also subject to the family statement in resolv.conf), because that would

I don't like that at all. And I don't think it matches resolv.conf's
definition of "family":

 family  Specify which type of Internet protocol family to prefer, if
 a host is reachable using different address families. 

the "if a host is reachable using different address families"
means that this doesn't apply for numeric addresses. (OK there could be
a side-case with CLAT on OS that don't force IPV6_V6ONLY but not on
OpenBSD).



> also break all current host_v6 implementations in parse.y, so that would
> be an overhaul in all parse.y files anyway.
> 
> martijn@
> > 
> > Martijn van Duren(openbsd+t...@list.imperialat.at) on 2021.11.14 00:23:59 
> > +0100:
> > > On Sat, 2021-11-13 at 13:23 +, Stuart Henderson wrote:
> > > > On 2021/08/09 20:55, Martijn van Duren wrote:
> > > > > On Mon, 2021-08-09 at 11:57 +0200, Martijn van Duren wrote:
> > > > > > 
> > > > > > This diff fixes all of the above:
> > > > > > - Allow any to be used resolving to 0.0.0.0 and ::
> > > > > > - Set SO_REUSEADDR on sockets, so we can listen on both any and
> > > > > > ?? localhost
> > > > > > - Document that we listen on any by default
> > > > 
> > > > I've discovered a problem with this, if you have "family inet4" or
> > > > "family inet6" in resolv.conf then startup fails, either with the
> > > > implicit listen:
> > > > 
> > > > snmpd: Unexpected resolving of ::
> > > > 
> > > > or with explicit e.g. "listen on any snmpv3":
> > > > 
> > > > /etc/snmpd.conf:3: invalid address: any
> > > > 
> > > > Config-based workaround is e.g. "listen on 0.0.0.0 snmpv3"
> > > > 
> > > > Should host() use a specific ai_family instead of PF_UNSPEC where we
> > > > already know that it's a v4 or v6 address? Or just do like httpd/parse.y
> > > > where host() tries v4, then v6 if that fails, then dns?
> > > > 
> 
> [0] https://marc.info/?l=openbsd-tech=159838549814986=2
> 
> Index: parse.y
> ===
> RCS file: /cvs/src/usr.sbin/snmpd/parse.y,v
> retrieving revision 1.72
> diff -u -p -r1.72 parse.y
> --- parse.y   25 Oct 2021 11:21:32 -  1.72
> +++ parse.y   20 Nov 2021 09:19:00 -
> @@ -1600,7 +1600,16 @@ host(const char *s, const char *port, in
>   bzero(, sizeof(hints));
>   hints.ai_family = PF_UNSPEC;
>   hints.ai_socktype = type;
> + /*
> +  * Without AI_NUMERICHOST getaddrinfo might not resolve ip addresses
> +  * for families not specified in the "family" statement in resolv.conf.
> +  */
> + hints.ai_flags = AI_NUMERICHOST;
>   error = getaddrinfo(s, port, , );
> + if (error == EAI_NONAME) {
> + hints.ai_flags = 0;
> + error = getaddrinfo(s, port, , );
> + }
>   if (error == EAI_AGAIN || error == EAI_NODATA || error == EAI_NONAME)
>   return 0;
>   if (error) {
> 



Re: asr(3): strip AD flag in responses

2021-11-20 Thread Florian Obser
On 2021-11-20 14:40 +01, Otto Moerbeek  wrote:
> On Sat, Nov 20, 2021 at 12:20:32PM +0100, Florian Obser wrote:
>> diff --git share/man/man5/resolv.conf.5 share/man/man5/resolv.conf.5
>> index 8d3b91c0832..ac64d3e6fd6 100644
>> --- share/man/man5/resolv.conf.5
>> +++ share/man/man5/resolv.conf.5
>> @@ -259,6 +259,12 @@ first as an absolute name before any search list 
>> elements are appended to it.
>>  .It Cm tcp
>>  Forces the use of TCP for queries.
>>  Normal behaviour is to query via UDP but fall back to TCP on failure.
>> +.It Cm trust-ad
>> +Request DNSSEC validated data from the nameservers and preserve the
>> +Authentic Data (AD) flag in responses.
>
>> +Otherwise the Authentic Data (AD) flag is removed from responses.
>
> This is not what happens (though the DNS header itself is not exposed
> in the API). Maybe describe it as:
>

That is a very good point. And hints at that the diff is not complete.
I think we should fiddle with the header. I think res_send_async_run()
should unset the AD flag in ASR_STATE_PACKET. The api exposes the raw
packet in the res_query(3) family of functions. And this is actually
what glibc does.

> Request DNSSEC validated data from the nameservers and evaluate the AD
> flag in responses.
>
>> +The nameservers and the network path to them must be trusted.
>
> Maybe:
>
> Only set this flag if the nameservers and the network paths to them are
> trusted.
>

I wanted to focus less on the technical details (AD flag) and more on
what this means. Of course I failed at that ;)
I think we should mention the AD flag so that people who know how DNSSEC
works can find it, but we need to better explain when random user should
set the flag (never!).

I'll rework the diff.

>> +This is the default for nameservers on localhost.
>>  .El
>>  .El
>>  .Pp
>> 
>> -- 
>> I'm not entirely sure you are real.
>> 
>

-- 
I'm not entirely sure you are real.



Re: diff: improve legibility of structs in several manpages

2021-11-20 Thread Ingo Schwarze
Hi Jan,

sorry that i failed to look at this earlier.

Even with your diff, the style is still not completely consistent.

I think that inside .Bd -literal, the best style is exactly the same
as in source code:

structname{
typename;
type*name;
};

where

 *  can be more than one tab
   if some of the types in the struct are long
 * and  can be two spaces if the struct contains a double pointer
 * and  can be reduced to four spaces if space is tight,
   or a different number of spaces in very unusual cases

Even with your diff, some of the structs still indent the type
using eight spaces instead of a tab and some structs still use
multiple spaces instead of .

That said, i think your diff is an improvement and you should commit it
(OK schwarze@), except that i disagree with your change to ktrace(2).
That one ought to use two tabs for  instead of one tab plus eight
spaces and it should use one tab after "struct timespec" instead of
one space.

On Tue, Oct 26, 2021 at 06:56:10PM +0200, Jan Klemkow wrote:

> This diff harmonises the indentation of struct members and comments in
> several manpages.  Also fixes line wraps of comments on 80 column
> terminals.  General uses tabs for general indentation and 4 spaces on
> tight spots.  Also uses extra space to align pointers and non-pointers
> as we do this on certain places in our source.

Yes, i agree with all of that.

Yours,
  Ingo



Re: asr(3): strip AD flag in responses

2021-11-20 Thread Otto Moerbeek
On Sat, Nov 20, 2021 at 02:40:59PM +0100, Otto Moerbeek wrote:

> On Sat, Nov 20, 2021 at 12:20:32PM +0100, Florian Obser wrote:
> 
> > The Authentic Data (AD) flag indicates that the nameserver validated
> > the response using DNSSEC. For clients to trust this the nameserver
> > and the path to the nameserver must be trusted. In the general case
> > this is not true.
> > 
> > We can trust localhost so we set the AD flag on queries to request
> > validation and preserve the AD flag in answers. (*)
> > 
> > If, and only if, trusted nameservers (that are not on localhost) have
> > been added to resolv.conf and the path to them is secure the trust-ad
> > flag may be used to request validation from them and trust answers with
> > the AD flag set.
> > 
> > The trust-ad option first appeared in glibc 2.31.
> > ( https://gnutoolchain-gerrit.osci.io/r/c/glibc/+/461 and
> > https://man7.org/linux/man-pages/man5/resolv.conf.5.html )
> > 
> > Thomas Habets (thomas at habets.se) pointed out on bugs@ that
> > VerifyHostKeyDNS in ssh only works with unwind (which is good) but
> > only by accident (which is bad).
> > https://marc.info/?t=16371749593=1=2
> > 
> > *) This is for people running unwind, unbound or some other validating
> > resolver on localhost. Yes, it is possible that someone set up some sort
> > of forwarder where they trust the DNS answers but not that they are
> > DNSSEC validated. This feels contrived and a case of DON'T DO THAT!
> > 
> > OK?
> 
> I like this much better than the sketch I posted on bugs@
> 
> Two comment wrt the docs inline.
> 
> Code looks and tests good.
> 
>   -Otto
> 
> > 
> > diff --git include/resolv.h include/resolv.h
> > index fb02483871e..2422deb5484 100644
> > --- include/resolv.h
> > +++ include/resolv.h
> > @@ -191,6 +191,7 @@ struct __res_state_ext {
> >  /* DNSSEC extensions: use higher bit to avoid conflict with ISC use */
> >  #defineRES_USE_DNSSEC  0x2000  /* use DNSSEC using OK bit in 
> > OPT */
> >  #defineRES_USE_CD  0x1000  /* set Checking Disabled flag */
> > +#defineRES_TRUSTAD 0x8000  /* Request AD, keep it in 
> > responses. */
> >  
> >  #define RES_DEFAULT(RES_RECURSE | RES_DEFNAMES | RES_DNSRCH)
> >  
> > diff --git lib/libc/asr/asr.c lib/libc/asr/asr.c
> > index 8bcb61b6000..77bc3854420 100644
> > --- lib/libc/asr/asr.c
> > +++ lib/libc/asr/asr.c
> > @@ -661,7 +661,8 @@ pass0(char **tok, int n, struct asr_ctx *ac)
> > d = strtonum(tok[i] + 6, 1, 16, );
> > if (e == NULL)
> > ac->ac_ndots = d;
> > -   }
> > +   } else if (!strcmp(tok[i], "trust-ad"))
> > +   ac->ac_options |= RES_TRUSTAD;
> > }
> > }
> >  }
> > @@ -672,7 +673,10 @@ pass0(char **tok, int n, struct asr_ctx *ac)
> >  static int
> >  asr_ctx_from_string(struct asr_ctx *ac, const char *str)
> >  {
> > -   char buf[512], *ch;
> > +   struct sockaddr_in6 *sin6;
> > +   struct sockaddr_in  *sin;
> > +   int  i, trustad;
> > +   char buf[512], *ch;
> >  
> > asr_ctx_parse(ac, str);
> >  
> > @@ -702,6 +706,27 @@ asr_ctx_from_string(struct asr_ctx *ac, const char 
> > *str)
> > break;
> > }
> >  
> > +   trustad = 1;
> > +   for (i = 0; i < ac->ac_nscount && trustad; i++) {
> > +   switch (ac->ac_ns[i]->sa_family) {
> > +   case AF_INET:
> > +   sin = (struct sockaddr_in *)ac->ac_ns[i];
> > +   if (sin->sin_addr.s_addr != htonl(INADDR_LOOPBACK))
> > +   trustad = 0;
> > +   break;
> > +   case AF_INET6:
> > +   sin6 = (struct sockaddr_in6 *)ac->ac_ns[i];
> > +   if (!IN6_IS_ADDR_LOOPBACK(>sin6_addr))
> > +   trustad = 0;
> > +   break;
> > +   default:
> > +   trustad = 0;
> > +   break;
> > +   }
> > +   }
> > +   if (trustad)
> > +   ac->ac_options |= RES_TRUSTAD;
> > +
> > return (0);
> >  }
> >  
> > diff --git lib/libc/asr/getrrsetbyname_async.c 
> > lib/libc/asr/getrrsetbyname_async.c
> > index e5e7c23c261..06a998b0381 100644
> > --- lib/libc/asr/getrrsetbyname_async.c
> > +++ lib/libc/asr/getrrsetbyname_async.c
> > @@ -32,7 +32,7 @@
> >  #include "asr_private.h"
> >  
> >  static int getrrsetbyname_async_run(struct asr_query *, struct asr_result 
> > *);
> > -static void get_response(struct asr_result *, const char *, int);
> > +static void get_response(struct asr_result *, const char *, int, int);
> >  
> >  struct asr_query *
> >  getrrsetbyname_async(const char *hostname, unsigned int rdclass,
> > @@ -150,7 +150,8 @@ getrrsetbyname_async_run(struct asr_query *as, struct 
> > asr_result *ar)
> > break;
> > }
> >  

Re: asr(3): strip AD flag in responses

2021-11-20 Thread Otto Moerbeek
On Sat, Nov 20, 2021 at 12:20:32PM +0100, Florian Obser wrote:

> The Authentic Data (AD) flag indicates that the nameserver validated
> the response using DNSSEC. For clients to trust this the nameserver
> and the path to the nameserver must be trusted. In the general case
> this is not true.
> 
> We can trust localhost so we set the AD flag on queries to request
> validation and preserve the AD flag in answers. (*)
> 
> If, and only if, trusted nameservers (that are not on localhost) have
> been added to resolv.conf and the path to them is secure the trust-ad
> flag may be used to request validation from them and trust answers with
> the AD flag set.
> 
> The trust-ad option first appeared in glibc 2.31.
> ( https://gnutoolchain-gerrit.osci.io/r/c/glibc/+/461 and
> https://man7.org/linux/man-pages/man5/resolv.conf.5.html )
> 
> Thomas Habets (thomas at habets.se) pointed out on bugs@ that
> VerifyHostKeyDNS in ssh only works with unwind (which is good) but
> only by accident (which is bad).
> https://marc.info/?t=16371749593=1=2
> 
> *) This is for people running unwind, unbound or some other validating
> resolver on localhost. Yes, it is possible that someone set up some sort
> of forwarder where they trust the DNS answers but not that they are
> DNSSEC validated. This feels contrived and a case of DON'T DO THAT!
> 
> OK?

I like this much better than the sketch I posted on bugs@

Two comment wrt the docs inline.

Code looks and tests good.

-Otto

> 
> diff --git include/resolv.h include/resolv.h
> index fb02483871e..2422deb5484 100644
> --- include/resolv.h
> +++ include/resolv.h
> @@ -191,6 +191,7 @@ struct __res_state_ext {
>  /* DNSSEC extensions: use higher bit to avoid conflict with ISC use */
>  #define  RES_USE_DNSSEC  0x2000  /* use DNSSEC using OK bit in 
> OPT */
>  #define  RES_USE_CD  0x1000  /* set Checking Disabled flag */
> +#define  RES_TRUSTAD 0x8000  /* Request AD, keep it in 
> responses. */
>  
>  #define RES_DEFAULT  (RES_RECURSE | RES_DEFNAMES | RES_DNSRCH)
>  
> diff --git lib/libc/asr/asr.c lib/libc/asr/asr.c
> index 8bcb61b6000..77bc3854420 100644
> --- lib/libc/asr/asr.c
> +++ lib/libc/asr/asr.c
> @@ -661,7 +661,8 @@ pass0(char **tok, int n, struct asr_ctx *ac)
>   d = strtonum(tok[i] + 6, 1, 16, );
>   if (e == NULL)
>   ac->ac_ndots = d;
> - }
> + } else if (!strcmp(tok[i], "trust-ad"))
> + ac->ac_options |= RES_TRUSTAD;
>   }
>   }
>  }
> @@ -672,7 +673,10 @@ pass0(char **tok, int n, struct asr_ctx *ac)
>  static int
>  asr_ctx_from_string(struct asr_ctx *ac, const char *str)
>  {
> - char buf[512], *ch;
> + struct sockaddr_in6 *sin6;
> + struct sockaddr_in  *sin;
> + int  i, trustad;
> + char buf[512], *ch;
>  
>   asr_ctx_parse(ac, str);
>  
> @@ -702,6 +706,27 @@ asr_ctx_from_string(struct asr_ctx *ac, const char *str)
>   break;
>   }
>  
> + trustad = 1;
> + for (i = 0; i < ac->ac_nscount && trustad; i++) {
> + switch (ac->ac_ns[i]->sa_family) {
> + case AF_INET:
> + sin = (struct sockaddr_in *)ac->ac_ns[i];
> + if (sin->sin_addr.s_addr != htonl(INADDR_LOOPBACK))
> + trustad = 0;
> + break;
> + case AF_INET6:
> + sin6 = (struct sockaddr_in6 *)ac->ac_ns[i];
> + if (!IN6_IS_ADDR_LOOPBACK(>sin6_addr))
> + trustad = 0;
> + break;
> + default:
> + trustad = 0;
> + break;
> + }
> + }
> + if (trustad)
> + ac->ac_options |= RES_TRUSTAD;
> +
>   return (0);
>  }
>  
> diff --git lib/libc/asr/getrrsetbyname_async.c 
> lib/libc/asr/getrrsetbyname_async.c
> index e5e7c23c261..06a998b0381 100644
> --- lib/libc/asr/getrrsetbyname_async.c
> +++ lib/libc/asr/getrrsetbyname_async.c
> @@ -32,7 +32,7 @@
>  #include "asr_private.h"
>  
>  static int getrrsetbyname_async_run(struct asr_query *, struct asr_result *);
> -static void get_response(struct asr_result *, const char *, int);
> +static void get_response(struct asr_result *, const char *, int, int);
>  
>  struct asr_query *
>  getrrsetbyname_async(const char *hostname, unsigned int rdclass,
> @@ -150,7 +150,8 @@ getrrsetbyname_async_run(struct asr_query *as, struct 
> asr_result *ar)
>   break;
>   }
>  
> - get_response(ar, ar->ar_data, ar->ar_datalen);
> + get_response(ar, ar->ar_data, ar->ar_datalen,
> + as->as_ctx->ac_options & RES_TRUSTAD);
>   free(ar->ar_data);
>   

Fix ipsp_spd_lookup() for transport mode (was Re: Fix IPsec NAT-T for L2TP/IPsec)

2021-11-20 Thread YASUOKA Masahiko
Hi,

On Wed, 12 May 2021 19:11:09 +0900 (JST)
YASUOKA Masahiko  wrote:
> Radek reported a problem to misc@ that multiple Windows clients behind
> a NAT cannot use a L2TP/IPsec server simultaneously.
> 
> https://marc.info/?t=16099681611=1=2
> 
> There is two problems.  First is pipex(4) doesn't pass the proper
> ipsecflowinfo to ip_output().  Second is the IPsec policy check which
> is done by ipsp_spd_lookup() returns -1 (EINVAL) if the given tdb is
> not cached.  This happens when its flow is shared by another tdb (for
> another client of the same NAT).

This problem is not fixed yet.  The diff for the second problem was
not committed in.  It was to fix the check in ipsp_spd_lookup() by
making a IPsec policy have a list of IDs.

Also my colleague Kawai pointed out there is another problem if there
is a Linux client among with Windows clients behind a NAT.  Windows
uses 1701/udp for its local ID, but the Linux uses ANY/udp for its
local ID.

In the situation, policies will be overlapped.

  (a) Windows:  REMOTE_IP:1701/udp <=> LOCAL_IP:1701/udp
  (b) Linux:REMOTE_IP:ANY/udp  <=> LOCAL_IP:1701/udp
  
Since we use a radix tree for the policies, when rn_match() is used to
find a policy, as it's best match, (b) is never selected.

Let me update the diff.

As for the incomming, we know the tdb when is used.  The diff uses the
tdb to find the proper policy.

As for the outgoing, other than using "ipsecflowinfo" there is no way
to select a proper policy.  So only when "ipsecflowinfo" is used, get
a tdb from the packet flow and the IDs (retributed by the
ipsecflowinfo), then we can find the proper policy by the tdb.

Also the diff skips the IDs check against the policy only if it is
transport mode and using NAT-T.  Since when NAT-T is used for a policy
for transport mode is shared by multiple clients which has a different
IDs, checking the IDs is difficult and I think the checks other than
is enough.

ok?  comments?

Fix some problems when accepting IPsec transport mode connections from
multiple clients behind a NAT.  In the situation, policies can be
overlapped, but previous could not choice a proper policy both for
incoming and outgoing.  To solve this problem, use
tdb->tdb_filter{,mask} to find a proper policy for incoming and find the
tdb by the given ipsecflowinfo and use it for outgoing.  Also skip
checking IDs of the policy since a policy is shared by multiple clients
in the situation.

Index: sys/netinet/ip_ipsp.c
===
RCS file: /cvs/src/sys/netinet/ip_ipsp.c,v
retrieving revision 1.251
diff -u -p -r1.251 ip_ipsp.c
--- sys/netinet/ip_ipsp.c   18 Nov 2021 11:04:10 -  1.251
+++ sys/netinet/ip_ipsp.c   20 Nov 2021 12:42:36 -
@@ -91,6 +91,8 @@ void  tdb_firstuse(void *);
 void   tdb_soft_timeout(void *);
 void   tdb_soft_firstuse(void *);
 inttdb_hash(u_int32_t, union sockaddr_union *, u_int8_t);
+intsockaddr_encap_match(struct sockaddr_encap *,
+   struct sockaddr_encap *, struct sockaddr_encap *);
 
 int ipsec_in_use = 0;
 u_int64_t ipsec_last_added = 0;
@@ -501,6 +503,76 @@ gettdbbysrc(u_int rdomain, union sockadd
 
mtx_leave(_sadb_mtx);
return tdbp;
+}
+
+/*
+ * Get an SA given the flow, the direction, the security protocol type, and
+ * the desired IDs.
+ */
+struct tdb *
+gettdbbyflow(u_int rdomain, int direction, struct sockaddr_encap *senflow,
+u_int8_t sproto, struct ipsec_ids *ids)
+{
+   u_int32_t hashval;
+   struct tdb *tdbp;
+   union sockaddr_union srcdst;
+
+   if (ids == NULL)/* ids is mandatory */
+   return NULL;
+
+   memset(, 0, sizeof(srcdst));
+   switch (senflow->sen_type) {
+   case SENT_IP4:
+   srcdst.sin.sin_len = sizeof(srcdst.sin);
+   srcdst.sin.sin_family = AF_INET;
+   if (direction == IPSP_DIRECTION_OUT)
+   srcdst.sin.sin_addr = senflow->Sen.Sip4.Dst;
+   else
+   srcdst.sin.sin_addr = senflow->Sen.Sip4.Src;
+   break;
+   case SENT_IP6:
+   srcdst.sin6.sin6_len = sizeof(srcdst.sin6);
+   srcdst.sin6.sin6_family = AF_INET6;
+   if (direction == IPSP_DIRECTION_OUT)
+   srcdst.sin6.sin6_addr = senflow->Sen.Sip6.Dst;
+   else
+   srcdst.sin6.sin6_addr = senflow->Sen.Sip6.Src;
+   break;
+   }
+
+   mtx_enter(_sadb_mtx);
+   hashval = tdb_hash(0, , sproto);
+
+   for (tdbp = tdbdst[hashval]; tdbp != NULL; tdbp = tdbp->tdb_dnext)
+   if (tdbp->tdb_sproto == sproto &&
+   tdbp->tdb_rdomain == rdomain &&
+   (tdbp->tdb_flags & TDBF_INVALID) == 0 &&
+   ((direction == IPSP_DIRECTION_OUT &&
+   !memcmp(>tdb_dst, , srcdst.sa.sa_len)) ||
+   (direction == 

asr(3): strip AD flag in responses

2021-11-20 Thread Florian Obser
The Authentic Data (AD) flag indicates that the nameserver validated
the response using DNSSEC. For clients to trust this the nameserver
and the path to the nameserver must be trusted. In the general case
this is not true.

We can trust localhost so we set the AD flag on queries to request
validation and preserve the AD flag in answers. (*)

If, and only if, trusted nameservers (that are not on localhost) have
been added to resolv.conf and the path to them is secure the trust-ad
flag may be used to request validation from them and trust answers with
the AD flag set.

The trust-ad option first appeared in glibc 2.31.
( https://gnutoolchain-gerrit.osci.io/r/c/glibc/+/461 and
https://man7.org/linux/man-pages/man5/resolv.conf.5.html )

Thomas Habets (thomas at habets.se) pointed out on bugs@ that
VerifyHostKeyDNS in ssh only works with unwind (which is good) but
only by accident (which is bad).
https://marc.info/?t=16371749593=1=2

*) This is for people running unwind, unbound or some other validating
resolver on localhost. Yes, it is possible that someone set up some sort
of forwarder where they trust the DNS answers but not that they are
DNSSEC validated. This feels contrived and a case of DON'T DO THAT!

OK?

diff --git include/resolv.h include/resolv.h
index fb02483871e..2422deb5484 100644
--- include/resolv.h
+++ include/resolv.h
@@ -191,6 +191,7 @@ struct __res_state_ext {
 /* DNSSEC extensions: use higher bit to avoid conflict with ISC use */
 #defineRES_USE_DNSSEC  0x2000  /* use DNSSEC using OK bit in 
OPT */
 #defineRES_USE_CD  0x1000  /* set Checking Disabled flag */
+#defineRES_TRUSTAD 0x8000  /* Request AD, keep it in 
responses. */
 
 #define RES_DEFAULT(RES_RECURSE | RES_DEFNAMES | RES_DNSRCH)
 
diff --git lib/libc/asr/asr.c lib/libc/asr/asr.c
index 8bcb61b6000..77bc3854420 100644
--- lib/libc/asr/asr.c
+++ lib/libc/asr/asr.c
@@ -661,7 +661,8 @@ pass0(char **tok, int n, struct asr_ctx *ac)
d = strtonum(tok[i] + 6, 1, 16, );
if (e == NULL)
ac->ac_ndots = d;
-   }
+   } else if (!strcmp(tok[i], "trust-ad"))
+   ac->ac_options |= RES_TRUSTAD;
}
}
 }
@@ -672,7 +673,10 @@ pass0(char **tok, int n, struct asr_ctx *ac)
 static int
 asr_ctx_from_string(struct asr_ctx *ac, const char *str)
 {
-   char buf[512], *ch;
+   struct sockaddr_in6 *sin6;
+   struct sockaddr_in  *sin;
+   int  i, trustad;
+   char buf[512], *ch;
 
asr_ctx_parse(ac, str);
 
@@ -702,6 +706,27 @@ asr_ctx_from_string(struct asr_ctx *ac, const char *str)
break;
}
 
+   trustad = 1;
+   for (i = 0; i < ac->ac_nscount && trustad; i++) {
+   switch (ac->ac_ns[i]->sa_family) {
+   case AF_INET:
+   sin = (struct sockaddr_in *)ac->ac_ns[i];
+   if (sin->sin_addr.s_addr != htonl(INADDR_LOOPBACK))
+   trustad = 0;
+   break;
+   case AF_INET6:
+   sin6 = (struct sockaddr_in6 *)ac->ac_ns[i];
+   if (!IN6_IS_ADDR_LOOPBACK(>sin6_addr))
+   trustad = 0;
+   break;
+   default:
+   trustad = 0;
+   break;
+   }
+   }
+   if (trustad)
+   ac->ac_options |= RES_TRUSTAD;
+
return (0);
 }
 
diff --git lib/libc/asr/getrrsetbyname_async.c 
lib/libc/asr/getrrsetbyname_async.c
index e5e7c23c261..06a998b0381 100644
--- lib/libc/asr/getrrsetbyname_async.c
+++ lib/libc/asr/getrrsetbyname_async.c
@@ -32,7 +32,7 @@
 #include "asr_private.h"
 
 static int getrrsetbyname_async_run(struct asr_query *, struct asr_result *);
-static void get_response(struct asr_result *, const char *, int);
+static void get_response(struct asr_result *, const char *, int, int);
 
 struct asr_query *
 getrrsetbyname_async(const char *hostname, unsigned int rdclass,
@@ -150,7 +150,8 @@ getrrsetbyname_async_run(struct asr_query *as, struct 
asr_result *ar)
break;
}
 
-   get_response(ar, ar->ar_data, ar->ar_datalen);
+   get_response(ar, ar->ar_data, ar->ar_datalen,
+   as->as_ctx->ac_options & RES_TRUSTAD);
free(ar->ar_data);
async_set_state(as, ASR_STATE_HALT);
break;
@@ -255,7 +256,7 @@ static void free_dns_response(struct dns_response *);
 static int count_dns_rr(struct dns_rr *, u_int16_t, u_int16_t);
 
 static void
-get_response(struct asr_result *ar, const char *pkt, int pktlen)
+get_response(struct asr_result *ar, const char *pkt, int pktlen, int trustad)
 

Re: snmpd: tweak listen on

2021-11-20 Thread Martijn van Duren
On Sun, 2021-11-14 at 22:30 +0100, Sebastian Benoit wrote:
> If there is no obvious reason (i.e. be different because you need it for a
> specific feature) why not to use the same host*() function as other parse.y?
> it would be better to stay in sync with otehrr daemons. That way if there is
> an issue in one daemon, we can fix it in all of them.
> 
> Or, to turn the argument around: if you have found a way to improve those
> functions in parse.y, you could push for your changes to be applied to all
> parse.y that use the same function.
> 
> This applies to other parse.y functions too. The more they deviate, the
> harder maintanance becomes.

This is the original message[0] (code committed had some tweaks not in this
mail).
In there I mention reyk's commit message saying that host_* could be merged.
This commit tried to implement that (but apparently doesn't hold true any
more, or maybe it never did). Moving away from struct address in host() also
has the benefit that having different implementations of struct address to
be more forgiving and it's less code overall.

Since it took over a year to find this particular edge case I think it could
be a good idea to push this code to the other daemons as well, but I'm short
on time at the moment.

Diff below should fix this particular issue and is easy enough to revert if
we decide to go for the behaviour change of getaddrinfo proposed in my
previous mail.

I'm not afraid of moving the other way in getaddrinfo (that AI_NUMERICHOST
is also subject to the family statement in resolv.conf), because that would
also break all current host_v6 implementations in parse.y, so that would
be an overhaul in all parse.y files anyway.

martijn@
> 
> Martijn van Duren(openbsd+t...@list.imperialat.at) on 2021.11.14 00:23:59 
> +0100:
> > On Sat, 2021-11-13 at 13:23 +, Stuart Henderson wrote:
> > > On 2021/08/09 20:55, Martijn van Duren wrote:
> > > > On Mon, 2021-08-09 at 11:57 +0200, Martijn van Duren wrote:
> > > > > 
> > > > > This diff fixes all of the above:
> > > > > - Allow any to be used resolving to 0.0.0.0 and ::
> > > > > - Set SO_REUSEADDR on sockets, so we can listen on both any and
> > > > > ?? localhost
> > > > > - Document that we listen on any by default
> > > 
> > > I've discovered a problem with this, if you have "family inet4" or
> > > "family inet6" in resolv.conf then startup fails, either with the
> > > implicit listen:
> > > 
> > > snmpd: Unexpected resolving of ::
> > > 
> > > or with explicit e.g. "listen on any snmpv3":
> > > 
> > > /etc/snmpd.conf:3: invalid address: any
> > > 
> > > Config-based workaround is e.g. "listen on 0.0.0.0 snmpv3"
> > > 
> > > Should host() use a specific ai_family instead of PF_UNSPEC where we
> > > already know that it's a v4 or v6 address? Or just do like httpd/parse.y
> > > where host() tries v4, then v6 if that fails, then dns?
> > > 

[0] https://marc.info/?l=openbsd-tech=159838549814986=2

Index: parse.y
===
RCS file: /cvs/src/usr.sbin/snmpd/parse.y,v
retrieving revision 1.72
diff -u -p -r1.72 parse.y
--- parse.y 25 Oct 2021 11:21:32 -  1.72
+++ parse.y 20 Nov 2021 09:19:00 -
@@ -1600,7 +1600,16 @@ host(const char *s, const char *port, in
bzero(, sizeof(hints));
hints.ai_family = PF_UNSPEC;
hints.ai_socktype = type;
+   /*
+* Without AI_NUMERICHOST getaddrinfo might not resolve ip addresses
+* for families not specified in the "family" statement in resolv.conf.
+*/
+   hints.ai_flags = AI_NUMERICHOST;
error = getaddrinfo(s, port, , );
+   if (error == EAI_NONAME) {
+   hints.ai_flags = 0;
+   error = getaddrinfo(s, port, , );
+   }
if (error == EAI_AGAIN || error == EAI_NODATA || error == EAI_NONAME)
return 0;
if (error) {