Re: key guessing for signify -C

2020-01-20 Thread Ted Unangst
Theo Buehler wrote:
> Two small things for signify -C:
> 
> Contrary to what usage suggests, the -p pubkey argument for signify -C
> is optional in that signify will use the key specified in the untrusted
> comment. In -V mode, the key can be tied down a little by specifying -t.
> 
> Right now, the -t keytype argument is silently ignored in -C mode:
> 
> $ ftp 
> https://cdn.openbsd.org/pub/OpenBSD/6.6/amd64/{INSTALL.amd64,SHA256.sig} 
> >/dev/null
> $ signify -C -t pkg -x SHA256.sig INSTALL.amd64
> Signature Verified
> INSTALL.amd64: OK
> 
> Diff below fixes usage and makes the above command error out with:

yes, that's an oversight. this looks good.



key guessing for signify -C

2020-01-20 Thread Theo Buehler
Two small things for signify -C:

Contrary to what usage suggests, the -p pubkey argument for signify -C
is optional in that signify will use the key specified in the untrusted
comment. In -V mode, the key can be tied down a little by specifying -t.

Right now, the -t keytype argument is silently ignored in -C mode:

$ ftp https://cdn.openbsd.org/pub/OpenBSD/6.6/amd64/{INSTALL.amd64,SHA256.sig} 
>/dev/null
$ signify -C -t pkg -x SHA256.sig INSTALL.amd64
Signature Verified
INSTALL.amd64: OK

Diff below fixes usage and makes the above command error out with:

signify: incorrect keytype: openbsd-66-base.pub is not pkg

while -t base will work

$ signify -C -t base -x SHA256.sig INSTALL.amd64
Signature Verified
INSTALL.amd64: OK

Index: signify.1
===
RCS file: /var/cvs/src/usr.bin/signify/signify.1,v
retrieving revision 1.48
diff -u -p -r1.48 signify.1
--- signify.1   10 Aug 2019 03:56:02 -  1.48
+++ signify.1   8 Nov 2019 17:04:01 -
@@ -24,7 +24,8 @@
 .Nm signify
 .Fl C
 .Op Fl q
-.Fl p Ar pubkey
+.Op Fl p Ar pubkey
+.Op Fl t Ar keytype
 .Fl x Ar sigfile
 .Op Ar
 .Nm signify
Index: signify.c
===
RCS file: /var/cvs/src/usr.bin/signify/signify.c,v
retrieving revision 1.134
diff -u -p -r1.134 signify.c
--- signify.c   22 Dec 2019 06:37:25 -  1.134
+++ signify.c   25 Dec 2019 12:37:08 -
@@ -78,7 +78,7 @@ usage(const char *error)
fprintf(stderr, "%s\n", error);
fprintf(stderr, "usage:"
 #ifndef VERIFYONLY
-   "\t%1$s -C [-q] -p pubkey -x sigfile [file ...]\n"
+   "\t%1$s -C [-q] [-p pubkey] [-t keytype] -x sigfile [file ...]\n"
"\t%1$s -G [-n] [-c comment] -p pubkey -s seckey\n"
"\t%1$s -S [-enz] [-x sigfile] -s seckey -m message\n"
 #endif
@@ -715,13 +715,13 @@ verifychecksums(char *msg, int argc, cha
 }
 
 static void
-check(const char *pubkeyfile, const char *sigfile, int quiet, int argc,
-char **argv)
+check(const char *pubkeyfile, const char *sigfile, const char *keytype,
+int quiet, int argc, char **argv)
 {
unsigned long long msglen;
uint8_t *msg;
 
-   msg = verifyembedded(pubkeyfile, sigfile, quiet, &msglen, NULL);
+   msg = verifyembedded(pubkeyfile, sigfile, quiet, &msglen, keytype);
verifychecksums((char *)msg, argc, argv, quiet);
 
free(msg);
@@ -846,7 +846,7 @@ main(int argc, char **argv)
err(1, "pledge");
if (!sigfile)
usage("must specify sigfile");
-   check(pubkeyfile, sigfile, quiet, argc, argv);
+   check(pubkeyfile, sigfile, keytype, quiet, argc, argv);
return 0;
}
 #endif



Re: [patch] signify's file name parsing broken

2020-01-20 Thread Theo de Raadt
Ted Unangst  wrote:

> MarcusMüller wrote:
> > I've just stumbled across a malfunction in signify: It cannot handle
> > file names that contain a `)` character, when checking a list of hashes
> > generated by `sha256` command line utilities (`sha256sum --tags` on
> > Linux). 
> 
> This fix is unfortunately rather complicated for the problem. Files with ) are
> not used within openbsd, for instance. It may possible to simplify it a bit? 

A similar construct is in bin/md5/md5.c, for the sha256 -C option.
This does not use sscanf, but hand-parses the string.  It uses strrchr
to search for ')'.

I think this should be fixed, but agree the diff seems complicated.



Re: [patch] signify's file name parsing broken

2020-01-20 Thread Ted Unangst
MarcusMüller wrote:
> I've just stumbled across a malfunction in signify: It cannot handle
> file names that contain a `)` character, when checking a list of hashes
> generated by `sha256` command line utilities (`sha256sum --tags` on
> Linux). 

This fix is unfortunately rather complicated for the problem. Files with ) are
not used within openbsd, for instance. It may possible to simplify it a bit? 



Re: bgpd, remove getifaddrs call in RDE

2020-01-20 Thread Claudio Jeker
On Thu, Jan 09, 2020 at 02:45:46PM +0100, Claudio Jeker wrote:
> The RDE needs to know the local v4 and v6 address of a session so that
> nexthop self works. Until now the lookup for the other AF address was done
> in the RDE when the session got established. This diff moves this code
> over to the SE where it fits better. Especially this allows to remove the
> route pledge from the RDE.
> 
> This diff works for me but I would like more tests especially on link
> local IPv6 sessions (the KAME embedded scope curse haunts me again).

ping...

-- 
:wq Claudio

? obj
Index: bgpd.h
===
RCS file: /cvs/src/usr.sbin/bgpd/bgpd.h,v
retrieving revision 1.397
diff -u -p -r1.397 bgpd.h
--- bgpd.h  9 Jan 2020 11:55:25 -   1.397
+++ bgpd.h  9 Jan 2020 13:37:19 -
@@ -656,7 +656,8 @@ struct kif {
 };
 
 struct session_up {
-   struct bgpd_addrlocal_addr;
+   struct bgpd_addrlocal_v4_addr;
+   struct bgpd_addrlocal_v6_addr;
struct bgpd_addrremote_addr;
struct capabilities capa;
u_int32_t   remote_bgpid;
Index: rde.c
===
RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v
retrieving revision 1.498
diff -u -p -r1.498 rde.c
--- rde.c   9 Jan 2020 13:31:52 -   1.498
+++ rde.c   9 Jan 2020 13:37:19 -
@@ -177,7 +177,7 @@ rde_main(int debug, int verbose)
setresuid(pw->pw_uid, pw->pw_uid, pw->pw_uid))
fatal("can't drop privileges");
 
-   if (pledge("stdio route recvfd", NULL) == -1)
+   if (pledge("stdio recvfd", NULL) == -1)
fatal("pledge");
 
signal(SIGTERM, rde_sighdlr);
Index: rde_peer.c
===
RCS file: /cvs/src/usr.sbin/bgpd/rde_peer.c,v
retrieving revision 1.2
diff -u -p -r1.2 rde_peer.c
--- rde_peer.c  9 Jan 2020 13:31:52 -   1.2
+++ rde_peer.c  9 Jan 2020 13:37:19 -
@@ -17,9 +17,7 @@
  */
 #include 
 #include 
-#include 
 
-#include 
 #include 
 #include 
 #include 
@@ -306,102 +304,6 @@ rde_up_dump_done(void *ptr, u_int8_t aid
fatal("%s: prefix_dump_new", __func__);
 }
 
-static int
-sa_cmp(struct bgpd_addr *a, struct sockaddr *b)
-{
-   struct sockaddr_in  *in_b;
-   struct sockaddr_in6 *in6_b;
-
-   if (aid2af(a->aid) != b->sa_family)
-   return (1);
-
-   switch (b->sa_family) {
-   case AF_INET:
-   in_b = (struct sockaddr_in *)b;
-   if (a->v4.s_addr != in_b->sin_addr.s_addr)
-   return (1);
-   break;
-   case AF_INET6:
-   in6_b = (struct sockaddr_in6 *)b;
-#ifdef __KAME__
-   /* directly stolen from sbin/ifconfig/ifconfig.c */
-   if (IN6_IS_ADDR_LINKLOCAL(&in6_b->sin6_addr)) {
-   in6_b->sin6_scope_id =
-   ntohs(*(u_int16_t *)&in6_b->sin6_addr.s6_addr[2]);
-   in6_b->sin6_addr.s6_addr[2] =
-   in6_b->sin6_addr.s6_addr[3] = 0;
-   }
-#endif
-   if (bcmp(&a->v6, &in6_b->sin6_addr,
-   sizeof(struct in6_addr)))
-   return (1);
-   break;
-   default:
-   fatal("king bula sez: unknown address family");
-   /* NOTREACHED */
-   }
-
-   return (0);
-}
-
-/*
- * Figure out the local IP addresses most suitable for this session.
- * This looks up the local address of other address family based on
- * the address of the TCP session.
- */
-static int
-peer_localaddrs(struct rde_peer *peer, struct bgpd_addr *laddr)
-{
-   struct ifaddrs  *ifap, *ifa, *match;
-
-   if (getifaddrs(&ifap) == -1)
-   fatal("getifaddrs");
-
-   for (match = ifap; match != NULL; match = match->ifa_next)
-   if (sa_cmp(laddr, match->ifa_addr) == 0)
-   break;
-
-   if (match == NULL) {
-   log_warnx("peer_localaddrs: local address not found");
-   return (-1);
-   }
-
-   for (ifa = ifap; ifa != NULL; ifa = ifa->ifa_next) {
-   if (ifa->ifa_addr->sa_family == AF_INET &&
-   strcmp(ifa->ifa_name, match->ifa_name) == 0) {
-   if (ifa->ifa_addr->sa_family ==
-   match->ifa_addr->sa_family)
-   ifa = match;
-   sa2addr(ifa->ifa_addr, &peer->local_v4_addr, NULL);
-   break;
-   }
-   }
-   for (ifa = ifap; ifa != NULL; ifa = ifa->ifa_next) {
-   if (ifa->ifa_addr->sa_family == AF_INET6 &&
-   strcmp(ifa->ifa_name, match->ifa_name) == 0) {
-   /*
-* only accept global scope addresses except explicitly
-

Re: piixpm(4) on ATI SBx00

2020-01-20 Thread Claudio Jeker
On Tue, Jan 07, 2020 at 12:44:59PM +0100, Claudio Jeker wrote:
> On Tue, Jan 07, 2020 at 09:27:50AM +0100, Claudio Jeker wrote:
> > In -current I added support for the additional I2C busses on piixpm(4)
> > now I noticed that on my old AMD system the I2C bus seems to either
> > connect all those 4 busses together (or there is a bug in the driver).
> 
> Looks like this is indeed a problem with the driver.
> 
> With the following diff I now get:
> piixpm0 at pci0 dev 20 function 0 "ATI SBx00 SMBus" rev 0x41: polling
> iic0 at piixpm0
> sdtemp0 at iic0 addr 0x18: stts2002
> sdtemp1 at iic0 addr 0x19: stts2002
> sdtemp2 at iic0 addr 0x1a: stts2002
> sdtemp3 at iic0 addr 0x1b: stts2002
> spdmem0 at iic0 addr 0x50: 4GB DDR3 SDRAM ECC PC3-10600 with thermal sensor
> spdmem1 at iic0 addr 0x51: 4GB DDR3 SDRAM ECC PC3-10600 with thermal sensor
> spdmem2 at iic0 addr 0x52: 4GB DDR3 SDRAM ECC PC3-10600 with thermal sensor
> spdmem3 at iic0 addr 0x53: 4GB DDR3 SDRAM ECC PC3-10600 with thermal sensor
> iic1 at piixpm0   
>   piixpm0: exec: op 1, addr 0x18, cmdlen 1, len 0, flags 0x08: timeout, 
> status 0x1
> 
> iic2 at piixpm0
> iic3 at piixpm0
> 
> Not sure what is up with the timeout on iic @ addr 0x18. If that is seen
> on more systems it may be better to only use the primary bus on SB800
> devices that don't support SB800_PMREG_SMB0SEL.
> 
> Please test and report back.

Picking this up again. Since the second bus causes read timeouts and the
docu mentions that this bus is used for fan and temp control I think it is
better to just not attach the extra busses on these old AMD chipsets.
The additional SB800 busses will only be used if SB800_PMREG_SMB0SELEN
tells us that SMB0SELEN is enabled.

While looking at this I also investigated a bit more into the
PCI_PRODUCT_AMD_HUDSON2_SMB issues. So PCI ID 1022:780b are used by AMD
Bolton and Family 16h model 30h-3fh. AMD Bolton still needs the old layout
while the family 16h chips use the FCH register layout. So merge the if
statements a bit so that AMD Bolton uses the old code path and Family 16h
uses the new. 

Last but not least I was informed that we reversed the meaning of
SB800_SMB_HOSTC_SMI (if SB800_SMB_HOSTC bit 1 is set then use IRQ else
SMI). This is visible in the dmesg line of piixpm.

Please test this since I can't test this properly at the moment.
Works for me on
piixpm0 at pci0 dev 20 function 0 "AMD FCH SMBus" rev 0x61: SMI 

-- 
:wq Claudio

Index: piixpm.c
===
RCS file: /cvs/src/sys/dev/pci/piixpm.c,v
retrieving revision 1.41
diff -u -p -r1.41 piixpm.c
--- piixpm.c9 Jan 2020 14:35:19 -   1.41
+++ piixpm.c21 Jan 2020 01:23:21 -
@@ -158,8 +158,13 @@ piixpm_attach(struct device *parent, str
return;
}
 
+   /*
+* AMD Bolton matches PCI_PRODUCT_AMD_HUDSON2_SMB but
+* uses old register layout. Therefor check PCI_REVISION.
+*/
if (PCI_VENDOR(pa->pa_id) == PCI_VENDOR_AMD &&
-   (PCI_PRODUCT(pa->pa_id) == PCI_PRODUCT_AMD_HUDSON2_SMB ||
+   ((PCI_PRODUCT(pa->pa_id) == PCI_PRODUCT_AMD_HUDSON2_SMB &&
+   PCI_REVISION(pa->pa_class) >= 0x1f) ||
PCI_PRODUCT(pa->pa_id) == PCI_PRODUCT_AMD_KERNCZ_SMB)) {
bus_space_write_1(sc->sc_iot, ioh, 0,
AMDFCH41_PM_DECODE_EN);
@@ -170,6 +175,9 @@ piixpm_attach(struct device *parent, str
AMDFCH41_PM_DECODE_EN + 1);
val = bus_space_read_1(sc->sc_iot, ioh, 1) << 8;
base = val;
+
+   sc->sc_is_fch = 1;
+   numbusses = 2;
} else {
/* Read "SmBus0En" */
bus_space_write_1(sc->sc_iot, ioh, 0,
@@ -181,6 +189,14 @@ piixpm_attach(struct device *parent, str
val |= (bus_space_read_1(sc->sc_iot, ioh, 1) << 8);
smb0en = val & SB800_SMB0EN_EN;
base = val & SB800_SMB0EN_BASE_MASK;
+
+   bus_space_write_1(sc->sc_iot, ioh, 0,
+   SB800_PMREG_SMB0SELEN);
+   val = bus_space_read_1(sc->sc_iot, ioh, 1);
+   if (val & SB800_SMB0SELEN_EN) {
+   sc->sc_is_sb800 = 1;
+   numbusses = 4;
+   }
}
 
if (smb0en == 0) {
@@ -190,17 +206,6 @@ piixpm_attach(struct device *parent, str
}
 
sc->sc_sb800_ioh = ioh;
-   if ((PCI_VENDOR(pa->pa_id) == PCI_VENDOR_AMD &&
-   PCI_PRODUCT(pa->pa_id) == PCI_PRODUCT_AMD_KERNCZ_SMB) ||
-   (PCI_VENDOR(pa->pa_id) == PCI_VENDOR_AMD &&
-

Re: securelevel.7: Clarify mem(4) semantics

2020-01-20 Thread Theo de Raadt
Klemens Nanni  wrote:

> On Sun, Jan 19, 2020 at 11:50:01AM -0700, Theo de Raadt wrote:
> > Your text seems somewhat backwards, because if you can't open it, it
> > doesn't matter if it is read-only, it is read-not.
> It starts with what holds true unconditionally: the fact that you cannot
> write;  then it describes an additional case that has another condition
> to it.
> 
> Is this version better?

  1 Secure mode
   -   default mode when system is multi-user
   -   securelevel may no longer be lowered except by init
   -   /dev/mem and /dev/kmem cannot be opened
   -   raw disk devices of mounted file systems are read-only
...


The list is "what is blocked".  Changing this to a complicated set of
conditions regarding unrelated conditions makes this difficult to read.

The test is:

switch (minor(dev)) {
case 0:
case 1:
if (securelevel <= 0 || allowkmem)
break;
return (EPERM);

Maybe
/dev/mem and /dev/kmem cannot be opened (unless permitted by sysctl(8)
kern.allowkmem)

Your text about read-only must be incorrect.  That is decided by the
permission of the nodes in the filesystem, not by securelevel or
kern.allowkmem the idea is you can't open it, so you can't read/write/map.

> Index: share/man/man7/securelevel.7
> ===
> RCS file: /cvs/src/share/man/man7/securelevel.7,v
> retrieving revision 1.31
> diff -u -p -r1.31 securelevel.7
> --- share/man/man7/securelevel.7  21 Aug 2019 20:44:09 -  1.31
> +++ share/man/man7/securelevel.7  19 Jan 2020 20:40:30 -
> @@ -66,7 +66,10 @@ securelevel may no longer be lowered exc
>  .Pa /dev/mem
>  and
>  .Pa /dev/kmem
> -cannot be opened
> +can only be opened if the
> +.Va kern.allowkmem
> +.Xr sysctl 8
> +variable is set, both are read-only
>  .It
>  raw disk devices of mounted file systems are read-only
>  .It



Re: bnxt(4), myx(4), vr(4): refill timeouts: timeout_add(..., 0) -> timeout_add(..., 1)

2020-01-20 Thread Hrvoje Popovski
On 20.1.2020. 17:40, Scott Cheloha wrote:
> Appreciate the testing.

np, i like testing network stuff :)

> Given what dlg@ has said in the past I think there should only be a
> performance change in a livelock situation. 

yeah, that could be problem with this testing ...
kern.netlivelocks=6 after few minutes of sending 14Mpps through myx :)

> What metric did you use to compare performance?

i'm generating 64byte udp packets with pktgen through openbsd box (plain
forwarding only)
https://www.kernel.org/doc/Documentation/networking/pktgen.txt

on linux box i'm counting pps on interface from which i'm sending
packets to openbsd box and on interface on which i'm receiving packets
from openbsd box ..

on box with 12 x Xenon CPU E5-2620 v2 @ 2.10GHz, 2400.01 MHz, 06-3e-04
with or without this diff i'm getting 890Kpps of plain forwarding



Naive patch to allow amd(8) to mount NFS with wxallowed

2020-01-20 Thread Andreas Kusalananda Kähäri
I noticed that the following amd(8) map did not actually mount the
remote NFS filesystem with the wxallowed mount option:

/defaults   sublink:=${key}
*   host==eeyore;type:=link;fs:=/vol/local/${host}; \
host!=eeyore;type:=nfs;rhost:=eeyore;rfs:=/vol/local/${host};\
opts:=rw,tcp,intr,soft,nosuid,nodevs,wxallowed;

On a machine "box", this mounts "eeyore:/vol/local/box" as

eeyore:/vol/local/box on /tmp_mnt/eeyore/vol/local/box type nfs (nodev, nosuid, 
v2, tcp, soft, intr, timeo=100)

Looking at src/usr.sbin/amd/amd/mount_fs.c, this appears to be because
only a select set of mount options are being allowed.  These are "ro",
"nodev", "noexec", "nosuid", and "sync".

I tested the attached very naive patch on my amd64-current system, and
it *seems* to work:

eeyore:/vol/local/box on /tmp_mnt/eeyore/vol/local/box type nfs (nodev, nosuid, 
wxallowed, v2, tcp, soft, intr, timeo=100) 

Regards,

-- 
Andreae (Kusalananda) Kähäri
SciLifeLab, NBIS, ICM
Uppsala University, Sweden

.
Index: mount_fs.c
===
RCS file: /extra/cvs/src/usr.sbin/amd/amd/mount_fs.c,v
retrieving revision 1.14
diff -u -p -r1.14 mount_fs.c
--- mount_fs.c  20 Oct 2014 06:55:59 -  1.14
+++ mount_fs.c  20 Jan 2020 22:34:32 -
@@ -50,6 +50,7 @@ struct opt_tab mnt_flags[] = {
{ "noexec", MNT_NOEXEC },
{ "nosuid", MNT_NOSUID },
{ "sync",   MNT_SYNCHRONOUS },
+   { "wxallowed",  MNT_WXALLOWED },
{ 0, 0 }
 };
 


Re: em(4) diff to test

2020-01-20 Thread Hrvoje Popovski
On 20.1.2020. 16:42, Martin Pieuchot wrote:
> Diff below is a refactoring of the actual em(4) code and defines that
> will allows me to present a shorter diff to interrupt multiple CPUs and
> make use of multiple queues.
> 
> It contains the following items:
> 
>   - Abstract the allocation/freeing of TX/RX ring into em_dma_malloc().
> This will ease the introduction of multiple rings.
> 
>   - Split the 82576 variant out of 82575.  The distinction is necessary
> when it comes to setting multiple queues.
> 
>   - Change multiple TX/RX related macro to take an index argument
> corresponding to a ring.  Currently only the index 0 and 1 are used.
> 
>   - Gather and print more stats counters
> 
>   - Switch to using a function, like FreeBSD, to translate 82542
> registers and get rid of a set of defines.
> 
> It has been tested one the models below, I'd like to be sure there isn't
> any fallout with this part before continuing the effort.
> 
>   em0 at pci0 dev 25 function 0 "Intel 82577LM" rev 0x06: msi
>   em0 at pci0 dev 25 function 0 "Intel 82579LM" rev 0x04: msi
>   em0 at pci0 dev 25 function 0 "Intel I218-V" rev 0x03: msi
>   em0 at pci0 dev 25 function 0 Intel I218-LM rev 0x04: msi
>   em0 at pci0 dev 31 function 6 "Intel I219-V" rev 0x21: msi

Hi,

i tested this diff on
em0 at pci7 dev 0 function 0 "Intel I350" rev 0x01: msi
em1 at pci7 dev 0 function 1 "Intel I350" rev 0x01: msi

with lots of ifconfig em up/down and sh netstart and box seems stable



Re: New pseudo-driver dt(4): dynamic tracer

2020-01-20 Thread Martin Pieuchot
On 20/01/20(Mon) 07:19, Mark Kettenis wrote:
> > [...] 
> > I'd appreciate particular review of the following items:
> > 
> >  * Event producer/consumer code which currently needs a mutex.  The
> >current implementation doesn't always use a PCB per-CPU.  Moving
> >to a lockless implementation would be beneficial for recording in
> >deeper places of the kernel.
> 
> I think that should be possible, but I'd have to think about that for
> a bit longer.  Shouldn't prevent this from going in.  And it might
> mean we can't support hppa MP support kernels (but that's not a big
> issue).

The current kernel atomic_* API should be good enough to make such code
usable on hppa.  Obviously, tracing a high number of events might result
in contention leading to the loss of some of them.

The current number of events read & dropped during a btrace(8) recording
is displayed when using '-v'.  For example capturing events every time a
thread gets executed, like in the following, usually result in events
being dropped.  This happens because the rings are consciously undersized
to force the developers to revisit this part of the code once we figure 
out how much we can allocate per-PCB for real use cases.

btrace -ve \
   'tracepoint:sched:on__cpu { printf("%u: run %s(%d)\n",nsecs, comm, pid); }'

...
579539503551390402: run softnet(80542)
^C4747 events read
13 events dropped

> >  * Barriers for enabling/disabling probes.  dt(4) can be opened multiple
> >times allowing multiple processes to gather event in parallel.  I'd
> >like to be sure a close(2) doesn't stop another thread from capturing
> >events.
> 
> Is it absolutely neccessary that we support multiple processes gathering?

It isn't.  However after so many years of MP development I'm still
unable to write such code.  That's why I'm asking for help as I believe
this knowledge might be useful to me and many of us. 

> >  * Read wakeup.  Currently a mutex is used to not loose a wakeup event
> >in dtread(), however this mutex creates lock ordering problems with
> >the SCHED_LOCK() when probes are executed to profile the scheduler.
> 
> Hmm, isn't it possible to avoid this by using sleep_setup() and
> sleep_finish_all() manually?  I might give that a go once this is in.

Thanks.

> > I included changes to MAKEDEV on all archs where the dev number 30 is
> > still free.  I'll pick the closest number for others archs if that's ok. 
> > 
> > The man page doesn't include a list of ioctl(2) as I believe they will
> > change soon.  I'll send another mail for the userland interface.
> > 
> > Ok to continue in-tree?
> 
> Yes!

Thanks :)
 
> Is there a particular reason your changing db_expr_t into long in some
> of the places?  I'm not necessarily against removing that useless
> abstraction, but this seems to be only a partial change.

With the current changes from visa@ to move the stacktrace saving API
outside of DDB it is no longer necessary.  Why might also want to
introduce a variant of the API to be able to skip a fix number of frames
to not always include the ones required by the dt(4) code.  For example
on amd64 for the profile provider used for FlameGraphs:

  dt_prov_profile_enter+0x6e
  hardclock+0x12c
  clockintr+0x59
  intr_handler+0x6e
  Xresume_legacy0+0x1d3
  cpu_idle_cycle+0x1b < start here.
  proc_trampoline+0x1c



Re: bnxt(4), myx(4), vr(4): refill timeouts: timeout_add(..., 0) -> timeout_add(..., 1)

2020-01-20 Thread Scott Cheloha
> On Jan 20, 2020, at 4:47 AM, Hrvoje Popovski  wrote:
> 
> On 16.1.2020. 18:45, Scott Cheloha wrote:
>> Here's a first batch of conversions: rx refill timeouts for bnxt(4),
>> myx(4), and vr(4).  All of these can run during a softclock().  Will
>> changing these to one tick break these drivers?
> 
> Hi all,
> 
> i tried this diff with myx and performance are the same as with plain
> kernel.
> 
> myx0 at pci4 dev 0 function 0 "Myricom Z8E" rev 0x01: msi, model
> 10G-PCIE2-8BL2-2S, address 00:60:dd:45:ba:f8
> myx1 at pci5 dev 0 function 0 "Myricom Z8E" rev 0x01: msi, model
> 10G-PCIE2-8BL2-2S, address 00:60:dd:45:ba:f9

Appreciate the testing.

Given what dlg@ has said in the past I think there should only be a
performance change in a livelock situation.

What metric did you use to compare performance?

I think you could induce a livelock with tcpbench(8) or iperf3 from ports.
CC'd bluhm@, who would know more.

To be clear, I'm of the opinion that we should not be spinning the softclock()
unless the alternative is palpably reduced performance for a significant group
of users.  However, I don't want to tread on network performance without 
analysis
and buy-in from the people who work on that, hence all the CCs.

Probably missing a few CCs, but I figured I'd start with the API and driver
authors.



Re: ospf(6)d: allow "type p2p" globally or per area

2020-01-20 Thread Denis Fondras
On Sun, Jan 19, 2020 at 11:04:16PM +0100, Remi Locherer wrote:
> This makes the interface setting "type p2p" configurable globally or
> per area. ospf(6)d allows this for almost all interface related settings.
> 
> As a side-effect of this diff ospf(6)d -nv prints "type p2p" also for
> point-to-point interfaces like gif or gre. I think this is an advantage
> but I can also change that by re-introducing the iface->p2p variable.
> 
> OK?
> 

diff looks good. Is it really useful to set p2p globally ?

> Remi
> 
> 
> 
> Index: ospf6d/ospf6d.h
> ===
> RCS file: /cvs/src/usr.sbin/ospf6d/ospf6d.h,v
> retrieving revision 1.44
> diff -u -p -r1.44 ospf6d.h
> --- ospf6d/ospf6d.h   3 Jan 2020 17:45:02 -   1.44
> +++ ospf6d/ospf6d.h   12 Jan 2020 21:44:41 -
> @@ -329,7 +329,6 @@ struct iface {
>   u_int8_t if_type;
>   u_int8_t linkstate;
>   u_int8_t priority;
> - u_int8_t p2p;
>   u_int8_t cflags;
>  #define F_IFACE_PASSIVE  0x01
>  #define F_IFACE_CONFIGURED   0x02
> Index: ospf6d/parse.y
> ===
> RCS file: /cvs/src/usr.sbin/ospf6d/parse.y,v
> retrieving revision 1.48
> diff -u -p -r1.48 parse.y
> --- ospf6d/parse.y26 Dec 2019 10:24:18 -  1.48
> +++ ospf6d/parse.y19 Jan 2020 21:51:56 -
> @@ -102,6 +102,7 @@ struct config_defaults {
>   u_int16_t   rxmt_interval;
>   u_int16_t   metric;
>   u_int8_tpriority;
> + u_int8_tp2p;
>  };
>  
>  struct config_defaultsglobaldefs;
> @@ -449,6 +450,9 @@ defaults  : METRIC NUMBER {
>   }
>   defs->rxmt_interval = $2;
>   }
> + | TYPE P2P  {
> + defs->p2p = 1;
> + }
>   ;
>  
>  optnl: '\n' optnl
> @@ -550,6 +554,8 @@ interface : INTERFACE STRING  {
>   iface->metric = defs->metric;
>   iface->priority = defs->priority;
>   iface->cflags |= F_IFACE_CONFIGURED;
> + if (defs->p2p == 1)
> + iface->type = IF_TYPE_POINTOPOINT;
>   iface = NULL;
>   /* interface is always part of an area */
>   defs = &areadefs;
> @@ -566,10 +572,6 @@ interfaceopts_l  : interfaceopts_l interf
>   ;
>  
>  interfaceoptsl   : PASSIVE   { iface->cflags |= 
> F_IFACE_PASSIVE; }
> - | TYPE P2P  {
> - iface->p2p = 1;
> - iface->type = IF_TYPE_POINTOPOINT;
> - }
>   | DEMOTE STRING {
>   if (strlcpy(iface->demote_group, $2,
>   sizeof(iface->demote_group)) >=
> @@ -1034,6 +1036,7 @@ parse_config(char *filename, int opts)
>   defs->rxmt_interval = DEFAULT_RXMT_INTERVAL;
>   defs->metric = DEFAULT_METRIC;
>   defs->priority = DEFAULT_PRIORITY;
> + defs->p2p = 0;
>  
>   conf->spf_delay = DEFAULT_SPF_DELAY;
>   conf->spf_hold_time = DEFAULT_SPF_HOLDTIME;
> Index: ospf6d/printconf.c
> ===
> RCS file: /cvs/src/usr.sbin/ospf6d/printconf.c,v
> retrieving revision 1.9
> diff -u -p -r1.9 printconf.c
> --- ospf6d/printconf.c26 Dec 2019 10:24:18 -  1.9
> +++ ospf6d/printconf.c12 Jan 2020 21:43:06 -
> @@ -1,4 +1,5 @@
> -/*   $OpenBSD: printconf.c,v 1.9 2019/12/26 10:24:18 remi Exp $ */
> +/*  $OpenBSD: printconf.c,v 1.9 2019/12/26 10:24:18 remi Exp $
> +*/
>  
>  /*
>   * Copyright (c) 2004, 2005 Esben Norby 
> @@ -135,7 +136,7 @@ print_iface(struct iface *iface)
>   printf("\t\trouter-priority %d\n", iface->priority);
>   printf("\t\ttransmit-delay %d\n", iface->transmit_delay);
>  
> - if (iface->p2p)
> + if (iface->type == IF_TYPE_POINTOPOINT)
>   printf("\t\ttype p2p\n");
>  
>   printf("\t}\n");
> Index: ospfd/ospfd.c
> ===
> RCS file: /cvs/src/usr.sbin/ospfd/ospfd.c,v
> retrieving revision 1.110
> diff -u -p -r1.110 ospfd.c
> --- ospfd/ospfd.c 23 Nov 2019 15:05:21 -  1.110
> +++ ospfd/ospfd.c 18 Jan 2020 14:02:04 -
> @@ -893,7 +893,6 @@ merge_interfaces(struct area *a, struct 
>   if (i->self)
>   i->self->priority = i->priority;
>   i->flags = xi->flags; /* needed? */
> - i->type = xi->type; /* needed? */
>   i->if_type = xi->if_type; /* needed? */
>   i->linkstate = xi->linkstate; /* needed? */
>  
> @@ -915,11 +914,11 @@ merge_interfaces(struct area *a, struct 
>   if_fsm(i, IF_EVT_UP);
>   

ospf6d: simplify lsa_snap()

2020-01-20 Thread Denis Fondras
No need to pass peerid to lsa_snap()

While at it, remove unused variable.

Index: rde.c
===
RCS file: /cvs/src/usr.sbin/ospf6d/rde.c,v
retrieving revision 1.82
diff -u -p -r1.82 rde.c
--- rde.c   2 Jan 2020 10:16:46 -   1.82
+++ rde.c   20 Jan 2020 09:23:01 -
@@ -345,7 +345,7 @@ rde_dispatch_imsg(int fd, short event, v
if (nbr == NULL)
break;
 
-   lsa_snap(nbr, imsg.hdr.peerid);
+   lsa_snap(nbr);
 
imsg_compose_event(iev_ospfe, IMSG_DB_END, 
imsg.hdr.peerid,
0, -1, NULL, 0);
Index: rde.h
===
RCS file: /cvs/src/usr.sbin/ospf6d/rde.h,v
retrieving revision 1.23
diff -u -p -r1.23 rde.h
--- rde.h   22 Dec 2019 11:19:06 -  1.23
+++ rde.h   20 Jan 2020 09:23:01 -
@@ -159,7 +159,7 @@ struct vertex   *lsa_find_tree(struct lsa_
 u_int32_t   lsa_find_lsid(struct lsa_tree *, u_int16_t, u_int32_t,
int (*)(struct lsa *, struct lsa *), struct lsa *);
 u_int16_t   lsa_num_links(struct vertex *);
-voidlsa_snap(struct rde_nbr *, u_int32_t);
+voidlsa_snap(struct rde_nbr *);
 voidlsa_dump(struct lsa_tree *, int, pid_t);
 voidlsa_merge(struct rde_nbr *, struct lsa *, struct vertex *);
 voidlsa_remove_invalid_sums(struct area *);
Index: rde_lsdb.c
===
RCS file: /cvs/src/usr.sbin/ospf6d/rde_lsdb.c,v
retrieving revision 1.41
diff -u -p -r1.41 rde_lsdb.c
--- rde_lsdb.c  2 Jan 2020 10:16:46 -   1.41
+++ rde_lsdb.c  20 Jan 2020 09:23:01 -
@@ -39,8 +39,6 @@ intlsa_get_prefix(void *, u_int16_t, 
 
 RB_GENERATE(lsa_tree, vertex, entry, lsa_compare)
 
-extern struct ospfd_conf   *rdeconf;
-
 void
 lsa_init(struct lsa_tree *t)
 {
@@ -235,6 +233,7 @@ lsa_check(struct rde_nbr *nbr, struct ls
case LSA_TYPE_NETWORK:
if ((len % sizeof(u_int32_t)) ||
len < sizeof(lsa->hdr) + sizeof(u_int32_t)) {
+   log_warnx("lsa_check: bad LSA network packet");
return (0);
}
break;
@@ -716,7 +715,7 @@ lsa_num_links(struct vertex *v)
 }
 
 void
-lsa_snap(struct rde_nbr *nbr, u_int32_t peerid)
+lsa_snap(struct rde_nbr *nbr)
 {
struct lsa_tree *tree = &nbr->area->lsa_tree;
struct vertex   *v;
@@ -727,11 +726,13 @@ lsa_snap(struct rde_nbr *nbr, u_int32_t 
continue;
lsa_age(v);
if (ntohs(v->lsa->hdr.age) >= MAX_AGE) {
-   rde_imsg_compose_ospfe(IMSG_LS_SNAP, peerid,
-   0, &v->lsa->hdr, ntohs(v->lsa->hdr.len));
+   rde_imsg_compose_ospfe(IMSG_LS_SNAP,
+   nbr->peerid, 0, &v->lsa->hdr,
+   ntohs(v->lsa->hdr.len));
} else {
-   rde_imsg_compose_ospfe(IMSG_DB_SNAPSHOT, peerid,
-   0, &v->lsa->hdr, sizeof(struct lsa_hdr));
+   rde_imsg_compose_ospfe(IMSG_DB_SNAPSHOT,
+   nbr->peerid, 0, &v->lsa->hdr,
+   sizeof(struct lsa_hdr));
}
}
if (tree == &asext_tree)



em(4) diff to test

2020-01-20 Thread Martin Pieuchot
Diff below is a refactoring of the actual em(4) code and defines that
will allows me to present a shorter diff to interrupt multiple CPUs and
make use of multiple queues.

It contains the following items:

  - Abstract the allocation/freeing of TX/RX ring into em_dma_malloc().
This will ease the introduction of multiple rings.

  - Split the 82576 variant out of 82575.  The distinction is necessary
when it comes to setting multiple queues.

  - Change multiple TX/RX related macro to take an index argument
corresponding to a ring.  Currently only the index 0 and 1 are used.

  - Gather and print more stats counters

  - Switch to using a function, like FreeBSD, to translate 82542
registers and get rid of a set of defines.

It has been tested one the models below, I'd like to be sure there isn't
any fallout with this part before continuing the effort.

em0 at pci0 dev 25 function 0 "Intel 82577LM" rev 0x06: msi
em0 at pci0 dev 25 function 0 "Intel 82579LM" rev 0x04: msi
em0 at pci0 dev 25 function 0 "Intel I218-V" rev 0x03: msi
em0 at pci0 dev 25 function 0 Intel I218-LM rev 0x04: msi
em0 at pci0 dev 31 function 6 "Intel I219-V" rev 0x21: msi

em(4) is a widely used driver as such it will make it easier for many
developers to experiment, test and design with parallelism in the Network
Stack.  However we should try our best to not break any system, that's why
I'd like to receive test reports as much variant as possible with this diff
:o)

Index: dev/pci/if_em.c
===
RCS file: /cvs/src/sys/dev/pci/if_em.c,v
retrieving revision 1.342
diff -u -p -r1.342 if_em.c
--- dev/pci/if_em.c 1 Mar 2019 10:02:44 -   1.342
+++ dev/pci/if_em.c 17 Jan 2020 13:33:59 -
@@ -249,6 +249,7 @@ void em_disable_aspm(struct em_softc *);
 void em_txeof(struct em_softc *);
 int  em_allocate_receive_structures(struct em_softc *);
 int  em_allocate_transmit_structures(struct em_softc *);
+int  em_allocate_desc_rings(struct em_softc *);
 int  em_rxfill(struct em_softc *);
 void em_rxrefill(void *);
 int  em_rxeof(struct em_softc *);
@@ -333,11 +334,6 @@ em_defer_attach(struct device *self)
 
em_free_pci_resources(sc);
 
-   sc->sc_rx_desc_ring = NULL;
-   em_dma_free(sc, &sc->sc_rx_dma);
-   sc->sc_tx_desc_ring = NULL;
-   em_dma_free(sc, &sc->sc_tx_dma);
-
return;
}

@@ -453,6 +449,7 @@ em_attach(struct device *parent, struct 
case em_82572:
case em_82574:
case em_82575:
+   case em_82576:
case em_82580:
case em_i210:
case em_i350:
@@ -483,23 +480,11 @@ em_attach(struct device *parent, struct 
sc->hw.min_frame_size = 
ETHER_MIN_LEN + ETHER_CRC_LEN;
 
-   /* Allocate Transmit Descriptor ring */
-   if (em_dma_malloc(sc, sc->sc_tx_slots * sizeof(struct em_tx_desc),
-   &sc->sc_tx_dma) != 0) {
-   printf("%s: Unable to allocate tx_desc memory\n", 
-  DEVNAME(sc));
-   goto err_tx_desc;
-   }
-   sc->sc_tx_desc_ring = (struct em_tx_desc *)sc->sc_tx_dma.dma_vaddr;
-
-   /* Allocate Receive Descriptor ring */
-   if (em_dma_malloc(sc, sc->sc_rx_slots * sizeof(struct em_rx_desc),
-   &sc->sc_rx_dma) != 0) {
-   printf("%s: Unable to allocate rx_desc memory\n",
-  DEVNAME(sc));
-   goto err_rx_desc;
+   if (em_allocate_desc_rings(sc) != 0) {
+   printf("%s: Unable to allocate descriptor ring memory\n",
+   DEVNAME(sc));
+   goto err_pci;
}
-   sc->sc_rx_desc_ring = (struct em_rx_desc *)sc->sc_rx_dma.dma_vaddr;
 
/* Initialize the hardware */
if ((defer = em_hardware_init(sc))) {
@@ -508,11 +493,12 @@ em_attach(struct device *parent, struct 
else {
printf("%s: Unable to initialize the hardware\n",
DEVNAME(sc));
-   goto err_hw_init;
+   goto err_pci;
}
}
 
if (sc->hw.mac_type == em_80003es2lan || sc->hw.mac_type == em_82575 ||
+   sc->hw.mac_type == em_82576 ||
sc->hw.mac_type == em_82580 || sc->hw.mac_type == em_i210 ||
sc->hw.mac_type == em_i350) {
uint32_t reg = EM_READ_REG(&sc->hw, E1000_STATUS);
@@ -541,7 +527,7 @@ em_attach(struct device *parent, struct 
if (em_read_mac_addr(&sc->hw) < 0) {
printf("%s: EEPROM read error while reading mac address\n",
   DEVNAME(sc));
-   goto err_mac_addr;
+   goto err_pci;
}
 
bcopy(sc->hw.mac_addr, sc->sc_ac.ac_enaddr, ETHER_ADDR_LEN);
@@ -582,14 +568,6 @@ em_attach(struct device *parent, s

upd(4): force boolean indicator to be 0 or 1

2020-01-20 Thread Boudewijn Dijkstra
Some upd(4) devices use -1 for "On" and some use 1.  sysctl(8) and  
sensorsd(8) hide this detail from the user, which makes it difficult to  
define low and high values in sensorsd.conf(5).


Also see https://marc.info/?l=openbsd-misc&m=144529176814155&w=2

I'd like to suggest a diff like the following:

RCS file: /cvs/src/sys/dev/usb/upd.c,v
retrieving revision 1.26
diff -u -p -u -p -d -r1.26 upd.c
--- upd.c   8 Apr 2017 02:57:25 -   1.26
+++ upd.c   20 Jan 2020 10:18:41 -
@@ -406,26 +406,30 @@ upd_sensor_update(struct upd_softc *sc,
struct upd_sensor   *child;
int64_t  hdata, adjust;

-   switch (HID_GET_USAGE(sensor->hitem.usage)) {
-   case HUB_REL_STATEOF_CHARGE:
-   case HUB_ABS_STATEOF_CHARGE:
-   case HUB_REM_CAPACITY:
-   case HUB_FULLCHARGE_CAPACITY:
-   adjust = 1000; /* scale adjust */
-   break;
-   case HUB_ATRATE_TIMETOFULL:
-   case HUB_ATRATE_TIMETOEMPTY:
-   case HUB_RUNTIMETO_EMPTY:
-   /* spec says minutes, not seconds */
-   adjust = 10LL;
-   break;
-   default:
-   adjust = 1; /* no scale adjust */
-   break;
+   hdata = hid_get_data(buf, len, &sensor->hitem.loc);
+   if (sensor->ksensor.type == SENSOR_INDICATOR)
+   sensor->ksensor.value = hdata ? 1 : 0;
+   else {
+   switch (HID_GET_USAGE(sensor->hitem.usage)) {
+   case HUB_REL_STATEOF_CHARGE:
+   case HUB_ABS_STATEOF_CHARGE:
+   case HUB_REM_CAPACITY:
+   case HUB_FULLCHARGE_CAPACITY:
+   adjust = 1000; /* scale adjust */
+   break;
+   case HUB_ATRATE_TIMETOFULL:
+   case HUB_ATRATE_TIMETOEMPTY:
+   case HUB_RUNTIMETO_EMPTY:
+   /* spec says minutes, not seconds */
+   adjust = 10LL;
+   break;
+   default:
+   adjust = 1; /* no scale adjust */
+   break;
+   }
+   sensor->ksensor.value = hdata * adjust;
}

-   hdata = hid_get_data(buf, len, &sensor->hitem.loc);
-   sensor->ksensor.value = hdata * adjust;
sensor->ksensor.status = SENSOR_S_OK;
sensor->ksensor.flags &= ~SENSOR_FINVALID;


Note that this will break some configurations.


--
Boudewijn Dijkstra
Indes-IDS B.V.
+31 345 545 535



Re: bnxt(4), myx(4), vr(4): refill timeouts: timeout_add(..., 0) -> timeout_add(..., 1)

2020-01-20 Thread Hrvoje Popovski
On 16.1.2020. 18:45, Scott Cheloha wrote:
> Here's a first batch of conversions: rx refill timeouts for bnxt(4),
> myx(4), and vr(4).  All of these can run during a softclock().  Will
> changing these to one tick break these drivers?

Hi all,

i tried this diff with myx and performance are the same as with plain
kernel.

myx0 at pci4 dev 0 function 0 "Myricom Z8E" rev 0x01: msi, model
10G-PCIE2-8BL2-2S, address 00:60:dd:45:ba:f8
myx1 at pci5 dev 0 function 0 "Myricom Z8E" rev 0x01: msi, model
10G-PCIE2-8BL2-2S, address 00:60:dd:45:ba:f9