Microsoft Surface: replace umstc(4) with ucc(4)

2022-11-17 Thread Anton Lindqvist
Hi,
umstc is essentially a less capable version of ucc. Now that matthieu@
recently taught wskbd to adjust the brightness in process context, we
should be able to get rid of umstc in favor of ucc.

In order for ucc to be feature compatible with umstc, it must honor the
always open quirk as pointed out by jcs@ earlier.

I don't have access to any umstc hardware. Could someone with such
hardware try the following diff? Ideally, umstc should no longer attach
whereas ucc should. A dmesg before and after would be helpful.

diff --git sys/dev/usb/ucc.c sys/dev/usb/ucc.c
index acc14ce86cf..25b391fd9ee 100644
--- sys/dev/usb/ucc.c
+++ sys/dev/usb/ucc.c
@@ -20,6 +20,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -29,6 +30,7 @@
 struct ucc_softc {
struct uhidevsc_hdev;
struct hidcc*sc_cc;
+   u_int32_tsc_quirks;
 };
 
 intucc_match(struct device *, void *, void *);
@@ -76,6 +78,8 @@ ucc_attach(struct device *parent, struct device *self, void 
*aux)
void *desc;
int repid, size;
 
+   sc->sc_quirks = usbd_get_quirks(sc->sc_hdev.sc_udev)->uq_flags;
+
sc->sc_hdev.sc_intr = ucc_intr;
sc->sc_hdev.sc_parent = uha->parent;
sc->sc_hdev.sc_udev = uha->uaa->device;
@@ -98,6 +102,9 @@ ucc_attach(struct device *parent, struct device *self, void 
*aux)
.arg= self,
};
sc->sc_cc = hidcc_attach();
+
+   if (sc->sc_quirks & UQ_ALWAYS_OPEN)
+   uhidev_open(>sc_hdev);
 }
 
 int
@@ -127,6 +134,9 @@ ucc_enable(void *v, int on)
struct ucc_softc *sc = (struct ucc_softc *)v;
int error = 0;
 
+   if (sc->sc_quirks & UQ_ALWAYS_OPEN)
+   return 0;
+
if (on)
error = uhidev_open(>sc_hdev);
else
diff --git sys/dev/usb/umstc.c sys/dev/usb/umstc.c
index 4bebeb41811..c3be6cce41f 100644
--- sys/dev/usb/umstc.c
+++ sys/dev/usb/umstc.c
@@ -80,6 +80,8 @@ umstc_match(struct device *parent, void *match, void *aux)
int size;
void *desc;
 
+   return (UMATCH_NONE);
+
if (UHIDEV_CLAIM_MULTIPLE_REPORTID(uha))
return (UMATCH_NONE);
 



ixv(4): porting Virtual Function driver for Intel 82599 series.

2022-11-17 Thread Yuichiro NAITO

Hi.

I'm porting Intel 82599 series VF driver ixv(4) from Intel (see below link).

https://www.intel.com/content/www/us/en/download/645984/intel-network-adapter-virtual-function-driver-for-pcie-10-gigabit-network-connections-under-freebsd.html

It's written for FreeBSD but doesn't have iflib code.
I think it is easier than porting from FreeBSD src tree.

In my patch, many codes are shared between OpenBSD ix(4) driver.
Especially ring buffer codes are almost re-used and just linked.
Only TDT/RDT register offset is different between ix(4) and ixv(4),
I changed to store the offset into the ring buffers and refer them.

Intel driver has AIM (Adaptive Interrupt Moderation) code.
I also ported it but it's disabled for now.
When I find some evidence that improves performance in the further testing,
I will enable it.

Most of ixv(4) driver codes (if_ixv.c) and hardware control code (ixgbe_vf.c)
are taken from Intel driver. I adjusted kstat codes to see VF registers.

Internal communication (called 'mailbox') between VF and PF (Primary Function)
codes are updated to the Intel driver.

I introduced a new interface that vlan(4) driver passes the vnet id to network
drivers. VF must tell PF which vnet ids are used or VF never receive VLAN 
packets.

I tested my patch X540 VF and X550 VF on ESXi and Linux qemu.
It works fine for me.
I also confirmed ix(4) works on X550 as a regression test.

Is it OK?

diff --git a/sys/arch/amd64/conf/GENERIC b/sys/arch/amd64/conf/GENERIC
index 3563126b2a1..e421c38e95f 100644
--- a/sys/arch/amd64/conf/GENERIC
+++ b/sys/arch/amd64/conf/GENERIC
@@ -522,6 +522,7 @@ msk*at mskc?#  each port of 
above
 em*at pci? # Intel Pro/1000 ethernet
 ixgb*  at pci? # Intel Pro/10Gb ethernet
 ix*at pci? # Intel 82598EB 10Gb ethernet
+ixv*   at pci? # Virtual Function of Intel 82598EB
 myx*   at pci? # Myricom Myri-10G 10Gb ethernet
 oce*   at pci? # Emulex OneConnect 10Gb ethernet
 txp*   at pci? # 3com 3CR990
diff --git a/sys/dev/pci/files.pci b/sys/dev/pci/files.pci
index a803dc9b659..7d6402e524e 100644
--- a/sys/dev/pci/files.pci
+++ b/sys/dev/pci/files.pci
@@ -350,13 +350,19 @@ file  dev/pci/ixgb_hw.c   ixgb
 # Intel 82598 10GbE
 device ix: ether, ifnet, ifmedia, intrmap, stoeplitz
 attach ix at pci
-file   dev/pci/if_ix.c ix
-file   dev/pci/ixgbe.c ix
-file   dev/pci/ixgbe_82598.c   ix
-file   dev/pci/ixgbe_82599.c   ix
-file   dev/pci/ixgbe_x540.cix
-file   dev/pci/ixgbe_x550.cix
-file   dev/pci/ixgbe_phy.c ix
+file   dev/pci/if_ix.c ix | ixv
+file   dev/pci/ixgbe.c ix | ixv
+file   dev/pci/ixgbe_82598.c   ix | ixv
+file   dev/pci/ixgbe_82599.c   ix | ixv
+file   dev/pci/ixgbe_x540.cix | ixv
+file   dev/pci/ixgbe_x550.cix | ixv
+file   dev/pci/ixgbe_phy.c ix | ixv
+
+# Virtual Function of i82599.
+device ixv: ether, ifnet, ifmedia, intrmap, stoeplitz
+attach ixv at pci
+file   dev/pci/if_ixv.cixv
+file   dev/pci/ixgbe_vf.c  ixv

 # Intel Ethernet 700 Series
 device ixl: ether, ifnet, ifmedia, intrmap, stoeplitz
diff --git a/sys/dev/pci/if_ix.c b/sys/dev/pci/if_ix.c
index b59ec28d9f1..9d88d00222d 100644
--- a/sys/dev/pci/if_ix.c
+++ b/sys/dev/pci/if_ix.c
@@ -507,7 +507,7 @@ ixgbe_start(struct ifqueue *ifq)
 * hardware that this frame is available to transmit.
 */
if (post)
-   IXGBE_WRITE_REG(>hw, IXGBE_TDT(txr->me),
+   IXGBE_WRITE_REG(>hw, txr->tail,
txr->next_avail_desc);
 }

@@ -706,7 +706,7 @@ ixgbe_watchdog(struct ifnet * ifp)
for (i = 0; i < sc->num_queues; i++, txr++) {
printf("%s: Queue(%d) tdh = %d, hw tdt = %d\n", ifp->if_xname, 
i,
IXGBE_READ_REG(hw, IXGBE_TDH(i)),
-   IXGBE_READ_REG(hw, IXGBE_TDT(i)));
+   IXGBE_READ_REG(hw, sc->tx_rings[i].tail));
printf("%s: TX(%d) Next TX to Clean = %d\n", ifp->if_xname,
i, txr->next_to_clean);
}
@@ -826,7 +826,7 @@ ixgbe_init(void *arg)
msec_delay(1);
}
IXGBE_WRITE_FLUSH(>hw);
-   IXGBE_WRITE_REG(>hw, IXGBE_RDT(i), rxr->last_desc_filled);
+   IXGBE_WRITE_REG(>hw, rxr[i].tail, rxr->last_desc_filled);
}

/* Set up VLAN support and filter */
@@ -2359,9 +2359,12 @@ ixgbe_initialize_transmit_units(struct ix_softc *sc)
IXGBE_WRITE_REG(hw, IXGBE_TDLEN(i),
sc->num_tx_desc * sizeof(struct ixgbe_legacy_tx_desc));

+   /* Set Tx Tail register */
+   txr->tail = IXGBE_TDT(i);
+
/* 

Re: More on mimmutable

2022-11-17 Thread Otto Moerbeek
On Thu, Nov 17, 2022 at 08:10:05PM -0700, Theo de Raadt wrote:

> So this static executable is completely immutable, except for the
> OPENBSD_MUTABLE region.  This annotation is used in one place now, deep
> inside libc's malloc(3) code, where a piece of code flips a data structure
> between readonly and read-write as a security measure.  That does not become
> immutable.  It happens to be an page.

This will change.

I have code ready to change the init of that malloc data structure so
that the one page wil be modified to contain the right data, then made
readonly and then made immutable.

-Otto



More on mimmutable

2022-11-17 Thread Theo de Raadt
[LONG]

I am getting close to having the big final step of mimmutable in the tree.
Here's a refresher on the how it works, what's already done, and the next
bit to land.

DESCRIPTION
 The mimmutable() system call changes currently mapped pages in the region
 to be marked immutable, which means their protection or mapping may not
 be changed in the future.  mmap(2), mprotect(2), and munmap(2) to pages
 marked immutable will return with error EPERM.

That's the system call.  In reality, almost no programs call it.

Let me start by explaining a process's address space, starting with the simplest
programs and then heading into more complicated cases.

A process runtime has a
  - stack (rwS permissions, S being the annotation used at system call entry
to ensure the "sp" register points to stack, and thus prevent a class
of ROP pivot methods),
  - a stack-guard (for growing the stack in case rlimits are changed, this
is is permission NONE)
  - a signal trampoline page, randomly placed, which will perform sigreturn(2),
(permission rwe, e being the annotation used at system call entry to ensure
the "pc" register points at a region allowed to system calls, thus 
preventing
attackers from uploading direct system call instruction code)

Those objects are automatically marked immutable by the kernel.

On to static executables.  The kernel loads a static ELF binary into
memory as a text segment (rx permission), followed by a data segment (rw
permission), a bss (zero'd data, rw permission), and a rodata segment
(ro permission).  The order of these varies per architecture.  There is
an overlay of this called the "GNU_RELRO", which is pretty uhm special.
I've created a new overlay called "OPENBSD_MUTABLE", which is
page-aligned and must not be made immutable.  As it happens, these two
special regions are the only part of the image load that cannot be marked
immutable, so the kernel proceeds to mark everything else immutable.

When that static program starts running, it will run the crt0 ("c run time")
startup code, which can make some small changes in the GNU_RELRO region,
and them mark it immutable.

So this static executable is completely immutable, except for the
OPENBSD_MUTABLE region.  This annotation is used in one place now, deep
inside libc's malloc(3) code, where a piece of code flips a data structure
between readonly and read-write as a security measure.  That does not become
immutable.  It happens to be an page.

There is another ugly old wart called "text relocations", and I won't
get into it except to say the kernel recognizes such binaries, and skips
some immutabilities, but of course crt0 finishes the job. 

I want to speak a bit about the mechanism.  Inside the kernel,
immutability is applied to all the regions.  And then the exceptions are
marked mutable.  The kernel is allowed to reverse setting immutability,
but userland cannot.  This will come up later.

Now let's talk about dynamic executables.  The same applies as above for
the main program, but then the kernel also loads another object into memory:
/usr/libexec/ld.so -- the shared library linker.  And instead of starting
to run the main program, execution starts in the shared library linker.

The shared library linker ELF image contains similar objects.  There is a
GNU_RELRO, which the kernel cannot mark immutable.  There is no OPENBSD_MUTABLE
because we don't request creation of one.  There is a special "boot.text"
section that the shared library linker unmaps upon startup, as a security
measure, and the kernel ignores that region.  All the other regions of
ld.so are marked immutable by the kernel automatically.

Now ld.so starts to execute, and the first job it does is to fix it's
own relro section, handle text relocations in the dynamic binary, repair
some permissions, and then mark itself and the main program immutable.
Completely immutable, except for the OPENBSD_MUTABLE page in malloc(2).

I was really surprised we got to this point without blowing up the ports
tree in a major way.  Some of these warts were found along the way and
changed the direction a little.  And I don't want to talk about sigaltstack
right now.

ld.so is now responsible for loading the shared libraries required by the
program.  Shared libraries have the same pieces as regular programs, so
they are loaded into memory, but here we hit a problem.  In the kernel we
could simplistically mark all the regions as immutable, and then reverse
it for the special cases.  Userland code cannot do that.  So we have to
keep track of the sections we want to be immutable, as well as the regions
that are immutable, and then subtract the differences, and apply the
immutables very late in the shared library loading process.  I call this
code the clipping engine, and I had a lot of bugs in it.

There is another way shared libraries are loaded:  via the dlopen()
call later at runtime.  With the flag RTLD_NODELETE, libraries can
be marked immutable.  

rpki-client: fold -T (bird table name) into -B (bird output)

2022-11-17 Thread Job Snijders
I don't think I've ever heard anyone use the '-T' feature. Perhaps
better to fold it into '-B' (BIRD output) using getopt '::' trick?

I don't feel super strong about this, but it helps reclaim a getopt
letter.

Kind regards,

Job

Index: main.c
===
RCS file: /cvs/src/usr.sbin/rpki-client/main.c,v
retrieving revision 1.222
diff -u -p -r1.222 main.c
--- main.c  17 Nov 2022 20:51:39 -  1.222
+++ main.c  17 Nov 2022 21:22:13 -
@@ -907,13 +907,15 @@ main(int argc, char *argv[])
"proc exec unveil", NULL) == -1)
err(1, "pledge");
 
-   while ((c = getopt(argc, argv, "b:Bcd:e:fH:jnorRs:S:t:T:vV")) != -1)
+   while ((c = getopt(argc, argv, "b:B::cd:e:fH:jnorRs:S:t:vV")) != -1)
switch (c) {
case 'b':
bind_addr = optarg;
break;
case 'B':
outformats |= FORMAT_BIRD;
+   if (optarg)
+   bird_tablename = optarg;
break;
case 'c':
outformats |= FORMAT_CSV;
@@ -964,9 +966,6 @@ main(int argc, char *argv[])
err(1, "too many tal files specified");
tals[talsz++] = optarg;
break;
-   case 'T':
-   bird_tablename = optarg;
-   break;
case 'v':
verbose++;
break;
@@ -1376,9 +1375,8 @@ usage:
fprintf(stderr,
"usage: rpki-client [-BcjnoRrVv] [-b sourceaddr] [-d cachedir]"
" [-e rsync_prog]\n"
-   "   [-H fqdn ] [-S skiplist] [-s timeout] [-T 
table]"
-   " [-t tal]\n"
-   "   [outputdir]\n"
+   "   [-H fqdn ] [-S skiplist] [-s timeout] [-t tal]"
+   " [outputdir]\n"
"   rpki-client [-Vv] [-d cachedir] [-j] [-t tal] -f file ..."
"\n");
return 1;
Index: rpki-client.8
===
RCS file: /cvs/src/usr.sbin/rpki-client/rpki-client.8,v
retrieving revision 1.80
diff -u -p -r1.80 rpki-client.8
--- rpki-client.8   17 Nov 2022 20:49:38 -  1.80
+++ rpki-client.8   17 Nov 2022 21:22:13 -
@@ -22,14 +22,14 @@
 .Nd RPKI validator to support BGP routing security
 .Sh SYNOPSIS
 .Nm
-.Op Fl BcjnoRrVv
+.Op Fl cjnoRrVv
+.Op Fl B Ar table
 .Op Fl b Ar sourceaddr
 .Op Fl d Ar cachedir
 .Op Fl e Ar rsync_prog
 .Op Fl H Ar fqdn
 .Op Fl S Ar skiplist
 .Op Fl s Ar timeout
-.Op Fl T Ar table
 .Op Fl t Ar tal
 .Op Ar outputdir
 .Nm
@@ -62,7 +62,7 @@ in various formats.
 .Pp
 The options are as follows:
 .Bl -tag -width Ds
-.It Fl B
+.It Fl B Ar table
 Create output in the files
 .Pa bird1v4 ,
 .Pa bird1v6 ,
@@ -70,6 +70,11 @@ and
 .Pa bird
 (for bird2)
 in the output directory which is suitable for the BIRD internet routing daemon.
+If
+.Ar table
+is specified, use
+.Ar table
+as roa table name instead of the default 'ROAS'.
 .It Fl b Ar sourceaddr
 Tell the HTTP and rsync clients to use
 .Ar sourceaddr
@@ -193,12 +198,6 @@ Individual RSYNC/RRDP repositories are t
 .Em timeout .
 All network synchronisation tasks are aborted after seven eights of
 .Em timeout .
-.It Fl T Ar table
-For BIRD output generated with the
-.Fl B
-option use
-.Ar table
-as roa table name instead of the default 'ROAS'.
 .It Fl t Ar tal
 Specify a
 .Em Trust Anchor Location Pq TAL



Re: rpki-client: add 'shortlist' functionality

2022-11-17 Thread Theo de Raadt
Job Snijders  wrote:

> rpki-client currently is using 'only' 18 out of the 66 ([a-zA-Z0-9).
> I am not very concerned in that regard. :-)


I have to disagree strongly -- Software bloat is dangerous.






Re: rpki-client: add 'shortlist' functionality

2022-11-17 Thread Job Snijders
Heya!

On Thu, Nov 17, 2022 at 08:39:36PM +0100, Theo Buehler wrote:
> > This functionality is handy if you want to inspect only specific
> > repositories and ignore the rest of the world. Useful for monitoring
> > too.
> > 
> > OK? Feedback?
> 
> I have no objection code-wise and I understand the motivation. However,
> I'm not a fan of using 'q' for this - it suggests quiet mode.

I was thinking 'quick', but I can see your point too.

> A more general concern could be summarized by saying that rpki-client
> will soon need to copy the BUGS section from indent(1). I'm only half
> joking. I'm not sure whether we have already reached the point where we
> must stop adding things, but we're getting close. We will soon need to
> start asking ourselves if adding this one feature might block a future,
> more important, thing, simply because there are only finitely many
> letters.

rpki-client currently is using 'only' 18 out of the 66 ([a-zA-Z0-9).
I am not very concerned in that regard. :-)

Garbage collection options:
   '-r' can be removed, its been the default for a while now anyway
   '-T' can maybe be merged into -B using getopt('B:'). I doubt many
(if any) people use '-T'.

If you don't like '-q', which of the following do you like better?
a A C D E F g G h H i I J k K l L m M N O p P Q u U w W x X y Y z Z

Kind regards,

Job



Re: rpki-client: add 'shortlist' functionality

2022-11-17 Thread Claudio Jeker
On Thu, Nov 17, 2022 at 05:53:40PM +, Job Snijders wrote:
> Dear all,
> 
> I introduced a 'shortlist' feature in rpki-client(8). If the operator
> specifies one or more '-q' options followed by FQDNs, the utility will
> *only* connect to those hosts and skip all others.
> 
> $ doas rpki-client -q rpki.ripe.net -q chloe.sobornost.net
> Processing time 84 seconds (75 seconds user, 10 seconds system)
> Skiplist entries: 0
> Route Origin Authorizations: 32459 (0 failed parse, 0 invalid)
> AS Provider Attestations: 0 (0 failed parse, 0 invalid)
> BGPsec Router Certificates: 2
> Certificates: 18750 (0 invalid)
> Trust Anchor Locators: 5 (0 invalid)
> Manifests: 18586 (0 failed parse, 0 stale)
> Certificate revocation lists: 18586
> Ghostbuster records: 1
> Trust Anchor Keys: 0
> Repositories: 8
> Cleanup: removed 1 files, 1270 directories, 67 superfluous
> VRP Entries: 179160 (179160 unique)
> VAP Entries: 0 (0 unique)
> 
> $ ls -lahtr /var/cache/rpki-client/
> total 28
> drwxr-xr-x  4 root  wheel   512B Nov 10 21:07 ..
> drwxr-xr-x  2 _rpki-client  wheel   512B Nov 17 17:35 .rsync
> drwxr-xr-x  7 _rpki-client  wheel   512B Nov 17 17:45 ta
> drwxr-xr-x  3 _rpki-client  wheel   512B Nov 17 17:47 rpki.ripe.net
> drwxr-xr-x  3 _rpki-client  wheel   512B Nov 17 17:47 chloe.sobornost.net
> drwxr-xr-x  7 _rpki-client  wheel   1.0K Nov 17 17:47 .
> drwxr-xr-x  5 _rpki-client  wheel   512B Nov 17 17:48 .rrdp
> 
> This functionality is handy if you want to inspect only specific
> repositories and ignore the rest of the world. Useful for monitoring
> too.
> 
> OK? Feedback?
> 
> Kind regards,
> 
> Job
> 
> Index: extern.h
> ===
> RCS file: /cvs/src/usr.sbin/rpki-client/extern.h,v
> retrieving revision 1.159
> diff -u -p -r1.159 extern.h
> --- extern.h  4 Nov 2022 12:05:36 -   1.159
> +++ extern.h  17 Nov 2022 17:47:34 -
> @@ -34,6 +34,15 @@ struct skiplistentry {
>  LIST_HEAD(skiplist, skiplistentry);
>  
>  /*
> + * Shortlist of hosts to connect to (loaded via -q arguments).
> + */
> +struct shortlistentry {
> + LIST_ENTRY(shortlistentry)   entry;
> + char*value; /* FQDN */
> +};
> +LIST_HEAD(shortlist, shortlistentry);
> +

Please define these in main.c. The shortlist is not used outside of main.c

Appart from that I agree with tb@ (both about the argument -q and the fact
that rpki-client ends up with a heck of a lot of options and modes).

-- 
:wq Claudio



Re: ed: Make use of stderr compliant with POSIX

2022-11-17 Thread Sören Tempel
PING.

I think this is a fairly obvious POSIX compliance issue with an easy fix.

Sören Tempel  wrote:
> Hello,
> 
> Currently, the OpenBSD ed implementation incorrectly writes information
> to standard error that should be written to standard out (as per POSIX).
> 
> For the read and write command the POSIX standard states the following:
> 
>   If the command is successful, the number of bytes written shall
>   be written to standard output [...]
> 
> However, OpenBSD ed presently writes the number of bytes (for both the
> read and the write command) to standard error. Furthermore, the POSIX
> standard mandates that the infamous "?\n" error string should also be
> written to standard output while OpenBSD ed presently writes the string
> to standard error.
> 
> Both cases are fixed by the patch below.
> 
> OK?
> 
> diff --git bin/ed/io.c bin/ed/io.c
> index 97306be16..e75bedd8b 100644
> --- bin/ed/io.c
> +++ bin/ed/io.c
> @@ -64,7 +64,7 @@ read_file(char *fn, int n)
>   return ERR;
>   }
>   if (!scripted)
> - fprintf(stderr, "%d\n", size);
> + printf("%d\n", size);
>   return current_addr - n;
>  }
>  
> @@ -166,7 +166,7 @@ write_file(char *fn, char *mode, int n, int m)
>   return ERR;
>   }
>   if (!scripted)
> - fprintf(stderr, "%d\n", size);
> + printf("%d\n", size);
>   return n ? m - n + 1 : 0;
>  }
>  
> diff --git bin/ed/main.c bin/ed/main.c
> index 231d021ef..674741c7c 100644
> --- bin/ed/main.c
> +++ bin/ed/main.c
> @@ -184,7 +184,7 @@ top:
>   signal(SIGINT, signal_int);
>   if (sigsetjmp(env, 1)) {
>   status = -1;
> - fputs("\n?\n", stderr);
> + fputs("\n?\n", stdout);
>   seterrmsg("interrupt");
>   } else {
>   init_buffers();
> @@ -196,7 +196,7 @@ top:
>   strlcpy(old_filename, *argv,
>   sizeof old_filename);
>   } else if (argc) {
> - fputs("?\n", stderr);
> + fputs("?\n", stdout);
>   if (**argv == '\0')
>   seterrmsg("invalid filename");
>   if (!interactive)
> @@ -215,7 +215,7 @@ top:
>   continue;
>   } else if (n == 0) {
>   if (modified && !scripted) {
> - fputs("?\n", stderr);
> + fputs("?\n", stdout);
>   seterrmsg("warning: file modified");
>   if (!interactive) {
>   if (garrulous)
> @@ -250,7 +250,7 @@ top:
>   break;
>   case EMOD:
>   modified = 0;
> - fputs("?\n", stderr);   /* give warning */
> + fputs("?\n", stdout);   /* give warning */
>   seterrmsg("warning: file modified");
>   if (!interactive) {
>   if (garrulous)
> @@ -271,7 +271,7 @@ top:
>   quit(3);
>   break;
>   default:
> - fputs("?\n", stderr);
> + fputs("?\n", stdout);
>   if (!interactive) {
>   if (garrulous)
>   fprintf(stderr,



Re: rpki-client: add 'shortlist' functionality

2022-11-17 Thread Theo Buehler
> This functionality is handy if you want to inspect only specific
> repositories and ignore the rest of the world. Useful for monitoring
> too.
> 
> OK? Feedback?

I have no objection code-wise and I understand the motivation. However,
I'm not a fan of using 'q' for this - it suggests quiet mode.

A more general concern could be summarized by saying that rpki-client
will soon need to copy the BUGS section from indent(1). I'm only half
joking. I'm not sure whether we have already reached the point where we
must stop adding things, but we're getting close. We will soon need to
start asking ourselves if adding this one feature might block a future,
more important, thing, simply because there are only finitely many
letters.



Re: Get rid of UVM_VNODE_CANPERSIST

2022-11-17 Thread Mark Kettenis
> From: Jeremie Courreges-Anglas 
> Date: Thu, 17 Nov 2022 18:00:21 +0100
> 
> On Tue, Nov 15 2022, Martin Pieuchot  wrote:
> > UVM vnode objects include a reference count to keep track of the number
> > of processes that have the corresponding pages mapped in their VM space.
> >
> > When the last process referencing a given library or executable dies,
> > the reaper will munmap this object on its behalf.  When this happens it
> > doesn't free the associated pages to speed-up possible re-use of the
> > file.  Instead the pages are placed on the inactive list but stay ready
> > to be pmap_enter()'d without requiring I/O as soon as a newly process
> > needs to access them.
> >
> > The mechanism to keep pages populated, known as UVM_VNODE_CANPERSIST,
> > doesn't work well with swapping [0].  For some reason when the page daemon
> > wants to free pages on the inactive list it tries to flush the pages to
> > disk and panic(9) because it needs a valid reference to the vnode to do
> > so.
> >
> > This indicates that the mechanism described above, which seems to work
> > fine for RO mappings, is currently buggy in more complex situations.
> > Flushing the pages when the last reference of the UVM object is dropped
> > also doesn't seem to be enough as bluhm@ reported [1].
> >
> > The diff below, which has already be committed and reverted, gets rid of
> > the UVM_VNODE_CANPERSIST logic.  I'd like to commit it again now that
> > the arm64 caching bug has been found and fixed.
> >
> > Getting rid of this logic means more I/O will be generated and pages
> > might have a faster reuse cycle.  I'm aware this might introduce a small
> > slowdown,
> 
> Numbers for my usual make -j4 in libc,
> on an Unmatched riscv64 box, now:
>16m32.65s real21m36.79s user30m53.45s system
>16m32.37s real21m33.40s user31m17.98s system
>16m32.63s real21m35.74s user31m12.01s system
>16m32.13s real21m36.12s user31m06.92s system
> After:
>19m14.15s real21m09.39s user36m51.33s system
>19m19.11s real21m02.61s user36m58.46s system
>19m21.77s real21m09.23s user37m03.85s system
>19m09.39s real21m08.96s user36m36.00s system
> 
> 4 cores amd64 VM, before (-current plus an other diff):
>1m54.31s real 2m47.36s user 4m24.70s system
>1m52.64s real 2m45.68s user 4m23.46s system
>1m53.47s real 2m43.59s user 4m27.60s system
> After:
>2m34.12s real 2m51.15s user 6m20.91s system
>2m34.30s real 2m48.48s user 6m23.34s system
>2m37.07s real 2m49.60s user 6m31.53s system
> 
> > however I believe we should work towards loading files from the
> > buffer cache to save I/O cycles instead of having another layer of cache.
> > Such work isn't trivial and making sure the vnode <-> UVM relation is
> > simple and well understood is the first step in this direction.
> >
> > I'd appreciate if the diff below could be tested on many architectures,
> > include the offending rpi4.
> 
> Mike has already tested a make build on a riscv64 Unmatched.  I have
> also run regress in sys, lib/libc and lib/libpthread on that arch.  As
> far as I can see this looks stable on my machine, but what I really care
> about is the riscv64 bulk build cluster (I'm going to start another
> bulk build soon).
> 
> > Comments?  Oks?
> 
> The performance drop in my microbenchmark kinda worries me but it's only
> a microbenchmark...

I wouldn't call this a microbenchmark.  I fear this is typical for
builds of anything on clang architectures.  And I expect it to be
worse on single-processor machine where *every* time we execute clang
or lld all the pages are thrown away upon exit and will need to be
read back from disk again for the next bit of C code we compile.

Having mmap'ed reads go through the buffer cache should indeed help
here, but that smells like UBC and even if it isn't, it is something
we don't have and will be tricky to get right.

The discussions we had during h2k22 made me understand this thing a
bit better.  Is there any reason why we can't keep a reference to the
vnode and have the pagedaemon drop it when it drops the pages?  Is
there a chance that we run out of vnodes this way?  I vaguely remember
beck@ saying that we can have tons of vnodes now.

> > [0] https://marc.info/?l=openbsd-bugs=164846737707559=2 
> > [1] https://marc.info/?l=openbsd-bugs=166843373415030=2
> >
> > Index: uvm/uvm_vnode.c
> > ===
> > RCS file: /cvs/src/sys/uvm/uvm_vnode.c,v
> > retrieving revision 1.130
> > diff -u -p -r1.130 uvm_vnode.c
> > --- uvm/uvm_vnode.c 20 Oct 2022 13:31:52 -  1.130
> > +++ uvm/uvm_vnode.c 15 Nov 2022 13:28:28 -
> > @@ -161,11 +161,8 @@ uvn_attach(struct vnode *vp, vm_prot_t a
> >  * add it to the writeable list, and then return.
> >  */
> > if (uvn->u_flags & UVM_VNODE_VALID) {   /* already active? */
> > +   

rpki-client: add 'shortlist' functionality

2022-11-17 Thread Job Snijders
Dear all,

I introduced a 'shortlist' feature in rpki-client(8). If the operator
specifies one or more '-q' options followed by FQDNs, the utility will
*only* connect to those hosts and skip all others.

$ doas rpki-client -q rpki.ripe.net -q chloe.sobornost.net
Processing time 84 seconds (75 seconds user, 10 seconds system)
Skiplist entries: 0
Route Origin Authorizations: 32459 (0 failed parse, 0 invalid)
AS Provider Attestations: 0 (0 failed parse, 0 invalid)
BGPsec Router Certificates: 2
Certificates: 18750 (0 invalid)
Trust Anchor Locators: 5 (0 invalid)
Manifests: 18586 (0 failed parse, 0 stale)
Certificate revocation lists: 18586
Ghostbuster records: 1
Trust Anchor Keys: 0
Repositories: 8
Cleanup: removed 1 files, 1270 directories, 67 superfluous
VRP Entries: 179160 (179160 unique)
VAP Entries: 0 (0 unique)

$ ls -lahtr /var/cache/rpki-client/
total 28
drwxr-xr-x  4 root  wheel   512B Nov 10 21:07 ..
drwxr-xr-x  2 _rpki-client  wheel   512B Nov 17 17:35 .rsync
drwxr-xr-x  7 _rpki-client  wheel   512B Nov 17 17:45 ta
drwxr-xr-x  3 _rpki-client  wheel   512B Nov 17 17:47 rpki.ripe.net
drwxr-xr-x  3 _rpki-client  wheel   512B Nov 17 17:47 chloe.sobornost.net
drwxr-xr-x  7 _rpki-client  wheel   1.0K Nov 17 17:47 .
drwxr-xr-x  5 _rpki-client  wheel   512B Nov 17 17:48 .rrdp

This functionality is handy if you want to inspect only specific
repositories and ignore the rest of the world. Useful for monitoring
too.

OK? Feedback?

Kind regards,

Job

Index: extern.h
===
RCS file: /cvs/src/usr.sbin/rpki-client/extern.h,v
retrieving revision 1.159
diff -u -p -r1.159 extern.h
--- extern.h4 Nov 2022 12:05:36 -   1.159
+++ extern.h17 Nov 2022 17:47:34 -
@@ -34,6 +34,15 @@ struct skiplistentry {
 LIST_HEAD(skiplist, skiplistentry);
 
 /*
+ * Shortlist of hosts to connect to (loaded via -q arguments).
+ */
+struct shortlistentry {
+   LIST_ENTRY(shortlistentry)   entry;
+   char*value; /* FQDN */
+};
+LIST_HEAD(shortlist, shortlistentry);
+
+/*
  * Enumeration for ASN.1 explicit tags in RSC eContent
  */
 enum rsc_resourceblock_tag {
Index: main.c
===
RCS file: /cvs/src/usr.sbin/rpki-client/main.c,v
retrieving revision 1.220
diff -u -p -r1.220 main.c
--- main.c  2 Nov 2022 12:43:02 -   1.220
+++ main.c  17 Nov 2022 17:47:34 -
@@ -64,11 +64,13 @@ const char  *bird_tablename = "ROAS";
 intverbose;
 intnoop;
 intfilemode;
+intshortlistmode;
 intrrdpon = 1;
 intrepo_timeout;
 time_t deadline;
 
 struct skiplist skiplist = LIST_HEAD_INITIALIZER(skiplist);
+struct shortlist shortlist = LIST_HEAD_INITIALIZER(shortlist);
 
 struct statsstats;
 
@@ -448,21 +450,35 @@ queue_add_from_cert(const struct cert *c
 {
struct repo *repo;
struct skiplistentry*sle;
+   struct shortlistentry   *she;
char*nfile, *npath, *host;
const char  *uri, *repouri, *file;
size_t   repourisz;
+   int  shortlisted = 0;
 
-   LIST_FOREACH(sle, , entry) {
-   if (strncmp(cert->repo, "rsync://", 8) != 0)
-   errx(1, "unexpected protocol");
-   host = cert->repo + 8;
+   if (strncmp(cert->repo, "rsync://", 8) != 0)
+   errx(1, "unexpected protocol");
+   host = cert->repo + 8;
 
+   LIST_FOREACH(sle, , entry) {
if (strncasecmp(host, sle->value, strcspn(host, "/")) == 0) {
warnx("skipping %s (listed in skiplist)", cert->repo);
return;
}
}
 
+   LIST_FOREACH(she, , entry) {
+   if (strncasecmp(host, she->value, strcspn(host, "/")) == 0) {
+   shortlisted = 1;
+   break;
+   }
+   }
+   if (shortlistmode && shortlisted == 0) {
+   if (verbose)
+   warnx("skipping %s (not shortlisted)", cert->repo);
+   return;
+   }
+
repo = repo_lookup(cert->talid, cert->repo,
rrdpon ? cert->notify : NULL);
if (repo == NULL)
@@ -760,6 +776,26 @@ load_skiplist(const char *slf)
free(line);
 }
 
+/*
+ * Load shortlist entries.
+ */
+static void
+load_shortlist(const char *fqdn)
+{
+   struct shortlistentry   *she;
+
+   if (!valid_uri(fqdn, strlen(fqdn), NULL))
+   errx(1, "invalid fqdn passed to -q: %s", fqdn);
+
+   if ((she = malloc(sizeof(struct shortlistentry))) == NULL)
+   err(1, NULL);
+
+   if ((she->value = strdup(fqdn)) == NULL)
+   err(1, NULL);
+
+   LIST_INSERT_HEAD(, she, entry);
+}
+
 static void
 check_fs_size(int fd, 

Re: Get rid of UVM_VNODE_CANPERSIST

2022-11-17 Thread Jeremie Courreges-Anglas
On Tue, Nov 15 2022, Martin Pieuchot  wrote:
> UVM vnode objects include a reference count to keep track of the number
> of processes that have the corresponding pages mapped in their VM space.
>
> When the last process referencing a given library or executable dies,
> the reaper will munmap this object on its behalf.  When this happens it
> doesn't free the associated pages to speed-up possible re-use of the
> file.  Instead the pages are placed on the inactive list but stay ready
> to be pmap_enter()'d without requiring I/O as soon as a newly process
> needs to access them.
>
> The mechanism to keep pages populated, known as UVM_VNODE_CANPERSIST,
> doesn't work well with swapping [0].  For some reason when the page daemon
> wants to free pages on the inactive list it tries to flush the pages to
> disk and panic(9) because it needs a valid reference to the vnode to do
> so.
>
> This indicates that the mechanism described above, which seems to work
> fine for RO mappings, is currently buggy in more complex situations.
> Flushing the pages when the last reference of the UVM object is dropped
> also doesn't seem to be enough as bluhm@ reported [1].
>
> The diff below, which has already be committed and reverted, gets rid of
> the UVM_VNODE_CANPERSIST logic.  I'd like to commit it again now that
> the arm64 caching bug has been found and fixed.
>
> Getting rid of this logic means more I/O will be generated and pages
> might have a faster reuse cycle.  I'm aware this might introduce a small
> slowdown,

Numbers for my usual make -j4 in libc,
on an Unmatched riscv64 box, now:
   16m32.65s real21m36.79s user30m53.45s system
   16m32.37s real21m33.40s user31m17.98s system
   16m32.63s real21m35.74s user31m12.01s system
   16m32.13s real21m36.12s user31m06.92s system
After:
   19m14.15s real21m09.39s user36m51.33s system
   19m19.11s real21m02.61s user36m58.46s system
   19m21.77s real21m09.23s user37m03.85s system
   19m09.39s real21m08.96s user36m36.00s system

4 cores amd64 VM, before (-current plus an other diff):
   1m54.31s real 2m47.36s user 4m24.70s system
   1m52.64s real 2m45.68s user 4m23.46s system
   1m53.47s real 2m43.59s user 4m27.60s system
After:
   2m34.12s real 2m51.15s user 6m20.91s system
   2m34.30s real 2m48.48s user 6m23.34s system
   2m37.07s real 2m49.60s user 6m31.53s system

> however I believe we should work towards loading files from the
> buffer cache to save I/O cycles instead of having another layer of cache.
> Such work isn't trivial and making sure the vnode <-> UVM relation is
> simple and well understood is the first step in this direction.
>
> I'd appreciate if the diff below could be tested on many architectures,
> include the offending rpi4.

Mike has already tested a make build on a riscv64 Unmatched.  I have
also run regress in sys, lib/libc and lib/libpthread on that arch.  As
far as I can see this looks stable on my machine, but what I really care
about is the riscv64 bulk build cluster (I'm going to start another
bulk build soon).

> Comments?  Oks?

The performance drop in my microbenchmark kinda worries me but it's only
a microbenchmark...

> [0] https://marc.info/?l=openbsd-bugs=164846737707559=2 
> [1] https://marc.info/?l=openbsd-bugs=166843373415030=2
>
> Index: uvm/uvm_vnode.c
> ===
> RCS file: /cvs/src/sys/uvm/uvm_vnode.c,v
> retrieving revision 1.130
> diff -u -p -r1.130 uvm_vnode.c
> --- uvm/uvm_vnode.c   20 Oct 2022 13:31:52 -  1.130
> +++ uvm/uvm_vnode.c   15 Nov 2022 13:28:28 -
> @@ -161,11 +161,8 @@ uvn_attach(struct vnode *vp, vm_prot_t a
>* add it to the writeable list, and then return.
>*/
>   if (uvn->u_flags & UVM_VNODE_VALID) {   /* already active? */
> + KASSERT(uvn->u_obj.uo_refs > 0);
>  
> - /* regain vref if we were persisting */
> - if (uvn->u_obj.uo_refs == 0) {
> - vref(vp);
> - }
>   uvn->u_obj.uo_refs++;   /* bump uvn ref! */
>  
>   /* check for new writeable uvn */
> @@ -235,14 +232,14 @@ uvn_attach(struct vnode *vp, vm_prot_t a
>   KASSERT(uvn->u_obj.uo_refs == 0);
>   uvn->u_obj.uo_refs++;
>   oldflags = uvn->u_flags;
> - uvn->u_flags = UVM_VNODE_VALID|UVM_VNODE_CANPERSIST;
> + uvn->u_flags = UVM_VNODE_VALID;
>   uvn->u_nio = 0;
>   uvn->u_size = used_vnode_size;
>  
>   /*
>* add a reference to the vnode.   this reference will stay as long
>* as there is a valid mapping of the vnode.   dropped when the
> -  * reference count goes to zero [and we either free or persist].
> +  * reference count goes to zero.
>*/
>   vref(vp);
>  
> @@ -323,16 +320,6 @@ uvn_detach(struct uvm_object *uobj)
>*/
>   vp->v_flag &= ~VTEXT;
>  
> - /*
> -  * we just