all architectures: put clockframe definition in frame.h?

2022-08-18 Thread Scott Cheloha
Hi,

clockframe is sometimes defined in cpu.h, sometimes in frame.h, and
sometimes defined once each in both header files.

Can we put the clockframe definitions in frame.h?  Always?  It is, at
least ostensibly, a "frame".

I do not want to consolidate the clockframe definitions in cpu.h
because this is creating a circular dependency problem for my clock
interrupt patch.

In particular, cpu.h needs a data structure defined in a new header
file to add it to struct cpu_info on all architectures, like this:

/* cpu.h */

#include 

struct cpu_info {
/* ... */
struct clockintr_state;
};

... but the header clockintr.h needs the clockframe definition so it
can prototype functions accepting a clockframe pointer, like this:

/* clockintr.h */

#include   /* this works fine */

#ifdef this_does_not_work
#include 
#endif

int clockintr_foo(struct clockframe *, int, short);
int clockintr_bar(struct clockframe *, char *, long);

struct clockintr_state {
char *cs_foo;
int cs_bar;
};

--

Hopefully I have illustrated the problem.

The only architecture where this might be a problem is sparc64.
There, clockframe is defined in terms of trapframe64, which is defined
in reg.h, not frame.h.

kettenis: can we put clockframe in frame.h on sparc64 or am I buying
trouble?

I can't compile-test this everywhere, but because every architecture's
cpu.h includes frame.h I don't think this can break anything (except
on sparc64).

The CLKF macros can remain in cpu.h.  They are not data structures so
putting them in frame.h looks odd on most architectures.

Index: alpha/include/cpu.h
===
RCS file: /cvs/src/sys/arch/alpha/include/cpu.h,v
retrieving revision 1.66
diff -u -p -r1.66 cpu.h
--- alpha/include/cpu.h 10 Aug 2022 10:41:35 -  1.66
+++ alpha/include/cpu.h 19 Aug 2022 03:27:06 -
@@ -296,14 +296,6 @@ cpu_rnd_messybits(void)
return alpha_rpcc();
 }
 
-/*
- * Arguments to hardclock and gatherstats encapsulate the previous
- * machine state in an opaque clockframe.  On the Alpha, we use
- * what we push on an interrupt (a trapframe).
- */
-struct clockframe {
-   struct trapframecf_tf;
-};
 #defineCLKF_USERMODE(framep)   
\
(((framep)->cf_tf.tf_regs[FRAME_PS] & ALPHA_PSL_USERMODE) != 0)
 #defineCLKF_PC(framep) ((framep)->cf_tf.tf_regs[FRAME_PC])
Index: alpha/include/frame.h
===
RCS file: /cvs/src/sys/arch/alpha/include/frame.h,v
retrieving revision 1.4
diff -u -p -r1.4 frame.h
--- alpha/include/frame.h   23 Mar 2011 16:54:34 -  1.4
+++ alpha/include/frame.h   19 Aug 2022 03:27:08 -
@@ -92,4 +92,13 @@ struct trapframe {
unsigned long   tf_regs[FRAME_SIZE];/* See above */
 };
 
+/*
+ * Arguments to hardclock and gatherstats encapsulate the previous
+ * machine state in an opaque clockframe.  On the Alpha, we use
+ * what we push on an interrupt (a trapframe).
+ */
+struct clockframe {
+   struct trapframecf_tf;
+};
+
 #endif /* _MACHINE_FRAME_H_ */
Index: amd64/include/cpu.h
===
RCS file: /cvs/src/sys/arch/amd64/include/cpu.h,v
retrieving revision 1.147
diff -u -p -r1.147 cpu.h
--- amd64/include/cpu.h 12 Aug 2022 02:20:36 -  1.147
+++ amd64/include/cpu.h 19 Aug 2022 03:27:08 -
@@ -335,13 +335,6 @@ cpu_rnd_messybits(void)
 
 #define curpcb curcpu()->ci_curpcb
 
-/*
- * Arguments to hardclock, softclock and statclock
- * encapsulate the previous machine state in an opaque
- * clockframe; for now, use generic intrframe.
- */
-#define clockframe intrframe
-
 #defineCLKF_USERMODE(frame)USERMODE((frame)->if_cs, 
(frame)->if_rflags)
 #define CLKF_PC(frame) ((frame)->if_rip)
 #define CLKF_INTR(frame)   (curcpu()->ci_idepth > 1)
Index: amd64/include/frame.h
===
RCS file: /cvs/src/sys/arch/amd64/include/frame.h,v
retrieving revision 1.10
diff -u -p -r1.10 frame.h
--- amd64/include/frame.h   10 Jul 2018 08:57:44 -  1.10
+++ amd64/include/frame.h   19 Aug 2022 03:27:08 -
@@ -138,6 +138,12 @@ struct intrframe {
int64_t if_ss;
 };
 
+/*
+ * Arguments to hardclock, softclock and statclock
+ * encapsulate the previous machine state in an opaque
+ * clockframe; for now, use generic intrframe.
+ */
+#define clockframe intrframe
 
 /*
  * The trampoline frame used on the kernel stack page which is present
Index: arm64/include/cpu.h
===
RCS file: /cvs/src/sys/arch/arm64/include/cpu.h,v
retrieving revision 1.27
diff -u -p -r1.27 cpu.h
--- arm64/include/cpu.h 13 Jul 2022 09:28:19 -  1.27
+++ arm64/include/cpu.h 19 Aug 2022 03:27:08 -
@@ -49,7 +49,6 @@
 
 /* All the CLKF_* macros 

Re: Fix a race in uvm_pseg_release()

2022-08-18 Thread Jonathan Gray
On Thu, Aug 18, 2022 at 12:39:58PM +0200, Martin Pieuchot wrote:
> The lock must be grabbed before iterating on the global array, ok?

ok jsg@

> 
> Index: uvm/uvm_pager.c
> ===
> RCS file: /cvs/src/sys/uvm/uvm_pager.c,v
> retrieving revision 1.88
> diff -u -p -r1.88 uvm_pager.c
> --- uvm/uvm_pager.c   15 Aug 2022 03:21:04 -  1.88
> +++ uvm/uvm_pager.c   18 Aug 2022 10:31:16 -
> @@ -209,6 +209,7 @@ uvm_pseg_release(vaddr_t segaddr)
>   struct uvm_pseg *pseg;
>   vaddr_t va = 0;
>  
> + mtx_enter(_pseg_lck);
>   for (pseg = [0]; pseg != [PSEG_NUMSEGS]; pseg++) {
>   if (pseg->start <= segaddr &&
>   segaddr < pseg->start + MAX_PAGER_SEGS * MAXBSIZE)
> @@ -222,7 +223,6 @@ uvm_pseg_release(vaddr_t segaddr)
>   /* test for no remainder */
>   KDASSERT(segaddr == pseg->start + id * MAXBSIZE);
>  
> - mtx_enter(_pseg_lck);
>  
>   KASSERT(UVM_PSEG_INUSE(pseg, id));
>  
> 
> 



Re: dhcpleased.8: add lease files to FILES

2022-08-18 Thread Jason McIntyre
On Thu, Aug 18, 2022 at 08:34:56PM +, Klemens Nanni wrote:
> On Thu, Aug 18, 2022 at 08:53:51PM +0100, Jason McIntyre wrote:
> > On Thu, Aug 18, 2022 at 07:29:42PM +, Klemens Nanni wrote:
> > > There is dhcpleasectl(8) -l but that only works for currently
> > > configured leases/interfaces and does not print all information
> > > contained in the lease file (mostly tailored to fit the installer's
> > > needs).
> > > 
> > > Feedback? OK?
> > > 
> > 
> > hi.
> > 
> > - in general i like this. i didn;t know about it, and am currently
> >   having to run dhcpleasectl to get the info (and generating a temp
> >   file so a script can read it!)
> > 
> > - when you say dhcpleasectl only works "for currently configured
> >   leases/interfaces" how does this differ? you mean it shows you the
> >   last lease, even if one is not currently in use?
> 
> Lease file stay behind until manually removed or overwritten with fresh
> leases, so you need to know what's what when parsing them:
>   $ ls /var/db/dhcpleased/
>   athn0cdce0em0  trunk0   urndis0
> 
> For example, there is no cdce0 or urndis0 right now on my box, those are
> from past tethering situations and those lease files are useless now.
> 
> You can always parse those files but it does not always make sense and
> nothing prevents you from dealing with old useless data.
> 
> dhcpleasectl on the other hand will check for autoconf set:
>   $ dhcpleasectl -l athn0
>   dhcpleasectl: non-autoconf interface athn0
>   $ dhcpleasectl -l trunk0
>   trunk0 [Bound]
>   inet 192.168.0.109 netmask 255.255.255.0
>   default gateway 192.168.0.1
>   nameservers 192.168.0.1
>   lease 1 hours
>   dhcp server 192.168.0.1
> 

right. i just wanted to check i understood what you were saying (and how
it worked).

> > 
> > - i prefer the wording "Interface specific lease files." but that's just
> >   a suggestion.
> 
> Sure, why not.
> 

well, as i said, a suggestion. ok either way.
jmc

ps would it make sense to say "Most recent interface specific lease
file." or is the distinction unhelpful? i suppose ypu might want to know
it's not neccessarily in use.

> 
> Index: dhcpleased.8
> ===
> RCS file: /cvs/src/sbin/dhcpleased/dhcpleased.8,v
> retrieving revision 1.4
> diff -u -p -r1.4 dhcpleased.8
> --- dhcpleased.8  23 Aug 2021 18:09:05 -  1.4
> +++ dhcpleased.8  18 Aug 2022 20:34:38 -
> @@ -72,7 +72,7 @@ Multiple
>  options increase the verbosity.
>  .El
>  .Sh FILES
> -.Bl -tag -width "/dev/dhcpleased.sockXX" -compact
> +.Bl -tag -width "/var/db/dhcpleased/" -compact
>  .It Pa /dev/dhcpleased.sock
>  .Ux Ns -domain
>  socket used for communication with
> @@ -81,6 +81,8 @@ socket used for communication with
>  Default
>  .Nm
>  configuration file.
> +.It Pa /var/db/dhcpleased/ Ns Aq Ar if
> +Interface specific lease files.
>  .El
>  .Sh SEE ALSO
>  .Xr dhcpleased.conf 5 ,
> 



Re: dhcpleased.8: add lease files to FILES

2022-08-18 Thread Klemens Nanni
On Thu, Aug 18, 2022 at 08:53:51PM +0100, Jason McIntyre wrote:
> On Thu, Aug 18, 2022 at 07:29:42PM +, Klemens Nanni wrote:
> > There is dhcpleasectl(8) -l but that only works for currently
> > configured leases/interfaces and does not print all information
> > contained in the lease file (mostly tailored to fit the installer's
> > needs).
> > 
> > Feedback? OK?
> > 
> 
> hi.
> 
> - in general i like this. i didn;t know about it, and am currently
>   having to run dhcpleasectl to get the info (and generating a temp
>   file so a script can read it!)
> 
> - when you say dhcpleasectl only works "for currently configured
>   leases/interfaces" how does this differ? you mean it shows you the
>   last lease, even if one is not currently in use?

Lease file stay behind until manually removed or overwritten with fresh
leases, so you need to know what's what when parsing them:
$ ls /var/db/dhcpleased/
athn0cdce0em0  trunk0   urndis0

For example, there is no cdce0 or urndis0 right now on my box, those are
from past tethering situations and those lease files are useless now.

You can always parse those files but it does not always make sense and
nothing prevents you from dealing with old useless data.

dhcpleasectl on the other hand will check for autoconf set:
$ dhcpleasectl -l athn0
dhcpleasectl: non-autoconf interface athn0
$ dhcpleasectl -l trunk0
trunk0 [Bound]
inet 192.168.0.109 netmask 255.255.255.0
default gateway 192.168.0.1
nameservers 192.168.0.1
lease 1 hours
dhcp server 192.168.0.1

> 
> - i prefer the wording "Interface specific lease files." but that's just
>   a suggestion.

Sure, why not.


Index: dhcpleased.8
===
RCS file: /cvs/src/sbin/dhcpleased/dhcpleased.8,v
retrieving revision 1.4
diff -u -p -r1.4 dhcpleased.8
--- dhcpleased.823 Aug 2021 18:09:05 -  1.4
+++ dhcpleased.818 Aug 2022 20:34:38 -
@@ -72,7 +72,7 @@ Multiple
 options increase the verbosity.
 .El
 .Sh FILES
-.Bl -tag -width "/dev/dhcpleased.sockXX" -compact
+.Bl -tag -width "/var/db/dhcpleased/" -compact
 .It Pa /dev/dhcpleased.sock
 .Ux Ns -domain
 socket used for communication with
@@ -81,6 +81,8 @@ socket used for communication with
 Default
 .Nm
 configuration file.
+.It Pa /var/db/dhcpleased/ Ns Aq Ar if
+Interface specific lease files.
 .El
 .Sh SEE ALSO
 .Xr dhcpleased.conf 5 ,



Re: dhcpleased.8: add lease files to FILES

2022-08-18 Thread Jason McIntyre
On Thu, Aug 18, 2022 at 07:29:42PM +, Klemens Nanni wrote:
> There is dhcpleasectl(8) -l but that only works for currently
> configured leases/interfaces and does not print all information
> contained in the lease file (mostly tailored to fit the installer's
> needs).
> 
> Feedback? OK?
> 

hi.

- in general i like this. i didn;t know about it, and am currently
  having to run dhcpleasectl to get the info (and generating a temp
  file so a script can read it!)

- when you say dhcpleasectl only works "for currently configured
  leases/interfaces" how does this differ? you mean it shows you the
  last lease, even if one is not currently in use?

- i prefer the wording "Interface specific lease files." but that's just
  a suggestion.

jmc

> Index: dhcpleased.8
> ===
> RCS file: /cvs/src/sbin/dhcpleased/dhcpleased.8,v
> retrieving revision 1.4
> diff -u -p -r1.4 dhcpleased.8
> --- dhcpleased.8  23 Aug 2021 18:09:05 -  1.4
> +++ dhcpleased.8  18 Aug 2022 19:24:20 -
> @@ -72,7 +72,7 @@ Multiple
>  options increase the verbosity.
>  .El
>  .Sh FILES
> -.Bl -tag -width "/dev/dhcpleased.sockXX" -compact
> +.Bl -tag -width "/var/db/dhcpleased/" -compact
>  .It Pa /dev/dhcpleased.sock
>  .Ux Ns -domain
>  socket used for communication with
> @@ -81,6 +81,8 @@ socket used for communication with
>  Default
>  .Nm
>  configuration file.
> +.It Pa /var/db/dhcpleased/ Ns Aq Ar if
> +Lease files per interface.
>  .El
>  .Sh SEE ALSO
>  .Xr dhcpleased.conf 5 ,
> 



Re: ifconfig, wireguard output less verbose, unless -A or

2022-08-18 Thread Hrvoje Popovski
On 14.7.2022. 11:37, Mikolaj Kucharski wrote:
> Hi,
> 
> Per other thread, Theo expressed dissatisfaction with long ifconfig(8)
> for wg(4) interface. Stuart Henderson pointed me at direction, which
> below diff makes it work.
> 
> I guess to questions are:
> 
> - Does the behaviour of ifconfig(8) make sense?
> - Does the code which makes above, make sense?
> 
> This is minimal diff, I would appreciate feedback, I did least
> resistance approach. Looking at diff -wu shows even less changes
> as wg_status() is mainly identation with if-statement.
> 
> 
> Short output by default, only 6 lines, no wgpeers section:
> 
> pce-0067# ifconfig.ifaliases -a | tail -n6
> wg0: flags=80c3 mtu 1420
> index 8 priority 0 llprio 3
> wgport 51820
> wgpubkey qcb...
> groups: wg
> inet6 fde4:f456:48c2:13c0::cc67 prefixlen 64
> 
> 
> Long output with  as an argument, wgpeers section present:
> 
> pce-0067# ifconfig.ifaliases wg0
> wg0: flags=80c3 mtu 1420
> index 8 priority 0 llprio 3
> wgport 51820
> wgpubkey qcb...
> wgpeer klM...
> wgpsk (present)
> wgpka 25 (sec)
> wgendpoint xxx.xxx.xxx.xxx 51820
> tx: 178764, rx: 65100
> last handshake: 7 seconds ago
> wgaip fde4:f456:48c2:13c0::/64
> groups: wg
> inet6 fde4:f456:48c2:13c0::cc67 prefixlen 64
> 
> 
> Above long output works with group as an argument (ifconfig wg) and with
> option -A (ifconfig -A), so I think from user experience perspective,
> works as expected.
> 
> Manual page changes not provided, as I'm not sure are they needed with
> this diff.
> 
> Comments?



Hi,

from my user perspective this is wonderful.
I have 16 wgpeers, 5 vlans, 5 carps, 4 physical interfaces and mostly if
doing ifconfig I want to know if physical interfaces are up and what is
status of carp. I don't need 100 lines of wg stuff in plain ifconifg.



dhcpleased.8: add lease files to FILES

2022-08-18 Thread Klemens Nanni
There is dhcpleasectl(8) -l but that only works for currently
configured leases/interfaces and does not print all information
contained in the lease file (mostly tailored to fit the installer's
needs).

Feedback? OK?

Index: dhcpleased.8
===
RCS file: /cvs/src/sbin/dhcpleased/dhcpleased.8,v
retrieving revision 1.4
diff -u -p -r1.4 dhcpleased.8
--- dhcpleased.823 Aug 2021 18:09:05 -  1.4
+++ dhcpleased.818 Aug 2022 19:24:20 -
@@ -72,7 +72,7 @@ Multiple
 options increase the verbosity.
 .El
 .Sh FILES
-.Bl -tag -width "/dev/dhcpleased.sockXX" -compact
+.Bl -tag -width "/var/db/dhcpleased/" -compact
 .It Pa /dev/dhcpleased.sock
 .Ux Ns -domain
 socket used for communication with
@@ -81,6 +81,8 @@ socket used for communication with
 Default
 .Nm
 configuration file.
+.It Pa /var/db/dhcpleased/ Ns Aq Ar if
+Lease files per interface.
 .El
 .Sh SEE ALSO
 .Xr dhcpleased.conf 5 ,



Re: installboot: clarify how -p only sets up the filesystem

2022-08-18 Thread Todd C . Miller
On Thu, 18 Aug 2022 18:53:41 -, Klemens Nanni wrote:

> root is now initalised before the verbose check so it is independent of
> -v usage and happens after prepare code, indicating that -p does not
> care about -r.

OK millert@

 - todd



Re: installboot: clarify how -p only sets up the filesystem

2022-08-18 Thread Klemens Nanni
On Thu, Aug 18, 2022 at 11:05:52AM -0600, Todd C. Miller wrote:
> On Thu, 18 Aug 2022 11:51:19 -, Klemens Nanni wrote:
> 
> > I've checked all usage combination of flags and arguments, the new code
> > does what the new synopsis says.
> 
> I agree with the general direction but have one concern.  See inline.
> 
>  - todd
> 
> > Index: installboot.c
> > ===
> > RCS file: /cvs/src/usr.sbin/installboot/installboot.c,v
> > retrieving revision 1.14
> > diff -u -p -r1.14 installboot.c
> > --- installboot.c   20 Jul 2021 14:51:56 -  1.14
> > +++ installboot.c   18 Aug 2022 11:42:43 -
> > @@ -31,17 +31,16 @@ int prepare;
> >  intstages;
> >  intverbose;
> >  
> > -char   *root = "/";
> > +char   *root;
> 
> If you don't initalize root here, a NULL pointer will be used later
> when the -v option is not also used.  You could also kill the useless
> strdup of optarg when the -r option is used.

Of course, thanks.

> 
> >  char   *stage1;
> >  char   *stage2;
> >  
> >  static __dead void
> >  usage(void)
> >  {
> > -   extern char *__progname;
> > -
> > -   fprintf(stderr, "usage: %s [-npv] [-r root] disk [stage1%s]\n",
> > -   __progname, (stages >= 2) ? " [stage2]" : "");
> > +   fprintf(stderr, "usage:\t%1$s [-nv] [-r root] disk [stage1%2$s]\n"
> > +   "\t%1$s [-nv] -p disk\n",
> > +   getprogname(), (stages >= 2) ? " [stage2]" : "");
> >  
> > exit(1);
> >  }
> > @@ -80,6 +79,8 @@ main(int argc, char **argv)
> >  
> > if (argc < 1 || argc > stages + 1)
> > usage();
> > +   if (prepare && (root != NULL || argc > 1))
> > +   usage();
> >  
> > dev = argv[0];
> > if (argc > 1)
> > @@ -87,9 +88,21 @@ main(int argc, char **argv)
> > if (argc > 2)
> > stage2 = argv[2];
> >  
> > +   if ((devfd = opendev(dev, (nowrite ? O_RDONLY : O_RDWR), OPENDEV_PART,
> > +   )) == -1)
> > +   err(1, "open: %s", realdev);
> > +
> > +   if (prepare) {
> > +   md_prepareboot(devfd, realdev);
> > +   return 0;
> > +   }
> > +
> > /* Prefix stages with root, unless they were user supplied. */
> > -   if (verbose)
> > +   if (verbose) {
> > +   if (root == NULL)
> > +   root = "/";
> > fprintf(stderr, "Using %s as root\n", root);
> > +   }
> > if (argc <= 1 && stage1 != NULL) {
> > stage1 = fileprefix(root, stage1);
> 
> With your diff root may be NULL when calling fileprefix().

root is now initalised before the verbose check so it is independent of
-v usage and happens after prepare code, indicating that -p does not
care about -r.

> 
> > if (stage1 == NULL)
> > @@ -101,10 +114,6 @@ main(int argc, char **argv)
> > exit(1);
> > }
> >  
> > -   if ((devfd = opendev(dev, (nowrite ? O_RDONLY : O_RDWR), OPENDEV_PART,
> > -   )) == -1)
> > -   err(1, "open: %s", realdev);
> > -
> >  if (verbose) {
> > fprintf(stderr, "%s bootstrap on %s\n",
> > (nowrite ? "would install" : "installing"), realdev);
> > @@ -118,11 +127,6 @@ main(int argc, char **argv)
> > }
> >  
> > md_loadboot();
> > -
> > -   if (prepare) {
> > -   md_prepareboot(devfd, realdev);
> > -   return 0;
> > -   }
> >  
> >  #ifdef SOFTRAID
> > sr_installboot(devfd, dev);
> >
> 


Index: installboot.8
===
RCS file: /cvs/src/usr.sbin/installboot/installboot.8,v
retrieving revision 1.5
diff -u -p -r1.5 installboot.8
--- installboot.8   20 Jul 2021 14:51:56 -  1.5
+++ installboot.8   18 Aug 2022 11:32:51 -
@@ -22,10 +22,14 @@
 .Nd install bootstrap on a disk
 .Sh SYNOPSIS
 .Nm installboot
-.Op Fl npv
+.Op Fl nv
 .Op Fl r Ar root
 .Ar disk
 .Op Ar stage1 Op Ar stage2
+.Nm
+.Op Fl nv
+.Fl p
+.Ar disk
 .Sh DESCRIPTION
 .Nm
 installs bootstrap on the specified disk.
@@ -38,7 +42,7 @@ the beginning of the disk.
 The options are as follows:
 .Bl -tag -width Ds
 .It Fl n
-Perform a dry run - do not actually write any bootstrap to the disk.
+Perform a dry run - do not actually write to the disk.
 .It Fl p
 Prepare filesystem.
 This will create a new filesystem on the partition reserved for the
Index: installboot.c
===
RCS file: /cvs/src/usr.sbin/installboot/installboot.c,v
retrieving revision 1.14
diff -u -p -r1.14 installboot.c
--- installboot.c   20 Jul 2021 14:51:56 -  1.14
+++ installboot.c   18 Aug 2022 18:51:45 -
@@ -31,17 +31,16 @@ int prepare;
 intstages;
 intverbose;
 
-char   *root = "/";
+char   *root;
 char   *stage1;
 char   *stage2;
 
 static __dead void
 usage(void)
 {
-   extern char *__progname;
-
-   fprintf(stderr, "usage: %s [-npv] [-r root] disk [stage1%s]\n",
-   __progname, (stages >= 2) ? " [stage2]" : "");
+

Re: Fix a race in uvm_pseg_release()

2022-08-18 Thread Mike Larkin
On Thu, Aug 18, 2022 at 12:39:58PM +0200, Martin Pieuchot wrote:
> The lock must be grabbed before iterating on the global array, ok?
>
> Index: uvm/uvm_pager.c
> ===
> RCS file: /cvs/src/sys/uvm/uvm_pager.c,v
> retrieving revision 1.88
> diff -u -p -r1.88 uvm_pager.c
> --- uvm/uvm_pager.c   15 Aug 2022 03:21:04 -  1.88
> +++ uvm/uvm_pager.c   18 Aug 2022 10:31:16 -
> @@ -209,6 +209,7 @@ uvm_pseg_release(vaddr_t segaddr)
>   struct uvm_pseg *pseg;
>   vaddr_t va = 0;
>
> + mtx_enter(_pseg_lck);
>   for (pseg = [0]; pseg != [PSEG_NUMSEGS]; pseg++) {
>   if (pseg->start <= segaddr &&
>   segaddr < pseg->start + MAX_PAGER_SEGS * MAXBSIZE)
> @@ -222,7 +223,6 @@ uvm_pseg_release(vaddr_t segaddr)
>   /* test for no remainder */
>   KDASSERT(segaddr == pseg->start + id * MAXBSIZE);
>
> - mtx_enter(_pseg_lck);
>
>   KASSERT(UVM_PSEG_INUSE(pseg, id));
>
>

ok mlarkin



Re: [RFC] acpi: add acpitimer_delay(), acpihpet_delay()

2022-08-18 Thread Mike Larkin
On Wed, Aug 17, 2022 at 09:00:12PM +1000, Jonathan Gray wrote:
> On Wed, Aug 17, 2022 at 04:53:20PM +1000, Jonathan Gray wrote:
> >
> > It seems to me it would be cleaner if the decision of what to use for
> > delay could be moved into an md file.
> >
> > Or abstract it by having a numeric weight like timecounters or driver
> > match return numbers.
>
> diff against your previous, does not change lapic_delay
>

I think the combination of diffs is a move in the right direction, so ok
mlarkin on these when ready.

-ml

> diff --git sys/arch/amd64/amd64/machdep.c sys/arch/amd64/amd64/machdep.c
> index 932b1dfeb47..c4645b6a6fd 100644
> --- sys/arch/amd64/amd64/machdep.c
> +++ sys/arch/amd64/amd64/machdep.c
> @@ -2069,3 +2069,13 @@ check_context(const struct reg *regs, struct trapframe 
> *tf)
>
>   return 0;
>  }
> +
> +void
> +delay_init(void(*f)(int), int v)
> +{
> + static int c = 0;
> + if (v > c) {
> + delay_func = f;
> + c = v;
> + }
> +}
> diff --git sys/arch/amd64/amd64/tsc.c sys/arch/amd64/amd64/tsc.c
> index fd38dc6359d..8c839357dd2 100644
> --- sys/arch/amd64/amd64/tsc.c
> +++ sys/arch/amd64/amd64/tsc.c
> @@ -109,7 +109,7 @@ tsc_identify(struct cpu_info *ci)
>
>   tsc_frequency = tsc_freq_cpuid(ci);
>   if (tsc_frequency > 0)
> - delay_func = tsc_delay;
> + delay_init(tsc_delay, 300);
>  }
>
>  static inline int
> diff --git sys/arch/amd64/include/cpu.h sys/arch/amd64/include/cpu.h
> index b8db48f2714..a82af172452 100644
> --- sys/arch/amd64/include/cpu.h
> +++ sys/arch/amd64/include/cpu.h
> @@ -359,6 +359,7 @@ void signotify(struct proc *);
>   * We need a machine-independent name for this.
>   */
>  extern void (*delay_func)(int);
> +void delay_init(void(*)(int), int);
>  struct timeval;
>
>  #define DELAY(x) (*delay_func)(x)
> diff --git sys/arch/i386/i386/machdep.c sys/arch/i386/i386/machdep.c
> index e4cb15b4dc1..7da5c26e240 100644
> --- sys/arch/i386/i386/machdep.c
> +++ sys/arch/i386/i386/machdep.c
> @@ -3996,3 +3996,13 @@ cpu_rnd_messybits(void)
>   nanotime();
>   return (ts.tv_nsec ^ (ts.tv_sec << 20));
>  }
> +
> +void
> +delay_init(void(*f)(int), int v)
> +{
> + static int c = 0;
> + if (v > c) {
> + delay_func = f;
> + c = v;
> + }
> +}
> diff --git sys/arch/i386/include/cpu.h sys/arch/i386/include/cpu.h
> index 5f300710562..211ee475678 100644
> --- sys/arch/i386/include/cpu.h
> +++ sys/arch/i386/include/cpu.h
> @@ -302,6 +302,7 @@ void signotify(struct proc *);
>   * We need a machine-independent name for this.
>   */
>  extern void (*delay_func)(int);
> +void delay_init(void(*)(int), int);
>  struct timeval;
>
>  #define  DELAY(x)(*delay_func)(x)
> diff --git sys/dev/acpi/acpihpet.c sys/dev/acpi/acpihpet.c
> index 6dc595e50ab..4332b4dbc0e 100644
> --- sys/dev/acpi/acpihpet.c
> +++ sys/dev/acpi/acpihpet.c
> @@ -27,8 +27,6 @@
>  #include 
>  #include 
>
> -#include "acpitimer.h"
> -
>  int acpihpet_attached;
>
>  int acpihpet_match(struct device *, void *, void *);
> @@ -270,15 +268,7 @@ acpihpet_attach(struct device *parent, struct device 
> *self, void *aux)
>   hpet_timecounter.tc_name = sc->sc_dev.dv_xname;
>   tc_init(_timecounter);
>
> -#if defined(__amd64__) || defined(__i386__)
> - if (delay_func == i8254_delay)
> - delay_func = acpihpet_delay;
> -#if NACPITIMER > 1
> - extern void acpitimer_delay(int);
> - if (delay_func == acpitimer_delay)
> - delay_func = acpihpet_delay;
> -#endif
> -#endif /* defined(__amd64__) || defined(__i386__) */
> + delay_init(acpihpet_delay, 200);
>
>  #if defined(__amd64__)
>   extern void cpu_recalibrate_tsc(struct timecounter *);
> diff --git sys/dev/acpi/acpitimer.c sys/dev/acpi/acpitimer.c
> index 0c4c7b71a01..e2757a40f3d 100644
> --- sys/dev/acpi/acpitimer.c
> +++ sys/dev/acpi/acpitimer.c
> @@ -103,10 +103,8 @@ acpitimerattach(struct device *parent, struct device 
> *self, void *aux)
>   acpi_timecounter.tc_name = sc->sc_dev.dv_xname;
>   tc_init(_timecounter);
>
> -#if defined(__amd64__) || defined(__i386__)
> - if (delay_func == i8254_delay)
> - delay_func = acpitimer_delay;
> -#endif
> + delay_init(acpitimer_delay, 100);
> +
>  #if defined(__amd64__)
>   extern void cpu_recalibrate_tsc(struct timecounter *);
>   cpu_recalibrate_tsc(_timecounter);
> diff --git sys/dev/acpi/files.acpi sys/dev/acpi/files.acpi
> index 8ec3ec2f8b3..f97eb6d4e3e 100644
> --- sys/dev/acpi/files.acpi
> +++ sys/dev/acpi/files.acpi
> @@ -13,7 +13,7 @@ filedev/acpi/acpidebug.cacpi & ddb
>  # ACPI timer
>  device   acpitimer
>  attach   acpitimer at acpi
> -file dev/acpi/acpitimer.cacpitimer needs-flag
> +file dev/acpi/acpitimer.cacpitimer
>
>  # AC device
>  device   acpiac
> diff --git sys/dev/pv/pvbus.c sys/dev/pv/pvbus.c
> index cbe543ac312..ee53afe2138 100644
> --- 

Re: installboot: clarify how -p only sets up the filesystem

2022-08-18 Thread Todd C . Miller
On Thu, 18 Aug 2022 11:51:19 -, Klemens Nanni wrote:

> I've checked all usage combination of flags and arguments, the new code
> does what the new synopsis says.

I agree with the general direction but have one concern.  See inline.

 - todd

> Index: installboot.c
> ===
> RCS file: /cvs/src/usr.sbin/installboot/installboot.c,v
> retrieving revision 1.14
> diff -u -p -r1.14 installboot.c
> --- installboot.c 20 Jul 2021 14:51:56 -  1.14
> +++ installboot.c 18 Aug 2022 11:42:43 -
> @@ -31,17 +31,16 @@ int   prepare;
>  int  stages;
>  int  verbose;
>  
> -char *root = "/";
> +char *root;

If you don't initalize root here, a NULL pointer will be used later
when the -v option is not also used.  You could also kill the useless
strdup of optarg when the -r option is used.

>  char *stage1;
>  char *stage2;
>  
>  static __dead void
>  usage(void)
>  {
> - extern char *__progname;
> -
> - fprintf(stderr, "usage: %s [-npv] [-r root] disk [stage1%s]\n",
> - __progname, (stages >= 2) ? " [stage2]" : "");
> + fprintf(stderr, "usage:\t%1$s [-nv] [-r root] disk [stage1%2$s]\n"
> + "\t%1$s [-nv] -p disk\n",
> + getprogname(), (stages >= 2) ? " [stage2]" : "");
>  
>   exit(1);
>  }
> @@ -80,6 +79,8 @@ main(int argc, char **argv)
>  
>   if (argc < 1 || argc > stages + 1)
>   usage();
> + if (prepare && (root != NULL || argc > 1))
> + usage();
>  
>   dev = argv[0];
>   if (argc > 1)
> @@ -87,9 +88,21 @@ main(int argc, char **argv)
>   if (argc > 2)
>   stage2 = argv[2];
>  
> + if ((devfd = opendev(dev, (nowrite ? O_RDONLY : O_RDWR), OPENDEV_PART,
> + )) == -1)
> + err(1, "open: %s", realdev);
> +
> + if (prepare) {
> + md_prepareboot(devfd, realdev);
> + return 0;
> + }
> +
>   /* Prefix stages with root, unless they were user supplied. */
> - if (verbose)
> + if (verbose) {
> + if (root == NULL)
> + root = "/";
>   fprintf(stderr, "Using %s as root\n", root);
> + }
>   if (argc <= 1 && stage1 != NULL) {
>   stage1 = fileprefix(root, stage1);

With your diff root may be NULL when calling fileprefix().

>   if (stage1 == NULL)
> @@ -101,10 +114,6 @@ main(int argc, char **argv)
>   exit(1);
>   }
>  
> - if ((devfd = opendev(dev, (nowrite ? O_RDONLY : O_RDWR), OPENDEV_PART,
> - )) == -1)
> - err(1, "open: %s", realdev);
> -
>  if (verbose) {
>   fprintf(stderr, "%s bootstrap on %s\n",
>   (nowrite ? "would install" : "installing"), realdev);
> @@ -118,11 +127,6 @@ main(int argc, char **argv)
>   }
>  
>   md_loadboot();
> -
> - if (prepare) {
> - md_prepareboot(devfd, realdev);
> - return 0;
> - }
>  
>  #ifdef SOFTRAID
>   sr_installboot(devfd, dev);
>



Re: mips64: trigger deferred timer interrupt from splx(9)

2022-08-18 Thread Visa Hankala
On Thu, Aug 18, 2022 at 02:33:55PM +, Miod Vallat wrote:
> > After about 92 hours, one machine showed cp0_raise_calls=622486 and
> > another 695892. cp0_raise_miss was zero on both of them. On two other
> > machines I had forgotten to allow ddb access from console and could
> > not check the values.
> 
> Put kern.allowkmem=1 in /etc/sysctl.conf, and then you can do fetch the
> values with pstat -d.

Not without rebooting first. And I don't want to run the machines
with kern.allowkmem=1.



Re: mips64: trigger deferred timer interrupt from splx(9)

2022-08-18 Thread Miod Vallat
> After about 92 hours, one machine showed cp0_raise_calls=622486 and
> another 695892. cp0_raise_miss was zero on both of them. On two other
> machines I had forgotten to allow ddb access from console and could
> not check the values.

Put kern.allowkmem=1 in /etc/sysctl.conf, and then you can do fetch the
values with pstat -d.



Re: bgpd, uninitalised check in kroute_insert()

2022-08-18 Thread Theo Buehler
On Thu, Aug 18, 2022 at 03:22:50PM +0200, Claudio Jeker wrote:
> Noticed while compling with gcc. In kroute_insert() the check for possible
> multipath routes is:
>   if (krm == NULL)
>   kr_redistribute(IMSG_NETWORK_ADD, kt, kf);
> 
> The problem is krm is only set in the IPv4 path but not in the IPv6 one.
> The diff below fixes this by using a new variable multipath which is set
> in the case where a multipath route is hit.

Ugh. Sorry for missing that. Yes, a new variable seems like the most
readable fix.

ok tb


> 
> -- 
> :wq Claudio
> 
> Index: kroute.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/kroute.c,v
> retrieving revision 1.293
> diff -u -p -r1.293 kroute.c
> --- kroute.c  18 Aug 2022 12:14:00 -  1.293
> +++ kroute.c  18 Aug 2022 13:22:06 -
> @@ -1607,6 +1607,7 @@ kroute_insert(struct ktable *kt, struct 
>   struct kroute6  *kr6, *kr6m;
>   struct knexthop *n;
>   uint32_t mplslabel = 0;
> + int  multipath = 0;
>  
>   if (kf->prefix.aid == AID_VPN_IPv4 ||
>   kf->prefix.aid == AID_VPN_IPv6) {
> @@ -1648,6 +1649,7 @@ kroute_insert(struct ktable *kt, struct 
>   while (krm->next != NULL)
>   krm = krm->next;
>   krm->next = kr;
> + multipath = 1;
>   }
>  
>   if (kf->flags & F_BGPD)
> @@ -1684,6 +1686,7 @@ kroute_insert(struct ktable *kt, struct 
>   while (kr6m->next != NULL)
>   kr6m = kr6m->next;
>   kr6m->next = kr6;
> + multipath = 1;
>   }
>  
>   if (kf->flags & F_BGPD)
> @@ -1699,8 +1702,8 @@ kroute_insert(struct ktable *kt, struct 
>   kf->prefixlen) == 0)
>   knexthop_validate(kt, n);
>  
> - if (krm == NULL)
> - /* redistribute multipath routes only once */
> + /* redistribute multipath routes only once */
> + if (!multipath)
>   kr_redistribute(IMSG_NETWORK_ADD, kt, kf);
>   }
>  
> 



Re: mips64: trigger deferred timer interrupt from splx(9)

2022-08-18 Thread Visa Hankala
On Wed, Aug 17, 2022 at 11:42:50AM -0500, Scott Cheloha wrote:
> On Wed, Aug 17, 2022 at 01:30:29PM +, Visa Hankala wrote:
> > On Tue, Aug 09, 2022 at 09:54:02AM -0500, Scott Cheloha wrote:
> > > On Tue, Aug 09, 2022 at 02:03:31PM +, Visa Hankala wrote:
> > > > On Mon, Aug 08, 2022 at 02:52:37AM -0500, Scott Cheloha wrote:
> > > > > One thing I'm still uncertain about is how glxclk fits into the
> > > > > loongson picture.  It's an interrupt clock that runs hardclock() and
> > > > > statclock(), but the code doesn't do any logical masking, so I don't
> > > > > know whether or not I need to adjust anything in that code or account
> > > > > for it at all.  If there's no logical masking there's no deferral, so
> > > > > it would never call need to call md_triggerclock() from splx(9).
> > > > 
> > > > I think the masking of glxclk interrupts are handled by the ISA
> > > > interrupt code.
> > > 
> > > Do those machines not have Coprocessor 0?  If they do, why would you
> > > prefer glxclk over CP0?
> > > 
> > > > The patch misses md_triggerclock definition in mips64_machdep.c.
> > > 
> > > Whoops, forgot that file.  Fuller patch below.
> > > 
> > > > I have put this to the test on the mips64 ports builder machines.
> > 
> > The machines completed a build with this patch without problems.
> > I tested with the debug counters removed from cp0_trigger_int5().
> > 
> > OK visa@
> 
> Thank you for testing!
> 
> There was a loongson portion to this patch.  Is this OK on loongson or
> just octeon?

OK for both.

> Also, what did the debug counters look like when you yanked them?  If
> cp0_raise_miss was non-zero I will double the initial offset to 32
> cycles.

After about 92 hours, one machine showed cp0_raise_calls=622486 and
another 695892. cp0_raise_miss was zero on both of them. On two other
machines I had forgotten to allow ddb access from console and could
not check the values.

The current offset is good enough.



Re: pcidevs: PM991_NVME: add 980 name

2022-08-18 Thread Mark Kettenis
 My understanding is that for the Samsung SSDs there are internal part
numbers (PMxxx) and marketing names (SSD 980). So far we've used the
internal part numbers in pcidevs, partly because not all parts show up as
retails SSDs. Given that a 980 reference shows up in the firmware name
that we print. I'd say we should not extend the pcidevs string. Saves a
few bytes as well.

  Op 17-08-2022 12:34 schreef Klemens Nanni :

  SSD 980 is matched as PM991 but as far as I can tell from research,
  those are two different products with different product/model codes
  using the same PCI product id.
  nvme0 at pci4 dev 0 function 0 "Samsung PM991 NVMe" rev 0x00: msix,
  NVMe 1.4 nvme0: Samsung SSD 980 500GB, firmware 1B4QFXO7, serial
  S64DNX0RC12899Z scsibus3 at nvme0: 2 targets, initiator 0 sd2 at
  scsibus3 targ 1 lun 0: 
  Same for the PRO variant.
  Feedback? Objection? OK?

  Index: pcidevs
  ===
  RCS file: /cvs/src/sys/dev/pci/pcidevs,v retrieving revision 1.2001
  diff -u -p -r1.2001 pcidevs --- pcidevs 16 Aug 2022 09:28:45 -
  1.2001 +++ pcidevs 17 Aug 2022 10:25:36 - @@ -8334,8 +8334,8 @@
  product SAMSUNG2 SM951_AHCI 0xa801 SM951 product SAMSUNG2 SM951_NVME
  0xa802 SM951/PM951 NVMe product SAMSUNG2 SM961_NVME 0xa804
  SM961/PM961 NVMe product SAMSUNG2 SM981_NVME 0xa808 SM981/PM981 NVMe
  -product SAMSUNG2 PM991_NVME 0xa809 PM991 NVMe -product SAMSUNG2
  PM9A1_NVME 0xa80a PM9A1 NVMe +product SAMSUNG2 PM991_NVME 0xa809
  PM991/980 NVMe +product SAMSUNG2 PM9A1_NVME 0xa80a PM9A1/980PRO NVMe
  product SAMSUNG2 NVME_171X 0xa820 NVMe product SAMSUNG2 NVME_172X
  0xa821 NVMe product SAMSUNG2 NVME_172X_A_B 0xa822 NVMe


bgpd, uninitalised check in kroute_insert()

2022-08-18 Thread Claudio Jeker
Noticed while compling with gcc. In kroute_insert() the check for possible
multipath routes is:
if (krm == NULL)
kr_redistribute(IMSG_NETWORK_ADD, kt, kf);

The problem is krm is only set in the IPv4 path but not in the IPv6 one.
The diff below fixes this by using a new variable multipath which is set
in the case where a multipath route is hit.

-- 
:wq Claudio

Index: kroute.c
===
RCS file: /cvs/src/usr.sbin/bgpd/kroute.c,v
retrieving revision 1.293
diff -u -p -r1.293 kroute.c
--- kroute.c18 Aug 2022 12:14:00 -  1.293
+++ kroute.c18 Aug 2022 13:22:06 -
@@ -1607,6 +1607,7 @@ kroute_insert(struct ktable *kt, struct 
struct kroute6  *kr6, *kr6m;
struct knexthop *n;
uint32_t mplslabel = 0;
+   int  multipath = 0;
 
if (kf->prefix.aid == AID_VPN_IPv4 ||
kf->prefix.aid == AID_VPN_IPv6) {
@@ -1648,6 +1649,7 @@ kroute_insert(struct ktable *kt, struct 
while (krm->next != NULL)
krm = krm->next;
krm->next = kr;
+   multipath = 1;
}
 
if (kf->flags & F_BGPD)
@@ -1684,6 +1686,7 @@ kroute_insert(struct ktable *kt, struct 
while (kr6m->next != NULL)
kr6m = kr6m->next;
kr6m->next = kr6;
+   multipath = 1;
}
 
if (kf->flags & F_BGPD)
@@ -1699,8 +1702,8 @@ kroute_insert(struct ktable *kt, struct 
kf->prefixlen) == 0)
knexthop_validate(kt, n);
 
-   if (krm == NULL)
-   /* redistribute multipath routes only once */
+   /* redistribute multipath routes only once */
+   if (!multipath)
kr_redistribute(IMSG_NETWORK_ADD, kt, kf);
}
 



Re: rpki-client: disallow inherit in ROA EE IP Resources extension

2022-08-18 Thread Job Snijders
On Sat, Aug 13, 2022 at 04:51:05PM +0200, Theo Buehler wrote:
> job mentioned that it might be preferable to do the validation in
> parse_{roa,rsc,aspa}(). So here's a diff that does this. It reworks
> valid_{roa,rsc}() to compare only against the EE cert's resources
> since it doesn't really make sense to walk the auth chain for this
> anyway. That the EE cert's resources are covered by the auth chain is
> checked later as part of valid_x509().
> 
> Inheritance in the EE cert will now result in a warning and the roa/rsc
> won't be considered valid.

OK job@



installboot: clarify how -p only sets up the filesystem

2022-08-18 Thread Klemens Nanni
Contrary to the synopsis and verbose output, using
 -p  Prepare filesystem.  This will create a new filesystem on the
 partition reserved for the boot loader on architectures that
 require one.

does only that:

# installboot -nvp vnd0
Using / as root
would install bootstrap on /dev/rvnd0c
using first-stage /usr/mdec/biosboot, second-stage /usr/mdec/boot
would newfs 545c9bdf92aa18f9.i

No files under / are ever accessed, and no program would be installed on
the disk;  only the last line is actually true.

-p thus never cares about -r but the current code still always loads
unused files, the does the actual filesystem preparation before exiting
without doing anything with the loaded bootloader/stage files:

49 int
50 main(int argc, char **argv)
51 {
...
120 md_loadboot();
121 
122 if (prepare) {
123 md_prepareboot(devfd, realdev);
124 return 0;
125 }
126 
127 #ifdef SOFTRAID
128 sr_installboot(devfd, dev);
129 #else
130 md_installboot(devfd, realdev);
131 #endif
132 
133 return 0;
134 }


Hoist -p code to just do file preparation, thus stop printing false
information and fix the synopsis accordingly.

Initialise -r later only when needed such that the usage check for -p
can test whether -r was given without adding a new rflag or so.

All -p users live under /usr/src/distrib/ and exclusively call
installboot -p $_disk

followed later on by
installboot [anything but -p depending on platform] $_disk

I've checked all usage combination of flags and arguments, the new code
does what the new synopsis says.

Feedback? Objection? OK?


Index: installboot.8
===
RCS file: /cvs/src/usr.sbin/installboot/installboot.8,v
retrieving revision 1.5
diff -u -p -r1.5 installboot.8
--- installboot.8   20 Jul 2021 14:51:56 -  1.5
+++ installboot.8   18 Aug 2022 11:32:51 -
@@ -22,10 +22,14 @@
 .Nd install bootstrap on a disk
 .Sh SYNOPSIS
 .Nm installboot
-.Op Fl npv
+.Op Fl nv
 .Op Fl r Ar root
 .Ar disk
 .Op Ar stage1 Op Ar stage2
+.Nm
+.Op Fl nv
+.Fl p
+.Ar disk
 .Sh DESCRIPTION
 .Nm
 installs bootstrap on the specified disk.
@@ -38,7 +42,7 @@ the beginning of the disk.
 The options are as follows:
 .Bl -tag -width Ds
 .It Fl n
-Perform a dry run - do not actually write any bootstrap to the disk.
+Perform a dry run - do not actually write to the disk.
 .It Fl p
 Prepare filesystem.
 This will create a new filesystem on the partition reserved for the
Index: installboot.c
===
RCS file: /cvs/src/usr.sbin/installboot/installboot.c,v
retrieving revision 1.14
diff -u -p -r1.14 installboot.c
--- installboot.c   20 Jul 2021 14:51:56 -  1.14
+++ installboot.c   18 Aug 2022 11:42:43 -
@@ -31,17 +31,16 @@ int prepare;
 intstages;
 intverbose;
 
-char   *root = "/";
+char   *root;
 char   *stage1;
 char   *stage2;
 
 static __dead void
 usage(void)
 {
-   extern char *__progname;
-
-   fprintf(stderr, "usage: %s [-npv] [-r root] disk [stage1%s]\n",
-   __progname, (stages >= 2) ? " [stage2]" : "");
+   fprintf(stderr, "usage:\t%1$s [-nv] [-r root] disk [stage1%2$s]\n"
+   "\t%1$s [-nv] -p disk\n",
+   getprogname(), (stages >= 2) ? " [stage2]" : "");
 
exit(1);
 }
@@ -80,6 +79,8 @@ main(int argc, char **argv)
 
if (argc < 1 || argc > stages + 1)
usage();
+   if (prepare && (root != NULL || argc > 1))
+   usage();
 
dev = argv[0];
if (argc > 1)
@@ -87,9 +88,21 @@ main(int argc, char **argv)
if (argc > 2)
stage2 = argv[2];
 
+   if ((devfd = opendev(dev, (nowrite ? O_RDONLY : O_RDWR), OPENDEV_PART,
+   )) == -1)
+   err(1, "open: %s", realdev);
+
+   if (prepare) {
+   md_prepareboot(devfd, realdev);
+   return 0;
+   }
+
/* Prefix stages with root, unless they were user supplied. */
-   if (verbose)
+   if (verbose) {
+   if (root == NULL)
+   root = "/";
fprintf(stderr, "Using %s as root\n", root);
+   }
if (argc <= 1 && stage1 != NULL) {
stage1 = fileprefix(root, stage1);
if (stage1 == NULL)
@@ -101,10 +114,6 @@ main(int argc, char **argv)
exit(1);
}
 
-   if ((devfd = opendev(dev, (nowrite ? O_RDONLY : O_RDWR), OPENDEV_PART,
-   )) == -1)
-   err(1, "open: %s", realdev);
-
 if (verbose) {
fprintf(stderr, "%s bootstrap on %s\n",
(nowrite ? "would install" : "installing"), realdev);
@@ -118,11 +127,6 @@ main(int argc, char **argv)
}
 
md_loadboot();
-
-   if (prepare) {
-   

Re: bgpd more kroute cleanup

2022-08-18 Thread Theo Buehler
On Thu, Aug 18, 2022 at 12:44:07PM +0200, Claudio Jeker wrote:
> It makes no sense to pass the fd to send_rtmsg() as an argument.
> The code just passes the fd from the global kr_state. It also makes the
> code less portable because for linux an mnl handle needs to be passed.
> By dropping this the code becomes simpler.

Makes sense

ok



Simplify locking code in pdaemon

2022-08-18 Thread Martin Pieuchot
Use a "slock" variable as done in multiple places to simplify the code.
The locking stay the same.  This is just a first step to simplify this
mess.

Also get rid of the return value of the function, it is never checked.

ok?

Index: uvm/uvm_pdaemon.c
===
RCS file: /cvs/src/sys/uvm/uvm_pdaemon.c,v
retrieving revision 1.101
diff -u -p -r1.101 uvm_pdaemon.c
--- uvm/uvm_pdaemon.c   28 Jun 2022 19:31:30 -  1.101
+++ uvm/uvm_pdaemon.c   18 Aug 2022 10:44:52 -
@@ -102,7 +102,7 @@ extern void drmbackoff(long);
  */
 
 void   uvmpd_scan(struct uvm_pmalloc *);
-boolean_t  uvmpd_scan_inactive(struct uvm_pmalloc *, struct pglist *);
+void   uvmpd_scan_inactive(struct uvm_pmalloc *, struct pglist *);
 void   uvmpd_tune(void);
 void   uvmpd_drop(struct pglist *);
 
@@ -377,17 +377,16 @@ uvm_aiodone_daemon(void *arg)
  * => we handle the building of swap-backed clusters
  * => we return TRUE if we are exiting because we met our target
  */
-
-boolean_t
+void
 uvmpd_scan_inactive(struct uvm_pmalloc *pma, struct pglist *pglst)
 {
-   boolean_t retval = FALSE;   /* assume we haven't hit target */
int free, result;
struct vm_page *p, *nextpg;
struct uvm_object *uobj;
struct vm_page *pps[SWCLUSTPAGES], **ppsp;
int npages;
struct vm_page *swpps[SWCLUSTPAGES];/* XXX: see below */
+   struct rwlock *slock;
int swnpages, swcpages; /* XXX: see below */
int swslot;
struct vm_anon *anon;
@@ -402,7 +401,6 @@ uvmpd_scan_inactive(struct uvm_pmalloc *
 */
swslot = 0;
swnpages = swcpages = 0;
-   free = 0;
dirtyreacts = 0;
p = NULL;
 
@@ -431,18 +429,14 @@ uvmpd_scan_inactive(struct uvm_pmalloc *
 */
uobj = NULL;
anon = NULL;
-
if (p) {
/*
-* update our copy of "free" and see if we've met
-* our target
+* see if we've met our target
 */
free = uvmexp.free - BUFPAGES_DEFICIT;
if (((pma == NULL || (pma->pm_flags & UVM_PMA_FREED)) &&
(free + uvmexp.paging >= uvmexp.freetarg << 2)) ||
dirtyreacts == UVMPD_NUMDIRTYREACTS) {
-   retval = TRUE;
-
if (swslot == 0) {
/* exit now if no swap-i/o pending */
break;
@@ -450,9 +444,9 @@ uvmpd_scan_inactive(struct uvm_pmalloc *
 
/* set p to null to signal final swap i/o */
p = NULL;
+   nextpg = NULL;
}
}
-
if (p) {/* if (we have a new page to consider) */
/*
 * we are below target and have a new page to consider.
@@ -460,11 +454,12 @@ uvmpd_scan_inactive(struct uvm_pmalloc *
uvmexp.pdscans++;
nextpg = TAILQ_NEXT(p, pageq);
 
+   anon = p->uanon;
+   uobj = p->uobject;
if (p->pg_flags & PQ_ANON) {
-   anon = p->uanon;
KASSERT(anon != NULL);
-   if (rw_enter(anon->an_lock,
-   RW_WRITE|RW_NOSLEEP)) {
+   slock = anon->an_lock;
+   if (rw_enter(slock, RW_WRITE|RW_NOSLEEP)) {
/* lock failed, skip this page */
continue;
}
@@ -474,23 +469,20 @@ uvmpd_scan_inactive(struct uvm_pmalloc *
 */
if (pmap_is_referenced(p)) {
uvm_pageactivate(p);
-   rw_exit(anon->an_lock);
+   rw_exit(slock);
uvmexp.pdreact++;
continue;
}
if (p->pg_flags & PG_BUSY) {
-   rw_exit(anon->an_lock);
+   rw_exit(slock);
uvmexp.pdbusy++;
-   /* someone else owns page, skip it */
continue;
}
uvmexp.pdanscan++;
} else {
-   uobj = p->uobject;
  

bgpd more kroute cleanup

2022-08-18 Thread Claudio Jeker
It makes no sense to pass the fd to send_rtmsg() as an argument.
The code just passes the fd from the global kr_state. It also makes the
code less portable because for linux an mnl handle needs to be passed.
By dropping this the code becomes simpler.

-- 
:wq Claudio

Index: kroute.c
===
RCS file: /cvs/src/usr.sbin/bgpd/kroute.c,v
retrieving revision 1.292
diff -u -p -r1.292 kroute.c
--- kroute.c17 Aug 2022 15:15:26 -  1.292
+++ kroute.c18 Aug 2022 10:41:19 -
@@ -175,7 +175,7 @@ voidget_rtaddrs(int, struct sockaddr *
 void   if_change(u_short, int, struct if_data *);
 void   if_announce(void *);
 
-intsend_rtmsg(int, int, struct ktable *, struct kroute_full *);
+intsend_rtmsg(int, struct ktable *, struct kroute_full *);
 intdispatch_rtmsg(void);
 intfetchtable(struct ktable *);
 intfetchifs(int);
@@ -497,7 +497,7 @@ kr4_change(struct ktable *kt, struct kro
else
kr->flags &= ~F_REJECT;
 
-   if (send_rtmsg(kr_state.fd, RTM_CHANGE, kt, kf))
+   if (send_rtmsg(RTM_CHANGE, kt, kf))
kr->flags |= F_BGPD_INSERTED;
}
 
@@ -535,7 +535,7 @@ kr6_change(struct ktable *kt, struct kro
else
kr6->flags &= ~F_REJECT;
 
-   if (send_rtmsg(kr_state.fd, RTM_CHANGE, kt, kf))
+   if (send_rtmsg(RTM_CHANGE, kt, kf))
kr6->flags |= F_BGPD_INSERTED;
}
 
@@ -587,7 +587,7 @@ krVPN4_change(struct ktable *kt, struct 
else
kr->flags &= ~F_REJECT;
 
-   if (send_rtmsg(kr_state.fd, RTM_CHANGE, kt, kf))
+   if (send_rtmsg(RTM_CHANGE, kt, kf))
kr->flags |= F_BGPD_INSERTED;
}
 
@@ -640,7 +640,7 @@ krVPN6_change(struct ktable *kt, struct 
else
kr6->flags &= ~F_REJECT;
 
-   if (send_rtmsg(kr_state.fd, RTM_CHANGE, kt, kf))
+   if (send_rtmsg(RTM_CHANGE, kt, kf))
kr6->flags |= F_BGPD_INSERTED;
}
 
@@ -714,14 +714,12 @@ kr_fib_couple(u_int rtableid)
 
RB_FOREACH(kr, kroute_tree, >krt)
if (kr->flags & F_BGPD) {
-   if (send_rtmsg(kr_state.fd, RTM_ADD, kt,
-   kr_tofull(kr)))
+   if (send_rtmsg(RTM_ADD, kt, kr_tofull(kr)))
kr->flags |= F_BGPD_INSERTED;
}
RB_FOREACH(kr6, kroute6_tree, >krt6)
if (kr6->flags & F_BGPD) {
-   if (send_rtmsg(kr_state.fd, RTM_ADD, kt,
-   kr6_tofull(kr6)))
+   if (send_rtmsg(RTM_ADD, kt, kr6_tofull(kr6)))
kr6->flags |= F_BGPD_INSERTED;
}
log_info("kernel routing table %u (%s) coupled", kt->rtableid,
@@ -752,14 +750,12 @@ kr_fib_decouple(u_int rtableid)
 
RB_FOREACH(kr, kroute_tree, >krt)
if ((kr->flags & F_BGPD_INSERTED)) {
-   if (send_rtmsg(kr_state.fd, RTM_DELETE, kt,
-   kr_tofull(kr)))
+   if (send_rtmsg(RTM_DELETE, kt, kr_tofull(kr)))
kr->flags &= ~F_BGPD_INSERTED;
}
RB_FOREACH(kr6, kroute6_tree, >krt6)
if ((kr6->flags & F_BGPD_INSERTED)) {
-   if (send_rtmsg(kr_state.fd, RTM_DELETE, kt,
-   kr6_tofull(kr6)))
+   if (send_rtmsg(RTM_DELETE, kt, kr6_tofull(kr6)))
kr6->flags &= ~F_BGPD_INSERTED;
}
 
@@ -1655,7 +1651,7 @@ kroute_insert(struct ktable *kt, struct 
}
 
if (kf->flags & F_BGPD)
-   if (send_rtmsg(kr_state.fd, RTM_ADD, kt, kf))
+   if (send_rtmsg(RTM_ADD, kt, kf))
kr->flags |= F_BGPD_INSERTED;
break;
case AID_INET6:
@@ -1691,7 +1687,7 @@ kroute_insert(struct ktable *kt, struct 
}
 
if (kf->flags & F_BGPD)
-   if (send_rtmsg(kr_state.fd, RTM_ADD, kt, kf))
+   if (send_rtmsg(RTM_ADD, kt, kf))
kr6->flags |= F_BGPD_INSERTED;
break;
}
@@ -1873,7 +1869,7 @@ kroute_remove(struct ktable *kt, struct 
return (multipath + 1);
 
if (kf->flags & F_BGPD_INSERTED)
-   send_rtmsg(kr_state.fd, RTM_DELETE, kt, kf);
+   send_rtmsg(RTM_DELETE, kt, kf);
 
/* remove only once all multipath routes are gone */
if (!(kf->flags & F_BGPD) && !multipath)
@@ -2622,7 +2618,7 @@ get_mpe_config(const char *name, u_int *
 #define 

Fix a race in uvm_pseg_release()

2022-08-18 Thread Martin Pieuchot
The lock must be grabbed before iterating on the global array, ok?

Index: uvm/uvm_pager.c
===
RCS file: /cvs/src/sys/uvm/uvm_pager.c,v
retrieving revision 1.88
diff -u -p -r1.88 uvm_pager.c
--- uvm/uvm_pager.c 15 Aug 2022 03:21:04 -  1.88
+++ uvm/uvm_pager.c 18 Aug 2022 10:31:16 -
@@ -209,6 +209,7 @@ uvm_pseg_release(vaddr_t segaddr)
struct uvm_pseg *pseg;
vaddr_t va = 0;
 
+   mtx_enter(_pseg_lck);
for (pseg = [0]; pseg != [PSEG_NUMSEGS]; pseg++) {
if (pseg->start <= segaddr &&
segaddr < pseg->start + MAX_PAGER_SEGS * MAXBSIZE)
@@ -222,7 +223,6 @@ uvm_pseg_release(vaddr_t segaddr)
/* test for no remainder */
KDASSERT(segaddr == pseg->start + id * MAXBSIZE);
 
-   mtx_enter(_pseg_lck);
 
KASSERT(UVM_PSEG_INUSE(pseg, id));
 



Re: Race in disk_attach_callback?

2022-08-18 Thread Klemens Nanni
On Tue, Aug 16, 2022 at 09:03:55PM +, Miod Vallat wrote:
> > after a prompt from stsp@ and florian@, reporting that newfs_msdos
> > fails when given an $duid.i argument, I set down to see what could be
> > going on. My test using an USB stick failed to reprdouce the problem.
> > Then I started using a vnd, and that shows the issue (once in a
> > while). The feeling is that any disk devcied created on the fly might
> > show this issue.
> 
> Moments ago kn@ stumbled upon this as well.
> 
> It turns out that, at least in the vnd case, there is indeed a race
> between the scheduling of the task running disk_attach_callback and the
> first access to the vnd device, which will cause the label to get read.
> 
> While vnd(4) has logic to read a label on-demand, disk_attach_callback
> assumes it runs at a point where the label can safely be obtained, and
> will not retry (because it sets DKF_OPENED, which never gets unset).
> 
> The following shell script will demonstrate the issue:
> 
> cat << EOF > test.sh
> #! /bin/sh
> set -e
> 
> dd if=/dev/zero of=img.bin bs=1m count=4 >/dev/null
> iter=1
> while [ $iter -lt 1000 ]; do
>   vnconfig vnd0 img.bin
>   fdisk -iy vnd0 > /dev/null
>   disklabel -Aw vnd0 > /dev/null
>   duid=$(sysctl hw.disknames|sed 's,.*vnd0:,,')
>   disklabel $duid > /dev/null
>   vnconfig -u vnd0
>   iter=$((iter + 1))
> done
> EOF
> 
> (don't forget to vnconfig -u vnd0 if it fails).

Here's the improved version of a new regress test I initially wrote to
compare real disks with vnd and block devices with duids.

I ran into this issue when trying to test installboot(8) changes on
vnd(4) rather than real hardware first.

The new regress spins up a vnd disk, creates a label and accesses the
disk through both its disklabel and block device... former is broken.

This could easily be adapted or copied into initial installboot(8)
tests which I wanted to do anyway.

regress/sys/kern/disk seems specific and yet broad enough;  putting it
under lib/libutil/ due to opendev(3) makes little sense since that's
just the common entry point to what is actually a kernel bug and this
is not really vnd specific either as we're dealing with a
race condition/design flaw.

Feedback? Objections? OK?

Index: regress/sys/kern/disk/Makefile
===
RCS file: regress/sys/kern/disk/Makefile
diff -N regress/sys/kern/disk/Makefile
--- /dev/null   1 Jan 1970 00:00:00 -
+++ regress/sys/kern/disk/Makefile  18 Aug 2022 10:06:49 -
@@ -0,0 +1,56 @@
+#  $OpenBSD: $
+
+# anything calling opendev(3) with "sd0a" or "fc2aa4672193310c.a", e.g.
+# disklabel(8), newfs(8) or newfs_msdos(8)
+OPENDEV ?= /sbin/disklabel
+# write default MBR
+FORMAT ?=  /sbin/fdisk -iy
+# write standard disklabel to get a valid DUID
+NEWLABEL ?=/sbin/disklabel -Aw
+
+# whether to trace diskmap(4)'s DIOCMAP handling DUIDs/block devices
+DO_TRACE ?=No
+
+IMGFILE =  disk.img
+DEVFILE =  vnd.txt
+DUIDFILE = duid.txt
+PART = a
+
+
+REGRESS_SETUP_ONCE =   format-disk
+
+create-new-disk: create-new-disk
+   dd if=/dev/zero  of=${IMGFILE} bs=1m count=0 seek=32
+   ${SUDO} vnconfig -- ${IMGFILE} > ${DEVFILE}
+format-disk: create-new-disk
+   ${SUDO} ${FORMAT}   -- "$$(<${DEVFILE})"
+   ${SUDO} ${NEWLABEL} -- "$$(<${DEVFILE})"
+
+
+REGRESS_TARGETS =  opendev-use-block \
+   opendev-use-duid
+
+opendev-use-block:
+   ${SUDO} ${OPENDEV} -- "$$(<${DEVFILE})${PART}"
+
+opendev-use-duid:
+   ${SUDO} disklabel  -- "$$(<${DEVFILE})" | \
+   awk '$$1 == "duid:" { print $$2 }' > ${DUIDFILE}
+.if ${DO_TRACE:L} == "yes"
+   @# always run kdump regardless of newfs exit code
+   sh -uc '${SUDO} ktrace -- ${OPENDEV} -- "$$(<${DUIDFILE}).${PART}" ;\
+   ret=$$? ;\
+   ${SUDO} kdump -tcn | grep -FwA2 DIOCMAP ;\
+   exit $$ret'
+.else
+   ${SUDO} ${OPENDEV} -- "$$(<${DUIDFILE}).${PART}"
+.endif
+
+
+CLEANFILES =   ${IMGFILE} ${DEVFILE} ${DUIDFILE} ktrace.out
+REGRESS_CLEANUP =  cleanup
+
+cleanup:
+   ${SUDO} vnconfig -vu -- "$$(<${DEVFILE})"
+
+.include 



Re: Race in disk_attach_callback?

2022-08-18 Thread Klemens Nanni
On Wed, Aug 17, 2022 at 07:03:50AM +, Miod Vallat wrote:
> > What is the result if root runs disklabel, and forces it to all zeros?
> 
> If the root duid is all zeroes, then the only way to refer to the root
> disk is to use its /dev/{s,w}d* device name, as zero duids are ignored.

I like miod's second diff and it fixes the race for vnd(4).
I never ran into the issue with softraid(4), but that should not happen
anymore with it, either.

OK kn

> 
> If you set a zero duid in disklabel(8), setdisklabel() in the kernel
> will compute a new, non-zero value.

Correct;  same for real sd1 and fictitious vnd0.

# disklabel -E sd1
Label editor (enter '?' for help at any prompt)

sd1> i
The disklabel UID is currently: c766517084e5e5ce
duid: [] 

sd1*> l
# /dev/rsd1c:
type: SCSI
disk: SCSI disk
label: Block Device
duid: 
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 16
sectors/cylinder: 1008
cylinders: 203
total sectors: 204800
boundstart: 32832
boundend: 204767
drivedata: 0

sd1*> w

sd1> l
# /dev/rsd1c:
type: SCSI
disk: SCSI disk
label: Block Device
duid: 9ff85059e4960901
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 16
sectors/cylinder: 1008
cylinders: 203
total sectors: 204800
boundstart: 32832
boundend: 204767
drivedata: 0

sd1>