Re: sdmmc(4): sdmmc_io_function_enable(): don't sleep on lbolt

2020-12-16 Thread Scott Cheloha
> On Dec 16, 2020, at 18:40, Martin Pieuchot  wrote:
> 
> On 16/12/20(Wed) 23:23, Claudio Jeker wrote:
>>> On Wed, Dec 16, 2020 at 04:50:42PM -0300, Martin Pieuchot wrote:
>>> [...] 
>>> Why did we choose to use a variable over NULL?  Any technical reason?
>> 
>> The sleep subsytem requires a non-NULL value for ident. Changing this
>> seems not trivial.
> 
> I'd say this is an implementation detail, nothing prevent us to use a
> "private" ident value if NULL is passed to tsleep(9) :)
> 
>>> I'm wondering it the locality of the variable might not matter in a
>>> distant future.  Did you dig a bit deeper about the FreeBSD solution?
>>> Why did they choose a per-CPU value?
>> 
>> Currently all sleep channels are hashed into IIRC 128 buckets. If all
>> timeouts use the same sleep channel then this queue may get overcrowded.
>> I guess only instrumentation and measurements will tell us how bad the
>> sleep queue is hashed.
> 
> So using a global as sleep channel is not optimum?  Would it be better
> to use an address on the stack?  If so we could make sleep_setup() accept
> NULL and use 'sls' for example.

Yes, I think that scheme would dodge any issues with overuse of a
single global channel.  Plus, "tsleep_nsec(NULL, ...)" looks about
right.  Passing NULL as the wakeup channel to say "I don't want any
wakeup(9) broadcasts" is intuitive.  More intuitive than handrolling a
private channel on the stack or (as clever as I think it is) "deadchan".



WITNESS panic: acquiring blockable sleep lock with spinlock or critical section held (rwlock) kmmaplk

2020-12-16 Thread Greg Steuck
I just hit this while booting an i386-current in vmd. The source tree is
synced to "Remove the assertion in uvm_km_pgremove()."

I enabled WITNESS on top of GENERIC. Naturally, GENERIC-Dec15 snap works.

Anybody else see this so I know it's worth a bisect?

OpenBSD 6.8-current (WITNESS) #0: Tue Dec 15 22:48:11 PST 2020
g...@i386.nest.cx:/usr/src/sys/arch/i386/compile/WITNESS
real mem  = 4026015744 (3839MB)
avail mem = 3914719232 (3733MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: date 06/23/99, BIOS32 rev. 0 @ 0xfefc2, SMBIOS rev. 2.4 @ 
0xf3f10 (12 entries)
bios0: vendor SeaBIOS version "1.11.0p3-OpenBSD-vmm" date 01/01/2011
bios0: OpenBSD VMM
acpi at bios0 function 0x0 not configured
pcibios at bios0 function 0x1a not configured
bios0: ROM list: 0xef000/0x1000!
cpu0 at mainbus0: (uniprocessor)
cpu0: Intel(R) Core(TM) i7-4600U CPU @ 2.10GHz ("GenuineIntel" 686-class) 2.69 
GHz, 06-45-01
cpu0: 
FPU,V86,DE,PSE,TSC,MSR,PAE,CX8,SEP,PGE,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,PCLMUL,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,LONG,LAHF,ABM,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,MD_CLEAR,MELTDOWN
pvbus0 at mainbus0: OpenBSD
pvclock0 at pvbus0
pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
pchb0 at pci0 dev 0 function 0 "OpenBSD VMM Host" rev 0x00
virtio0 at pci0 dev 1 function 0 "Qumranet Virtio RNG" rev 0x00
viornd0 at virtio0
virtio0: irq 3
virtio1 at pci0 dev 2 function 0 "Qumranet Virtio Network" rev 0x00
vio0 at virtio1: address fe:e1:bb:d1:e2:94
virtio1: irq 5
virtio2 at pci0 dev 3 function 0 "Qumranet Virtio Storage" rev 0x00
vioblk0 at virtio2
scsibus1 at vioblk0: 1 targets
sd0 at scsibus1 targ 0 lun 0: 
sd0: 30720MB, 512 bytes/sector, 62914560 sectors
virtio2: irq 6
virtio3 at pci0 dev 4 function 0 "OpenBSD VMM Control" rev 0x00
vmmci0 at virtio3
virtio3: irq 7
isa0 at mainbus0
isadma0 at isa0
com0 at isa0 port 0x3f8/8 irq 4: ns8250, no fifo
com0: console
pcdisplay0 at isa0 port 0x3d0/16 iomem 0xb8000/32768
wsdisplay0 at pcdisplay0 mux 1: console (80x25, vt100 emulation)
npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16
witness: lock_object uninitialized: 0xd0f3c828
Starting stack trace...
witness_checkorder(0,d6bb011c,d1155e6c,d02e10e4,90) at witness_checkorder+0x8a
witness_checkorder(d0f3c828,9,0) at witness_checkorder+0x8a
mtx_enter(d0f3c81c) at mtx_enter+0x27
pmap_extract_pae(d8bb0d80,f5605000,d8bb0da0) at pmap_extract_pae+0x53
pmap_pinit_pd_pae(d8bb0d80) at pmap_pinit_pd_pae+0x268
pmap_create(1,1000,f6fe5e86,d8bbfd54,d0f5ba18) at pmap_create+0xa8
uvmspace_fork(d0f5b5fc,d8bb3e34,d0f5b5fc,1,d1155f70) at uvmspace_fork+0x56
process_new(d8bb3e34,d0f5b5fc,1) at process_new+0xeb
fork1(d0eb7b14,1,d04eb560,0,0,d1155f90) at fork1+0x1ba
panic: acquiring blockable sleep lock with spinlock or critical section held 
(rwlock) kmmaplk
Stopped at  db_enter+0x4:   popl%ebp
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
* 0  0  0 0x1  0x2000  swapper
db_enter(d0e95e32,d1155b68,d0e95808,d0de7164,d0e25dc8) at db_enter+0x4
panic(d0bce365,d0bd4c59,d0bacff7,d0e21678,d0e2166c) at panic+0xd3
witness_checkorder(d0e95808,1,0) at witness_checkorder+0x86c
rw_enter_read(d0e95800,d1155bd8,d045c620,d0e957fc,d0b99f0c) at 
rw_enter_read+0x25
vm_map_lock_read_ln(d0e957fc,d0b99f0c,5fd) at vm_map_lock_read_ln+0x15
uvmfault_lookup(d1155c40,0) at uvmfault_lookup+0x90
uvm_fault_check(d1155c40,d1155c20,d1155c1c,1) at uvm_fault_check+0x14
uvm_fault(d0e957fc,d0201000,0,1,d0e957fc) at uvm_fault+0x9b
kpageflttrap(d1155d6c,d0201466,d0201466,,0) at kpageflttrap+0xe5
trap(d1155d6c) at trap+0x24b
calltrap(8,10046,4,4,d1155ddc) at calltrap+0xc
db_read_bytes(d0201466,4,d1155dc8) at db_read_bytes+0x20
db_get_value(d0201466,4,0) at db_get_value+0x32
db_stack_trace_print(d1155e3c,1,100,d0c43683,d07c7d10) at 
db_stack_trace_print+0x20d
ddb> show all locks
CPU 0:
exclusive mutex &pmap->pm_mtx r = 0 (0xd8bb0dcc)
Process 0 (swapper) thread 0xd0eb7b14 (0)
exclusive rwlock vmmaplk r = 0 (0xd0f5ba24)
exclusive mutex &pmap->pm_mtx r = 0 (0xd8bb0dcc)ddb> show proc
PROC (swapper) pid=0 stat=onproc
flags process=1 proc=200
pri=0, usrpri=50, nice=20
forw=0x0, list=0x0,0xd0f3b460
process=0xd0f5b5fc user=0xd1153000, vmspace=0xd0f5ba18
estcpu=0, cpticks=2, pctcpu=0.0
user=0, sys=2, intr=0
ddb> ps
   PID TID   PPIDUID  S   FLAGS  WAIT  COMMAND
*0   0 -1  0  7 0x10200swapper



Re: sdmmc(4): sdmmc_io_function_enable(): don't sleep on lbolt

2020-12-16 Thread Martin Pieuchot
On 16/12/20(Wed) 23:23, Claudio Jeker wrote:
> On Wed, Dec 16, 2020 at 04:50:42PM -0300, Martin Pieuchot wrote:
> > [...] 
> > Why did we choose to use a variable over NULL?  Any technical reason?
> 
> The sleep subsytem requires a non-NULL value for ident. Changing this
> seems not trivial.

I'd say this is an implementation detail, nothing prevent us to use a
"private" ident value if NULL is passed to tsleep(9) :)

> > I'm wondering it the locality of the variable might not matter in a
> > distant future.  Did you dig a bit deeper about the FreeBSD solution?
> > Why did they choose a per-CPU value?
> 
> Currently all sleep channels are hashed into IIRC 128 buckets. If all
> timeouts use the same sleep channel then this queue may get overcrowded.
> I guess only instrumentation and measurements will tell us how bad the
> sleep queue is hashed.

So using a global as sleep channel is not optimum?  Would it be better
to use an address on the stack?  If so we could make sleep_setup() accept
NULL and use 'sls' for example.



Re: tht(4): more tsleep(9) -> tsleep_nsec(9) conversions

2020-12-16 Thread David Gwynne
ok

> On 17 Dec 2020, at 04:22, Scott Cheloha  wrote:
> 
> On Thu, Dec 03, 2020 at 09:59:11PM -0600, Scott Cheloha wrote:
>> Hi,
>> 
>> tht(4) is another driver still using tsleep(9).
>> 
>> It uses it to spin while it waits for the card to load the firmware.
>> Then it uses it to spin for up to 2 seconds while waiting for
>> THT_REG_INIT_STATUS.
>> 
>> In the firmware case we can sleep for 10 milliseconds each iteration.
>> 
>> In the THT_REG_INIT_STATUS loop we can sleep for 10 milliseconds each
>> iteration again, but instead of using a timeout to set a flag after 2
>> seconds we can just count how many milliseconds we've slept.  This is
>> less precise than using the timeout but it is much simpler.  Obviously
>> we then need to remove all the timeout-related stuff from the function
>> and the file.
>> 
>> Thoughts?  ok?
> 
> Two week bump.
> 
> Index: if_tht.c
> ===
> RCS file: /cvs/src/sys/dev/pci/if_tht.c,v
> retrieving revision 1.142
> diff -u -p -r1.142 if_tht.c
> --- if_tht.c  10 Jul 2020 13:26:38 -  1.142
> +++ if_tht.c  4 Dec 2020 03:57:21 -
> @@ -582,7 +582,6 @@ void  tht_lladdr_read(struct 
> tht_softc 
> void  tht_lladdr_write(struct tht_softc *);
> int   tht_sw_reset(struct tht_softc *);
> int   tht_fw_load(struct tht_softc *);
> -void tht_fw_tick(void *arg);
> void  tht_link_state(struct tht_softc *);
> 
> /* interface operations */
> @@ -1667,11 +1666,9 @@ tht_sw_reset(struct tht_softc *sc)
> int
> tht_fw_load(struct tht_softc *sc)
> {
> - struct timeout  ticker;
> - volatile intok = 1;
>   u_int8_t*fw, *buf;
>   size_t  fwlen, wrlen;
> - int error = 1;
> + int error = 1, msecs, ret;
> 
>   if (loadfirmware("tht", &fw, &fwlen) != 0)
>   return (1);
> @@ -1682,7 +1679,9 @@ tht_fw_load(struct tht_softc *sc)
>   buf = fw;
>   while (fwlen > 0) {
>   while (tht_fifo_writable(sc, &sc->sc_txt) <= THT_FIFO_GAP) {
> - if (tsleep(sc, PCATCH, "thtfw", 1) == EINTR)
> + ret = tsleep_nsec(sc, PCATCH, "thtfw",
> + MSEC_TO_NSEC(10));
> + if (ret == EINTR)
>   goto err;
>   }
> 
> @@ -1695,32 +1694,21 @@ tht_fw_load(struct tht_softc *sc)
>   buf += wrlen;
>   }
> 
> - timeout_set(&ticker, tht_fw_tick, (void *)&ok);
> - timeout_add_sec(&ticker, 2);
> - while (ok) {
> + for (msecs = 0; msecs < 2000; msecs += 10) {
>   if (tht_read(sc, THT_REG_INIT_STATUS) != 0) {
>   error = 0;
>   break;
>   }
> -
> - if (tsleep(sc, PCATCH, "thtinit", 1) == EINTR)
> + ret = tsleep_nsec(sc, PCATCH, "thtinit", MSEC_TO_NSEC(10));
> + if (ret == EINTR)
>   goto err;
>   }
> - timeout_del(&ticker);
> 
>   tht_write(sc, THT_REG_INIT_SEMAPHORE, 0x1);
> 
> err:
>   free(fw, M_DEVBUF, fwlen);
>   return (error);
> -}
> -
> -void
> -tht_fw_tick(void *arg)
> -{
> - volatile int*ok = arg;
> -
> - *ok = 0;
> }
> 
> void



Re: nosuid more partitions that no longer have suid executables

2020-12-16 Thread Josh Rickmar
On Wed, Dec 16, 2020 at 03:40:41PM -0700, Theo de Raadt wrote:
> the term "suid" includes setgid executables... so your conclusions
> are way wrong.
> 
> Heck, you should give it a try.  Good luck logging in afterwards.

Indeed, please disregard.



Re: bgpd send side hold timer

2020-12-16 Thread Job Snijders
On Tue, Dec 15, 2020 at 05:02:19PM +0100, Claudio Jeker wrote:
> On Mon, Dec 14, 2020 at 06:22:09PM +, Job Snijders wrote:
> > This patch appears to be a very elegant solution to a thorny subtle
> > problem: what to do when a peer is not accepting new routing
> > information from you?
> 
> One thing I'm unsure about is the value of the SendHold timer. I reused
> the hold timer value with the assumption that for dead connections the
> regular hold timer expires before the SendHold timer (the send buffer
> needs to be full before send starts blocking).

Let's be conservative while being progressive! :-)

If the 'Send Hold Timer' value is moved from 'infinite' to 90 seconds
("The suggested default value for the HoldTime", RFC 4271), I think
we'll be able to see benefits.

> People should look out for cases where the SendHold timer triggered before
> either a NOTIFICATION form the peer arrived or where the SendHold timer
> triggered before the HoldTimer. Now that may be tricky since both SendHold
> and Hold timer trigger the same EVNT_TIMER_HOLDTIME event so they can not
> be distinguished easily.

Separation of the cases might be helpful.

> I think that the SendHold timer will almost never trigger and if it does
> only for the case where a session only works in one direction.

If it is rare, maybe it should be logged as a unique message:

"SendHoldTimer Expired".

Kind regards,

Job



Re: nosuid more partitions that no longer have suid executables

2020-12-16 Thread Theo de Raadt
the term "suid" includes setgid executables... so your conclusions
are way wrong.

Heck, you should give it a try.  Good luck logging in afterwards.

Josh Rickmar  wrote:

> Playing around with find ${dir} -type f -perm -04000, I see that there
> are no longer any more suid executables placed in /usr/libexec or
> /usr/X11R6, so I believe these can be removed from install.sub.  Based
> on the comments, this hasn't been touched in a long while.
> 
> diff bfe4739adb21458b6ca3fc690dff9c4b271f0330 /usr/src
> blob - 98d5313bcc8536593785d44cc15aa14faf4d92ee
> file + distrib/miniroot/install.sub
> --- distrib/miniroot/install.sub
> +++ distrib/miniroot/install.sub
> @@ -3021,14 +3021,11 @@ do_install() {
>   echo -n ",nodev"
>  
>   # The only directories that the install puts suid binaries into
> - # (as of 3.2) are:
> + # (as of 6.8) are:
>   #
>   # /sbin
>   # /usr/bin
>   # /usr/sbin
> - # /usr/libexec
> - # /usr/libexec/auth
> - # /usr/X11R6/bin
>   #
>   # and ports and users can do who knows what to /usr/local and
>   # sub directories thereof.
> @@ -3040,9 +3037,7 @@ do_install() {
>   case $_mp in
>   /sbin|/usr) ;;
>   /usr/bin|/usr/sbin) ;;
> - /usr/libexec|/usr/libexec/*);;
>   /usr/local|/usr/local/*);;
> - /usr/X11R6|/usr/X11R6/bin)  ;;
>   *)  echo -n ",nosuid"   ;;
>   esac
>   echo " 1 2"
> 



Re: sdmmc(4): sdmmc_io_function_enable(): don't sleep on lbolt

2020-12-16 Thread Claudio Jeker
On Wed, Dec 16, 2020 at 04:50:42PM -0300, Martin Pieuchot wrote:
> On 16/12/20(Wed) 12:50, Scott Cheloha wrote:
> > On Tue, Dec 15, 2020 at 01:47:24PM +0100, Mark Kettenis wrote:
> > > > Date: Tue, 15 Dec 2020 13:32:22 +0100
> > > > From: Claudio Jeker 
> > > > 
> > > > On Fri, Dec 11, 2020 at 07:07:56PM -0600, Scott Cheloha wrote:
> > > > > Hi,
> > > > > 
> > > > > I'd like to remove lbolt from the kernel.  I think having it in the
> > > > > kernel complicates otherwise simple code.
> > > > > 
> > > > > We can start with sdmmc(4).
> > > > > 
> > > > > The goal in sdmmc_io_function_enable() is calling 
> > > > > sdmmc_io_function_ready()
> > > > > up to six times and sleep 1 second between each attempt.  Here's 
> > > > > rewritten
> > > > > code that does with without lbolt.
> > > > > 
> > > > > ok?
> > > > > 
> > > > > Index: sdmmc_io.c
> > > > > ===
> > > > > RCS file: /cvs/src/sys/dev/sdmmc/sdmmc_io.c,v
> > > > > retrieving revision 1.41
> > > > > diff -u -p -r1.41 sdmmc_io.c
> > > > > --- sdmmc_io.c31 Dec 2019 10:05:33 -  1.41
> > > > > +++ sdmmc_io.c12 Dec 2020 01:04:59 -
> > > > > @@ -231,8 +231,8 @@ sdmmc_io_function_enable(struct sdmmc_fu
> > > > >  {
> > > > >   struct sdmmc_softc *sc = sf->sc;
> > > > >   struct sdmmc_function *sf0 = sc->sc_fn0;
> > > > > + int chan, retry = 5;
> > > > >   u_int8_t rv;
> > > > > - int retry = 5;
> > > > >  
> > > > >   rw_assert_wrlock(&sc->sc_lock);
> > > > >  
> > > > > @@ -244,7 +244,7 @@ sdmmc_io_function_enable(struct sdmmc_fu
> > > > >   sdmmc_io_write_1(sf0, SD_IO_CCCR_FN_ENABLE, rv);
> > > > >  
> > > > >   while (!sdmmc_io_function_ready(sf) && retry-- > 0)
> > > > > - tsleep_nsec(&lbolt, PPAUSE, "pause", INFSLP);
> > > > > + tsleep_nsec(&chan, PPAUSE, "pause", SEC_TO_NSEC(1));
> > > > >   return (retry >= 0) ? 0 : ETIMEDOUT;
> > > > >  }
> > > > >  
> > > > 
> > > > Why not use &retry as wait channel instead of adding a new variable
> > > > chan? Result is the same. Would it make sense to allow NULL as wait
> > > > channel to make the tsleep not wakeable. At least that could be used in 
> > > > a
> > > > few places where timeouts are implemented with tsleep and would make the
> > > > intent more obvious.
> > > 
> > > Or have an appropriately named global variable?  Something like "int 
> > > nowake"?
> > 
> > Something like the attached patch?
> > 
> > I think the idea of a "dead channel" communicates the intent.  Nobody
> > broadcasts wakeups on the dead channel.  If you don't want to receive
> > wakeup broadcasts you sleep on the dead channel.  Hence, "deadchan".
> 
> Why did we choose to use a variable over NULL?  Any technical reason?

The sleep subsytem requires a non-NULL value for ident. Changing this
seems not trivial.
 
> I'm wondering it the locality of the variable might not matter in a
> distant future.  Did you dig a bit deeper about the FreeBSD solution?
> Why did they choose a per-CPU value?

Currently all sleep channels are hashed into IIRC 128 buckets. If all
timeouts use the same sleep channel then this queue may get overcrowded.
I guess only instrumentation and measurements will tell us how bad the
sleep queue is hashed.

> > Index: kern/kern_synch.c
> > ===
> > RCS file: /cvs/src/sys/kern/kern_synch.c,v
> > retrieving revision 1.172
> > diff -u -p -r1.172 kern_synch.c
> > --- kern/kern_synch.c   7 Dec 2020 16:55:29 -   1.172
> > +++ kern/kern_synch.c   16 Dec 2020 18:50:12 -
> > @@ -87,6 +87,12 @@ sleep_queue_init(void)
> > TAILQ_INIT(&slpque[i]);
> >  }
> >  
> > +/*
> > + * Threads that do not want to receive wakeup(9) broadcasts should
> > + * sleep on deadchan.
> > + */
> > +static int __deadchan;
> > +int *deadchan = &__deadchan;
> >  
> >  /*
> >   * During autoconfiguration or after a panic, a sleep will simply
> > Index: sys/systm.h
> > ===
> > RCS file: /cvs/src/sys/sys/systm.h,v
> > retrieving revision 1.148
> > diff -u -p -r1.148 systm.h
> > --- sys/systm.h 26 Aug 2020 03:29:07 -  1.148
> > +++ sys/systm.h 16 Dec 2020 18:50:12 -
> > @@ -107,6 +107,8 @@ extern struct vnode *rootvp;/* vnode eq
> >  extern dev_t swapdev;  /* swapping device */
> >  extern struct vnode *swapdev_vp;/* vnode equivalent to above */
> >  
> > +extern int *deadchan;  /* dead wakeup(9) channel */
> > +
> >  struct proc;
> >  struct process;
> >  #define curproc curcpu()->ci_curproc
> > Index: dev/sdmmc/sdmmc_io.c
> > ===
> > RCS file: /cvs/src/sys/dev/sdmmc/sdmmc_io.c,v
> > retrieving revision 1.41
> > diff -u -p -r1.41 sdmmc_io.c
> > --- dev/sdmmc/sdmmc_io.c31 Dec 2019 10:05:33 -  1.41
> > +++ dev/sdmmc/sdmmc_io.c16

nosuid more partitions that no longer have suid executables

2020-12-16 Thread Josh Rickmar
Playing around with find ${dir} -type f -perm -04000, I see that there
are no longer any more suid executables placed in /usr/libexec or
/usr/X11R6, so I believe these can be removed from install.sub.  Based
on the comments, this hasn't been touched in a long while.

diff bfe4739adb21458b6ca3fc690dff9c4b271f0330 /usr/src
blob - 98d5313bcc8536593785d44cc15aa14faf4d92ee
file + distrib/miniroot/install.sub
--- distrib/miniroot/install.sub
+++ distrib/miniroot/install.sub
@@ -3021,14 +3021,11 @@ do_install() {
echo -n ",nodev"
 
# The only directories that the install puts suid binaries into
-   # (as of 3.2) are:
+   # (as of 6.8) are:
#
# /sbin
# /usr/bin
# /usr/sbin
-   # /usr/libexec
-   # /usr/libexec/auth
-   # /usr/X11R6/bin
#
# and ports and users can do who knows what to /usr/local and
# sub directories thereof.
@@ -3040,9 +3037,7 @@ do_install() {
case $_mp in
/sbin|/usr) ;;
/usr/bin|/usr/sbin) ;;
-   /usr/libexec|/usr/libexec/*);;
/usr/local|/usr/local/*);;
-   /usr/X11R6|/usr/X11R6/bin)  ;;
*)  echo -n ",nosuid"   ;;
esac
echo " 1 2"



Re: sdmmc(4): sdmmc_io_function_enable(): don't sleep on lbolt

2020-12-16 Thread Mark Kettenis
> Date: Wed, 16 Dec 2020 12:50:46 -0600
> From: Scott Cheloha 
> 
> On Tue, Dec 15, 2020 at 01:47:24PM +0100, Mark Kettenis wrote:
> > > Date: Tue, 15 Dec 2020 13:32:22 +0100
> > > From: Claudio Jeker 
> > > 
> > > On Fri, Dec 11, 2020 at 07:07:56PM -0600, Scott Cheloha wrote:
> > > > Hi,
> > > > 
> > > > I'd like to remove lbolt from the kernel.  I think having it in the
> > > > kernel complicates otherwise simple code.
> > > > 
> > > > We can start with sdmmc(4).
> > > > 
> > > > The goal in sdmmc_io_function_enable() is calling 
> > > > sdmmc_io_function_ready()
> > > > up to six times and sleep 1 second between each attempt.  Here's 
> > > > rewritten
> > > > code that does with without lbolt.
> > > > 
> > > > ok?
> > > > 
> > > > Index: sdmmc_io.c
> > > > ===
> > > > RCS file: /cvs/src/sys/dev/sdmmc/sdmmc_io.c,v
> > > > retrieving revision 1.41
> > > > diff -u -p -r1.41 sdmmc_io.c
> > > > --- sdmmc_io.c  31 Dec 2019 10:05:33 -  1.41
> > > > +++ sdmmc_io.c  12 Dec 2020 01:04:59 -
> > > > @@ -231,8 +231,8 @@ sdmmc_io_function_enable(struct sdmmc_fu
> > > >  {
> > > > struct sdmmc_softc *sc = sf->sc;
> > > > struct sdmmc_function *sf0 = sc->sc_fn0;
> > > > +   int chan, retry = 5;
> > > > u_int8_t rv;
> > > > -   int retry = 5;
> > > >  
> > > > rw_assert_wrlock(&sc->sc_lock);
> > > >  
> > > > @@ -244,7 +244,7 @@ sdmmc_io_function_enable(struct sdmmc_fu
> > > > sdmmc_io_write_1(sf0, SD_IO_CCCR_FN_ENABLE, rv);
> > > >  
> > > > while (!sdmmc_io_function_ready(sf) && retry-- > 0)
> > > > -   tsleep_nsec(&lbolt, PPAUSE, "pause", INFSLP);
> > > > +   tsleep_nsec(&chan, PPAUSE, "pause", SEC_TO_NSEC(1));
> > > > return (retry >= 0) ? 0 : ETIMEDOUT;
> > > >  }
> > > >  
> > > 
> > > Why not use &retry as wait channel instead of adding a new variable
> > > chan? Result is the same. Would it make sense to allow NULL as wait
> > > channel to make the tsleep not wakeable. At least that could be used in a
> > > few places where timeouts are implemented with tsleep and would make the
> > > intent more obvious.
> > 
> > Or have an appropriately named global variable?  Something like "int 
> > nowake"?
> 
> Something like the attached patch?
> 
> I think the idea of a "dead channel" communicates the intent.  Nobody
> broadcasts wakeups on the dead channel.  If you don't want to receive
> wakeup broadcasts you sleep on the dead channel.  Hence, "deadchan".

Yeah, that's a reasonable name.  Not sure if we need the indirection
though.  Using &deadchan isn't an enormous burden I'd say.

> Index: kern/kern_synch.c
> ===
> RCS file: /cvs/src/sys/kern/kern_synch.c,v
> retrieving revision 1.172
> diff -u -p -r1.172 kern_synch.c
> --- kern/kern_synch.c 7 Dec 2020 16:55:29 -   1.172
> +++ kern/kern_synch.c 16 Dec 2020 18:50:12 -
> @@ -87,6 +87,12 @@ sleep_queue_init(void)
>   TAILQ_INIT(&slpque[i]);
>  }
>  
> +/*
> + * Threads that do not want to receive wakeup(9) broadcasts should
> + * sleep on deadchan.
> + */
> +static int __deadchan;
> +int *deadchan = &__deadchan;
>  
>  /*
>   * During autoconfiguration or after a panic, a sleep will simply
> Index: sys/systm.h
> ===
> RCS file: /cvs/src/sys/sys/systm.h,v
> retrieving revision 1.148
> diff -u -p -r1.148 systm.h
> --- sys/systm.h   26 Aug 2020 03:29:07 -  1.148
> +++ sys/systm.h   16 Dec 2020 18:50:12 -
> @@ -107,6 +107,8 @@ extern struct vnode *rootvp;  /* vnode eq
>  extern dev_t swapdev;/* swapping device */
>  extern struct vnode *swapdev_vp;/* vnode equivalent to above */
>  
> +extern int *deadchan;/* dead wakeup(9) channel */
> +
>  struct proc;
>  struct process;
>  #define curproc curcpu()->ci_curproc
> Index: dev/sdmmc/sdmmc_io.c
> ===
> RCS file: /cvs/src/sys/dev/sdmmc/sdmmc_io.c,v
> retrieving revision 1.41
> diff -u -p -r1.41 sdmmc_io.c
> --- dev/sdmmc/sdmmc_io.c  31 Dec 2019 10:05:33 -  1.41
> +++ dev/sdmmc/sdmmc_io.c  16 Dec 2020 18:50:12 -
> @@ -244,7 +244,7 @@ sdmmc_io_function_enable(struct sdmmc_fu
>   sdmmc_io_write_1(sf0, SD_IO_CCCR_FN_ENABLE, rv);
>  
>   while (!sdmmc_io_function_ready(sf) && retry-- > 0)
> - tsleep_nsec(&lbolt, PPAUSE, "pause", INFSLP);
> + tsleep_nsec(deadchan, PPAUSE, "pause", SEC_TO_NSEC(1));
>   return (retry >= 0) ? 0 : ETIMEDOUT;
>  }
>  
> 



Re: sdmmc(4): sdmmc_io_function_enable(): don't sleep on lbolt

2020-12-16 Thread Martin Pieuchot
On 16/12/20(Wed) 12:50, Scott Cheloha wrote:
> On Tue, Dec 15, 2020 at 01:47:24PM +0100, Mark Kettenis wrote:
> > > Date: Tue, 15 Dec 2020 13:32:22 +0100
> > > From: Claudio Jeker 
> > > 
> > > On Fri, Dec 11, 2020 at 07:07:56PM -0600, Scott Cheloha wrote:
> > > > Hi,
> > > > 
> > > > I'd like to remove lbolt from the kernel.  I think having it in the
> > > > kernel complicates otherwise simple code.
> > > > 
> > > > We can start with sdmmc(4).
> > > > 
> > > > The goal in sdmmc_io_function_enable() is calling 
> > > > sdmmc_io_function_ready()
> > > > up to six times and sleep 1 second between each attempt.  Here's 
> > > > rewritten
> > > > code that does with without lbolt.
> > > > 
> > > > ok?
> > > > 
> > > > Index: sdmmc_io.c
> > > > ===
> > > > RCS file: /cvs/src/sys/dev/sdmmc/sdmmc_io.c,v
> > > > retrieving revision 1.41
> > > > diff -u -p -r1.41 sdmmc_io.c
> > > > --- sdmmc_io.c  31 Dec 2019 10:05:33 -  1.41
> > > > +++ sdmmc_io.c  12 Dec 2020 01:04:59 -
> > > > @@ -231,8 +231,8 @@ sdmmc_io_function_enable(struct sdmmc_fu
> > > >  {
> > > > struct sdmmc_softc *sc = sf->sc;
> > > > struct sdmmc_function *sf0 = sc->sc_fn0;
> > > > +   int chan, retry = 5;
> > > > u_int8_t rv;
> > > > -   int retry = 5;
> > > >  
> > > > rw_assert_wrlock(&sc->sc_lock);
> > > >  
> > > > @@ -244,7 +244,7 @@ sdmmc_io_function_enable(struct sdmmc_fu
> > > > sdmmc_io_write_1(sf0, SD_IO_CCCR_FN_ENABLE, rv);
> > > >  
> > > > while (!sdmmc_io_function_ready(sf) && retry-- > 0)
> > > > -   tsleep_nsec(&lbolt, PPAUSE, "pause", INFSLP);
> > > > +   tsleep_nsec(&chan, PPAUSE, "pause", SEC_TO_NSEC(1));
> > > > return (retry >= 0) ? 0 : ETIMEDOUT;
> > > >  }
> > > >  
> > > 
> > > Why not use &retry as wait channel instead of adding a new variable
> > > chan? Result is the same. Would it make sense to allow NULL as wait
> > > channel to make the tsleep not wakeable. At least that could be used in a
> > > few places where timeouts are implemented with tsleep and would make the
> > > intent more obvious.
> > 
> > Or have an appropriately named global variable?  Something like "int 
> > nowake"?
> 
> Something like the attached patch?
> 
> I think the idea of a "dead channel" communicates the intent.  Nobody
> broadcasts wakeups on the dead channel.  If you don't want to receive
> wakeup broadcasts you sleep on the dead channel.  Hence, "deadchan".

Why did we choose to use a variable over NULL?  Any technical reason?

I'm wondering it the locality of the variable might not matter in a
distant future.  Did you dig a bit deeper about the FreeBSD solution?
Why did they choose a per-CPU value?

> Index: kern/kern_synch.c
> ===
> RCS file: /cvs/src/sys/kern/kern_synch.c,v
> retrieving revision 1.172
> diff -u -p -r1.172 kern_synch.c
> --- kern/kern_synch.c 7 Dec 2020 16:55:29 -   1.172
> +++ kern/kern_synch.c 16 Dec 2020 18:50:12 -
> @@ -87,6 +87,12 @@ sleep_queue_init(void)
>   TAILQ_INIT(&slpque[i]);
>  }
>  
> +/*
> + * Threads that do not want to receive wakeup(9) broadcasts should
> + * sleep on deadchan.
> + */
> +static int __deadchan;
> +int *deadchan = &__deadchan;
>  
>  /*
>   * During autoconfiguration or after a panic, a sleep will simply
> Index: sys/systm.h
> ===
> RCS file: /cvs/src/sys/sys/systm.h,v
> retrieving revision 1.148
> diff -u -p -r1.148 systm.h
> --- sys/systm.h   26 Aug 2020 03:29:07 -  1.148
> +++ sys/systm.h   16 Dec 2020 18:50:12 -
> @@ -107,6 +107,8 @@ extern struct vnode *rootvp;  /* vnode eq
>  extern dev_t swapdev;/* swapping device */
>  extern struct vnode *swapdev_vp;/* vnode equivalent to above */
>  
> +extern int *deadchan;/* dead wakeup(9) channel */
> +
>  struct proc;
>  struct process;
>  #define curproc curcpu()->ci_curproc
> Index: dev/sdmmc/sdmmc_io.c
> ===
> RCS file: /cvs/src/sys/dev/sdmmc/sdmmc_io.c,v
> retrieving revision 1.41
> diff -u -p -r1.41 sdmmc_io.c
> --- dev/sdmmc/sdmmc_io.c  31 Dec 2019 10:05:33 -  1.41
> +++ dev/sdmmc/sdmmc_io.c  16 Dec 2020 18:50:12 -
> @@ -244,7 +244,7 @@ sdmmc_io_function_enable(struct sdmmc_fu
>   sdmmc_io_write_1(sf0, SD_IO_CCCR_FN_ENABLE, rv);
>  
>   while (!sdmmc_io_function_ready(sf) && retry-- > 0)
> - tsleep_nsec(&lbolt, PPAUSE, "pause", INFSLP);
> + tsleep_nsec(deadchan, PPAUSE, "pause", SEC_TO_NSEC(1));
>   return (retry >= 0) ? 0 : ETIMEDOUT;
>  }
>  
> 



Re: sdmmc(4): sdmmc_io_function_enable(): don't sleep on lbolt

2020-12-16 Thread Scott Cheloha
On Tue, Dec 15, 2020 at 01:47:24PM +0100, Mark Kettenis wrote:
> > Date: Tue, 15 Dec 2020 13:32:22 +0100
> > From: Claudio Jeker 
> > 
> > On Fri, Dec 11, 2020 at 07:07:56PM -0600, Scott Cheloha wrote:
> > > Hi,
> > > 
> > > I'd like to remove lbolt from the kernel.  I think having it in the
> > > kernel complicates otherwise simple code.
> > > 
> > > We can start with sdmmc(4).
> > > 
> > > The goal in sdmmc_io_function_enable() is calling 
> > > sdmmc_io_function_ready()
> > > up to six times and sleep 1 second between each attempt.  Here's rewritten
> > > code that does with without lbolt.
> > > 
> > > ok?
> > > 
> > > Index: sdmmc_io.c
> > > ===
> > > RCS file: /cvs/src/sys/dev/sdmmc/sdmmc_io.c,v
> > > retrieving revision 1.41
> > > diff -u -p -r1.41 sdmmc_io.c
> > > --- sdmmc_io.c31 Dec 2019 10:05:33 -  1.41
> > > +++ sdmmc_io.c12 Dec 2020 01:04:59 -
> > > @@ -231,8 +231,8 @@ sdmmc_io_function_enable(struct sdmmc_fu
> > >  {
> > >   struct sdmmc_softc *sc = sf->sc;
> > >   struct sdmmc_function *sf0 = sc->sc_fn0;
> > > + int chan, retry = 5;
> > >   u_int8_t rv;
> > > - int retry = 5;
> > >  
> > >   rw_assert_wrlock(&sc->sc_lock);
> > >  
> > > @@ -244,7 +244,7 @@ sdmmc_io_function_enable(struct sdmmc_fu
> > >   sdmmc_io_write_1(sf0, SD_IO_CCCR_FN_ENABLE, rv);
> > >  
> > >   while (!sdmmc_io_function_ready(sf) && retry-- > 0)
> > > - tsleep_nsec(&lbolt, PPAUSE, "pause", INFSLP);
> > > + tsleep_nsec(&chan, PPAUSE, "pause", SEC_TO_NSEC(1));
> > >   return (retry >= 0) ? 0 : ETIMEDOUT;
> > >  }
> > >  
> > 
> > Why not use &retry as wait channel instead of adding a new variable
> > chan? Result is the same. Would it make sense to allow NULL as wait
> > channel to make the tsleep not wakeable. At least that could be used in a
> > few places where timeouts are implemented with tsleep and would make the
> > intent more obvious.
> 
> Or have an appropriately named global variable?  Something like "int nowake"?

Something like the attached patch?

I think the idea of a "dead channel" communicates the intent.  Nobody
broadcasts wakeups on the dead channel.  If you don't want to receive
wakeup broadcasts you sleep on the dead channel.  Hence, "deadchan".

Index: kern/kern_synch.c
===
RCS file: /cvs/src/sys/kern/kern_synch.c,v
retrieving revision 1.172
diff -u -p -r1.172 kern_synch.c
--- kern/kern_synch.c   7 Dec 2020 16:55:29 -   1.172
+++ kern/kern_synch.c   16 Dec 2020 18:50:12 -
@@ -87,6 +87,12 @@ sleep_queue_init(void)
TAILQ_INIT(&slpque[i]);
 }
 
+/*
+ * Threads that do not want to receive wakeup(9) broadcasts should
+ * sleep on deadchan.
+ */
+static int __deadchan;
+int *deadchan = &__deadchan;
 
 /*
  * During autoconfiguration or after a panic, a sleep will simply
Index: sys/systm.h
===
RCS file: /cvs/src/sys/sys/systm.h,v
retrieving revision 1.148
diff -u -p -r1.148 systm.h
--- sys/systm.h 26 Aug 2020 03:29:07 -  1.148
+++ sys/systm.h 16 Dec 2020 18:50:12 -
@@ -107,6 +107,8 @@ extern struct vnode *rootvp;/* vnode eq
 extern dev_t swapdev;  /* swapping device */
 extern struct vnode *swapdev_vp;/* vnode equivalent to above */
 
+extern int *deadchan;  /* dead wakeup(9) channel */
+
 struct proc;
 struct process;
 #define curproc curcpu()->ci_curproc
Index: dev/sdmmc/sdmmc_io.c
===
RCS file: /cvs/src/sys/dev/sdmmc/sdmmc_io.c,v
retrieving revision 1.41
diff -u -p -r1.41 sdmmc_io.c
--- dev/sdmmc/sdmmc_io.c31 Dec 2019 10:05:33 -  1.41
+++ dev/sdmmc/sdmmc_io.c16 Dec 2020 18:50:12 -
@@ -244,7 +244,7 @@ sdmmc_io_function_enable(struct sdmmc_fu
sdmmc_io_write_1(sf0, SD_IO_CCCR_FN_ENABLE, rv);
 
while (!sdmmc_io_function_ready(sf) && retry-- > 0)
-   tsleep_nsec(&lbolt, PPAUSE, "pause", INFSLP);
+   tsleep_nsec(deadchan, PPAUSE, "pause", SEC_TO_NSEC(1));
return (retry >= 0) ? 0 : ETIMEDOUT;
 }
 



Re: tht(4): more tsleep(9) -> tsleep_nsec(9) conversions

2020-12-16 Thread Scott Cheloha
On Thu, Dec 03, 2020 at 09:59:11PM -0600, Scott Cheloha wrote:
> Hi,
> 
> tht(4) is another driver still using tsleep(9).
> 
> It uses it to spin while it waits for the card to load the firmware.
> Then it uses it to spin for up to 2 seconds while waiting for
> THT_REG_INIT_STATUS.
> 
> In the firmware case we can sleep for 10 milliseconds each iteration.
> 
> In the THT_REG_INIT_STATUS loop we can sleep for 10 milliseconds each
> iteration again, but instead of using a timeout to set a flag after 2
> seconds we can just count how many milliseconds we've slept.  This is
> less precise than using the timeout but it is much simpler.  Obviously
> we then need to remove all the timeout-related stuff from the function
> and the file.
> 
> Thoughts?  ok?

Two week bump.

Index: if_tht.c
===
RCS file: /cvs/src/sys/dev/pci/if_tht.c,v
retrieving revision 1.142
diff -u -p -r1.142 if_tht.c
--- if_tht.c10 Jul 2020 13:26:38 -  1.142
+++ if_tht.c4 Dec 2020 03:57:21 -
@@ -582,7 +582,6 @@ voidtht_lladdr_read(struct 
tht_softc 
 void   tht_lladdr_write(struct tht_softc *);
 inttht_sw_reset(struct tht_softc *);
 inttht_fw_load(struct tht_softc *);
-void   tht_fw_tick(void *arg);
 void   tht_link_state(struct tht_softc *);
 
 /* interface operations */
@@ -1667,11 +1666,9 @@ tht_sw_reset(struct tht_softc *sc)
 int
 tht_fw_load(struct tht_softc *sc)
 {
-   struct timeout  ticker;
-   volatile intok = 1;
u_int8_t*fw, *buf;
size_t  fwlen, wrlen;
-   int error = 1;
+   int error = 1, msecs, ret;
 
if (loadfirmware("tht", &fw, &fwlen) != 0)
return (1);
@@ -1682,7 +1679,9 @@ tht_fw_load(struct tht_softc *sc)
buf = fw;
while (fwlen > 0) {
while (tht_fifo_writable(sc, &sc->sc_txt) <= THT_FIFO_GAP) {
-   if (tsleep(sc, PCATCH, "thtfw", 1) == EINTR)
+   ret = tsleep_nsec(sc, PCATCH, "thtfw",
+   MSEC_TO_NSEC(10));
+   if (ret == EINTR)
goto err;
}
 
@@ -1695,32 +1694,21 @@ tht_fw_load(struct tht_softc *sc)
buf += wrlen;
}
 
-   timeout_set(&ticker, tht_fw_tick, (void *)&ok);
-   timeout_add_sec(&ticker, 2);
-   while (ok) {
+   for (msecs = 0; msecs < 2000; msecs += 10) {
if (tht_read(sc, THT_REG_INIT_STATUS) != 0) {
error = 0;
break;
}
-
-   if (tsleep(sc, PCATCH, "thtinit", 1) == EINTR)
+   ret = tsleep_nsec(sc, PCATCH, "thtinit", MSEC_TO_NSEC(10));
+   if (ret == EINTR)
goto err;
}
-   timeout_del(&ticker);
 
tht_write(sc, THT_REG_INIT_SEMAPHORE, 0x1);
 
 err:
free(fw, M_DEVBUF, fwlen);
return (error);
-}
-
-void
-tht_fw_tick(void *arg)
-{
-   volatile int*ok = arg;
-
-   *ok = 0;
 }
 
 void



Re: Switch select(2) to kqueue-based implementation

2020-12-16 Thread Visa Hankala
On Tue, Dec 15, 2020 at 05:23:45PM +, Visa Hankala wrote:
> On Tue, Dec 15, 2020 at 07:46:01AM -0300, Martin Pieuchot wrote:
> > @@ -636,43 +651,59 @@ dopselect(struct proc *p, int nd, fd_set
> > if (sigmask)
> > dosigsuspend(p, *sigmask &~ sigcantmask);
> >  
> > -retry:
> > -   ncoll = nselcoll;
> > -   atomic_setbits_int(&p->p_flag, P_SELECT);
> > -   error = selscan(p, pibits[0], pobits[0], nd, ni, retval);
> > -   if (error || *retval)
> > +   /* Register kqueue events */
> > +   if ((error = pselregister(p, pibits[0], nd, ni, &nevents) != 0))
> > goto done;
> 
> The above has parentheses wrong and returns 1 (EPERM) as error.
> The lines should read:
> 
>   if ((error = pselregister(p, pibits[0], nd, ni, &nevents)) != 0)
>   goto done;
> 
> 
> In addition to the above, I noticed that select(2) behaves differently
> than before when a file descriptor that is being monitored is closed by
> another thread. The old implementation returns EBADF. The new code keeps
> on waiting on the underlying object.
> 
> The diff below makes kqueue clear kqpoll's fd event registration on
> fd close. However, it does not make select(2) return an error, the fd
> just will not cause a wakeup any longer. I think I have an idea on how
> to correct that but I need to consider it some more.

The following diff adds the ability to return EBADF on fd close. The
idea is to set the error pending on the kqueue instance and deliver it
as the return value of kqueue_scan().

The error condition could also be reported via a kevent struct that
kqueue_scan() returns: set EV_ERROR in kev->flags and error code
in kev->data. However, doing so is somewhat contrived. knote_remove()
would have to rewire the knote, or allocate a new one, with a suitable
f_event code and activate it. In addition, the code would have to use
a dedicated knlist for these knotes, or play some tricks with kq_knlist
or kq_knhash, so that the knotes are still reachable from the kqueue
through some knlist.

The diff assumes eager kqpoll purging after poll(2) or select(2).
Lazy purging needs an additional check that skips raising EBADF if
the knote is not part of current scan.

Index: kern/kern_event.c
===
RCS file: src/sys/kern/kern_event.c,v
retrieving revision 1.150
diff -u -p -r1.150 kern_event.c
--- kern/kern_event.c   16 Dec 2020 15:07:30 -  1.150
+++ kern/kern_event.c   16 Dec 2020 16:27:12 -
@@ -95,6 +95,8 @@ void  knote_enqueue(struct knote *kn);
 void   knote_dequeue(struct knote *kn);
 intknote_acquire(struct knote *kn);
 void   knote_release(struct knote *kn);
+void   knote_activate(struct knote *);
+void   knote_remove(struct proc *p, struct knlist *list, int purge);
 
 void   filt_kqdetach(struct knote *kn);
 intfilt_kqueue(struct knote *kn, long hint);
@@ -504,8 +506,14 @@ kqpoll_init(void)
struct proc *p = curproc;
struct filedesc *fdp;
 
-   if (p->p_kq != NULL)
+   if (p->p_kq != NULL) {
+   /*
+* Clear any pending error that was raised after
+* previous scan.
+*/
+   p->p_kq->kq_error = 0;
return;
+   }
 
p->p_kq = kqueue_alloc(p->p_fd);
p->p_kq_serial = arc4random();
@@ -962,6 +970,15 @@ retry:
}
 
s = splhigh();
+
+   if (kq->kq_error != 0) {
+   /* Deliver the pending error. */
+   error = kq->kq_error;
+   kq->kq_error = 0;
+   splx(s);
+   goto done;
+   }
+
if (kq->kq_count == 0) {
/*
 * Successive loops are only necessary if there are more
@@ -1173,10 +1190,10 @@ kqueue_purge(struct proc *p, struct kque
KERNEL_ASSERT_LOCKED();
 
for (i = 0; i < kq->kq_knlistsize; i++)
-   knote_remove(p, &kq->kq_knlist[i]);
+   knote_remove(p, &kq->kq_knlist[i], 1);
if (kq->kq_knhashmask != 0) {
for (i = 0; i < kq->kq_knhashmask + 1; i++)
-   knote_remove(p, &kq->kq_knhash[i]);
+   knote_remove(p, &kq->kq_knhash[i], 1);
}
 }
 
@@ -1356,9 +1373,10 @@ knote(struct klist *list, long hint)
  * remove all knotes from a specified knlist
  */
 void
-knote_remove(struct proc *p, struct knlist *list)
+knote_remove(struct proc *p, struct knlist *list, int purge)
 {
struct knote *kn;
+   struct kqueue *kq;
int s;
 
while ((kn = SLIST_FIRST(list)) != NULL) {
@@ -1369,6 +1387,21 @@ knote_remove(struct proc *p, struct knli
}
splx(s);
kn->kn_fop->f_detach(kn);
+
+   /*
+* Notify poll(2) and select(2) when a monitored
+* file descriptor is closed.
+*/
+   if (!purge && (kn->kn_flags & __EV_POLL) != 0) {
+   kq = kn->kn_kq;

Re: regress print target name

2020-12-16 Thread Claudio Jeker
On Wed, Dec 16, 2020 at 05:01:18PM +0100, Theo Buehler wrote:
> On Wed, Dec 16, 2020 at 04:42:59PM +0100, Alexander Bluhm wrote:
> > When debugging tests, it is useful to see the target name and which
> > output belongs to it.  A lot of my tests have echo lines, but I
> > think this is better done in the framework.  Then all tests behave
> > simmilar.  I would remove the echos from the Makefiles afterwards.
> 
> Agreed. While it's not exactly perfect for libssl regress tests due to
> some suboptimal target names, I think it's an improvement (provided you
> remove superfluous echos, as intended).
> 
> ok tb

I agree. OK claudio@ 

-- 
:wq Claudio



Re: netstat - proto ip record

2020-12-16 Thread Claudio Jeker
On Wed, Dec 16, 2020 at 03:54:04PM +, Stuart Henderson wrote:
> On 2020/12/16 16:43, Salvatore Cuzzilla wrote:
> > Hi folks,
> > 
> > is there any process associated with this netstat record?
> > btw, what's the meaning of the state field with value '17'?
> > 
> > ToTo@obsd ~ $ doas netstat -an -f inet
> > Active Internet connections (including servers)
> > Proto   Recv-Q Send-Q  Local Address  Foreign Address(state)
> > ip   0  0  *.**.*17
> 
> Are kernel and userland in sync?

This is a SOCK_RAW socket using protocol 17 (UDP). AFAIK this is dhclient.
You can see this also with fstat.
root dhclient   750245* internet dgram udp *:0

-- 
:wq Claudio



Re: regress print target name

2020-12-16 Thread Theo Buehler
On Wed, Dec 16, 2020 at 04:42:59PM +0100, Alexander Bluhm wrote:
> When debugging tests, it is useful to see the target name and which
> output belongs to it.  A lot of my tests have echo lines, but I
> think this is better done in the framework.  Then all tests behave
> simmilar.  I would remove the echos from the Makefiles afterwards.

Agreed. While it's not exactly perfect for libssl regress tests due to
some suboptimal target names, I think it's an improvement (provided you
remove superfluous echos, as intended).

ok tb



Re: kern.video.record - part 2

2020-12-16 Thread Laurence Tratt
On Wed, Dec 16, 2020 at 03:50:54PM +0100, Marcus Glocker wrote:

Hello Marcus,

> I reviewed this again, and I think this implementation should be consistent
> on a video(4) level now.

I've just tried this, and it works, and works better than my original (and
uvideo-only) patch. I've been flipping it on and off and video(1), ffplay,
Firefox, and Chrome all work fine with it even when they're midway through
using the stream. [Of mild interest, it seems that you turn kern.video.record
to 0, ffplay and Firefox keep displaying the last frame while video(1) and
Chrome display a solid green picture.]

This certainly supersedes my patch and has my OK, not that it needs it :)


Laurie



Re: netstat - proto ip record

2020-12-16 Thread Stuart Henderson
On 2020/12/16 16:43, Salvatore Cuzzilla wrote:
> Hi folks,
> 
> is there any process associated with this netstat record?
> btw, what's the meaning of the state field with value '17'?
> 
> ToTo@obsd ~ $ doas netstat -an -f inet
> Active Internet connections (including servers)
> Proto   Recv-Q Send-Q  Local Address  Foreign Address(state)
> ip   0  0  *.**.*17

Are kernel and userland in sync?



netstat - proto ip record

2020-12-16 Thread Salvatore Cuzzilla

Hi folks,

is there any process associated with this netstat record?
btw, what's the meaning of the state field with value '17'?

ToTo@obsd ~ $ doas netstat -an -f inet
Active Internet connections (including servers)
Proto   Recv-Q Send-Q  Local Address  Foreign Address(state)
ip   0  0  *.**.*17


---
:wq,
Salvatore.



regress print target name

2020-12-16 Thread Alexander Bluhm
Hi,

When debugging tests, it is useful to see the target name and which
output belongs to it.  A lot of my tests have echo lines, but I
think this is better done in the framework.  Then all tests behave
simmilar.  I would remove the echos from the Makefiles afterwards.

ok?

bluhm

Index: share/mk/bsd.regress.mk
===
RCS file: /data/mirror/openbsd/cvs/src/share/mk/bsd.regress.mk,v
retrieving revision 1.21
diff -u -p -r1.21 bsd.regress.mk
--- share/mk/bsd.regress.mk 17 Jun 2019 17:20:24 -  1.21
+++ share/mk/bsd.regress.mk 16 Dec 2020 15:33:04 -
@@ -84,6 +84,7 @@ regress: .SILENT
rm -f ${REGRESS_SETUP_ONCE:S/^/stamp-/}
 .endif
 .for RT in ${REGRESS_TARGETS}
+   echo ' ${RT} '
 .  if ${REGRESS_SKIP_TARGETS:M${RT}}
echo -n "SKIP " ${_REGRESS_OUT}
echo SKIPPED
@@ -106,9 +107,12 @@ regress: .SILENT
fi
 .  endif
echo ${_REGRESS_NAME}/${RT:S/^run-regress-//} ${_REGRESS_OUT}
+   echo
 .endfor
 .for RT in ${REGRESS_CLEANUP}
+   echo ' ${RT} '
${MAKE} -C ${.CURDIR} ${RT}
+   echo
 .endfor
rm -f ${REGRESS_SETUP_ONCE:S/^/stamp-/}
 



kern.video.record - part 2

2020-12-16 Thread Marcus Glocker
Hi,

In September Laurence Tratt came up with a diff to implement an
kern.video.record switch for sysctl(8).  The implementation was only
for uvideo(4), and hence we wanted to lift that up to the video(4)
level, but we were not able to get a good implementation together back
then.

I reviewed this again, and I think this implementation should be
consistent on a video(4) level now.  Some points of the
characteristics:

* The implementation acts on video(4) level, so it will serve
  all it's drivers without the need to modify an existing
  video(4) driver.
* The implementation is more or less equal to what
  kern.audio.record does.
* The video data will be blanked on the fly by zeroing the
  frame buffer data in video(4) before passing it to userland.
* Hence the switch can be applied also on a open video stream.

TODOs in a second step:

* I haven't done the man page updates in this diff.
* The blanking could be improved to not only zero out the frame
  buffer data, but generate a black image.  That would require
  some similar helper functions like audio(4) has, to create
  silence: audio_calc_sil(), audio_fill_sil().  Of course we
  would need to distinguish between compressed and uncompressed
  video data in there.

I have quickly tested the diff here with two cams, and it works fine
for me so far.

Comments, testers, OKs?

Cheers,
Marcus


Index: sys/dev/video.c
===
RCS file: /cvs/src/sys/dev/video.c,v
retrieving revision 1.44
diff -u -p -u -p -r1.44 video.c
--- sys/dev/video.c 16 May 2020 10:47:22 -  1.44
+++ sys/dev/video.c 16 Dec 2020 14:05:11 -
@@ -51,6 +51,7 @@ struct video_softc {
 
int  sc_fsize;
uint8_t *sc_fbuffer;
+   caddr_t  sc_fbuffer_mmap;
size_t   sc_fbufferlen;
int  sc_vidmode;/* access mode */
 #defineVIDMODE_NONE0
@@ -78,6 +79,11 @@ struct cfdriver video_cd = {
NULL, "video", DV_DULL
 };
 
+/*
+ * Global flag to control if video recording is enabled by kern.video.record.
+ */
+int video_record_enable = 0;
+
 int
 videoprobe(struct device *parent, void *match, void *aux)
 {
@@ -191,6 +197,8 @@ videoread(dev_t dev, struct uio *uio, in
 
/* move no more than 1 frame to userland, as per specification */
size = ulmin(uio->uio_resid, sc->sc_fsize);
+   if (!video_record_enable)
+   bzero(sc->sc_fbuffer, size);
error = uiomove(sc->sc_fbuffer, size, uio);
sc->sc_frames_ready--;
if (error)
@@ -205,6 +213,7 @@ int
 videoioctl(dev_t dev, u_long cmd, caddr_t data, int flags, struct proc *p)
 {
struct video_softc *sc;
+   struct v4l2_buffer *vb = (struct v4l2_buffer *)data;
int unit, error;
 
unit = VIDEOUNIT(dev);
@@ -299,6 +308,8 @@ videoioctl(dev_t dev, u_long cmd, caddr_
}
error = (sc->hw_if->dqbuf)(sc->hw_hdl,
(struct v4l2_buffer *)data);
+   if (!video_record_enable)
+   bzero(sc->sc_fbuffer_mmap + vb->m.offset, vb->length);
sc->sc_frames_ready--;
break;
case VIDIOC_STREAMON:
@@ -408,6 +419,10 @@ videommap(dev_t dev, off_t off, int prot
if (pmap_extract(pmap_kernel(), (vaddr_t)p, &pa) == FALSE)
panic("videommap: invalid page");
sc->sc_vidmode = VIDMODE_MMAP;
+
+   /* store frame buffer base address for later blanking */
+   if (off == 0)
+   sc->sc_fbuffer_mmap = p;
 
return (pa);
 }
Index: sys/kern/kern_sysctl.c
===
RCS file: /cvs/src/sys/kern/kern_sysctl.c,v
retrieving revision 1.384
diff -u -p -u -p -r1.384 kern_sysctl.c
--- sys/kern/kern_sysctl.c  7 Dec 2020 16:55:29 -   1.384
+++ sys/kern/kern_sysctl.c  16 Dec 2020 14:05:12 -
@@ -114,6 +114,7 @@
 #endif
 
 #include "audio.h"
+#include "video.h"
 #include "pf.h"
 
 extern struct forkstat forkstat;
@@ -125,6 +126,9 @@ extern  long numvnodes;
 #if NAUDIO > 0
 extern int audio_record_enable;
 #endif
+#if NVIDEO > 0
+extern int video_record_enable;
+#endif
 
 int allowkmem;
 int allowdt;
@@ -141,6 +145,9 @@ int sysctl_cptime2(int *, u_int, void *,
 #if NAUDIO > 0
 int sysctl_audio(int *, u_int, void *, size_t *, void *, size_t);
 #endif
+#if NVIDEO > 0
+int sysctl_video(int *, u_int, void *, size_t *, void *, size_t);
+#endif
 int sysctl_cpustats(int *, u_int, void *, size_t *, void *, size_t);
 int sysctl_utc_offset(void *, size_t *, void *, size_t);
 
@@ -379,6 +386,7 @@ kern_sysctl(int *name, u_int namelen, vo
case KERN_FILE:
case KERN_WITNESS:
case KERN_AUDIO:
+   case KERN_VIDEO:

Re: acme-client(1) make -F flag use more obvious

2020-12-16 Thread Renaud Allard



On 12/16/20 11:13 AM, Janne Johansson wrote:
Den ons 16 dec. 2020 kl 10:42 skrev Renaud Allard >:


 > While there, I propose to change the proposed crontab to once a day
 > instead of every hour. The certificates can be renewed 1 full month
 > before expiracy, I think trying to renew every hour is too much.

I think that, while waiting for someone to fix acme-client correctly as
suggested by Florian, this patch is worth committing.
The crontab change in particular is quite useful, there is really no
reason to check every hour (even every day is probably too much
already).


But it is a local check for the local date vs the date in the 
certificate, and perhaps your box is not on at 03:00 on Saturdays as you 
thought 3 months ago.




If your clock is 3 months off, it could also be off the other way round. 
That means you would try to renew every hour and get blacklisted for 
hitting rate limits. I don't think the example crontab should take into 
account a wrong config in the first place.




smime.p7s
Description: S/MIME Cryptographic Signature


Re: acme-client(1) make -F flag use more obvious

2020-12-16 Thread Renaud Allard



On 12/16/20 9:44 AM, Solene Rapenne wrote:

On Tue, 15 Dec 2020 10:18:41 +0100
Solene Rapenne :


This is a small change to acme-client(1) because I find
the explanation of -F flag not being obvious that you
need it when you add/remove an alternative name in your
domain config.

Maybe wording could be better, if a native English
speaker could give it a look.

ok?



I added 's to domain and specified -F only works for new domains.

While there, I propose to change the proposed crontab to once a day
instead of every hour. The certificates can be renewed 1 full month
before expiracy, I think trying to renew every hour is too much.



I think that, while waiting for someone to fix acme-client correctly as 
suggested by Florian, this patch is worth committing.
The crontab change in particular is quite useful, there is really no 
reason to check every hour (even every day is probably too much already).




smime.p7s
Description: S/MIME Cryptographic Signature


amd64 pamp panic messages

2020-12-16 Thread Alexander Bluhm
Hi,

during all my pmap crashes, I sometimes get this strange address.

panic: pmap_remove_pte: unmanaged page marked PG_PVLIST, va = 0x5d155753000, pa 
= 0xfdfdfdfdfd000

I think we should not clear bits in a panic messages.  Debugging
with the full picture is easier.  While there make the panics
more consistent.

ok?

bluhm

Index: arch/amd64//amd64/pmap.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/arch/amd64/amd64/pmap.c,v
retrieving revision 1.140
diff -u -p -r1.140 pmap.c
--- arch/amd64//amd64/pmap.c26 Oct 2020 18:35:41 -  1.140
+++ arch/amd64//amd64/pmap.c16 Dec 2020 12:49:04 -
@@ -1626,17 +1626,18 @@ pmap_remove_ptes(struct pmap *pmap, stru
if ((opte & PG_PVLIST) == 0) {
 #ifdef DIAGNOSTIC
if (pg != NULL)
-   panic("%s: managed page without PG_PVLIST "
- "for 0x%lx", __func__, startva);
+   panic("%s: managed page without PG_PVLIST: "
+   "va 0x%lx, opte 0x%llx", __func__,
+   startva, opte);
 #endif
continue;
}
 
 #ifdef DIAGNOSTIC
if (pg == NULL)
-   panic("%s: unmanaged page marked PG_PVLIST, "
- "va = 0x%lx, pa = 0x%lx", __func__,
- startva, (u_long)(opte & PG_FRAME));
+   panic("%s: unmanaged page marked PG_PVLIST: "
+   "va 0x%lx, opte 0x%llx", __func__,
+   startva, opte);
 #endif
 
/* sync R/M bits */
@@ -1692,16 +1693,16 @@ pmap_remove_pte(struct pmap *pmap, struc
if ((opte & PG_PVLIST) == 0) {
 #ifdef DIAGNOSTIC
if (pg != NULL)
-   panic("%s: managed page without PG_PVLIST for 0x%lx",
- __func__, va);
+   panic("%s: managed page without PG_PVLIST: "
+   "va 0x%lx, opte 0x%llx", __func__, va, opte);
 #endif
return 1;
}
 
 #ifdef DIAGNOSTIC
if (pg == NULL)
-   panic("%s: unmanaged page marked PG_PVLIST, va = 0x%lx, "
- "pa = 0x%lx", __func__, va, (u_long)(opte & PG_FRAME));
+   panic("%s: unmanaged page marked PG_PVLIST: "
+   "va 0x%lx, opte 0x%llx", __func__, va, opte);
 #endif
 
/* sync R/M bits */
@@ -2724,18 +2725,19 @@ pmap_enter(struct pmap *pmap, vaddr_t va
pg = PHYS_TO_VM_PAGE(pa);
 #ifdef DIAGNOSTIC
if (pg == NULL)
-   panic("%s: same pa PG_PVLIST "
- "mapping with unmanaged page "
- "pa = 0x%lx (0x%lx)", __func__,
- pa, atop(pa));
+   panic("%s: same pa, PG_PVLIST "
+   "mapping with unmanaged page: "
+   "va 0x%lx, opte 0x%llx, pa 0x%lx",
+   __func__, va, opte, pa);
 #endif
pmap_sync_flags_pte(pg, opte);
} else {
 #ifdef DIAGNOSTIC
if (PHYS_TO_VM_PAGE(pa) != NULL)
-   panic("%s: same pa, managed "
-   "page, no PG_VLIST pa: 0x%lx",
-   __func__, pa);
+   panic("%s: same pa, no PG_PVLIST "
+   "mapping with managed page: "
+   "va 0x%lx, opte 0x%llx, pa 0x%lx",
+   __func__, va, opte, pa);
 #endif
}
goto enter_now;
@@ -2755,8 +2757,8 @@ pmap_enter(struct pmap *pmap, vaddr_t va
 #ifdef DIAGNOSTIC
if (pg == NULL)
panic("%s: PG_PVLIST mapping with unmanaged "
- "page pa = 0x%lx (0x%lx)",
- __func__, pa, atop(pa));
+   "page: va 0x%lx, opte 0x%llx, pa 0x%lx",
+   __func__, va, opte, pa);
 #endif
pmap_sync_flags_pte(pg, opte);
opve = pmap_remove_pv(pg, pmap, va);



Re: acme-client(1) make -F flag use more obvious

2020-12-16 Thread Solene Rapenne
On Tue, 15 Dec 2020 10:18:41 +0100
Solene Rapenne :

> This is a small change to acme-client(1) because I find
> the explanation of -F flag not being obvious that you
> need it when you add/remove an alternative name in your
> domain config.
> 
> Maybe wording could be better, if a native English
> speaker could give it a look.
> 
> ok?
> 

I added 's to domain and specified -F only works for new domains.

While there, I propose to change the proposed crontab to once a day
instead of every hour. The certificates can be renewed 1 full month
before expiracy, I think trying to renew every hour is too much.

Index: acme-client.1
===
RCS file: /home/reposync/src/usr.sbin/acme-client/acme-client.1,v
retrieving revision 1.36
diff -u -p -r1.36 acme-client.1
--- acme-client.1   4 Nov 2020 10:34:18 -   1.36
+++ acme-client.1   16 Dec 2020 08:42:36 -
@@ -68,6 +68,9 @@ The options are as follows:
 .Bl -tag -width Ds
 .It Fl F
 Force certificate renewal, even if it's too soon.
+This is required if new domain's alternatives names
+were added to
+.Xr acme-client.conf 5 .
 .It Fl f Ar configfile
 Specify an alternative configuration file.
 .It Fl n
@@ -123,7 +126,7 @@ On renewal,
 .Xr httpd 8
 is reloaded:
 .Bd -literal -offset indent
-~  *   *   *   *   acme-client example.com && \e
+~  ~   *   *   *   acme-client example.com && \e
rcctl reload httpd
 .Ed
 .Sh SEE ALSO



Re: Switch select(2) to kqueue-based implementation

2020-12-16 Thread Martin Pieuchot
On 15/12/20(Tue) 17:23, Visa Hankala wrote:
> On Tue, Dec 15, 2020 at 07:46:01AM -0300, Martin Pieuchot wrote:
> > @@ -636,43 +651,59 @@ dopselect(struct proc *p, int nd, fd_set
> > if (sigmask)
> > dosigsuspend(p, *sigmask &~ sigcantmask);
> >  
> > -retry:
> > -   ncoll = nselcoll;
> > -   atomic_setbits_int(&p->p_flag, P_SELECT);
> > -   error = selscan(p, pibits[0], pobits[0], nd, ni, retval);
> > -   if (error || *retval)
> > +   /* Register kqueue events */
> > +   if ((error = pselregister(p, pibits[0], nd, ni, &nevents) != 0))
> > goto done;
> 
> The above has parentheses wrong and returns 1 (EPERM) as error.
> The lines should read:
> 
>   if ((error = pselregister(p, pibits[0], nd, ni, &nevents)) != 0)
>   goto done;

Thanks fixed.
 
> In addition to the above, I noticed that select(2) behaves differently
> than before when a file descriptor that is being monitored is closed by
> another thread. The old implementation returns EBADF. The new code keeps
> on waiting on the underlying object.
> 
> The diff below makes kqueue clear kqpoll's fd event registration on
> fd close. However, it does not make select(2) return an error, the fd
> just will not cause a wakeup any longer. I think I have an idea on how
> to correct that but I need to consider it some more.

Are you saying that knote_fdclose() can't cleanup the knotes on the
per-thread kqueue?

If so, should we replace the call of kqueue_free() in kqpoll_exit() by
a KQRELE()?  The original intend was to not put the per-thread kqueue to
the fdp list.  As you just pointed out this is necessary.  So I don't
see any need for kqueue_free().  We could even assert that the refcount
is 1.

Diff below does that, feel free to commit it.

Index: kern/kern_event.c
===
RCS file: /cvs/src/sys/kern/kern_event.c,v
retrieving revision 1.148
diff -u -p -r1.148 kern_event.c
--- kern/kern_event.c   15 Dec 2020 04:48:18 -  1.148
+++ kern/kern_event.c   16 Dec 2020 11:19:01 -
@@ -168,12 +168,11 @@ KQREF(struct kqueue *kq)
 void
 KQRELE(struct kqueue *kq)
 {
-   struct filedesc *fdp;
+   struct filedesc *fdp = kq->kq_fdp;
 
if (atomic_dec_int_nv(&kq->kq_refs) > 0)
return;
 
-   fdp = kq->kq_fdp;
if (rw_status(&fdp->fd_lock) == RW_WRITE) {
LIST_REMOVE(kq, kq_next);
} else {
@@ -182,12 +181,6 @@ KQRELE(struct kqueue *kq)
fdpunlock(fdp);
}
 
-   kqueue_free(kq);
-}
-
-void
-kqueue_free(struct kqueue *kq)
-{
free(kq->kq_knlist, M_KEVENT, kq->kq_knlistsize *
sizeof(struct knlist));
hashfree(kq->kq_knhash, KN_HASHSIZE, M_KEVENT);
@@ -509,12 +502,17 @@ void
 kqpoll_init(void)
 {
struct proc *p = curproc;
+   struct filedesc *fdp;
 
if (p->p_kq != NULL)
return;
 
p->p_kq = kqueue_alloc(p->p_fd);
p->p_kq_serial = arc4random();
+   fdp = p->p_fd;
+   fdplock(fdp);
+   LIST_INSERT_HEAD(&fdp->fd_kqlist, p->p_kq, kq_next);
+   fdpunlock(fdp);
 }
 
 void
@@ -526,7 +524,8 @@ kqpoll_exit(void)
return;
 
kqueue_terminate(p, p->p_kq);
-   kqueue_free(p->p_kq);
+   KASSERT(p->p_kq->kq_refs == 1);
+   KQRELE(p->p_kq);
p->p_kq = NULL;
 }
 



Re: acme-client(1) make -F flag use more obvious

2020-12-16 Thread Stuart Henderson
On 2020/12/16 11:47, Renaud Allard wrote:
> On 12/16/20 11:13 AM, Janne Johansson wrote:
> > 
> > But it is a local check for the local date vs the date in the
> > certificate, and perhaps your box is not on at 03:00 on Saturdays as you
> > thought 3 months ago.
> > 
> 
> If your clock is 3 months off, it could also be off the other way round.
> That means you would try to renew every hour and get blacklisted for hitting
> rate limits. I don't think the example crontab should take into account a
> wrong config in the first place.
> 

JJ isn't talking about the clock being set incorrectly, he's talking
about the machine being turned off. Even part time servers (say, a test
server running on a laptop) may still need a signed certificate.

If the machine clock is correct then there's no issue, it is a very
quick local file check only.

If the clock is incorrect then, for letsencrypt, the relevant limit is
the Duplicate Certificate limit, which is 5 per week, so a daily check
will still trip this. I'd argue that it is better to know sooner rather
than later if there is a problem as it will give you more time to fix it
before the certificate expires.



Re: acme-client(1) make -F flag use more obvious

2020-12-16 Thread Janne Johansson
Den ons 16 dec. 2020 kl 10:42 skrev Renaud Allard :

> > While there, I propose to change the proposed crontab to once a day
> > instead of every hour. The certificates can be renewed 1 full month
> > before expiracy, I think trying to renew every hour is too much.
>
> I think that, while waiting for someone to fix acme-client correctly as
> suggested by Florian, this patch is worth committing.
> The crontab change in particular is quite useful, there is really no
> reason to check every hour (even every day is probably too much already).
>

But it is a local check for the local date vs the date in the certificate,
and perhaps your box is not on at 03:00 on Saturdays as you thought 3
months ago.

-- 
May the most significant bit of your life be positive.