Re: man find -exec -- a little bit more hand-holding

2020-08-14 Thread Jason McIntyre
On Fri, Aug 14, 2020 at 09:24:35AM -0600, Theo de Raadt wrote:
> Christian Weisgerber  wrote:
> 
> > On 2020-08-14, Jason McIntyre  wrote:
> > 
> > > - i cannot work out what is with the \! examples. i know we try to make
> > >   entries work for both csh and sh style shells, but stuff like this
> > >   works without escaping:
> > >
> > >   $ find . ! -type f
> > 
> > Going through the CVS and SCCS history, I see that the examples
> > came with a rewrite of find(1) at Berkeley 30 years ago.
> > 
> > csh's behavior that "for convenience, a `!' is passed unchanged
> > when it is followed by a blank, tab, newline, `=' or `('" has been
> > documented as such at least since the start of the CSRG repository
> > in 1985.
> > 
> > bash, whose history substitution was modeled on csh, also shares
> > this behavior.
> > 
> > I think it was never necessary to escape the '!' and the man page
> > examples were written with an abundance of caution and a lack of
> > understanding of csh's exact replacement mechanism.
> 
> Yep, so I think that particular escape should be deleted.
> 

done.
jmc



Re: TCP congestion control progression

2020-08-14 Thread Martin Pieuchot
On 13/08/20(Thu) 10:14, Brian Brombacher wrote:
> 
> 
> >> On Aug 9, 2020, at 6:29 PM, Chris Cappuccio  wrote:
> > Brian Brombacher [br...@planetunix.net] wrote:
> >> 
> >> I am wondering what approach the project is planning to use to modernize 
> >> the congestion control algorithms.  I'm interested in assisting the 
> >> project 
> >> with development effort in this area.  I've spent time making modifications
> >> for my own purposes and would prefer to understand the projects goals 
> >> before
> >> continuing, if possible.
> > 
> > Various improvements have been made over the years for dynamic window size,
> > also further tweaks for higher bandwidth over high latency connections. I'd
> > recommend sharing your current modifications here to get feedback.
> > 
> > Chris
> 
> Hi Chris,
> 
> The modifications I’ve made are around parameters related to the slow start 
> threshold logic, ENOBUF behavior on interface write, and each Reno congestion 
> control logic segment has also received some changes.  The majority of the 
> changes are implementing sysctl tunables to experiment with varying degrees 
> of network behavior I’ve been witnessing on my infrastructure.
> 
> I’ve been evaluating potential patterns to allow optimal maintenance and 
> addition of future CC algorithms.

mikeb@ has been working on this in the past.  His work can be used as a
solid basis, see:

https://github.com/mbelop/src/tree/tcpcc
https://github.com/mbelop/src/tree/tcpcc2



Re: switch: allow datapath_id and maxflow ioctls for non-root

2020-08-14 Thread Klemens Nanni
On Fri, Jul 31, 2020 at 06:28:32AM +0200, Klemens Nanni wrote:
> ifconfig(8) detects switch(4) through its unique SIOCSWSDPID ioctl and
> further does another switch specific ioctl for the default output
> regardless of configuration and/or members:
> 
>   SIOCSWSDPID struct ifbrparam
>   Set the datapath_id in the OpenFlow protocol of the switch named
>   in ifbrp_name to the value in the ifbrpu_datapath field.
>   
>   SIOCSWGMAXFLOW struct ifbrparam
>   Retrieve the maximum number of flows in the OpenFlow protocol of
>   the switch named in ifbrp_name into the ifbrp_maxflow field.
> 
> This is how it should look like:
> 
>   # ifconfig switch0 create
>   # ifconfig switch0
>   switch0: flags=0<>
>   index 29 llprio 3
>   groups: switch
>   datapath 0x5bea2b5b8e2456cf maxflow 1 maxgroup 1000
> 
> But using ifconfig as unprivileged user makes it fail switch(4)
> interfaces as such and thus interprets them as bridge(4) instead:
> 
>   $ ifconfig switch0
>   switch0: flags=0<>
>   index 29 llprio 3
>   groups: switch
>   priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 
> proto rstp
>   designated: id 00:00:00:00:00:00 priority 0
> 
> This is because the above mentioned ioctls are listed together with all
> other bridge and switch related ioctls that set or write things.
> Getting datapath_id and maxflow values however is read-only and crucial
> for ifconfig as demonstrated above, so I'd like to move them out of the
> root check to fix ifconfig.
> 
> Feedback? OK?
Ping.

Last diff after dlg's feedback reattached.


Index: if.c
===
RCS file: /cvs/src/sys/net/if.c,v
retrieving revision 1.618
diff -u -p -r1.618 if.c
--- if.c5 Aug 2020 11:07:34 -   1.618
+++ if.c5 Aug 2020 22:31:44 -
@@ -2160,9 +2160,7 @@ ifioctl(struct socket *so, u_long cmd, c
case SIOCBRDGSIFCOST:
case SIOCBRDGSTXHC:
case SIOCBRDGSPROTO:
-   case SIOCSWGDPID:
case SIOCSWSPORTNO:
-   case SIOCSWGMAXFLOW:
 #endif
if ((error = suser(p)) != 0)
break;



Re: Enable arm64 PAN feature

2020-08-14 Thread Mark Kettenis
> Date: Fri, 14 Aug 2020 14:40:23 +0200 (CEST)
> From: Mark Kettenis 
> 
> I suppose a way to test this properly is to pick a system call and
> replace a copyin() with a direct access?  That will succeed without
> PAN but should fail with PAN enabled right?

So that does indeed work.  However, the result is a hard hang.  So
here as an additional diff that makes sure we panic instead.  The idea
is that all user-space access from the kernel should be done by the
special unprivileged load/store instructions.

Index: arch/arm64/arm64/trap.c
===
RCS file: /cvs/src/sys/arch/arm64/arm64/trap.c,v
retrieving revision 1.27
diff -u -p -r1.27 trap.c
--- arch/arm64/arm64/trap.c 6 Jan 2020 12:37:30 -   1.27
+++ arch/arm64/arm64/trap.c 14 Aug 2020 21:05:54 -
@@ -65,6 +65,14 @@ void do_el0_error(struct trapframe *);
 
 void dumpregs(struct trapframe*);
 
+/* Check whether we're executing an unprivileged load/store instruction. */
+static inline int
+is_unpriv_ldst(uint64_t elr)
+{
+   uint32_t insn = *(uint32_t *)elr;
+   return ((insn & 0x3f200c00) == 0x38000800);
+}
+
 static void
 data_abort(struct trapframe *frame, uint64_t esr, uint64_t far,
 int lower, int exe)
@@ -104,8 +112,18 @@ data_abort(struct trapframe *frame, uint
/* The top bit tells us which range to use */
if ((far >> 63) == 1)
map = kernel_map;
-   else
+   else {
+   /*
+* Only allow user-space access using
+* unprivileged load/store instructions.
+*/
+   if (!is_unpriv_ldst(frame->tf_elr)) {
+   panic("attempt to access user address"
+ " 0x%llx from EL1", far);
+   }
+
map = >p_vmspace->vm_map;
+   }
}
 
if (exe)



Re: getitimer(2), setitimer(2): merge critical sections

2020-08-14 Thread Theo de Raadt
> It has occurred to me that we could do a trial copyout(9) in
> sys_setitimer() before entering the critical section.  This is a *bit*
> wasteful, but is relatively inexpensive and narrows the behavior
> change I mentioned down to truly improbable cases involving multiple
> threads and munmap(2).

That sounds scary.  You are touching userland memory twice, and I could
imagine a situation where it isn't the same memory because it gets
swapped out in a shared storage situation.

It sounds terribly wrong to do that.



Re: bug on fmemopen(3)

2020-08-14 Thread Todd C . Miller
Fix append mode and expand regress.  I've added an append flag to
the state but we could just as easily store the open flag instead.

 - todd

Index: lib/libc/stdio/fmemopen.c
===
RCS file: /cvs/src/lib/libc/stdio/fmemopen.c,v
retrieving revision 1.4
diff -u -p -u -r1.4 fmemopen.c
--- lib/libc/stdio/fmemopen.c   14 Aug 2020 12:00:33 -  1.4
+++ lib/libc/stdio/fmemopen.c   14 Aug 2020 15:28:09 -
@@ -30,6 +30,7 @@ struct state {
size_t   size;  /* allocated size */
size_t   len;   /* length of the data */
int  update;/* open for update */
+   int  append;/* open for append */
 };
 
 static int
@@ -51,6 +52,9 @@ fmemopen_write(void *v, const char *b, i
struct state*st = v;
int i;
 
+   if (st->append)
+   st->pos = st->len;
+
for (i = 0; i < l && i + st->pos < st->size; i++)
st->string[st->pos + i] = b[i];
st->pos += i;
@@ -147,6 +151,7 @@ fmemopen(void *buf, size_t size, const c
st->len = (oflags & O_TRUNC) ? 0 : size;
st->size = size;
st->update = oflags & O_RDWR;
+   st->append = oflags & O_APPEND;
 
if (buf == NULL) {
if ((st->string = malloc(size)) == NULL) {
@@ -173,7 +178,7 @@ fmemopen(void *buf, size_t size, const c
 
fp->_flags = (short)flags;
fp->_file = -1;
-   fp->_cookie = (void *)st;
+   fp->_cookie = st;
fp->_read = (flags & __SWR) ? NULL : fmemopen_read;
fp->_write = (flags & __SRD) ? NULL : fmemopen_write;
fp->_seek = fmemopen_seek;

Index: regress/lib/libc/fmemopen/fmemopentest.c
===
RCS file: /cvs/src/regress/lib/libc/fmemopen/fmemopentest.c,v
retrieving revision 1.3
diff -u -p -u -r1.3 fmemopentest.c
--- regress/lib/libc/fmemopen/fmemopentest.c28 Mar 2013 09:35:58 -  
1.3
+++ regress/lib/libc/fmemopen/fmemopentest.c15 Aug 2020 01:02:47 -
@@ -70,7 +70,7 @@ simpletest(void)
 }
 
 int
-updatetest(void)
+appendtest(const char *mode)
 {
FILE*s1;
char string[] = "hello\0 test number 2";
@@ -78,35 +78,44 @@ updatetest(void)
size_t   len;
int  failures = 0;
 
-   s1 = fmemopen(string, 19, "a+");
+   s1 = fmemopen(string, 19, mode);
if (s1 == NULL)
return (1);
 
+   fseek(s1, 0, SEEK_SET);
+   if (ftell(s1) != 0) {
+   warnx("failed seek test [%s] (4)", mode);
+   failures++;
+   }
+
+   /* write should append even though seek position is 0 */
len = fwrite(" world", 1, 6, s1);
if (len != 6) {
-   warnx("failed write test (4)");
+   warnx("failed write test [%s] (5)", mode);
failures++;
}
 
-   fseek(s1, 0, SEEK_SET);
-   if (ftell(s1) != 0) {
-   warnx("failed seek test (5)");
+   if (ftell(s1) != strlen("hello world")) {
+   warnx("failed seek test [%s] (6)", mode);
failures++;
}
 
-   len = fread(buffer, 1, sizeof(buffer) - 1, s1);
-   if (strncmp(string, buffer, len)) {
-   warnx("failed compare test (6)");
-   failures++;
+   if (mode[1] == '+') {
+   fseek(s1, 0, SEEK_SET);
+   len = fread(buffer, 1, sizeof(buffer) - 1, s1);
+   if (len == 0 || strncmp(string, buffer, len)) {
+   warnx("failed compare test [%s] (7)", mode);
+   failures++;
+   }
}
 
if (strcmp(string, "hello world")) {
-   warnx("failed compare test (7)");
+   warnx("failed compare test [%s] (8)", mode);
failures++;
}
 
if (strcmp(string + strlen(string) + 1, "number 2")) {
-   warnx("failed compare test (8)");
+   warnx("failed compare test [%s] (9)", mode);
failures++;
}
 
@@ -114,7 +123,46 @@ updatetest(void)
 }
 
 int
-writetest(void)
+updatetest(void)
+{
+   FILE*s1;
+   char string[] = "bah, what a day";
+   char buffer[256];
+   size_t   len;
+   int  failures = 0;
+
+   s1 = fmemopen(string, 19, "r+");
+   if (s1 == NULL)
+   return (1);
+
+   if (ftell(s1) != 0) {
+   warnx("failed seek test (10)");
+   failures++;
+   }
+
+   len = fwrite("oh frabjous", 1, 11, s1);
+   if (len != 11) {
+   warnx("failed write test (11)");
+   failures++;
+   }
+
+   fseek(s1, 0, SEEK_SET);
+   len = fread(buffer, 1, sizeof(buffer) - 1, s1);
+   if (len == 0 || strncmp(string, buffer, len)) {
+   warnx("failed compare test (12)");
+   failures++;
+   }
+
+   if 

Re: getitimer(2), setitimer(2): merge critical sections

2020-08-14 Thread Scott Cheloha
On Wed, Aug 12, 2020 at 01:58:08PM -0500, Scott Cheloha wrote:
> 
> [...]
> 
> There is one behavior change: in the setitimer(2) swap case it is now
> possible to EFAULT on copyout(9) *after* you have written the new
> timer value and (possibly) started the ITIMER_REAL timeout.
> 
> For example, the following code now yields EFAULT even though a new
> oneshot timer has been started successfully.
> 
> [...]
> 
> I don't think there is a way to avoid this without introducing a bunch
> of extra complexity.  [...]

It has occurred to me that we could do a trial copyout(9) in
sys_setitimer() before entering the critical section.  This is a *bit*
wasteful, but is relatively inexpensive and narrows the behavior
change I mentioned down to truly improbable cases involving multiple
threads and munmap(2).

Updated patch below.

Index: kern_time.c
===
RCS file: /cvs/src/sys/kern/kern_time.c,v
retrieving revision 1.140
diff -u -p -r1.140 kern_time.c
--- kern_time.c 12 Aug 2020 15:31:27 -  1.140
+++ kern_time.c 14 Aug 2020 23:59:23 -
@@ -491,7 +491,7 @@ out:
 struct mutex itimer_mtx = MUTEX_INITIALIZER(IPL_CLOCK);
 
 /*
- * Get value of an interval timer.  The process virtual and
+ * Get and/or set value of an interval timer.  The process virtual and
  * profiling virtual time timers are kept internally in the
  * way they are specified externally: in time until they expire.
  *
@@ -509,6 +509,63 @@ struct mutex itimer_mtx = MUTEX_INITIALI
  * real time timers .it_interval.  Rather, we compute the next time in
  * absolute time the timer should go off.
  */
+void
+setitimer(int which, struct itimerval *itv, struct itimerval *olditv)
+{
+   struct itimerspec its, oldits;
+   struct itimerspec *itimer;
+   struct process *pr;
+   int timo;
+
+   KASSERT(which >= ITIMER_REAL && which <= ITIMER_PROF);
+
+   pr = curproc->p_p;
+   itimer = >ps_timer[which];
+
+   if (itv != NULL) {
+   TIMEVAL_TO_TIMESPEC(>it_value, _value);
+   TIMEVAL_TO_TIMESPEC(>it_interval, _interval);
+   }
+
+   if (which != ITIMER_REAL)
+   mtx_enter(_mtx);
+
+   if (olditv != NULL)
+   oldits = *itimer;
+   if (itv != NULL) {
+   if (which == ITIMER_REAL) {
+   struct timespec cts;
+   getnanouptime();
+   if (timespecisset(_value)) {
+   timo = tstohz(_value);
+   timeout_add(>ps_realit_to, timo);
+   timespecadd(_value, , _value);
+   } else
+   timeout_del(>ps_realit_to);
+   }
+   *itimer = its;
+   }
+
+   if (which != ITIMER_REAL)
+   mtx_leave(_mtx);
+
+   if (olditv != NULL) {
+   if (which == ITIMER_REAL) {
+   struct timespec now;
+   getnanouptime();
+   if (timespecisset(_value)) {
+   if (timespeccmp(_value, , <))
+   timespecclear(_value);
+   else
+   timespecsub(_value, ,
+   _value);
+   }
+   }
+   TIMESPEC_TO_TIMEVAL(>it_value, _value);
+   TIMESPEC_TO_TIMEVAL(>it_interval, _interval);
+   }
+}
+
 int
 sys_getitimer(struct proc *p, void *v, register_t *retval)
 {
@@ -516,44 +573,16 @@ sys_getitimer(struct proc *p, void *v, r
syscallarg(int) which;
syscallarg(struct itimerval *) itv;
} */ *uap = v;
-   struct itimerspec its;
struct itimerval aitv;
-   struct itimerspec *itimer;
int which;
 
which = SCARG(uap, which);
 
if (which < ITIMER_REAL || which > ITIMER_PROF)
return (EINVAL);
-   itimer = >p_p->ps_timer[which];
memset(, 0, sizeof(aitv));
 
-   if (which != ITIMER_REAL)
-   mtx_enter(_mtx);
-   its = *itimer;
-   if (which != ITIMER_REAL)
-   mtx_leave(_mtx);
-
-   if (which == ITIMER_REAL) {
-   struct timespec now;
-
-   getnanouptime();
-   /*
-* Convert from absolute to relative time in .it_value
-* part of real time timer.  If time for real time timer
-* has passed return 0, else return difference between
-* current time and time for the timer to go off.
-*/
-   if (timespecisset(_value)) {
-   if (timespeccmp(_value, , <))
-   timespecclear(_value);
-   else
-   timespecsub(_value, ,
-   _value);
-   

Re: pppac(4): destroy sessions the same way as pppx(4) does

2020-08-14 Thread YASUOKA Masahiko
On Wed, 12 Aug 2020 12:26:22 +0300
Vitaliy Makkoveev  wrote:
> We destroy pppx(4) related sessions while we performing PIPEXDSESSION
> command. But with pppac(4) we set session's state to
> PIPEX_STATE_CLOSE_WAIT2 and we wait garbage collector to do destruction.

pppac's PIPEXDSESSION set the states PIPEX_STATE_CLOSED.  It is to
wait until pipex{in,out}q becomes empty.

> We removed `pipex{in,out}q'. So we can safe destroy session in any time.
> I propose to make pppac(4) session destruction path the same as pppx(4)
> does. Now we destroy them while performing PIPEXDSESSION commad too.

Yes.  I agree this point.

> Also there is no in-kernel garbage collector for pppac(4) sessions.
> yasuoka@ pointed me that npppd(8) should kill expired sessions.
> 
> This not only makes pppac(4) closer to pppx(4) but simplify code and
> allow us to make safe pppx(4) session processing by pipex_timer().
> So this is preparation step to restore in-kernel timeout for pppx(4)
> too.

Below, I am asking to keep the timeout behavior.  There is a bug for
pppx(4) but it had been working for pppac(4) for long time.  If you
really want to change the behavior please provide a reason.  I have
not so strong opinion but I don't want to change the behavior without
a reason.

> Index: sys/net/pipex.c
> ===
> RCS file: /cvs/src/sys/net/pipex.c,v
> retrieving revision 1.124
> diff -u -p -r1.124 pipex.c
> --- sys/net/pipex.c   12 Aug 2020 08:41:39 -  1.124
> +++ sys/net/pipex.c   12 Aug 2020 09:07:12 -
> @@ -536,29 +536,6 @@ out:
>   return error;
>  }
>  
> -int
> -pipex_notify_close_session(struct pipex_session *session)
> -{
> - NET_ASSERT_LOCKED();
> - session->state = PIPEX_STATE_CLOSE_WAIT;
> - session->stat.idle_time = 0;
> - LIST_INSERT_HEAD(_close_wait_list, session, state_list);
> -
> - return (0);
> -}
> -

Unrelated but ok.

> -int
> -pipex_notify_close_session_all(void)
> -{
> - struct pipex_session *session;
> -
> - NET_ASSERT_LOCKED();
> - LIST_FOREACH(session, _session_list, session_list)
> - if (session->state == PIPEX_STATE_OPENED)
> - pipex_notify_close_session(session);
> - return (0);
> -}
> -

Unrelated but ok.  Since it's not used.

>  Static int
>  pipex_close_session(struct pipex_session_close_req *req,
>  struct pipex_iface_context *iface)
> @@ -573,13 +550,9 @@ pipex_close_session(struct pipex_session
>   if (session->pipex_iface != iface)
>   return (EINVAL);
>  
> - /* remove from close_wait list */
> - if (session->state == PIPEX_STATE_CLOSE_WAIT)
> - LIST_REMOVE(session, state_list);
> -

This must be kept.  Useland may PIPEXDSESSION before PIPEXGCLOSED for
this session.

>   /* get statistics before destroy the session */
>   req->pcr_stat = session->stat;
> - session->state = PIPEX_STATE_CLOSED;
> + pipex_destroy_session(session);
>  
>   return (0);
>  }

ok

> @@ -739,47 +712,25 @@ pipex_timer_stop(void)
>  Static void
>  pipex_timer(void *ignored_arg)
>  {
> - struct pipex_session *session, *session_tmp;
> + struct pipex_session *session;
>  
>   timeout_add_sec(_timer_ch, pipex_prune);
>  
>   NET_LOCK();
>   /* walk through */
> - LIST_FOREACH_SAFE(session, _session_list, session_list,
> - session_tmp) {
> - switch (session->state) {
> - case PIPEX_STATE_OPENED:
> - if (session->timeout_sec == 0)
> - continue;
> -
> - session->stat.idle_time++;
> - if (session->stat.idle_time < session->timeout_sec)
> - continue;
> -
> - pipex_notify_close_session(session);
> - break;
> -
> - case PIPEX_STATE_CLOSE_WAIT:
> - case PIPEX_STATE_CLOSE_WAIT2:
> - /* Wait PIPEXDSESSION from userland */
> - session->stat.idle_time++;
> - if (session->stat.idle_time < PIPEX_CLOSE_TIMEOUT)
> - continue;
> -
> - if (session->state == PIPEX_STATE_CLOSE_WAIT)
> - LIST_REMOVE(session, state_list);
> - session->state = PIPEX_STATE_CLOSED;
> - /* FALLTHROUGH */
> + LIST_FOREACH(session, _session_list, session_list) {
> + if (session->state != PIPEX_STATE_OPENED)
> + continue;
> + if (session->timeout_sec == 0)
> + continue;
>  
> - case PIPEX_STATE_CLOSED:
> - pipex_destroy_session(session);
> - break;
> + session->stat.idle_time++;
> + if (session->stat.idle_time < session->timeout_sec)
> + continue;
>  
> - default:
> - break;
> -  

Re: man find -exec -- a little bit more hand-holding

2020-08-14 Thread Jason McIntyre
On Fri, Aug 14, 2020 at 04:30:13AM +0100, ropers wrote:
> The find man page might benefit from adding a little bit more
> user-friendly hand-holding here:
> 
> Index: find.1
> ===
> RCS file: /cvs/src/usr.bin/find/find.1,v
> retrieving revision 1.98
> diff -u -r1.98 find.1
> --- find.12 Sep 2019 21:18:41 -   1.98
> +++ find.114 Aug 2020 02:59:48 -
> @@ -231,6 +231,10 @@
>  .Qq {}
>  appears anywhere in the utility name or the
>  arguments it is replaced by the pathname of the current file.
> +The semicolon will likely have to be escaped, depending on the shell.
> +See
> +.Sx CAVEATS
> +below.
>  .Pp
>  If terminated by a plus sign,
>  the pathnames for which the
> 
> 
> EXAMPLE:
> 
> $ find ~ -iname ".*" -exec echo {} \;
> works, but
> $ find ~ -iname ".*" -exec echo {} ;
> does not.
> 
> (Yes, this is a contrived example.  I wouldn't want to make people
> copy things all over the place.)
> 

hi.

first off, i do sympathise ;) shell escaping always hurts my head too.

regarding your diff: i don;t think this is the way to go. really, we
would have to add such a text every time we encounter a character that
may need escaping. so the whole point of the text in CAVEATS in to avoid
such a large text addition, and just remind people to watch out
(caveat!) the text in CAVEATS pretty much makes your diff redundant.

the EXAMPLES section additionally shows escaping.

i think there's enough there for people.

havng said that:

- i had hoped we'd have a specific \; example. that format is so
  commonly seen that it might make sense. so i'd not be against either
  adding such an example (currently EXAMPLES is fairly small) or
  altering another.

- i cannot work out what is with the \! examples. i know we try to make
  entries work for both csh and sh style shells, but stuff like this
  works without escaping:

$ find . ! -type f

jmc



kqueue_scan_setup/finish

2020-08-14 Thread Martin Pieuchot
The previous change introducing the kqueue_scan_setup()/finish() API
required to switch poll(2) internals to use the kqueue mechanism has
been backed out.  The reason for the regression is still unknown, so
let's take a baby step approach.

Diff below introduces the new API with only minimal changes.  It should
not introduce any change in behavior.

Comments?  Oks?

Index: kern/kern_event.c
===
RCS file: /cvs/src/sys/kern/kern_event.c,v
retrieving revision 1.142
diff -u -p -r1.142 kern_event.c
--- kern/kern_event.c   12 Aug 2020 13:49:24 -  1.142
+++ kern/kern_event.c   14 Aug 2020 10:13:38 -
@@ -64,9 +64,6 @@ void  KQREF(struct kqueue *);
 void   KQRELE(struct kqueue *);
 
 intkqueue_sleep(struct kqueue *, struct timespec *);
-intkqueue_scan(struct kqueue *kq, int maxevents,
-   struct kevent *ulistp, struct timespec *timeout,
-   struct kevent *kev, struct proc *p, int *retval);
 
 intkqueue_read(struct file *, struct uio *, int);
 intkqueue_write(struct file *, struct uio *, int);
@@ -554,6 +551,7 @@ out:
 int
 sys_kevent(struct proc *p, void *v, register_t *retval)
 {
+   struct kqueue_scan_state scan;
struct filedesc* fdp = p->p_fd;
struct sys_kevent_args /* {
syscallarg(int) fd;
@@ -635,11 +633,12 @@ sys_kevent(struct proc *p, void *v, regi
goto done;
}
 
-   KQREF(kq);
+   kqueue_scan_setup(, kq);
FRELE(fp, p);
-   error = kqueue_scan(kq, SCARG(uap, nevents), SCARG(uap, eventlist),
+   error = kqueue_scan(, SCARG(uap, nevents), SCARG(uap, eventlist),
tsp, kev, p, );
-   KQRELE(kq);
+   kqueue_scan_finish();
+
*retval = n;
return (error);
 
@@ -895,11 +894,13 @@ kqueue_sleep(struct kqueue *kq, struct t
 }
 
 int
-kqueue_scan(struct kqueue *kq, int maxevents, struct kevent *ulistp,
-struct timespec *tsp, struct kevent *kev, struct proc *p, int *retval)
+kqueue_scan(struct kqueue_scan_state *scan, int maxevents,
+struct kevent *ulistp, struct timespec *tsp, struct kevent *kev,
+struct proc *p, int *retval)
 {
+   struct kqueue *kq = scan->kqs_kq;
struct kevent *kevp;
-   struct knote mend, mstart, *kn;
+   struct knote *kn;
int s, count, nkev, error = 0;
 
nkev = 0;
@@ -909,9 +910,6 @@ kqueue_scan(struct kqueue *kq, int maxev
if (count == 0)
goto done;
 
-   memset(, 0, sizeof(mstart));
-   memset(, 0, sizeof(mend));
-
 retry:
KASSERT(count == maxevents);
KASSERT(nkev == 0);
@@ -939,18 +937,16 @@ retry:
goto done;
}
 
-   mstart.kn_filter = EVFILT_MARKER;
-   mstart.kn_status = KN_PROCESSING;
-   TAILQ_INSERT_HEAD(>kq_head, , kn_tqe);
-   mend.kn_filter = EVFILT_MARKER;
-   mend.kn_status = KN_PROCESSING;
-   TAILQ_INSERT_TAIL(>kq_head, , kn_tqe);
+   TAILQ_INSERT_TAIL(>kq_head, >kqs_end, kn_tqe);
+   TAILQ_INSERT_HEAD(>kq_head, >kqs_start, kn_tqe);
while (count) {
-   kn = TAILQ_NEXT(, kn_tqe);
+   kn = TAILQ_NEXT(>kqs_start, kn_tqe);
if (kn->kn_filter == EVFILT_MARKER) {
-   if (kn == ) {
-   TAILQ_REMOVE(>kq_head, , kn_tqe);
-   TAILQ_REMOVE(>kq_head, , kn_tqe);
+   if (kn == >kqs_end) {
+   TAILQ_REMOVE(>kq_head, >kqs_end,
+   kn_tqe);
+   TAILQ_REMOVE(>kq_head, >kqs_start,
+   kn_tqe);
splx(s);
if (count == maxevents)
goto retry;
@@ -958,8 +954,9 @@ retry:
}
 
/* Move start marker past another thread's marker. */
-   TAILQ_REMOVE(>kq_head, , kn_tqe);
-   TAILQ_INSERT_AFTER(>kq_head, kn, , kn_tqe);
+   TAILQ_REMOVE(>kq_head, >kqs_start, kn_tqe);
+   TAILQ_INSERT_AFTER(>kq_head, kn, >kqs_start,
+   kn_tqe);
continue;
}
 
@@ -1029,8 +1026,8 @@ retry:
break;
}
}
-   TAILQ_REMOVE(>kq_head, , kn_tqe);
-   TAILQ_REMOVE(>kq_head, , kn_tqe);
+   TAILQ_REMOVE(>kq_head, >kqs_end, kn_tqe);
+   TAILQ_REMOVE(>kq_head, >kqs_start, kn_tqe);
splx(s);
 done:
if (nkev != 0) {
@@ -1044,6 +1041,33 @@ done:
*retval = maxevents - count;
return (error);
 }
+
+void
+kqueue_scan_setup(struct kqueue_scan_state *scan, struct kqueue *kq)
+{
+   memset(scan, 0, sizeof(*scan));
+
+   KQREF(kq);
+   scan->kqs_kq = kq;
+   scan->kqs_start.kn_filter = EVFILT_MARKER;
+   

Re: man find -exec -- a little bit more hand-holding

2020-08-14 Thread Christian Weisgerber
On 2020-08-14, Jason McIntyre  wrote:

> - i cannot work out what is with the \! examples. i know we try to make
>   entries work for both csh and sh style shells, but stuff like this
>   works without escaping:
>
>   $ find . ! -type f

Going through the CVS and SCCS history, I see that the examples
came with a rewrite of find(1) at Berkeley 30 years ago.

csh's behavior that "for convenience, a `!' is passed unchanged
when it is followed by a blank, tab, newline, `=' or `('" has been
documented as such at least since the start of the CSRG repository
in 1985.

bash, whose history substitution was modeled on csh, also shares
this behavior.

I think it was never necessary to escape the '!' and the man page
examples were written with an abundance of caution and a lack of
understanding of csh's exact replacement mechanism.

-- 
Christian "naddy" Weisgerber  na...@mips.inka.de



Re: man find -exec -- a little bit more hand-holding

2020-08-14 Thread Theo de Raadt
Christian Weisgerber  wrote:

> On 2020-08-14, Jason McIntyre  wrote:
> 
> > - i cannot work out what is with the \! examples. i know we try to make
> >   entries work for both csh and sh style shells, but stuff like this
> >   works without escaping:
> >
> > $ find . ! -type f
> 
> Going through the CVS and SCCS history, I see that the examples
> came with a rewrite of find(1) at Berkeley 30 years ago.
> 
> csh's behavior that "for convenience, a `!' is passed unchanged
> when it is followed by a blank, tab, newline, `=' or `('" has been
> documented as such at least since the start of the CSRG repository
> in 1985.
> 
> bash, whose history substitution was modeled on csh, also shares
> this behavior.
> 
> I think it was never necessary to escape the '!' and the man page
> examples were written with an abundance of caution and a lack of
> understanding of csh's exact replacement mechanism.

Yep, so I think that particular escape should be deleted.



softintr.h comment tweak

2020-08-14 Thread Mark Kettenis
Miod noticed that the powerpc64 version talked about AArch64.  I don't
think the "for all XXX platforms" makes sense so simply drop it from
all three versions of this header.

ok?


Index: arch/arm/include/softintr.h
===
RCS file: /cvs/src/sys/arch/arm/include/softintr.h,v
retrieving revision 1.5
diff -u -p -r1.5 softintr.h
--- arch/arm/include/softintr.h 21 Dec 2010 14:56:23 -  1.5
+++ arch/arm/include/softintr.h 14 Aug 2020 12:30:10 -
@@ -44,7 +44,7 @@
 #include 
 
 /*
- * Generic software interrupt support for all ARM platforms.
+ * Generic software interrupt support.
  *
  * To use this code, include  from your platform's
  * .
Index: arch/arm64/include/softintr.h
===
RCS file: /cvs/src/sys/arch/arm64/include/softintr.h,v
retrieving revision 1.1
diff -u -p -r1.1 softintr.h
--- arch/arm64/include/softintr.h   17 Dec 2016 23:38:33 -  1.1
+++ arch/arm64/include/softintr.h   14 Aug 2020 12:30:10 -
@@ -39,7 +39,7 @@
 #include 
 
 /*
- * Generic software interrupt support for all AArch64 platforms.
+ * Generic software interrupt support.
  *
  * To use this code, include  from your platform's
  * .
Index: arch/powerpc64/include/softintr.h
===
RCS file: /cvs/src/sys/arch/powerpc64/include/softintr.h,v
retrieving revision 1.1
diff -u -p -r1.1 softintr.h
--- arch/powerpc64/include/softintr.h   16 May 2020 17:11:14 -  1.1
+++ arch/powerpc64/include/softintr.h   14 Aug 2020 12:30:10 -
@@ -38,7 +38,7 @@
 #include 
 
 /*
- * Generic software interrupt support for all AArch64 platforms.
+ * Generic software interrupt support.
  *
  * To use this code, include  from your platform's
  * .



Re: softintr.h comment tweak

2020-08-14 Thread Patrick Wildt
On Fri, Aug 14, 2020 at 02:33:10PM +0200, Mark Kettenis wrote:
> Miod noticed that the powerpc64 version talked about AArch64.  I don't
> think the "for all XXX platforms" makes sense so simply drop it from
> all three versions of this header.
> 
> ok?
> 

ok patrick@

> 
> Index: arch/arm/include/softintr.h
> ===
> RCS file: /cvs/src/sys/arch/arm/include/softintr.h,v
> retrieving revision 1.5
> diff -u -p -r1.5 softintr.h
> --- arch/arm/include/softintr.h   21 Dec 2010 14:56:23 -  1.5
> +++ arch/arm/include/softintr.h   14 Aug 2020 12:30:10 -
> @@ -44,7 +44,7 @@
>  #include 
>  
>  /*
> - * Generic software interrupt support for all ARM platforms.
> + * Generic software interrupt support.
>   *
>   * To use this code, include  from your platform's
>   * .
> Index: arch/arm64/include/softintr.h
> ===
> RCS file: /cvs/src/sys/arch/arm64/include/softintr.h,v
> retrieving revision 1.1
> diff -u -p -r1.1 softintr.h
> --- arch/arm64/include/softintr.h 17 Dec 2016 23:38:33 -  1.1
> +++ arch/arm64/include/softintr.h 14 Aug 2020 12:30:10 -
> @@ -39,7 +39,7 @@
>  #include 
>  
>  /*
> - * Generic software interrupt support for all AArch64 platforms.
> + * Generic software interrupt support.
>   *
>   * To use this code, include  from your platform's
>   * .
> Index: arch/powerpc64/include/softintr.h
> ===
> RCS file: /cvs/src/sys/arch/powerpc64/include/softintr.h,v
> retrieving revision 1.1
> diff -u -p -r1.1 softintr.h
> --- arch/powerpc64/include/softintr.h 16 May 2020 17:11:14 -  1.1
> +++ arch/powerpc64/include/softintr.h 14 Aug 2020 12:30:10 -
> @@ -38,7 +38,7 @@
>  #include 
>  
>  /*
> - * Generic software interrupt support for all AArch64 platforms.
> + * Generic software interrupt support.
>   *
>   * To use this code, include  from your platform's
>   * .
> 



Re: Enable arm64 PAN feature

2020-08-14 Thread Mark Kettenis
> Date: Fri, 14 Aug 2020 12:29:51 +1000
> From: Jonathan Gray 
> 
> On Thu, Aug 13, 2020 at 09:17:41PM +0200, Mark Kettenis wrote:
> > ARMv8.1 introduced PAN (Priviliged Access Never) which prevents the
> > kernel from accessing userland data.  This can be bypassed by using
> > special instructions which we already use in copyin(9) and friends.
> > So we can simply turn this feature on if the CPU supports it.
> > 
> > Tested on an Odroid-C4 which has Cortex-A55 cores that have PAN
> > support.
> > 
> > ok?
> 
> This should be changing PSTATE.PAN as well.  Can you force an
> acess of this type to be sure the permission fault occurs?
> 
> >From the Arm ARM:
> 
> "When the value of PSTATE.PAN is 1, any privileged data access from
> EL1, or EL2 when HCR_EL2.E2H is 1, to a virtual memory address that
> is accessible at EL0, generates a Permission fault.
> 
> When the value of PSTATE.PAN is 0, the translation system is the
> same as in Armv8.0.
> 
> When ARMv8.1-PAN is implemented, the SPSR_EL1.PAN, SPSR_EL2.PAN, and
> SPSR_EL3.PAN bits are used for exception returns, and the DSPSR_EL0
> register is used for entry to or exit from Debug state.
> 
> When ARMv8.1-PAN is implemented, the SCTLR_EL1.SPAN and SCTLR_EL2.SPAN
> bits are used to control whether the PAN bit is set on an exception
> to EL1 or EL2."

By clearing SCTRL_EL0.SPAN, PSTATE.PAN will be set on an exception to
EL1.  Yes, this does mean that PSTATE.PAN won't be set until a CPU
returns to userland for the first time.  But from then on PSTATE.PAN
will be set.

I suppose a way to test this properly is to pick a system call and
replace a copyin() with a direct access?  That will succeed without
PAN but should fail with PAN enabled right?

> > Index: arch/arm64/arm64/cpu.c
> > ===
> > RCS file: /cvs/src/sys/arch/arm64/arm64/cpu.c,v
> > retrieving revision 1.38
> > diff -u -p -r1.38 cpu.c
> > --- arch/arm64/arm64/cpu.c  4 Jun 2020 21:18:16 -   1.38
> > +++ arch/arm64/arm64/cpu.c  13 Aug 2020 19:12:30 -
> > @@ -321,6 +321,7 @@ cpu_attach(struct device *parent, struct
> > struct fdt_attach_args *faa = aux;
> > struct cpu_info *ci;
> > uint64_t mpidr = READ_SPECIALREG(mpidr_el1);
> > +   uint64_t id_aa64mmfr1, sctlr;
> > uint32_t opp;
> >  
> > KASSERT(faa->fa_nreg > 0);
> > @@ -393,6 +394,14 @@ cpu_attach(struct device *parent, struct
> > cpu_cpuspeed = cpu_clockspeed;
> > }
> >  
> > +   /* Enable PAN. */
> > +   id_aa64mmfr1 = READ_SPECIALREG(id_aa64mmfr1_el1);
> > +   if (ID_AA64MMFR1_PAN(id_aa64mmfr1) != ID_AA64MMFR1_PAN_NONE) {
> > +   sctlr = READ_SPECIALREG(sctlr_el1);
> > +   sctlr &= ~SCTLR_SPAN;
> > +   WRITE_SPECIALREG(sctlr_el1, sctlr);
> > +   }
> > +
> > /* Initialize debug registers. */
> > WRITE_SPECIALREG(mdscr_el1, DBG_MDSCR_TDCC);
> > WRITE_SPECIALREG(oslar_el1, 0);
> > @@ -522,6 +531,7 @@ cpu_boot_secondary(struct cpu_info *ci)
> >  void
> >  cpu_start_secondary(struct cpu_info *ci)
> >  {
> > +   uint64_t id_aa64mmfr1, sctlr;
> > uint64_t tcr;
> > int s;
> >  
> > @@ -543,6 +553,14 @@ cpu_start_secondary(struct cpu_info *ci)
> > tcr |= TCR_T0SZ(64 - USER_SPACE_BITS);
> > tcr |= TCR_A1;
> > WRITE_SPECIALREG(tcr_el1, tcr);
> > +
> > +   /* Enable PAN. */
> > +   id_aa64mmfr1 = READ_SPECIALREG(id_aa64mmfr1_el1);
> > +   if (ID_AA64MMFR1_PAN(id_aa64mmfr1) != ID_AA64MMFR1_PAN_NONE) {
> > +   sctlr = READ_SPECIALREG(sctlr_el1);
> > +   sctlr &= ~SCTLR_SPAN;
> > +   WRITE_SPECIALREG(sctlr_el1, sctlr);
> > +   }
> >  
> > /* Initialize debug registers. */
> > WRITE_SPECIALREG(mdscr_el1, DBG_MDSCR_TDCC);
> > Index: arch/arm64/include/armreg.h
> > ===
> > RCS file: /cvs/src/sys/arch/arm64/include/armreg.h,v
> > retrieving revision 1.11
> > diff -u -p -r1.11 armreg.h
> > --- arch/arm64/include/armreg.h 5 Jun 2020 22:14:25 -   1.11
> > +++ arch/arm64/include/armreg.h 13 Aug 2020 19:12:30 -
> > @@ -451,6 +451,7 @@
> >  #defineSCTLR_nTWI  0x0001
> >  #defineSCTLR_nTWE  0x0004
> >  #defineSCTLR_WXN   0x0008
> > +#defineSCTLR_SPAN  0x0080
> >  #defineSCTLR_EOE   0x0100
> >  #defineSCTLR_EE0x0200
> >  #defineSCTLR_UCI   0x0400
> > @@ -478,6 +479,7 @@
> >  #definePSR_D   0x0200
> >  #definePSR_IL  0x0010
> >  #definePSR_SS  0x0020
> > +#definePSR_PAN 0x0040
> >  #definePSR_V   0x1000
> >  #definePSR_C   0x2000
> >  #definePSR_Z   0x4000
> > 
> > 
> 



radeondrm(4) timing issue

2020-08-14 Thread Marcus Glocker
Hi,

Recently I took over the old iMac11,2 of my son, and what else to do
with it other than installing OpenBSD and see what happens.  The first
thing which happened after the installation was that the screen
remained dark after the radeondrm(4) KMS initialization.

After some painful debugging, and exchanging forth and back with jsg@,
I've noticed that when enabling DRMDEBUG, radeondrm(4) just works fine!
Smells like a timing issue.

After doing some more investigations I've figured out that when
removing a specific debug printf in drm_dp_helper.c, just before an
usleep_range() statement, it fails again:

  case DP_AUX_NATIVE_REPLY_DEFER:
->  DRM_DEBUG_KMS("native defer\n");
/*
 * We could check for I2C bit rate capabilities and if
 * available adjust this interval. We could also be
 * more careful with DP-to-legacy adapters where a
 * long legacy cable may force very low I2C bit rates.
 *
 * For now just defer for long enough to hopefully be
 * safe for all use-cases.
 */
 usleep_range(AUX_RETRY_INTERVAL, AUX_RETRY_INTERVAL + 100);
 continue;

This usleep_range() expands to usleep_range(500, 600), and we wrap it in
this function on OpenBSD:

  static inline void
  usleep_range(unsigned long min, unsigned long max)
  {
  DELAY(min);
  }

In my case the 500us was too short, while the 600us works fine.

Based on that jsg@ told me that there is an discussion on-going to
change usleep_range() to calculate (min + max) / 2, which would change
this specific usleep_range() line to 550us, which is enough in my case
to make radeondrm(4) work fine on my iMac.

We would like to understand if this diff has any negative impact on
other devices.  Can you please give it a spin therefore?


Index: dev/pci/drm/include/linux/delay.h
===
RCS file: /cvs/src/sys/dev/pci/drm/include/linux/delay.h,v
retrieving revision 1.2
diff -u -p -u -p -r1.2 delay.h
--- dev/pci/drm/include/linux/delay.h   8 Jun 2020 04:48:14 -   1.2
+++ dev/pci/drm/include/linux/delay.h   14 Aug 2020 11:44:43 -
@@ -20,7 +20,7 @@ ndelay(unsigned long nsecs)
 static inline void
 usleep_range(unsigned long min, unsigned long max)
 {
-   DELAY(min);
+   DELAY((min + max) / 2);
 }
 
 static inline void