from:"Philip Guenther"

Re: missing warning in wireguard manual page

2022-07-25 Thread Philip Guenther

On Mon, Jul 25, 2022 at 7:20 AM Theo de Raadt  wrote:

> I've been watching conversation on a mailing list, and it leads me to
> wonder if we should inform the userbase better.
>

Too true.  Certification *is* the key thing that protects users, not
careful, well engineered designs.

We should be giving this warning in many other places too; for example:

 Index: stdlib/malloc.3
===
RCS file: /data/src/openbsd/src/lib/libc/stdlib/malloc.3,v
retrieving revision 1.129
diff -u -p -r1.129 malloc.3
--- stdlib/malloc.3 31 Mar 2022 17:27:16 -  1.129
+++ stdlib/malloc.3 25 Jul 2022 20:00:07 -
@@ -766,6 +766,11 @@ and
 functions appeared in
 .Ox 6.6 .
 .Sh CAVEATS
+Layout randomization in
+.Nm malloc
+uses uncertified random number generators,
+so the security properties cannot be guaranteed.
+.Pp
 When using
 .Fn malloc ,
 be wary of signed integer and

Re: move PRU_BIND request to (*pru_bind)() handler

2022-08-19 Thread Philip Guenther

On Fri, Aug 19, 2022 at 12:42 PM Vitaliy Makkoveev  wrote:

> bluhm@ pointed, that many KASSERT()s are not welcomed, so I didn't
> insert them into newly introduced handlers. Anyway except the tcp(4)
> protocol, `so_pcb' cant be NULL here. But the socket lock assertion
> looks reasonable.
>
> Some unp_*() functions could be merged with newly introduced uipc_*(),
> but I want to do this after (*pru_usrreq)() split finished.
>

Having multiple PROTO_bind() routines that just return EOPNOTSUPP seems
like overkill to me.  I think I would tend to just have the pru_bind()
inline do a NULL test and return EOPNOTSUPP if it is and leave the callback
NULL for all those protocols, but I could also see having a single
prubind_eopnotsupp()  (or whatever you think is a clear name)
implementation in some sys/kern/ file which could then be used by all the
protocols that don't implement it.  Consider how many protocols and
callbacks you're going to be working through and how many stubs that would
be if there's no sharing...

Philip

Re: move PRU_RCVD request to (*pru_rcvd)()

2022-08-23 Thread Philip Guenther

Since pru_rcvd() is only invoked if the protocol has the PR_WANTRCVD flag
set, there should be no need to test whether the callback is set: a
protocol without the callback MUST NOT have PR_WANTRCVD.

(I guess this could, alternatively, go the other direction and eliminate
PR_WANTRCVD and use the presence of the callback to decide whether the
protocol needs anything to be done.)

Side note: pru_rcvd() (and the pru_rcvd implementations) should have a
return type of void.


Philip Guenther



On Mon, Aug 22, 2022 at 1:40 PM Vitaliy Makkoveev  wrote:

> Another one.
>
> Since we never use `flags' arg within handlers, remove it from the
> pru_rcvd() args.
>
> Index: sys/sys/protosw.h
> ===
> RCS file: /cvs/src/sys/sys/protosw.h,v
> retrieving revision 1.43
> diff -u -p -r1.43 protosw.h
> --- sys/sys/protosw.h   22 Aug 2022 21:18:48 -  1.43
> +++ sys/sys/protosw.h   22 Aug 2022 22:27:08 -
> @@ -72,6 +72,7 @@ struct pr_usrreqs {
> int (*pru_accept)(struct socket *, struct mbuf *);
> int (*pru_disconnect)(struct socket *);
> int (*pru_shutdown)(struct socket *);
> +   int (*pru_rcvd)(struct socket *);
>  };
>
>  struct protosw {
> @@ -314,10 +315,11 @@ pru_shutdown(struct socket *so)
>  }
>
>  static inline int
> -pru_rcvd(struct socket *so, int flags)
> +pru_rcvd(struct socket *so)
>  {
> -   return (*so->so_proto->pr_usrreqs->pru_usrreq)(so,
> -   PRU_RCVD, NULL, (struct mbuf *)(long)flags, NULL, curproc);
> +   if (so->so_proto->pr_usrreqs->pru_rcvd)
> +   return (*so->so_proto->pr_usrreqs->pru_rcvd)(so);
> +   return (EOPNOTSUPP);
>  }
>
>  static inline int
> Index: sys/kern/uipc_socket.c
> ===
> RCS file: /cvs/src/sys/kern/uipc_socket.c,v
> retrieving revision 1.284
> diff -u -p -r1.284 uipc_socket.c
> --- sys/kern/uipc_socket.c  21 Aug 2022 16:22:17 -  1.284
> +++ sys/kern/uipc_socket.c  22 Aug 2022 22:27:08 -
> @@ -1156,7 +1156,7 @@ dontblock:
> SBLASTRECORDCHK(&so->so_rcv, "soreceive 4");
> SBLASTMBUFCHK(&so->so_rcv, "soreceive 4");
> if (pr->pr_flags & PR_WANTRCVD && so->so_pcb)
> -   pru_rcvd(so, flags);
> +   pru_rcvd(so);
> }
> if (orig_resid == uio->uio_resid && orig_resid &&
> (flags & MSG_EOR) == 0 && (so->so_state & SS_CANTRCVMORE) ==
> 0) {
> @@ -1521,7 +1521,7 @@ somove(struct socket *so, int wait)
> if (m == NULL) {
> sbdroprecord(so, &so->so_rcv);
> if (so->so_proto->pr_flags & PR_WANTRCVD && so->so_pcb)
> -   pru_rcvd(so, 0);
> +   pru_rcvd(so);
> goto nextpkt;
> }
>
> @@ -1627,7 +1627,7 @@ somove(struct socket *so, int wait)
>
> /* Send window update to source peer as receive buffer has
> changed. */
> if (so->so_proto->pr_flags & PR_WANTRCVD && so->so_pcb)
> -   pru_rcvd(so, 0);
> +   pru_rcvd(so);
>
> /* Receive buffer did shrink by len bytes, adjust oob. */
> state = so->so_state;
> Index: sys/kern/uipc_usrreq.c
> ===
> RCS file: /cvs/src/sys/kern/uipc_usrreq.c,v
> retrieving revision 1.174
> diff -u -p -r1.174 uipc_usrreq.c
> --- sys/kern/uipc_usrreq.c  22 Aug 2022 21:18:48 -  1.174
> +++ sys/kern/uipc_usrreq.c  22 Aug 2022 22:27:08 -
> @@ -136,6 +136,7 @@ const struct pr_usrreqs uipc_usrreqs = {
> .pru_accept = uipc_accept,
> .pru_disconnect = uipc_disconnect,
> .pru_shutdown   = uipc_shutdown,
> +   .pru_rcvd   = uipc_rcvd,
>  };
>
>  void
> @@ -243,32 +244,6 @@ uipc_usrreq(struct socket *so, int req,
> }
> break;
>
> -   case PRU_RCVD:
> -   switch (so->so_type) {
> -
> -   case SOCK_DGRAM:
> -   panic("uipc 1");
> -   /*NOTREACHED*/
> -
> -   case SOCK_STREAM:
> -   case SOCK_SEQPACKET:
> -   if ((so2 = unp_solock_peer(so)) == NULL)
> -   break;
> -   /*
> -* Adjust backpressure on sender
> -* and wakeup any waiting to write.

Re: static inline, not inline static

2022-08-28 Thread Philip Guenther

On Sun, Aug 28, 2022 at 2:11 PM Anders Andersson  wrote:

> On Sun, Aug 28, 2022 at 3:15 PM Jonathan Gray  wrote:
> >
> > diff --git lib/libc/locale/wctoint.h lib/libc/locale/wctoint.h
> > index ea50c5ae1b6..14c7f0c466d 100644
> > --- lib/libc/locale/wctoint.h
> > +++ lib/libc/locale/wctoint.h
> > @@ -30,7 +30,7 @@
> >   */
> >
> >
> > -inline static int
> > +static inline int
> >  wctoint(wchar_t wc)
> >  {
> > int n;
> > [...]
>
> Why this change? As far as I can see, the standard allows for any order.
>

C99 standard stated:
"The placement of a storage-class speciﬁer other than at the beginning
of the declaration
 speciﬁers in a declaration is an obsolescent feature.

My recall is that it was officially removed in C11.

ok guenther@

Re: use volatile not volatile

2022-08-28 Thread Philip Guenther

On Sat, Aug 27, 2022 at 6:41 PM Jonathan Gray  wrote:

> directly use ansi volatile keyword not __volatile__ builtin
>

Yes, please.  ok guenther@

Re: use volatile not __volatile

2022-08-28 Thread Philip Guenther

On Sat, Aug 27, 2022 at 6:31 PM Jonathan Gray  wrote:

> directly use ansi volatile keyword not __volatile from cdefs.h
>

Yay!
ok guenther@

Re: add sendmmsg and recvmmsg systemcalls

2022-08-31 Thread Philip Guenther

On Tue, Aug 30, 2022 at 11:18 AM Moritz Buhl  wrote:

> the following diff only contains recvmmsg which should be the more useful
> syscall of the two.
>

Comments inline.

> --- sys/kern/syscalls.master1 Aug 2022 14:56:59 -   1.229
> +++ sys/kern/syscalls.master30 Aug 2022 15:44:29 -
> @@ -575,3 +575,6 @@
>  328OBSOL   __tfork51
>  329STD NOLOCK  { void sys___set_tcb(void *tcb); }
>  330STD NOLOCK  { void *sys___get_tcb(void); }
> +331STD NOLOCK  { int sys_recvmmsg(int s, struct mmsghdr *mmsg, \
> +   unsigned int vlen, unsigned int flags, \
> +   struct timespec *timeout); }
>

We believe in packing syscall numbers down, modulus having them group
nicely.  So, instead of putting recvmmsg() as a new high-water mark, I
would put it at 116 ("t32_gettimeofday") and mark 117 ("t32_getrusage") as
sendmmsg(), UNIMPL if not part of this round.

(Typically, when a diff changes syscalls.master then you leave out the diff
chunks for the generated files when sending for review, because they are
100% implied and are just noise.  Not a big deal)

--- sys/kern/uipc_syscalls.c14 Aug 2022 01:58:28 -  1.201
> +++ sys/kern/uipc_syscalls.c30 Aug 2022 17:03:09 -
> @@ -805,6 +805,135 @@ done:
>  }
>
>  int
> +sys_recvmmsg(struct proc *p, void *v, register_t *retval)
> +{
> +   struct sys_recvmmsg_args /* {
> +   syscallarg(int) s;
> +   syscallarg(struct mmsghdr *)mmsg;
> +   syscallarg(unsigned int)vlen;
> +   syscallarg(unsigned int)flags;
> +   syscallarg(struct timespec *)   timeout;
> +   } */ *uap = v;
> +   struct mmsghdr mmsg;
> +   struct timespec ts, now;
> +   struct iovec aiov[UIO_SMALLIOV], *uiov, *iov = aiov;
> +   struct file *fp;
> +   struct socket *so;
> +   struct timespec *timeout;
> +   unsigned int vlen, dg;
> +   int error = 0, flags, s;
> +
> +   timeout = SCARG(uap, timeout);
> +   if (timeout != NULL) {
> +   error = copyin(SCARG(uap, timeout), &ts, sizeof(ts));
> +   if (error != 0)
> +   return error;
>

Should have a KTRACE ktrreltimmespec() block here.
Should validate the timespec.
(Follow what sys_kevent() does)

+   getnanotime(&now);
> +   timespecadd(&now, &ts, &ts);
> +   }
> +
> +   s = SCARG(uap, s);
> +   if ((error = getsock(p, s, &fp)) != 0)
> +   return (error);
> +   so = (struct socket *)fp->f_data;
> +
> +   flags = SCARG(uap, flags);
> +
> +   vlen = SCARG(uap, vlen);
> +   if (vlen > 1024)
> +   vlen = 1024;
> +
> +   for (dg = 0; dg < vlen;) {
> +   error = copyin(SCARG(uap, mmsg) + dg, &mmsg, sizeof(mmsg));
> +   if (error != 0)
> +   break;
>

Hmm.  This copies in each mmsghdr structure when it gets to it.  Is that
how the Linux version behaves, lazily accessing them such that an early
exit (from timeout, signal, whatever) means later values aren't read?  Or
do they copy them all in, update any that are changed, then copy them out
at the end?

(Not sure it matters, but it's an interesting corner case to think
carefully about.)

 ...

> +   mmsg.msg_hdr.msg_iov = uiov;
> +   mmsg.msg_len = *retval;
> +#ifdef KTRACE
> +   if (KTRPOINT(p, KTR_STRUCT)) {
> +   ktrmsghdr(p, &mmsg.msg_hdr);
>

I think you should go ahead and define ktrmmsghdr() taking that full
struct, so kdump can report the msg_len value that is being returned.

> +   if (iov != aiov) {
> +   free(iov, M_IOV, sizeof(struct iovec) *
> +   mmsg.msg_hdr.msg_iovlen);
> +   iov = aiov;
> +   }
>

The iov freeing, IMO,, should be done once, at the end of the loop.  Just
keep growing as necessary (tracking the currently allocated size) and free
once.

kdump.c will need at least a SYS_recvmmsg line in the big table, and if you
do a ktrmmsghdr() bit in the kernel a matching decoder will be needed in
kdump.

Philip Guenther

Re: move PRU_CONTROL request to (*pru_control)()

2022-09-01 Thread Philip Guenther

Yes, splitting {tcp,udp}_usrreqs and {tcp,udp}6_usrreqs is clearly the
Right Thing.
ok guenther@

Unrelated to this specific callback, but I think you should consider
splitting out uipc_dgram_usrreqs out from uipc_usrreqs, as the SOCK_DGRAM
case differs from the other two for multiple callbacks.

Philip

On Thu, Sep 1, 2022 at 10:54 AM Vitaliy Makkoveev  wrote:

> The 'proc *' is not used for PRU_CONTROL request, so remove it from
> pru_control() wrapper.
>
> I want to use existing in{6,}_control for tcp(4) and udp(4) sockets, so
> for inet6 case I introduced `tcp6_usrreqs' and `udp6_usrreqs'
> structures. I also want to use them for the following PRU_SOCKADDR and
> PRU_PEERADDR.
>
> Since the PRU_SOCKADDR and PRU_PEERADDR are the last ones, I want to
> move the corresponding (*pru_)() handlers and kill (*pru_usrreq)() with
> single diff. Is it ok to review?
>
> Index: sys/sys/protosw.h
> ===
> RCS file: /cvs/src/sys/sys/protosw.h,v
> retrieving revision 1.51
> diff -u -p -r1.51 protosw.h
> --- sys/sys/protosw.h   1 Sep 2022 18:21:23 -   1.51
> +++ sys/sys/protosw.h   1 Sep 2022 19:35:08 -
> @@ -59,6 +59,7 @@ struct socket;
>  struct domain;
>  struct proc;
>  struct stat;
> +struct ifnet;
>
>  struct pr_usrreqs {
> /* user request: see list below */
> @@ -77,6 +78,8 @@ struct pr_usrreqs {
> int (*pru_send)(struct socket *, struct mbuf *, struct mbuf *,
> struct mbuf *);
> int (*pru_abort)(struct socket *);
> +   int (*pru_control)(struct socket *, u_long, caddr_t,
> +   struct ifnet *);
> int (*pru_sense)(struct socket *, struct stat *);
> int (*pru_rcvoob)(struct socket *, struct mbuf *, int);
> int (*pru_sendoob)(struct socket *, struct mbuf *, struct mbuf
> *,
> @@ -343,12 +346,12 @@ pru_abort(struct socket *so)
>  }
>
>  static inline int
> -pru_control(struct socket *so, u_long cmd, caddr_t data,
> -struct ifnet *ifp, struct proc *p)
> +pru_control(struct socket *so, u_long cmd, caddr_t data, struct ifnet
> *ifp)
>  {
> -   return (*so->so_proto->pr_usrreqs->pru_usrreq)(so,
> -   PRU_CONTROL, (struct mbuf *)cmd, (struct mbuf *)data,
> -   (struct mbuf *)ifp, p);
> +   if (so->so_proto->pr_usrreqs->pru_control)
> +   return (*so->so_proto->pr_usrreqs->pru_control)(so,
> +   cmd, data, ifp);
> +   return (EOPNOTSUPP);
>  }
>
>  static inline int
> Index: sys/kern/sys_socket.c
> ===
> RCS file: /cvs/src/sys/kern/sys_socket.c,v
> retrieving revision 1.53
> diff -u -p -r1.53 sys_socket.c
> --- sys/kern/sys_socket.c   14 Aug 2022 01:58:28 -  1.53
> +++ sys/kern/sys_socket.c   1 Sep 2022 19:35:07 -
> @@ -137,7 +137,7 @@ soo_ioctl(struct file *fp, u_long cmd, c
> if (IOCGROUP(cmd) == 'r')
> return (EOPNOTSUPP);
> KERNEL_LOCK();
> -   error = pru_control(so, cmd, data, NULL, p);
> +   error = pru_control(so, cmd, data, NULL);
> KERNEL_UNLOCK();
> break;
> }
> Index: sys/kern/uipc_usrreq.c
> ===
> RCS file: /cvs/src/sys/kern/uipc_usrreq.c,v
> retrieving revision 1.182
> diff -u -p -r1.182 uipc_usrreq.c
> --- sys/kern/uipc_usrreq.c  1 Sep 2022 18:21:22 -   1.182
> +++ sys/kern/uipc_usrreq.c  1 Sep 2022 19:35:07 -
> @@ -219,8 +219,6 @@ uipc_usrreq(struct socket *so, int req,
> struct socket *so2;
> int error = 0;
>
> -   if (req == PRU_CONTROL)
> -   return (EOPNOTSUPP);
> if (req != PRU_SEND && control && control->m_len) {
> error = EOPNOTSUPP;
> goto release;
> Index: sys/net/if.c
> ===
> RCS file: /cvs/src/sys/net/if.c,v
> retrieving revision 1.663
> diff -u -p -r1.663 if.c
> --- sys/net/if.c13 Aug 2022 21:01:46 -  1.663
> +++ sys/net/if.c1 Sep 2022 19:35:07 -
> @@ -2360,7 +2360,7 @@ forceup:
> break;
> /* FALLTHROUGH */
> default:
> -   error = pru_control(so, cmd, data, ifp, p);
> +   error = pru_control(so, cmd, data, ifp);
> if (error != EOPNOTSUPP)
> break;
> switch (cmd) {
> Index: sys/net/pfkeyv2.c
> ===
> RCS file: /cvs/src/sys/net/pfkeyv2.c,v
> retrieving revision 1.249
> diff -u -p -r1.249 pfkeyv2.c
> --- sys/net/pfkeyv2.c   1 Sep 2022 18:21:23 -   1.249
> +++ sys/net/pfkeyv2.c   1 Sep 2022 19:35:07 -
> @@ -395,9 +395,6 @@ pfkeyv2_usrreq(struct socket *so, int re
> struct p

Re: Change pru_rcvd() return type to the type of void

2022-09-10 Thread Philip Guenther

ok guenther@

(Thanks!)

On Sat, Sep 10, 2022 at 10:20 AM Vitaliy Makkoveev  wrote:

> We have no interest on pru_rcvd() return value. Also, we call pru_rcvd()
> only if the socket's protocol have PR_WANTRCVD flag set. Such sockets
> are route domain, tcp(4) and unix(4) sockets.
>
> This diff keeps the PR_WANTRCVD check. In other hand we could always
> call pru_rcvd() and do "pru_rcvd != NULL" check within, but in the
> future with per buffer locking, we could have some re-locking around
> pru_rcvd() call and I want to do it outside wrapper.
>
>
> Index: sys/kern/uipc_usrreq.c
> ===
> RCS file: /cvs/src/sys/kern/uipc_usrreq.c,v
> retrieving revision 1.185
> diff -u -p -r1.185 uipc_usrreq.c
> --- sys/kern/uipc_usrreq.c  3 Sep 2022 22:43:38 -   1.185
> +++ sys/kern/uipc_usrreq.c  10 Sep 2022 18:51:42 -
> @@ -363,7 +363,7 @@ uipc_shutdown(struct socket *so)
> return (0);
>  }
>
> -int
> +void
>  uipc_rcvd(struct socket *so)
>  {
> struct socket *so2;
> @@ -390,8 +390,6 @@ uipc_rcvd(struct socket *so)
> default:
> panic("uipc 2");
> }
> -
> -   return (0);
>  }
>
>  int
> Index: sys/net/rtsock.c
> ===
> RCS file: /cvs/src/sys/net/rtsock.c,v
> retrieving revision 1.355
> diff -u -p -r1.355 rtsock.c
> --- sys/net/rtsock.c8 Sep 2022 10:22:06 -   1.355
> +++ sys/net/rtsock.c10 Sep 2022 18:51:42 -
> @@ -115,7 +115,7 @@ int route_attach(struct socket *, int);
>  introute_detach(struct socket *);
>  introute_disconnect(struct socket *);
>  introute_shutdown(struct socket *);
> -introute_rcvd(struct socket *);
> +void   route_rcvd(struct socket *);
>  introute_send(struct socket *, struct mbuf *, struct mbuf *,
> struct mbuf *);
>  introute_abort(struct socket *);
> @@ -299,7 +299,7 @@ route_shutdown(struct socket *so)
> return (0);
>  }
>
> -int
> +void
>  route_rcvd(struct socket *so)
>  {
> struct rtpcb *rop = sotortpcb(so);
> @@ -314,8 +314,6 @@ route_rcvd(struct socket *so)
> ((sbspace(rop->rop_socket, &rop->rop_socket->so_rcv) ==
> rop->rop_socket->so_rcv.sb_hiwat)))
> rop->rop_flags &= ~ROUTECB_FLAG_FLUSH;
> -
> -   return (0);
>  }
>
>  int
> Index: sys/netinet/tcp_usrreq.c
> ===
> RCS file: /cvs/src/sys/netinet/tcp_usrreq.c,v
> retrieving revision 1.207
> diff -u -p -r1.207 tcp_usrreq.c
> --- sys/netinet/tcp_usrreq.c3 Sep 2022 22:43:38 -   1.207
> +++ sys/netinet/tcp_usrreq.c10 Sep 2022 18:51:42 -
> @@ -792,18 +792,17 @@ out:
>  /*
>   * After a receive, possibly send window update to peer.
>   */
> -int
> +void
>  tcp_rcvd(struct socket *so)
>  {
> struct inpcb *inp;
> struct tcpcb *tp;
> -   int error;
> short ostate;
>
> soassertlocked(so);
>
> -   if ((error = tcp_sogetpcb(so, &inp, &tp)))
> -   return (error);
> +   if (tcp_sogetpcb(so, &inp, &tp))
> +   return;
>
> if (so->so_options & SO_DEBUG)
> ostate = tp->t_state;
> @@ -820,7 +819,6 @@ tcp_rcvd(struct socket *so)
>
> if (so->so_options & SO_DEBUG)
> tcp_trace(TA_USER, ostate, tp, tp, NULL, PRU_RCVD, 0);
> -   return (0);
>  }
>
>  /*
> Index: sys/netinet/tcp_var.h
> ===
> RCS file: /cvs/src/sys/netinet/tcp_var.h,v
> retrieving revision 1.157
> diff -u -p -r1.157 tcp_var.h
> --- sys/netinet/tcp_var.h   3 Sep 2022 22:43:38 -   1.157
> +++ sys/netinet/tcp_var.h   10 Sep 2022 18:51:42 -
> @@ -725,7 +725,7 @@ int  tcp_connect(struct socket *, struct
>  int tcp_accept(struct socket *, struct mbuf *);
>  int tcp_disconnect(struct socket *);
>  int tcp_shutdown(struct socket *);
> -int tcp_rcvd(struct socket *);
> +voidtcp_rcvd(struct socket *);
>  int tcp_send(struct socket *, struct mbuf *, struct mbuf *,
>  struct mbuf *);
>  int tcp_abort(struct socket *);
> Index: sys/sys/protosw.h
> ===
> RCS file: /cvs/src/sys/sys/protosw.h,v
> retrieving revision 1.55
> diff -u -p -r1.55 protosw.h
> --- sys/sys/protosw.h   5 Sep 2022 14:56:09 -   1.55
> +++ sys/sys/protosw.h   10 Sep 2022 18:51:42 -
> @@ -72,7 +72,7 @@ struct pr_usrreqs {
> int (*pru_accept)(struct socket *, struct mbuf *);
> int (*pru_disconnect)(struct socket *);
> int (*pru_shutdown)(struct socket *);
> -   int (*pru_rcvd)(struct socket *);
> +   void(*pru_rcvd)(struct socket *);
> int (*pru_send)(struct socket *, struct mbuf *, struct mbuf *,
> struct mbuf *);
> int (*pru_abort)(struct socket *

Re: sparc64: constify ddb instruction table

2022-10-21 Thread Philip Guenther

On Fri, Oct 21, 2022 at 11:04 AM Klemens Nanni  wrote:

> --- a/sys/arch/sparc64/sparc64/db_disasm.c
> +++ b/sys/arch/sparc64/sparc64/db_disasm.c
>
...

> @@ -877,7 +877,7 @@ struct sparc_insn sparc_i[] = {
>  vaddr_t
>  db_disasm(vaddr_t loc, int altfmt)
>  {
> -   struct sparc_insn*  i_ptr = (struct sparc_insn *)&sparc_i;
> +   const struct sparc_insn *i_ptr = (const struct sparc_insn
> *)&sparc_i;
>

What's with that cast?  Is it only there because sparc_i is an array and
it's wrong to take its address when we just want a pointer to its first
element?  I mean, shouldn't that line just (with const) be:
const struct sparc_insn *i_ptr = sparc_i;/* or &sparc_i[0] */

?

Philip Guenther

Re: OpenBSD::MkTemp vs Devel::Cover

2023-07-09 Thread Philip Guenther

Yeah, I don't really get what's going on here that Devel::Cover is unhappy
about.

Maybe it's something about how my mkstemps_real() implementation creates
the filehandle that it returns.  I see perlxstut(1) now talks about
{Input,InOut,Output}Stream and PerlIO* in the typemap, so maybe I should
change it to skip the newGVgen()+sv_bless() dance, switch from PPCODE: to
CODE: with a return type of InOutStream, and return the PerlIO_fdopen()
result?

(I also suspect it should be using mkostemps(..., O_CLOEXEC), given how
newer perls have switched to setting the close-on-exec flag by default in
the low-level calls, but it's 100% unclear to me what perl's expectations
are for the close-on-exec flag of an fd returned to it by an XS.)

Philip


On Sun, Jul 9, 2023 at 12:30 PM Andrew Hewus Fresh 
wrote:

> On Sat, Jul 08, 2023 at 12:09:01PM -0700, Andrew Hewus Fresh wrote:
> > On Sat, Jul 08, 2023 at 11:18:00AM +0200, Marc Espie wrote:
> > > Hey, Philip, you wrote this a long time ago.
> > >
> > > Now, I'm trying to get some coverage out of Devel::Cover on pkg_add,
> > > and somehow, it gets in the way.
> > >
> > > # perl -MDevel::Cover=+select,OpenBSD/.* /usr/sbin/pkg_add random_run
> > 
> > >   the problem was:
> > >
> > > Found type 9 GLOB(0x64a7292fc10), but it is not representable by the
> Sereal encoding format at
> /usr/local/libdata/perl5/site_perl/amd64-openbsd/Devel/Cover/DB/IO/Sereal.pm
> line 46.
> > >
> > >
> > > I have zero idea if Devel::Cover is to blame or if OpenBSD::MkTemp is
> missing
> > > some magic annotation to cover for _GEN_1, _GEN_2, but the end result
> is
> > > that NO coverage data gets written, none at all (I suspect some missing
> > > annotations, see below)
> >
> > Same, especially when I can't reproduce on my laptop that needs to be
> > updated or on my sparc64 that just got updated today:
>
>
> I take that back, running the same command as root I see the same
> problem.
>
> However, trying to reproduce with a simple case doesn't show me the same
> issue, so not quite sure what Devel::Cover is trying to store.
>
> $ cat mktemp.pl
> #!/usr/bin/perl
> use v5.36;
>
> use OpenBSD::MkTemp qw< mkstemp mkdtemp >;
>
> sub output($fh, $content) {
> print $fh $content;
> }
>
> my $d = mkdtemp("fooX");
> my ($fh, $file) = mkstemp("fooX");
>
> output($fh, "Hello World\n");
>
> close $fh;
>
> $ doas perl -MDevel::Cover=+select,OpenBSD.* ./mktemp.pl
>
> This version of Devel::Cover was built with Perl version 5.036000.
> It is now being run with Perl version 5.036001.
> Attempting to make adjustments, but you may find that some of your modules
> do
> not have coverage data collected.  You may need to alter the +-inc,
> +-ignore
> and +-select options.
>
> Devel::Cover 1.40: Collecting coverage data for branch, condition, pod,
> statement, subroutine and time.
> Selecting packages matching:
> OpenBSD.*
> Ignoring packages matching:
> /Devel/Cover[./]
> Ignoring packages in:
> /usr/local/libdata/perl5/site_perl/sparc64-openbsd
> /usr/local/libdata/perl5/site_perl
> /usr/libdata/perl5/sparc64-openbsd
> /usr/libdata/perl5
> Devel::Cover: Deleting old coverage for changed file mktemp.pl
> Devel::Cover: getting BEGIN block coverage
> Devel::Cover: 100% - 0s taken
> Devel::Cover: getting CHECK block coverage
> Devel::Cover: 100% - 0s taken
> Devel::Cover: getting END/INIT block coverage
> Devel::Cover: 100% - 0s taken
> Devel::Cover: getting CV coverage
> Devel::Cover: 100% - 0s taken
> Devel::Cover: Writing coverage database to
> /tmp/x/cover_db/runs/1688930470.15647.20075
>  -- -- -- -- -- --
> --
> File   stmt   bran   condsubpod   time
> total
>  -- -- -- -- -- --
> --
> ...openbsd/OpenBSD/MkTemp.pm   77.7   25.0   25.0   83.30.00.7
>  61.7
> mktemp.pl 100.0n/an/a  100.0n/a   99.3
> 100.0
> Total  87.5   25.0   25.0   88.80.0  100.0
>  74.5
>  -- -- -- -- -- --
> --
>
>

Re: remove extra parentheses

2023-07-14 Thread Philip Guenther

On Tue, Jul 11, 2023 at 9:21 PM Masato Asou  wrote:

> ok ?
>

ok guenther@

(I think this was simply from how the original example was simplified in
rev 1.25)

Philip Guenther

Re: Stop using direct syscall(2) from perl(1)

2023-07-20 Thread Philip Guenther

On Thu, Jul 20, 2023 at 10:59 PM Theo de Raadt  wrote:

> Andrew Hewus Fresh  wrote:
>
> > One thing to note, on the advice of miod@, I adjusted the mapping of
> > size_t from u_int to u_long, but that means we no longer recognize
> > getlogin_r.  When I asked, miod@ suggested that syscalls.master was
> > confused.
> >
> > $ grep getlogin_r /usr/src/sys/kern/syscalls.master /usr/include/unistd.h
> > /usr/src/sys/kern/syscalls.master:141   STD { int
> sys_getlogin_r(char *namebuf, u_int namelen); }
> > /usr/include/unistd.h:intgetlogin_r(char *, size_t)
>
> That's quite a surprise.
>
> syscalls.master will need to be changed, which will change
> struct sys_getlogin_r_args, which is used inside sys_getlogin_r() in
> kern_prot.c.  Don't get fooled by the advisory /* */ block you see
> there, it is using the generated struct.  But the offset to load the
> field isn't changing, and luckily the incorrect field is smaller than
> the correct field, so I *think* there is no major ABI crank needed.
>

Yeah, at least for syscall args a u_int and a size_t both get passed in a
register-sized space on all our platforms' ABIs, so sys_getlogin_r_args
doesn't actually change size or offsets.

(If it was off_t or long long vs an int or long-sized arg, that would be
different, of course, on 32bit archs, but this change is in the clear.)

It does mean the syscall is implicitly reducing the len argument mod 2^32
on ILP32 archs, so it may fail with ERANGE if passed a 4GB buffer when it
should succeed, but I don't think that merits any sort of syspatch or such.

Philip Guenther

Re: kqueue(2) and close-on-exec

2023-08-13 Thread Philip Guenther

I think that changing the behavior of the existing API in a way that
gratuitously increases the differences between BSDs is unwise.  IMHO, we
should follow NetBSD on this and add kqueue1(), that being obviously
consistent with the previous 'add flags argument' extensions: pipe2(),
dup3(), and accept4().

(As you know, close-on-fork is imposed because the "take a descriptor
number" filters are specified in a way tied to a struct filedesc and
there's no obvious extension to the kqueue behavior for sharing it across
processes.  Oh, and as a result trying to do so would result in
use-after-frees and/or assertion failures in the kernel.  Using a kqueue
after exec() is weird, but not logically undefined in that way.)

Philip

On Sun, Aug 13, 2023 at 4:55 AM Visa Hankala  wrote:

> FreeBSD and NetBSD have variants of the kqueue(2) system call that
> allow setting the close-on-exec flag on the returned file descriptor.
>
> In general, I think it is good that the flag can be set atomically
> for new descriptors. However, it seems to me that it is almost surely
> a mistake if a kqueue descriptor is passed over an exec.
>
> Instead of adding a new system call, maybe close-on-exec should be
> enabled automatically by kqueue(2). Today it feels backwards that
> close-on-exec is off by default.
>
> Note that kqueue cannot be inherited by accident in fork-then-exec
> situations because fork(2) closes kqueue descriptors for the child
> process.
>
> Index: sys/kern/kern_event.c
> ===
> RCS file: src/sys/kern/kern_event.c,v
> retrieving revision 1.197
> diff -u -p -r1.197 kern_event.c
> --- sys/kern/kern_event.c   13 Aug 2023 08:29:28 -  1.197
> +++ sys/kern/kern_event.c   13 Aug 2023 10:42:45 -
> @@ -932,7 +932,7 @@ sys_kqueue(struct proc *p, void *v, regi
> *retval = fd;
> LIST_INSERT_HEAD(&fdp->fd_kqlist, kq, kq_next);
> kq = NULL;
> -   fdinsert(fdp, fd, 0, fp);
> +   fdinsert(fdp, fd, UF_EXCLOSE, fp);
> FRELE(fp, p);
>  out:
> fdpunlock(fdp);
> Index: lib/libc/sys/kqueue.2
> ===
> RCS file: src/lib/libc/sys/kqueue.2,v
> retrieving revision 1.48
> diff -u -p -r1.48 kqueue.2
> --- lib/libc/sys/kqueue.2   13 Aug 2023 08:29:28 -  1.48
> +++ lib/libc/sys/kqueue.2   13 Aug 2023 10:42:45 -
> @@ -74,6 +74,7 @@ on a file descriptor will remove any kev
>  .Pp
>  .Fn kqueue
>  creates a new kernel event queue and returns a descriptor.
> +The new descriptor has close-on-exec flag set.
>  The queue is not inherited by a child created with
>  .Xr fork 2 .
>  Similarly, kqueues cannot be passed across UNIX-domain sockets.
>
>

Re: ps.1/kvm documentation

2023-09-11 Thread Philip Guenther

On Mon, Sep 11, 2023 at 5:29 AM Marc Espie 
wrote:

> On Mon, Sep 11, 2023 at 12:10:17PM +0200, Claudio Jeker wrote:
> > On Mon, Sep 11, 2023 at 11:02:00AM +0200, Marc Espie wrote:
> > > I was reading through ps.1, which has two slightly different options
> > >  -H  Also display information about kernel visible threads.
> > >  -k  Also display information about kernel threads.
> > >
> > > It's not at all obvious what the difference between these options
> might be.
> >
> > kernel threads == kthread(9) created threads
> >
> > Those should have K in STAT and the name is in () like:
> >  3141 ??  RK/1   4057:57.90 (idle1)
>

Fun fact: at the current time, all kernel threads are actually full
processes, though this just wastes some memory and cache.
Part of my (evil) plan in adding thread names was removing an impediment in
changing kthread_create(9) to create just a thread inside process 0.

> kernel visible threads == __tfork_thread(3) created threads for userland
> > applications. For example:
> >
> > 43838  556612 ??  IpU  0:01.58 firefox (firefox/Cache2 I/O)
> > 43838  415551 ??  IpU  0:00.01 firefox (firefox/Cookie)
> > 43838  377915 ??  IpU  0:00.01 firefox (firefox/Worker Launcher)
> >
> > These threads all share the same PID but have different TID.
> >
> > I think the "kernel visible" is there to tell that pure userland threads
> > can not be reported by ps. I think go routines are such an example.
>

Claudio's descriptions are correct.  How to massage those into the ps(1)
manpage is unclear to me.

> Any of you guys willing to provide a patch to ps(1) and kvm_getprocs(3) ?
>

I think I've written a diff for kvm_getprocs(3) at least 3 times and each
previous time the way the API makes me grit my teeth has resulted in
incomplete rewrites of the API and its callers that I haven't been happy
enough with to complete and push through.  Enough stalling on documenting
the current behavior: how about something like the diff below?

Philip

 Index: lib/libkvm/kvm_getprocs.3
===
RCS file: /data/src/openbsd/src/lib/libkvm/kvm_getprocs.3,v
retrieving revision 1.21
diff -u -p -r1.21 kvm_getprocs.3
--- lib/libkvm/kvm_getprocs.3   11 Aug 2019 15:48:08 -  1.21
+++ lib/libkvm/kvm_getprocs.3   12 Sep 2023 04:25:32 -
@@ -91,6 +91,13 @@ processes with real user ID
 .Fa arg
 .El
 .Pp
+In addition, if
+.Dv KERN_PROC_SHOW_THREADS
+is bitwise ORed into
+.Fa op
+then additional entries for matched processes' associated threads
+are returned.
+.Pp
 Only the first
 .Fa elemsize
 bytes of each array entry are returned.

Re: Add ENOTDIR as possible error to unveil(2)

2023-09-16 Thread Philip Guenther

On Sat, Sep 16, 2023 at 9:47 AM Ingo Schwarze  wrote:

> Janne Johansson wrote on Sat, Sep 16, 2023 at 11:49:10AM +0200:
>
> > In case someone wants the change in a diff format, those 5 seconds
> > of work are available here:
> > http://c66.it.su.se:8080/obsd/unveil.2.diff
>
> How come jj@ forgot we want diffs inline?  
>
> > +.It Bq Er ENOTDIR
> > +.Fa path
> > +points to a file that is not a directory.
>
> That diff is certainly not OK.
> It is misleading because a "path" argument pointing to a file
> that is not a directory is actually valid.
>
> The problem only occurs when you add a trailing slash to the name
> of a file that is not a directory, like this:
>
>$ touch tmp.txt
>$ ls -l tmp.txt
>   -rw-r--r--  1 schwarze  schwarze  0 Sep 16 18:00 tmp.txt
>$ ls -l tmp.txt/
>   ls: tmp.txt/: Not a directory
>$ ls -l tmp.txt/foobar
>   ls: tmp.txt/foobar: Not a directory
>
> Looking at sys_unveil in /sys/kern/vfs_syscalls.c, i suspect that
> errno actually comes from a lower layer, likely namei(9), rather than
> from unveil(2) itself.
>

Yes, from namei(9)/vfs_lookup(9) and the VOP_LOOKUP() calls they make.
Basically all the syscalls that take a path should document the standard
set of errnos that paths can prompt: ENOTDIR, ENAMETOOLONG, ENOENT, EACCES,
ELOOP, EIO, (and EFAULT I guess).  Those in the *at() family should
document EBADF, and the fd-related causes of ENOTDIR and EACCES.  And then
there are additional errors that those calls which may create or modify the
target path probably need to document: EISDIR and EROFS, maybe ENXIO or
EPERM...



> In library manual pages, we occasionally have sentences like this:
>
>  execv() may fail and set errno for any of the errors specified
>  for the library function execve(2).
>
>  The fgetln() function may also fail and set errno for any of the
>  errors specified for the routines fflush(3), malloc(3), read(2),
>  stat(2), or realloc(3).
>
> But it doesn't look like we do that in system call manuals, and
> besides, namei(9) appears to lack the RETURN VALUES section.
>

Hmm, would it be useful to fill in namei(9)'s RETURN VALUES section with
the template of the "so you took a path argument..." errnos that syscall
manpages can start from?



> So, what is our policy with syscall ERRORS sections?
> For standardized syscalls, POSIX probably needs to be taken
> into account in addition to our implementation.
> But for our very own syscalls?  Hm...
>
> access(2), acct(2), chdir(2), chflags(2), chmod(2), chown(2), chroot(2),
> execve(2), getfh(2), ktrace(2), link(2), mkdir(2), mkfifo(2), mknod(2),
> mount(2), open(2), pathconf(2), quotactl(2), readlink(2), rename(2),
> revoke(2), rmdir(2), stat(2), statfs(2), swapctl(2), symlink(2),
> truncate(2), unlink(2), utimes(2) all list ENOTDIR, and i don't see
> any right now taking a path argument that don't.
>
> Do we want the following diff?
>

Yes, I think we want at least that.  ok guenther@


> I did not sort the existing return values in alphabetical order,
> > there were two out of order in there, but that is a separate
> > commit I guess.
>
> NetBSD has a policy of sorting ERRORS entries alphabetically, we don't.
>
> Yours,
>   Ingo
>
>
> Index: unveil.2
> ===
> RCS file: /cvs/src/lib/libc/sys/unveil.2,v
> retrieving revision 1.22
> diff -u -r1.22 unveil.2
> --- unveil.26 Sep 2021 08:03:08 -   1.22
> +++ unveil.216 Sep 2023 16:38:24 -
> @@ -158,6 +158,10 @@
>  A directory in
>  .Fa path
>  did not exist.
> +.It Bq Er ENOTDIR
> +A component of the
> +.Fa path
> +prefix is not a directory.
>  .It Bq Er EINVAL
>  An invalid value of
>  .Fa permissions
>
>

Re: ktrwriteraw & vget

2020-03-17 Thread Philip Guenther

On Tue, Mar 17, 2020 at 1:07 AM Martin Pieuchot  wrote:

> On 16/03/20(Mon) 14:01, Martin Pieuchot wrote:
> > vget(9) might fail, stop right away if that happens.
> >
> > CID 1453020 Unchecked return value.
>
> Updated diff that stops tracing if vget(9) fails, similar to what's
> currently done if VOP_WRITE(9) fails, suggested by visa@.
>
> Code shuffling is there to avoid calling vput(9) if vget(9) doesn't
> succeed.
>
> ok?
>
> Index: kern/kern_ktrace.c
> ===
> RCS file: /cvs/src/sys/kern/kern_ktrace.c,v
> retrieving revision 1.100
> diff -u -p -r1.100 kern_ktrace.c
> --- kern/kern_ktrace.c  6 Oct 2019 16:24:14 -   1.100
> +++ kern/kern_ktrace.c  17 Mar 2020 09:46:03 -
> @@ -649,12 +649,17 @@ ktrwriteraw(struct proc *curp, struct vn
> auio.uio_iovcnt++;
> auio.uio_resid += kth->ktr_len;
> }
> -   vget(vp, LK_EXCLUSIVE | LK_RETRY);
> +   error = vget(vp, LK_EXCLUSIVE | LK_RETRY);
> +   if (error)
> +   goto bad;
> error = VOP_WRITE(vp, &auio, IO_UNIT|IO_APPEND, cred);
> -   if (!error) {
> -   vput(vp);
> -   return (0);
> -   }
> +   vput(vp);
> +   if (error)
> +   goto bad;
> +
> +   return (0);
> +
> +bad:
> /*
>  * If error encountered, give up tracing on this vnode.
>  */
> @@ -663,8 +668,6 @@ ktrwriteraw(struct proc *curp, struct vn
> LIST_FOREACH(pr, &allprocess, ps_list)
> if (pr->ps_tracevp == vp && pr->ps_tracecred == cred)
> ktrcleartrace(pr);
> -
> -   vput(vp);
> return (error);
>  }
>

This looks unsafe to me: isn't ktrcleartrace() only safe if the caller
holds a reference to the vnode?  Once ktrcleartrace() clears the reference
from the current thread's process and it goes on the freelist, can't the
vnode vp points to be invalidated and reused?

Maybe this vget() should be split into vref() + vn_lock(), and the vput()
similarly split into VOP_UNLOCK() + vrele(), so the vrele() can be left
after this LIST_FOREACH() while the VOP_UNLOCK() is hoisted to the
"vn_lock() succeeded" path.


Philip Guenther

Re: ktrwriteraw & vget

2020-03-17 Thread Philip Guenther

On Tue, Mar 17, 2020 at 5:18 AM Martin Pieuchot  wrote:

> On 17/03/20(Tue) 04:02, Philip Guenther wrote:
> > On Tue, Mar 17, 2020 at 1:07 AM Martin Pieuchot  wrote:
> > [...]
> > > @@ -663,8 +668,6 @@ ktrwriteraw(struct proc *curp, struct vn
> > > LIST_FOREACH(pr, &allprocess, ps_list)
> > > if (pr->ps_tracevp == vp && pr->ps_tracecred == cred)
> > > ktrcleartrace(pr);
> > > -
> > > -   vput(vp);
> > > return (error);
> > >  }
> > >
> >
> > This looks unsafe to me: isn't ktrcleartrace() only safe if the caller
> > holds a reference to the vnode?  Once ktrcleartrace() clears the
> reference
> > from the current thread's process and it goes on the freelist, can't the
> > vnode vp points to be invalidated and reused?
>
> As long as a process holds a reference to the vnode, via `ps_tracevp',
> it wont be recycle.  Only the last call of ktrcleartrace() will release
> the vnode via vrele(9).
>

...and after that last reference is released this code will continue to
walk the allprocess list, comparing a possibly-recycled pointer to
ps_tracevp pointers in the remaining processes.  Good thing that vrele(9)
is guaranteed to never sleep and thereby let another process start
ktracing, reusing that vnode.??

Re: EV_SET(2) shadows variable

2020-04-02 Thread Philip Guenther

On Tue, Mar 31, 2020 at 11:24 PM Martin Pieuchot  wrote:

> The current form of EV_SET(2) declares a `kevp' variable.  This can
> cause to subtle bugs hard to discover if one uses a variable of the
> same to retrieve events.
>
> Diff below prefixes the locally declared variable by an underscore,
> like it it done in FD_ZERO(), which should be good enough to prevent
> surprises.
>
> Is it the right way to correct such issue?  How many underscores are
> enough?  Can the standards gurus tell me?
>

tl;dr: this should use a variable that starts with either two underbars
(__kevp) or an underbar and a capital (_Kevp).  The same is true of
FD_ZERO().

The namespace of identifiers that begin with a single underbar is only
"reserved for use as identifiers with file scope in both the ordinary and
tag name spaces".  That means it's perfectly legal for an application to
declare a *local* variable with such a name.  So, this should be legal:

#include 
#include 
#include 
#include 
int main(void)
{
struct fd_set _p;
struct kevent _kevp;
FD_ZERO(&_p):
EV_SET(&_kevp, 0, 0, 0, 0, 0, 0);
return 0;
}

...but it currently blows up on the FD_ZERO() and would blow up in the
EV_SET() with your proposed diff.

Philip Guenther

Re: EV_SET(2) shadows variable

2020-04-03 Thread Philip Guenther

On Fri, 3 Apr 2020, Martin Pieuchot wrote:
> Thanks, here it is, ok?

ok guenther@

Re: kdump futex fix

2020-04-03 Thread Philip Guenther

On Fri, 3 Apr 2020, Martin Pieuchot wrote:
> Depending on the operation requested futex(2) might return the number of 
> woken threads or an error.  That means the following...
> 
> mpv  CALL  futex(0xa58935899b0,0x82,1,0,0)
> mpv  RET   futex -1 errno 1 Operation not permitted
> 
> ...is not an error but it indicates that 1 thread has been awoken.
> 
> Diff below would have saved me quite some time looking in the wrong 
> direction.
> 
> ok?

Isn't that just a complicated way to get the same effect as deleting the 
"case SYS_futex:" line shortly above there?

The real problem is that futex(2) is actually 3 different syscalls wrapped 
into one.  It was split into three then kdump could properly report 
futex_wake(2) and futex_requeue(2) as returning a count, while 
futex_wait(2) returns an errno.  The existing 'switch' in sys_futex() 
would just move to userspace's futex(3), provided for linux compat.

(And our syscalls would have proper prototypes and futex_wait() could take 
a clockid_t, and)

Re: split futex into three

2020-04-04 Thread Philip Guenther

On Sat, 4 Apr 2020, Paul Irofti wrote:
> Here is a proper diff (both sys and libc into one).

Okay, bunch o' comments and thoughts of varying strength of opinion.

> diff --git lib/libc/Symbols.list lib/libc/Symbols.list
> index f9aa62ab6e8..4fa37a835aa 100644
> --- lib/libc/Symbols.list
> +++ lib/libc/Symbols.list
> @@ -86,7 +86,10 @@ _thread_sys_fstatat
>  _thread_sys_fstatfs
>  _thread_sys_fsync
>  _thread_sys_ftruncate
> -_thread_sys_futex
> +_thread_sys_ofutex

The ofutex syscall should not be built (or exported) by libc. The kernel 
needs to continue to provide it for the normal period to support older 
libc's where futex(2) is a direct syscall,but no application does, would, 
or should invoke ofutex() under that name.

> +_thread_sys_futex_wait
> +_thread_sys_futex_wake
> +_thread_sys_futex_requeue

glibc has internal inline functions futex_wait() and futex_wake() and 
there has been at least discussion about exporting some version of them.  
If our signatures matched the last-best-proposal over there (which was 
dropped, mind you) then I would be tempted to use those names.  If not, 
then maybe go with _futex_wait() and _futex_wake()?

FUTEX_REQUEUE is the old bad one, with no val3 argument that's checked 
before the operation.  Our libc/libpthread don't actually use them, and in 
the Linux world glibc switched completely to FUTEX_CMP_REQUEUE.  Perhaps 
we should drop support for FUTEX_REQUEUE (major bump, yah) and add 
_futex_cmp_requeue(2) when we need it?

> --- lib/libc/gen/sigwait.c
> +++ lib/libc/gen/sigwait.c

This is unrelated to futex stuff.  :)

> --- lib/libc/hidden/sys/futex.h
> +++ lib/libc/hidden/sys/futex.h
> @@ -20,6 +20,8 @@
>  
>  #include_next 
>  
> -PROTO_NORMAL(futex);

You need to keep this as long as futex() is used internally.  Once futex() 
is no longer used in libc then this should change to 
"PROTO_DEPRECATED(futex);" and the "DEF_STRONG(futex);" line should be 
deleted.

(If you have DEF_STRONG(foo) or DEF_WEAK(foo), then you must have 
PROTO_NORMAL(foo).  c.f libc/include/README II.B.2)

> --- lib/libc/thread/synch.h
> +++ lib/libc/thread/synch.h
> @@ -19,10 +19,33 @@
>  #include 
>  #include 
>  
> +static inline int
> +_futex(volatile uint32_t *p, int op, int val, const struct timespec 
> *timeout, uint32_t *g)
> +{
> + int flags = 0;
> +
> + if (op & FUTEX_PRIVATE_FLAG)
> + flags |= FT_PRIVATE;
> +
> + switch (op) {
> + case FUTEX_WAIT:
> + case FUTEX_WAIT_PRIVATE:
> + return futex_wait(p, val, timeout, flags);
> + case FUTEX_WAKE:
> + case FUTEX_WAKE_PRIVATE:
> + return futex_wake(p, val, flags);
> + case FUTEX_REQUEUE:
> + case FUTEX_REQUEUE_PRIVATE:
> + return futex_requeue(p, val, g, timeout, flags);
> + }
> +
> + return ENOSYS;
> +}

I would put this logic directly into futex()'s definition, and then...

>  static inline int
>  _wake(volatile uint32_t *p, int n)
>  {
> - return futex(p, FUTEX_WAKE_PRIVATE, n, NULL, NULL);
> + return _futex(p, FUTEX_WAKE_PRIVATE, n, NULL, NULL);
>  }

have this just be
return futex_wake(p, n, FUTEX_WAKE_PRIVATE);

and similar for _twait() below.

(I would be inclined to use FUTEX_WAKE_PRIVATE as the flag for the new 
syscalls instead of exposing FT_PRIVATE into the userspace API.)

> --- sys/kern/sys_futex.c
> +++ sys/kern/sys_futex.c
> @@ -83,9 +83,74 @@ futex_init(void)
>  }
>  
>  int
> -sys_futex(struct proc *p, void *v, register_t *retval)
> +sys_futex_wait(struct proc *p, void *v, register_t *retval)
>  {
> - struct sys_futex_args /* {
> + struct sys_futex_wait_args /* {
> + syscallarg(uint32_t *) f;
> + syscallarg(inr) val;
> + syscallarg(const struct timespec *) timeout;
> + syscallarg(int) flags;
> + } */ *uap = v;
> + uint32_t *uaddr = SCARG(uap, f);
> + uint32_t val = SCARG(uap, val);
> + const struct timespec *timeout = SCARG(uap, timeout);
> + int flags = SCARG(uap, flags);
> +
> + KERNEL_LOCK();
> + rw_enter_write(&ftlock);
> + *retval = futex_wait(uaddr, val, timeout, flags);
> + rw_exit_write(&ftlock);
> + KERNEL_UNLOCK();
> +
> + return 0;
> +}
> +
> +int
> +sys_futex_wake(struct proc *p, void *v, register_t *retval)
> +{
> + struct sys_futex_wake_args /* {
> + syscallarg(uint32_t *) f;
> + syscallarg(int) val;
> + syscallarg(int) flags;
> + } */ *uap = v;
> + uint32_t *uaddr = SCARG(uap, f);
> + uint32_t val = SCARG(uap, val);
> + int flags = SCARG(uap, flags);
> +
> + rw_enter_write(&ftlock);
> + *retval = futex_wake(uaddr, val, flags);
> + rw_exit_write(&ftlock);
> +
> + return 0;
> +}
> +
> +int
> +sys_futex_requeue(struct proc *p, void *v, register_t *retval)
> +{
> + struct sys_futex_requeue_args /* {
> + syscallarg(uint32_t *) f;
> + syscallarg(int) val;
> + syscallarg(uint32_t *) g;

Re: kdump futex fix

2020-04-04 Thread Philip Guenther

On Sat, 4 Apr 2020, Martin Pieuchot wrote:
> On 03/04/20(Fri) 19:26, Philip Guenther wrote:
> > On Fri, 3 Apr 2020, Martin Pieuchot wrote:
> > > Depending on the operation requested futex(2) might return the number of 
> > > woken threads or an error.  That means the following...
> > > 
> > > mpv  CALL  
> > > futex(0xa58935899b0,0x82,1,0,0)
> > > mpv  RET   futex -1 errno 1 Operation not permitted
> > > 
> > > ...is not an error but it indicates that 1 thread has been awoken.
> > > 
> > > Diff below would have saved me quite some time looking in the wrong 
> > > direction.
> > > 
> > > ok?
> > 
> > Isn't that just a complicated way to get the same effect as deleting the 
> > "case SYS_futex:" line shortly above there?
> 
> Thanks!  Updated below :)

ok guenther@

Re: split futex into three

2020-04-05 Thread Philip Guenther

On Sun, 5 Apr 2020, Stuart Henderson wrote:
> On 2020/04/05 10:28, Martin Pieuchot wrote:
> > Another way to proceed would be to do a port grep for futex and see what
> > the ecosystem is using.
> 
> Sorry it's not filtered, but :
> 
> https://junkpile.org/grep.futex.gz

Sure looks like the only occurence of futex() used with FUTEX_REQUEUE (== 
3) is the linux kernel test program.  Everything else, including rust, is 
using FUTEX_CMP_REQUEUE or one of the PI operations 
(FUTEX_WAIT_REQUEUE_PI, FUTEX_CMP_REQUEUE_PI).

Philip

Re: EV_SET(2) shadows variable

2020-04-05 Thread Philip Guenther

On Sat, 4 Apr 2020, Theo de Raadt wrote:
> Philip Guenther  wrote:
> 
> > On Fri, 3 Apr 2020, Martin Pieuchot wrote:
> > > Thanks, here it is, ok?
> > 
> > ok guenther@
> 
> Should we do the same to all other macros, just in case?

Checking /usr/include/{,sys/}*.h, the diff below fixes the only ones I 
found to be potential problems

/usr/include/net* and some others have not-completely-safe macros, like 
IP6_EXTHDR_GET()

Index: include/bitstring.h
===
RCS file: /data/src/openbsd/src/include/bitstring.h,v
retrieving revision 1.5
diff -u -p -r1.5 bitstring.h
--- include/bitstring.h 2 Jun 2003 19:34:12 -   1.5
+++ include/bitstring.h 6 Apr 2020 00:37:52 -
@@ -83,46 +83,46 @@ typedef unsigned char bitstr_t;
 
/* clear bits start ... stop in bitstring */
 #definebit_nclear(name, start, stop) do { \
-   register bitstr_t *_name = name; \
-   register int _start = start, _stop = stop; \
-   while (_start <= _stop) { \
-   bit_clear(_name, _start); \
-   _start++; \
+   register bitstr_t *__name = (name); \
+   register int ___start = (start), __stop = (stop); \
+   while (__start <= __stop) { \
+   bit_clear(__name, __start); \
+   __start++; \
} \
 } while(0)
 
/* set bits start ... stop in bitstring */
 #definebit_nset(name, start, stop) do { \
-   register bitstr_t *_name = name; \
-   register int _start = start, _stop = stop; \
-   while (_start <= _stop) { \
-   bit_set(_name, _start); \
-   _start++; \
+   register bitstr_t *__name = (name); \
+   register int __start = (start), __stop = (stop); \
+   while (__start <= __stop) { \
+   bit_set(__name, __start); \
+   __start++; \
} \
 } while(0)
 
/* find first bit clear in name */
 #definebit_ffc(name, nbits, value) do { \
-   register bitstr_t *_name = name; \
-   register int _bit, _nbits = nbits, _value = -1; \
-   for (_bit = 0; _bit < _nbits; ++_bit) \
-   if (!bit_test(_name, _bit)) { \
-   _value = _bit; \
+   register bitstr_t *__name = (name); \
+   register int __bit, __nbits = (nbits), __value = -1; \
+   for (__bit = 0; __bit < __nbits; ++__bit) \
+   if (!bit_test(__name, __bit)) { \
+   __value = __bit; \
break; \
} \
-   *(value) = _value; \
+   *(value) = __value; \
 } while(0)
 
/* find first bit set in name */
 #definebit_ffs(name, nbits, value) do { \
-   register bitstr_t *_name = name; \
-   register int _bit, _nbits = nbits, _value = -1; \
-   for (_bit = 0; _bit < _nbits; ++_bit) \
-   if (bit_test(_name, _bit)) { \
-   _value = _bit; \
+   register bitstr_t *__name = (name); \
+   register int __bit, __nbits = (nbits), __value = -1; \
+   for (__bit = 0; __bit < __nbits; ++__bit) \
+   if (bit_test(__name, __bit)) { \
+   __value = __bit; \
break; \
} \
-   *(value) = _value; \
+   *(value) = __value; \
 } while(0)
 
 #endif /* !_BITSTRING_H_ */
Index: sys/sys/disklabel.h
===
RCS file: /data/src/openbsd/src/sys/sys/disklabel.h,v
retrieving revision 1.75
diff -u -p -r1.75 disklabel.h
--- sys/sys/disklabel.h 24 Oct 2017 09:36:13 -  1.75
+++ sys/sys/disklabel.h 6 Apr 2020 00:52:08 -
@@ -156,37 +156,37 @@ struct__partitionv0 { /* old (v0) part
 
 #define DL_GETPSIZE(p) (((u_int64_t)(p)->p_sizeh << 32) + (p)->p_size)
 #define DL_SETPSIZE(p, n)  do { \
-   u_int64_t x = (n); \
-   (p)->p_sizeh = x >> 32; \
-   (p)->p_size = x; \
+   u_int64_t __x = (n); \
+   (p)->p_sizeh = __x >> 32; \
+   (p)->p_size = __x; \
} while (0)
 #define DL_GETPOFFSET(p)   (((u_int64_t)(p)->p_offseth << 32) + 
(p)->p_offset)
 #define DL_SETPOFFSET(p, n)do { \
-   u_int64_t x = (n); \
-   (p)->p_offseth = x >> 32; \
-   (p)->p_offset = x; \
+   u_int64_t __x = (n); \
+   (p)->p_offseth = __x >> 32; \
+

Re: SSE in kernel?

2020-06-23 Thread Philip Guenther

On Mon, Jun 22, 2020 at 9:12 PM  wrote:

> Are SSE instructions allowed in the AMD64 kernel?  Is #ifdef __SSE__
> a sufficient guard?
>
> I have a rasops32 putchar with SSE that is 2x faster.
>

As Bryan and Patrick noted: it's possible, but there are restrictions and
costs.

The main restriction is that the code must not permit a context-switch
between the fpu_kernel_enter() and fpu_kernel_exit() calls.  No taking an
rwlock or calling any of the sleep functions, for example.

If you're using more than the minimal level of SSE which is already
required by the kernel (for lfence, etc) then you should also check whether
the necessary extension bits are present in curcpu()->ci_feature_* and fall
back to the current code if not present.

The cost is that if the thread doing this isn't a system thread, then the
first fpu_kernel_enter() call after the userspace->kernel transition has to
save and reset the FPU registers (XSAVEOPT + XRSTOR on newish CPUs).  Every
fpu_kernel_exit(), regardless of thread type, resets them again (XRSTOR).

If the restriction isn't a problem and the cost of those is worth the gain,
then sure, go for it.  We already do it for AES stuff in the kernel, for
example.  c.f. /usr/src/sys/arch/amd64/amd64/aesni.c

Philip Guenther

Re: printf(1) Fix hex numbers

2021-04-18 Thread Philip Guenther

I'll just throw in a note that the current POSIX spec does not include
support for \x in the printf(1) format or in the argument to the %b format
specifier.  At least for \x in the format string it appears to require that
it _not_ be interpreted, such that
printf '\x61\n'
must output
\x61
 and not
a

FreeBSD 12's /usr/bin/printf does this, for example, however at least some
on the list seemed to feel the spec should be changed to make it
unspecified behavior.

The behavior of
printf '%b\n' '\x61'

is clearly unspecified already.

Philip Guenther

On Sun, Apr 18, 2021 at 7:02 AM Martijn van Duren <
openbsd+t...@list.imperialat.at> wrote:

> On Thu, 2021-04-15 at 19:48 -0600, Todd C. Miller wrote:
> > On Fri, 16 Apr 2021 03:34:04 +0200, Martijn van Duren wrote:
> >
> > > We currently have NetBSD's behaviour when it comes to to no characters,
> > > so no need to change there imho, but the 2 character limit seems like
> > > a good place to stop, especially since there's no way to break out of
> > > this hungry hungry hippo.
> > >
> > > limiter inspiration taken from octal notation.
> > >
> > > While here, also document \x notation.
> >
> > No objection from me but I'm not sure it needs to make 6.9.
> >
> >  - todd
> I found a (different) bug in NetBSD's implementation, which got fixed by
> christos. But he also added a check that at least 1 isxdigit(3) needs to
> be present. From the commit message:
> Also add a warning if the conversion fails (like the gnu printf does)
>
> Do we want something similar now that we are the odd ones out, or keep
> the diff as is? I still think that the current diff is fine, but since
> the landscape changed I wanted to throw it out there.
>
> martijn@
>
>

Re: printf(1) Fix hex numbers

2021-04-18 Thread Philip Guenther

On Sun, Apr 18, 2021 at 12:04 PM Martijn van Duren <
openbsd+t...@list.imperialat.at> wrote:

> On Sun, 2021-04-18 at 11:17 -0900, Philip Guenther wrote:
> >
> > I'll just throw in a note that the current POSIX spec does not include
> support for \x in the printf(1) format or in the argument to the %b format
> specifier.  At least for \x in the format string it
> > appears to require that it _not_ be interpreted, such that
> > printf '\x61\n'
> > must output
> > \x61
> >  and not
> > a
>
> Could you point out where it states that an escape-sequence not
> specified must output the original text? The way I read it[0][1]:
>
> In addition to the escape sequences shown in XBD File Format Notation (
> '\\', '\a', '\b', '\f', '\n', '\r', '\t', '\v' ), "\ddd", where ddd is a
> one, two, or three-digit octal number, shall be written as a byte with
> the numeric value specified by the octal number.
>
> and
>
> Escape Sequences and Associated Actions lists escape sequences and
> associated actions on display devices capable of the action.
>

Before that statement in XBD 5 is this:
-
The format is a character string that contains three types of objects
defined below:

   1. Characters that are not "escape sequences" or "conversion
specifications", as described
 below, shall be copied to the output.

   2. Escape Sequences represent non-graphic characters and the escape
character ().

   3. Conversion Specifications specify the output format of each argument;
see below.
-

So, per (1) above, if it's not an escape sequence or conversion
specification, it SHALL be copied to the output.  Unfortunately, the list
of escape sequences is just a table, and while the printf(1) specification
has the clause you cited that adds \ddd to the list, there's nothing to
state that, for formats,  followed by anything else results in
unspecified behavior.  Ergo \x isn't an escape sequence under (2) and,
since it's not a converspecification, it must follow (1) and be copied to
the output.

Anyway, POSIX wording will probably change, it's just a place to note in
the manual that handling \xXX is an extension to the POSIX standard and
that compliant (or portable) applications must use \ddd instead.

Philip Guenther

Re: ld.so: NULL dereference on corrupt library

2021-05-05 Thread Philip Guenther

On Tue, May 4, 2021 at 12:43 PM Klemens Nanni  wrote:
...

> I compared my corrupted shared library with an intact copy from ports
> and it showed that the corrupted one was simply zeroed out at some point
> (not truncated) until the end, e.g. readelf(1)'s `-h' or `-l' report
> "Error: no .dynamic section in the dynamic segment".
>
> So this isn't a case of some badly linked library or one that has a few
> bits flipped, it's simply a partial one... seems like bad luck?
>

IMHO, the benefit of adding this check is almost zero: it gives a slightly
better experience for a small set of possible data corruption cases, when
similar corruptions that affect other pages aren't helped at all as it'll
crash when it executes zeroed text, or accesses zeroed data, or fails to
find a required symbol because the symbol table was zeroed out.

If we want to protect against that sort of hardware lossage, then a
filesystem which does so is the way to go, not an alarm on one window of a
glass house.

Philip Guenther

Re: Use atomic op for UVM map refcount

2021-05-21 Thread Philip Guenther

On Wed, May 19, 2021 at 5:19 AM Mark Kettenis 
wrote:

> > Date: Tue, 18 May 2021 13:24:42 +0200
> > From: Martin Pieuchot 
>
...

> > There's only a couple of 'volatile' usages in sys/sys.  These annotations
> > do not explicitly indicate which piece of code requires it.  Maybe it
> would
> > be clearer to use a cast or a macro where necessary.  This might help us
> > understand why and where "volatile" is needed.
>
> There are the READ_ONCE() and WRITE_ONCE() macros.  I'm not a big fan
> of those (since they add clutter) but they do take care of dependency
> ordering issues that exist in the alpha memory model.  Must admit that
> I only vaguely understand that issue, but I think it involves ordered
> access to two atomic variables which doesn't seem to be the case.
>

> On non-alpha systems, READ_ONCE() and WRITE_ONCE() just do a volatile
> pointer cast.
>

READ/WRITE_ONCE() are 99% about keeping the compiler from deciding to
rearrange code such that the indicated variable is read/written more than
once.  To point to Linus posts from 2008/2009, when it was still
ACCESS_ONCE():
 https://yarchive.net/comp/linux/ACCESS_ONCE.html

ISTR a paper by Ousterhout describing this same problem to the C standard
committee(?) in the early 90's, which kinda opened the "memory model"
rathole.

If the variable is actually being protected by a lock then these are indeed
noise/pessimization; it's the lock-less accesses where the compiler can
pull a rabbit from its hat and stab you with it.

Philip Guenther

Re: Use atomic op for UVM map refcount

2021-05-21 Thread Philip Guenther

On Wed, May 19, 2021 at 11:29 PM Martin Pieuchot  wrote:

> On 19/05/21(Wed) 16:17, Mark Kettenis wrote:
>
...

> > There are the READ_ONCE() and WRITE_ONCE() macros.  I'm not a big fan
> > of those (since they add clutter) but they do take care of dependency
> > ordering issues that exist in the alpha memory model.  Must admit that
> > I only vaguely understand that issue, but I think it involves ordered
> > access to two atomic variables which doesn't seem to be the case.
>
> These macros are used in places where declaring the field as "volatile"
> could also work, no?  We can look at __mp_lock and SMR implementations.
> So could we agree one way to do things?
>
> Visa, David, why did you pick READ_ONCE() in SMR and veb(4)?  Anything
> we overlooked regarding the use of "volatile"?
>

If _all_ references to a member/variable use READ/WRITE_ONCE, then
declaring it volatile should be equivalent, but if there are any uses
which have a lock for protection instead than making the member volatile
will result in worse object code for the protected sequences where *_ONCE
are not needed.  Also, initialization of structs with volatile members can
be pessimal: the compiler has to assume this could be in uncached memory
mapped from a device where writes to the member have to be of the size of
the member: no paving with larger writes or deferring initializations.

volatile is a *really* blunt hammer.  READ/WRITE_ONCE use it carefully to
build a sharper tool.  Unifying on "just plain volatile" when the work has
already been done to use a sharper tool correctly..well, if that's a good
idea then why have SMR at all when locks would be easier for everyone to
think about, despite being a blunter hammer?  /s

Philip Guenther

Re: ld.so: program headers: do not rely on DYNAMIC coming before GNU_RELRO

2021-05-25 Thread Philip Guenther

On Mon, May 24, 2021 at 4:59 AM Klemens Nanni  wrote:

> When tinkering with ld.so crashes due to file corruption the other day
> I tested a few changes but did not want to replace /usr/libexec/ld.so
> and since recompiling executable to change their interpreter is not
> always an option, I went for https://github.com/NixOS/patchelf which
> allows me to manipulate executables in place.
>
> The tool works just fine and as a byproduct rearanges program headers;
> that in itself is fine except our run-time link-editor is not happy with
> such valid executables:
>
>
>  ELF mandates nothing but the file header be at a fixed location, hence
>  ld.so(1) must not assume any specific order for headers, segments, etc.
>

(Not quite: it does place ordering requirements in some specific cases,
such as that PT_INTERP and PT_PHDR, if present, must appear before any
PT_LOAD segments in the program header, and that PT_LOAD segments must
appear in p_vaddr order.)


 Looping over the program header table to parse segment headers,
>  _dl_boot() creates the executable object upon DYNAMIC and expects it to
>  be set upon GNU_RELRO, resulting in a NULL dereference iff that order is
>  reversed.
>
>  Store relocation bits in temporary variables and update the executable
>  object once all segment headers are parsed to lift this dependency.
>
>  Under __mips__ _dl_boot() later on uses the same temporary variable, so
>  move nothing but the declaration out of MI code so as to not alter the
>  MD code's logic/behaviour.
>
>
> This fix is needed for the following work on OpenBSD:
>
> $ patchelf --set-interpreter $PWD/obj/ld.so ./my-ldso-test
> $ readelf -l ./my-ldso-test | grep interpreter
>   [Requesting program interpreter: /usr/src/libexec/
> ld.so/obj/ld.so]
> $ ./my-ldso-test
> it works!
>
> amd64 and arm64 regress is happy and all my patched executables work
> with this.
>
> Feedback? Objections? OK?
>
> diff d7231fb4fb547dd287a884c56ae7c8b10f9145fe
> f023dbe355bef379d55eb93eddbb2702559d5bdb
> blob - 18bd30af759bffbc4e3fbfee9ffc29906f0d1bb0
> blob + b66dbb169aad9afffa1283d480ad9276aff9072a
> --- libexec/ld.so/loader.c
> +++ libexec/ld.so/loader.c
> @@ -464,6 +464,7 @@ _dl_boot(const char **argv, char **envp, const long dy
> int failed;
> struct dep_node *n;
> Elf_Addr minva, maxva, exe_loff, exec_end, cur_exec_end;
> +   Elf_Addr relro_addr = 0, relro_size = 0;
> Elf_Phdr *ptls = NULL;
> int align;
>
> @@ -552,8 +553,8 @@ _dl_boot(const char **argv, char **envp, const long dy
> ptls = phdp;
> break;
> case PT_GNU_RELRO:
> -   exe_obj->relro_addr = phdp->p_vaddr + exe_loff;
> -   exe_obj->relro_size = phdp->p_memsz;
> +   relro_addr = phdp->p_vaddr + exe_loff;
> +   relro_size = phdp->p_memsz;
> break;
> }
> phdp++;
> @@ -561,6 +562,8 @@ _dl_boot(const char **argv, char **envp, const long dy
> exe_obj->load_list = load_list;
> exe_obj->obj_flags |= DF_1_GLOBAL;
> exe_obj->load_size = maxva - minva;
> +   exe_obj->relro_addr = relro_addr;
> +   exe_obj->relro_size = relro_size;
> _dl_set_sod(exe_obj->load_name, &exe_obj->sod);
>
>  #ifdef __i386__
> @@ -638,7 +641,7 @@ _dl_boot(const char **argv, char **envp, const long dy
> debug_map->r_ldbase = dyn_loff;
> _dl_debug_map = debug_map;
>  #ifdef __mips__
> -   Elf_Addr relro_addr = exe_obj->relro_addr;
> +   relro_addr = exe_obj->relro_addr;
> if (dynp->d_tag == DT_DEBUG &&
> ((Elf_Addr)map_link + sizeof(*map_link) <= relro_addr
> ||
>  (Elf_Addr)map_link >= relro_addr +
> exe_obj->relro_size)) {
>
>
ok guenther@

(This still assumes PT_GNU_RELRO comes after PT_PHDR, for exe_loff, but I
suspect even patchelf wouldn't screw with that.)

Re: rsync fix symlink discovery

2021-07-05 Thread Philip Guenther

Based on the fts_open(3) manpage and other base source usage, shouldn't
this use fts_accpath instead of fts_name?

The use of fts_statp in this code seems a bit loose vs ftp_info: instead of
using S_ISLNK() on fts_statp I would expect this code to check for fts_info
== FTS_SL: according the manpage fts_statp's value is undefined for various
values of fts_info.

Philip Guenther

On Fri, Jul 2, 2021 at 4:46 AM Claudio Jeker 
wrote:

> Hit this today while doing some tests. symlink_read() needs to use just
> the filename and not the full path because fts_read(3) does chdir
> internally.
>
> Without this I got:
> openrsync: error: ./obj/openrsync.1: readlink: No such file or directory
> openrsync: error: symlink_read
> openrsync: error: flist_gen_dirent
> openrsync: error: flist_gen
> openrsync: error: rsync_sender
>
> --
> :wq Claudio
>
> Index: flist.c
> ===
> RCS file: /cvs/src/usr.bin/rsync/flist.c,v
> retrieving revision 1.32
> diff -u -p -r1.32 flist.c
> --- flist.c 30 Jun 2021 13:10:04 -  1.32
> +++ flist.c 2 Jul 2021 13:14:01 -
> @@ -972,7 +972,7 @@ flist_gen_dirent(struct sess *sess, char
> /* Optionally copy link information. */
>
> if (S_ISLNK(ent->fts_statp->st_mode)) {
> -   f->link = symlink_read(f->path);
> +   f->link = symlink_read(ent->fts_name);
> if (f->link == NULL) {
> ERRX1("symlink_read");
> goto out;
>
>

Re: CVS: cvs.openbsd.org: src

2018-07-12 Thread Philip Guenther

On Thu, Jul 12, 2018 at 2:11 PM Philip Guenther 
wrote:

> CVSROOT:/cvs
> Module name:src
> Changes by: guent...@cvs.openbsd.org2018/07/12 08:11:11
>
> Modified files:
> sys/arch/amd64/amd64: cpu.c identcpu.c locore.S machdep.c pmap.c
>   vector.S
> sys/arch/amd64/conf: ld.script
> sys/arch/amd64/include: asm.h codepatch.h
>
> Log message:
> Reorganize the Meltdown entry and exit trampolines for syscall and
> traps so that the "mov %rax,%cr3" is followed by an infinite loop
> which is avoided because the mapping of the code being executed is
> changed.  This means the sysretq/iretq isn't even present in that
> flow of instructions in the kernel mapping, so userspace code can't
> be speculatively reached on the kernel mapping and totally eliminates
> the conditional jump over the the %cr3 change that supported CPUs
> without the Meltdown vulnerability.  The return paths were probably
> vulnerable to Spectre v1 (and v1.1/1.2) style attacks, speculatively
> executing user code post-system-call with the kernel mappings, thus
> creating cache/TLB/etc side-effects.


Damnit, I left out that since this evolves the _Meltdown_ fix with mapping
_over_ the trampoline, we're calling this the Tuna Meltover.

Because you can tuna meltover, but you can't tune a fish.
(hat tip to the author of the tunefs(8) manpage.)


Philip Guenther

Re: Kill MD-specific interrupt enable/disable API on amd64

2018-07-24 Thread Philip Guenther

On Tue, Jul 24, 2018 at 1:02 PM Mark Kettenis 
wrote:

> Diff below switches to the MI equivalent and kills the MD-specific
> API.
>
> ok?
>

ok guenther@

Re: Floating point exception on boot after using syspatch(8)

2018-08-01 Thread Philip Guenther

On Wed, 1 Aug 2018, Zbyszek Żółkiewski wrote:
> OpenBSD 6.3 GENERIC.MP running on XEN (AWS m4.large) - dmesg at the end 
> of this mail. Image build from https://github.com/kolargol/openbsd-aws 
> which is fork from https://github.com/ajacoutot/aws-openbsd - my fork 
> have some minor modifications.
...
> forgive me if i missed something obvious here - but any idea what is 
> wrong here?  What can cause Floating point exception during boot on 
> system that otherwise works without patching?

This smells like some compatibility issue with the eager-FPU change.  One 
thought is whether the Xen presents the guest with a correctly reset FPU 
state on all CPUs, or if there's some odd state that we're not clearing.

I made a later commit in that area (sys/arch/amd64/amd64/cpu.c rev 1.124) 
so it would be interesting to know if -current shows the same issue or if 
the symptoms have improved (or changed at all) there.

Philip Guenther

Re: Linux DRM

2018-09-03 Thread Philip Guenther

On Mon, Sep 3, 2018 at 11:46 AM Thomas de Grivel  wrote:

> I was browsing the DRM code ported from Linux and it's a terrible
> mess, is there any ongoing project to clean up that codebase or
> rewrite it entirely ?
>

No.  OpenBSD doesn't have the resources to reimplement the DRM subsystem or
maintain a non-trivial fork of the Linux version.  We don't want to get
stuck with a code base that doesn't support close-to-current hardware, so
the porting work has concentrated on minimizing the changes necessary to
make the upstream code base work in OpenBSD.

It's clear that the hardware support in the upstream has large
contributions from developers with inside access at the hardware vendors;
without such access it's doubtful that all the hardware bugs^Wlimitations
can be worked around with non-infinite resource.

Improvements in the DRM code itself should be done in the upstream, not
just to minimize OpenBSD costs in this area, but so that all OSes that draw
from that base can benefit.

Philip Guenther

Re: add uid_from_user/gid_from_group [1/3]

2018-09-11 Thread Philip Guenther

On Mon, 10 Sep 2018, Todd C. Miller wrote:
> This is a port of uid_from_user() and gid_from_group() from NetBSD (also 
> present in FreeBSD).
...
> --- lib/libc/gen/pwcache.c25 Nov 2015 23:16:01 -  1.13
> +++ lib/libc/gen/pwcache.c10 Sep 2018 15:04:35 -
...
> +/*
> + * routines that control user, group, uid and gid caches (for the archive
> + * member print routine).

The parenthetical side comment should be deleted.  I suggest capitalizing 
'routines' while there.

...
> +const char *
> +group_from_gid(gid_t gid, int noname)
> +{
> + struct group grstore, *gr = NULL;
> + char grbuf[_GR_BUF_LEN];
> + GIDC *ptr, **pptr;
> +
> + if ((gidtb == NULL) && (gidtb_start() < 0))
> + return NULL;

One question is whether these routines should fall back to uncached 
operation if the allocation fails.  "tar -tv" will segfault currently if a 
[ug]idtb_start() call fails.

...
> Index: lib/libc/gen/pwcache.h
> ===
> RCS file: lib/libc/gen/pwcache.h

I merged cache.h into cache.c in pax because the separate header was 
pointless; why split these bits into a single-use header again?

(...specially because I would maul these data structure a bit...)

> Index: lib/libc/gen/getgrent.c
> Index: lib/libc/Symbols.list
> Index: lib/libc/hidden/grp.h
> Index: lib/libc/hidden/pwd.h

These bits are all correct and do what you desire.

Philip

Re: add uid_from_user/gid_from_group [1/3]

2018-09-11 Thread Philip Guenther

On Mon, 10 Sep 2018, Todd C. Miller wrote:
...
> --- /dev/null 1 Jan 1970 00:00:00 -
> +++ lib/libc/gen/pwcache.h10 Sep 2018 00:46:44 -
...
> +/*
> + * Constants and data structures used to implement group and password file
> + * caches. Traditional passwd/group cache routines perform quite poorly with
> + * archives. The chances of hitting a valid lookup with an archive is quite a
> + * bit worse than with files already resident on the file system. These 
> misses
> + * create a MAJOR performance cost. To adress this problem, these routines
> + * cache both hits and misses.
> + *
> + * NOTE:  name lengths must be as large as those stored in ANY PROTOCOL and
> + * as stored in the passwd and group files. CACHE SIZES MUST BE PRIME
> + */

Oh yeah, this comment should be massaged to match the new context,
perhaps noting that the limits on these functions are set to provide
effective caching, both positive and negative, for the standard
archive formats, blah blah blah.

(I mentioned "maul"ing before.  I'm been tempted to make malloc earn its
pay and use
struct uidc {
uid_t uid;
char valid;
char name[];
};

instead, mallocing the correct size for the name...and eliminating the
UNMLEN/GNMLEN limits!)

Philip

Re: add uid_from_user/gid_from_group [2/3]

2018-09-11 Thread Philip Guenther

On Mon, 10 Sep 2018, Todd C. Miller wrote:
> This removes cache.c from pax in favor of using the new uid_from_user() 
> and gid_from_group() functions in libc.  I've added explicit calls to 
> setpassent() and setgroupent() since they are no longer implicitly 
> called.

This part is ok guenther@ (once part 1 is haggled in)

Re: add uid_from_user/gid_from_group [1/3]

2018-09-12 Thread Philip Guenther

On Wed, 12 Sep 2018, Todd C. Miller wrote:
> Thanks for the feedback, here's an updated diff that eliminates 
> pwcache.h, gracefully handles table allocation failure and massages the 
> comments to be a bit more general.
> 
> I can take a look at supporting arbitrary length names in the future.

ok guenther@

> +/*
> + * Constants and data structures used to implement group and password file
> + * caches.  Name lengths have been chosen to be as large as those supported
> + * by the passwd and group files as well as the standard archive formats.
> + * CACHE SIZES MUST BE PRIME
> + */

Side note: I seem to recall reading an analysis somewhere that found that 
with a good hash function, the size of the table didn't matter.  I 
remember Kernighan & Pike's The Practice of Programming showing the 
primality of the size didn't matter for the hash table they whip up in the 
algorithms section.  Maybe the current hash just needs updating...

Re: [patch] Fix an error in chmod.1

2018-09-13 Thread Philip Guenther

On Thu, Sep 13, 2018 at 8:17 PM Nan Xiao  wrote:

> The following patch fixes an error in chmod.1, and very sorry if I am
> wrong, thanks!
>
...

> --- chmod.1 7 Jun 2017 09:41:57 -   1.42
> +++ chmod.1 14 Sep 2018 05:10:44 -
> @@ -243,7 +243,7 @@ If no value is supplied for
>  each permission bit specified in
>  .Ar perm ,
>  for which the corresponding bit in the file mode creation mask
> -is clear, is cleared.
> +is set, is cleared.
>  Otherwise, the mode bits represented by the specified
>  .Ar who
>  and
>

The current documentation matches both the POSIX specification and the
utility behavior.  Consider:

$ touch file
$ chmod 666 file
$ mkdir -m 777 dir
$ ls -l
total 2
drwxrwxrwx  2 guenther  wheel  512 Sep 13 22:24 dir
-rw-rw-rw-  1 guenther  wheel0 Sep 13 22:24 file
$ umask
002
$ chmod -rw *
$ ls -l
total 2
d--x--x-wx  2 guenther  wheel  512 Sep 13 22:24 dir
w-  1 guenther  wheel0 Sep 13 22:24 file
$

The other-write bit was set in my file mode creation mask, so the "-rw" was
treated like "ug-rw,o-r" and did not change the other-write permission on
the file or directory.


Philip Guenther

Re: bsd.rd failure in VirtualBox

2018-09-15 Thread Philip Guenther

On Sat, Sep 15, 2018 at 11:59 AM David Higgs  wrote:

> I often use VirtualBox (version 5.2.18 on OS X) to familiarize myself
> with new features in snapshots, before upgrading my physical hardware.
>
> This afternoon, I tried updating bsd.rd (amd64, 6.4-beta RAMDISK_CD
> #281) and wasn't able to successfully boot it.  I had to rely on the
> video capture ability of VirtualBox to even notice there was a panic
> (typed out below) before it rebooted to the "BIOS" splash screen.
>
...

> Also attached is the dmesg from a prior working snapshot.  I haven't
> tried updating since this prior snapshot, so I don't have further
> insight into when the issue first appeared.
>

Thank you for the complete and clear report!

I have a diff in the amd64 snapshots to use the CPU's PCID support in many
cases and this VirtualBox setup found a bug in it.  I've generated a new
diff that should fix this, so a future snap should fix this, though when
that'll happend depends on the snap builder's schedule.

Philip Guenther

Re: copy and paste error in rb tree code

2018-10-08 Thread Philip Guenther

On Mon, Oct 8, 2018 at 6:03 PM David Gwynne  wrote:

> According to https://github.com/FRRouting/frr/pull/3106, Coverity thinks
> I made a copy and paste error regarding one of the augment calls in
> RBT_REMOVE. I'm inclined to agree.
>
> The augment code is only used by uvm_map.c, and it seems to be fine with
> this change. I believe it coped because the augment handler it provides
> is very conservative and walks all parents on every augment call, so
> the code missing one wasn't a problem.
>
> ok?
>

Looking at the macro code this was based on in tree.h rev 1.14, I agree
it's wrong and the diff is correct:
...
 if (RB_LEFT(RB_PARENT(old, field), field) == old)\
RB_LEFT(RB_PARENT(old, field), field) =
elm;\
else\
RB_RIGHT(RB_PARENT(old, field), field) =
elm;\
RB_AUGMENT(RB_PARENT(old, field));  \
...

ok guenther@

Re: multicast.4: drop unneeded (void *) casts

2018-10-18 Thread Philip Guenther

On Thu, Oct 18, 2018 at 4:00 PM  wrote:

> getsockopt(2) and setsockopt(2) take a void pointer for the third
> parameter so there is no need for any of these casts.
>

Yep.


> I can do other style(9)-type cleanup in a subsequent diff or I can
> cook this diff and do it all in one.
>

> Thoughts, preferences?
>

I would prefer line-wrapping adjustment that's enabled by the deletion of
the casts be done in the same commit.  Those to change, IMHO, called out
below.


> @@ -168,7 +168,7 @@ memcpy(&vc.vifc_lcl_addr, &vif_local_add
>  if (vc.vifc_flags & VIFF_TUNNEL)
>  memcpy(&vc.vifc_rmt_addr, &vif_remote_address,
> sizeof(vc.vifc_rmt_addr));
> -setsockopt(mrouter_s4, IPPROTO_IP, MRT_ADD_VIF, (void *)&vc,
> +setsockopt(mrouter_s4, IPPROTO_IP, MRT_ADD_VIF, &vc,
> sizeof(vc));
>

Unwrap completely.

@@ -205,7 +205,7 @@ memset(&mc, 0, sizeof(mc));
>  mc.mif6c_mifi = mif_index;
>  mc.mif6c_flags = mif_flags;
>  mc.mif6c_pifi = pif_index;
> -setsockopt(mrouter_s6, IPPROTO_IPV6, MRT6_ADD_MIF, (void *)&mc,
> +setsockopt(mrouter_s6, IPPROTO_IPV6, MRT6_ADD_MIF, &mc,
> sizeof(mc));
>

Ditto

 ...

> @@ -302,7 +302,7 @@ mc.mfcc_parent = iif_index;
>  for (i = 0; i < maxvifs; i++)
>  mc.mfcc_ttls[i] = oifs_ttl[i];
>  setsockopt(mrouter_s4, IPPROTO_IP, MRT_ADD_MFC,
> -   (void *)&mc, sizeof(mc));
> +   &mc, sizeof(mc));
>

Ditto and the next three.


>  setsockopt(mrouter_s4, IPPROTO_IPV6, MRT6_ADD_MFC,
> -   (void *)&mc, sizeof(mc));
> +   &mc, sizeof(mc));
>
...

>  setsockopt(mrouter_s4, IPPROTO_IP, MRT_DEL_MFC,
> -   (void *)&mc, sizeof(mc));
> +   &mc, sizeof(mc));

...

>  setsockopt(mrouter_s4, IPPROTO_IPV6, MRT6_DEL_MFC,
> -   (void *)&mc, sizeof(mc));
> +   &mc, sizeof(mc));
>


@@ -478,7 +478,7 @@ would fail.
>  To modify the API, and to set some specific feature in the kernel, then:
>  .Bd -literal -offset 3n
>  uint32_t v = MRT_MFC_FLAGS_DISABLE_WRONGVIF;
> -if (setsockopt(sock, IPPROTO_IP, MRT_API_CONFIG, (void *)&v, sizeof(v))
> +if (setsockopt(sock, IPPROTO_IP, MRT_API_CONFIG, &v, sizeof(v))
>  != 0) {
>

Unwrap the !=0{ line.

Re: multicast.4: drop unneeded (void *) casts

2018-10-19 Thread Philip Guenther

On Fri, Oct 19, 2018 at 6:20 AM Scott Cheloha 
wrote:
...

> Sure, looks cleaner.  ok?


ok guenther@

Re: cron.c patch

2018-10-20 Thread Philip Guenther

On Sat, Oct 20, 2018 at 2:34 PM Edgar Pettijohn III 
wrote:

> I'm guessing the if block will never hit since listen returns either 0 or
> -1.
>

Uh, but -1 is true.

Philip

Re: cron.c patch

2018-10-20 Thread Philip Guenther

On Sat, Oct 20, 2018 at 2:41 PM Edgar Pettijohn III 
wrote:

> On 10/20/18 6:40 PM, Philip Guenther wrote:
>
> On Sat, Oct 20, 2018 at 2:34 PM Edgar Pettijohn III <
> ed...@pettijohn-web.com> wrote:
>
>> I'm guessing the if block will never hit since listen returns either 0 or
>> -1.
>>
>
> Uh, but -1 is true.
>
> my bad. -1 just feels so untrue.
>

On two's-complement systems (i.e., all modern CPUs), -1 is all bits set
making it the *truest* value!  :-)

Re: httpd: fix/consistency cast for ctype function

2018-11-03 Thread Philip Guenther

On Fri, Nov 2, 2018 at 4:14 AM Hiltjo Posthuma 
wrote:

> I noticed many ctype functions (such as isalpha, isdigit, tolower) are
> cast to
> unsigned char in httpd. This patch changes it also for one remaining check.
>

Yep.  Committed.  Thanks!

Philip Guenther

Re: AMD64 buffer cache 4GB cap anything new, multiqueueing plans? ("64bit DMA on amd64" cont)

2018-11-06 Thread Philip Guenther

On Tue, Nov 6, 2018 at 9:51 PM Joseph Mayer 
wrote:

> Previously there was a years-long thread about a 4GB (32bit) buffer
> cache constraint on AMD64, ref
> https://marc.info/?t=14682443664&r=1&w=2 .
>
> What I gather is,
>
>  * The problematique is that on AMD64, DMA is limited to 32bit
>addressing, I guess because unlike AMD64 arch CPU:s which all have
>64bit DMA support, popular PCI accessories and supporting hardware
>out there like bridges, have DMA functionality limited to 32bit
>addressing.
>

My read of that thread, particularly Theo's comments, is that no one
actually demonstrated a case where lack of 64bit DMA caused any problems or
limitations.

If you have a system and use where lack of 64bit DMA creates a performance
limitation, then describe it and, *more importantly*, *why* you think the
DMA limit is involved.

Philip Guenther

Re: tap(4) minor number too large

2018-11-11 Thread Philip Guenther

On Sun, Nov 11, 2018 at 9:45 PM David Gwynne  wrote:

> If you can trick someone into implementing 64bit device ids you can have
> more than your 16 millionth tap device.
>

Hhahahahahahahhahahahahah.

That would involve rolling six syscall numbers, not to mention handling the
64bit padding in one of mknod() or mknodat().

As the guy who did the last type bump that invasive, Just Say No.

Philip Guenther

Re: tap(4) minor number too large

2018-11-11 Thread Philip Guenther

On Sun, Nov 11, 2018 at 10:03 PM David Gwynne  wrote:
...

> so like this?
>
...

> --- if_tun.c24 Feb 2018 07:20:04 -  1.181
> +++ if_tun.c12 Nov 2018 06:02:51 -
> @@ -193,6 +193,9 @@ tun_create(struct if_clone *ifc, int uni
> struct tun_softc*tp;
> struct ifnet*ifp;
>
> +   if (unit > minor(~0U))
> +   return (ENXIO);
>

I would be inclined to write minor(~(dev_t)0) Just In Case, but ok either
way.

Re: ksh: be even more posixly correct on local/typeset

2018-11-20 Thread Philip Guenther

On Tue, Nov 20, 2018 at 12:47 AM Martijn van Duren <
openbsd+t...@list.imperialat.at> wrote:

> When running a script with -o posix I usually do so to make sure that
> the script will run anywhere with a POSIX compliant shell. Currently we
> support typeset when running in posix mode, where POSIX states:
> If the command name matches the name of a utility listed in the
> following table, the results are unspecified[0].
> >snip<
> local
> >snip<
> typeset
> >snip<
>
> When looking at the rationale[1] I found the following:
> Local variables within a function were considered and included in
> another early proposal (controlled by the special built-in local), but
> were removed because they do not fit the simple model developed for
> functions and because there was some opposition to adding yet another
> new special built-in that was not part of historical practice.
> Implementations should reserve the identifier local (as well as typeset,
> as used in the KornShell) in case this local variable mechanism is
> adopted in a future version of this standard.
>
> I know bash supports local and dash has the keyword, but ignores its
> meaning (the variables are still global), but since POSIX is pretty
> vague in how the keyword should be reserved I reckon it would help
> people who want to write portable scripts.
>
> Thoughts? Are people running -o posix in production where this could
> breakage?
>

"set -o posix" is, IMO, about dealing with the places where ksh and posix
diverge.  Using it as a hammer for strict compliance eliminates that middle
ground of "posix behavior with extensions".  I would not consider that a
positive direction.


Philip Guenther

Re: [patch] sh.1 getopts inaccurate

2018-11-28 Thread Philip Guenther

On Wed, Nov 28, 2018 at 10:52 PM Solene Rapenne  wrote:

> I'm not familiar with getopts, first time I use it and the man page was
> confusing. It says "a colon following an option indicates it may take an
> argument".
> I understand this as: if I use the string "s:" then I can use "-s" or "-s
> something" as it **may** take an argument.
>
> But if you use "s:", getopts reports an error "-s requires argument".
>
> I propose to modify that sentence (not sure for the s in requires).
>
> Index: sh.1
> ===
> RCS file: /data/cvs/src/bin/ksh/sh.1,v
> retrieving revision 1.149
> diff -u -p -r1.149 sh.1
> --- sh.128 Sep 2018 18:32:39 -  1.149
> +++ sh.129 Nov 2018 06:44:50 -
> @@ -495,7 +495,7 @@ to the index of the next variable to be
>  The string
>  .Ar optstring
>  contains a list of acceptable options;
> -a colon following an option indicates it may take an argument.
> +a colon following an option indicates it requires an argument.
>  If an option not recognised by
>  .Ar optstring
>  is found,
>

"requires" is correct there.
ok guenther@

Re: ldapd, fix warnings when compiling with gcc

2018-12-04 Thread Philip Guenther

On Tue, Dec 4, 2018 at 5:40 AM Claudio Jeker 
wrote:

> Gcc is unhappy about the void * usage in printf:
> search.c:325: warning: format '%s' expects type 'char *', but argument 2
> has type 'void *'
> search.c:345: warning: format '%.*s' expects type 'char *', but argument 3
> has type 'void *'
> search.c:365: warning: format '%s' expects type 'char *', but argument 2
> has type 'void *'
>
> This diff fixes those. Now neither clang nor gcc complain.
> OK?
>

ok guenther@

Re: mi_switch() & setting `p_stat'

2021-10-02 Thread Philip Guenther

On Sat, Oct 2, 2021 at 8:57 AM Martin Pieuchot  wrote:

> When a thread running on a CPU schedules itself out, it does the following
> (pseudo_code):
>
> SCHED_LOCK()
> curproc->p_stat = SSLEEP;
> // some more operations
> mi_switch()
>
> The problem with this is that any instrumentation between setting `p_stat'
> and cpu_switchto() is incorrect because 'curproc' is still being executed
> and is not yet sleeping.  Its `p_stat' should be SONPROC and not SSLEEP.
>
...

> To fix this we should set `p_stat' as late a possible, diff below does that
> just before calling cpu_switchto().
>
...

> --- kern/kern_sig.c 28 Sep 2021 10:00:18 -  1.283
> +++ kern/kern_sig.c 2 Oct 2021 17:00:52 -
> @@ -1347,7 +1347,6 @@ proc_stop(struct proc *p, int sw)
> SCHED_ASSERT_LOCKED();
>  #endif
>
> -   p->p_stat = SSTOP;
> atomic_clearbits_int(&pr->ps_flags, PS_WAITED);
> atomic_setbits_int(&pr->ps_flags, PS_STOPPED);
> atomic_setbits_int(&p->p_flag, P_SUSPSIG);
> @@ -1357,7 +1356,7 @@ proc_stop(struct proc *p, int sw)
>  */
> softintr_schedule(proc_stop_si);
> if (sw)
> -   mi_switch();
> +   mi_switch(SSTOP);
>

This pair of chunks is wrong, as then the proc_stop(p, 0) call in
ptsignal() doesn't change the process from SSLEEP to SSTOP.  Sending a stop
signal to a blocked process needs to change its state.


Philip Guenther

Re: [PATCH] /etc/netstart: unquoted command substitution inside arithmetic expression

2021-10-07 Thread Philip Guenther

On Thu, Oct 7, 2021 at 5:57 PM bm1les  wrote:

> --- netstart2 Sep 2021 19:38:20 -   1.216
> +++ netstart8 Oct 2021 02:43:30 -
> @@ -365,7 +365,7 @@ ifmstart "tun tap gif etherip gre egre p
>  if [[ $ip6kernel == YES ]]; then
> # Ensure IPv6 Duplicate Address Detection (DAD) is completed.
> count=0
> -   while ((count++ < 10 && $(sysctl -n net.inet6.ip6.dad_pending) !=
> 0)); do
> +   while ((count++ < 10 && "$(sysctl -n net.inet6.ip6.dad_pending)"
> != 0)); do
> sleep 1
> done
>  fi
>

I can't figure out what problem you think this could solve.  Can you
explain the circumstances under which those quotes could make a difference?


Philip Guenther

Re: [PATCH] /etc/netstart: unquoted command substitution inside arithmetic expression

2021-10-08 Thread Philip Guenther

On Fri, Oct 8, 2021 at 8:57 AM Theo de Raadt  wrote:

> Philip Guenther  wrote:
>
> > On Thu, Oct 7, 2021 at 5:57 PM bm1les  wrote:
> >
> > > --- netstart2 Sep 2021 19:38:20 -   1.216
> > > +++ netstart8 Oct 2021 02:43:30 -
> > > @@ -365,7 +365,7 @@ ifmstart "tun tap gif etherip gre egre p
> > >  if [[ $ip6kernel == YES ]]; then
> > > # Ensure IPv6 Duplicate Address Detection (DAD) is completed.
> > > count=0
> > > -   while ((count++ < 10 && $(sysctl -n net.inet6.ip6.dad_pending)
> !=
> > > 0)); do
> > > +   while ((count++ < 10 && "$(sysctl -n
> net.inet6.ip6.dad_pending)"
> > > != 0)); do
> > > sleep 1
> > > done
> > >  fi
> > >
> >
> > I can't figure out what problem you think this could solve.  Can you
> > explain the circumstances under which those quotes could make a
> difference?
>
> Not the OP's issue, but I think a kernels compiled without option INET6
> will return an errno, and I cannot tell if sysctl prints out an error
> message
> or converts to "", the empty string, which would conceivably mis-parse.
>

AFAICT, an empty quoted string there results in the exact same error.  As I
wrote off-list to the original submitter:

Can you be clearer about how the quoting makes the result any better when
> run under bsd.rd?  Doesn't it fail in the same way?  Testing with 'echo'
> instead would seem to indicate so:
> : bleys; (( 1 < 10 && $(echo) != 0 )); echo $?
> /bin/ksh:  1 < 10 &&  != 0 : unexpected `!='
> 2
> : bleys; (( 1 < 10 && $(echo -n) != 0 )); echo $?
> /bin/ksh:  1 < 10 &&  != 0 : unexpected `!='
> 2
> : bleys; (( 1 < 10 && "$(echo)" != 0 )); echo $?
> /bin/ksh:  1 < 10 &&  != 0 : unexpected `!='
> 2
> : bleys; (( 1 < 10 && "$(echo -n)" != 0 )); echo $?
> /bin/ksh:  1 < 10 &&  != 0 : unexpected `!='
> 2
> : bleys;



Philip

Re: [PATCH] /etc/netstart: unquoted command substitution inside arithmetic expression

2021-10-10 Thread Philip Guenther

On Sun, Oct 10, 2021 at 1:48 PM bm1les  wrote:

> Exactly my point. Even if the circumstances were contrived, I think it
> would good to fix it just for the sake of correctness.
>

Sure, knowing what circumstances could cause a problem assists in
achieving correctness.

> The issue is actually a pattern I found not only in /etc/netstart but also
> in /etc/rc. (( )) cannot deal with an empty result yet it sometimes
> includes calls to sysctl which apparently can return an empty string in
> some cases.
>
> So options are:
>
> 1. ensure that sysctl always returns a number where it is expected
> 2. work around the issue by using /bin/[ in place of or in addition to,
> the arithmetic expression (depending on the expression being tested), so
> that whatever returns empty string can be tested instead of blowing up.
>

3. use syntax with works when the expansion is empty, such as
  (( $(sysctl blah) + 0 != 0 ))

   which is unary plus when it's empty, or
   (( $(sysctl blah)0 != 0 ))

   which multiplies the value by 10 when not empty and is zero when it's
empty.

Philip Guenther

(Per POSIX rules, any arithmetic expression is effectively evaluated in
double-quotes.  If you follow the spec from there, you can work out that if
an arithmetic expression that directly** contains double-quotes parses
correctly, then it must also parse correctly with the double-quotes
removed; adding double-quotes can't fix an arithmetic expression.  The
reverse is not true: adding double-quotes can break the parsing.  ** i.e.,
not counting quotes inside a nested parameter expansion or command
substitution.)

Re: New hw.perfpolicy behavior

2021-11-09 Thread Philip Guenther

On Tue, Nov 9, 2021 at 3:29 PM Jan Stary  wrote:

> On Nov 10 00:21:59, h...@stare.cz wrote:
> > Regarding C states, this machine has
> >
> > acpicpu0 at acpi0: C3(350@96 mwait.1@0x20), C2(500@64 mwait.1@0x10),
> C1(1000@1 mwait.1), PSS
> >
> > I suppose the cpu supports C1, C2, C3, but can someone please kindly
> > explain (or point to an explanation of) what the whole line says?
>
> For comparison, my Thinkpad T400 has
> acpicpu0 at acpi0: !C3(100@57 mwait.3@0x30), !C2(500@1 mwait.1@0x10),
> C1(1000@1 mwait.1), PSS
> What does the ! mean?
>

This is generated by sys/dev/acpi/acpicpu.c:acpicpu_print_one_cst() and at
this point should either be suppressed by default or documented in the
manpage; I'm not sure which.

To enumerate it from right to left:

 !C3(100@57 mwait.3@0x30)
 ^  ^   ^   ^ ^ ^   ^
 |   ||| |  |   |
 |   ||| |  |   \-- 'hints' passed to the actual
MWAIT instruction, or address for 'io' method
 |   ||| |  \-- fixed-function level bit flags: 1="HW
coordinated" 2="bus-master avoidance required"
 |   ||| |   OR '!' if this is a faked-up fallback
to the pre-CST C1-via-HALT (bios is broken)
 |   ||| \-- sleep method, one of mwait, io, or halt
 |   ||\-- advertised latency when exiting the state
 |   |\-- advertised power-consumption
 |   \-- the C state, duh
 \-- '!' to indicate this state is disabled because the CPU's LAPIC stops
in C2 or deeper (no ARAT)

Re: moncontrol(3): remove hertz()

2021-12-06 Thread Philip Guenther

On Mon, Dec 6, 2021 at 10:30 AM Scott Cheloha 
wrote:

> In the moncontrol(3) code we have this fallback function, hertz().
> The idea is to use an undocumented side effect of setitimer(2) to
> determine the width of the hardclock(9) tick.
>
> hertz() has a number of problems:
>


> So, I want to get rid of hertz() and replace it with nothing.  This is
> an extreme edge case.  sysctl(2) has to fail.  I do not think that is
> possible here, outside of stack corruption.
>

I agree: sysctl(KERN_CLOCKRATE) isn't blocked by pledge(2), so how would
they manage to break it?

(I'm not even sure the failover from profhz to hz should be kept: the
kernel does that itself in initclocks(9) so if profhz is 0 then hz is too,
sysctl(2) failed, and your CPU is on fire.)

ok guenther@

warning for -current builders: update your kernel before libc/ld.so!

2021-12-23 Thread Philip Guenther

For those building -current themselves, when you update past the commit
below you must be sure to build *and reboot to* a new kernel with the
change before you install a new libc or ld.so!

If you fail to do so then anything using the newer-than-kernel libc/ld.so
will coredump immediately, generally on the first mmap(2), and you'll need
to reboot to a bsd.rd or similar and put a matching kernel+libc+ld.so in
place.

This might be a good time to just install an official snapshot instead.

-- Forwarded message -
From: Philip Guenther 
Date: Thu, Dec 23, 2021 at 10:51 AM
Subject: CVS: cvs.openbsd.org: src
To: 


CVSROOT:/cvs
Module name:src
Changes by: guent...@cvs.openbsd.org2021/12/23 11:50:33

Modified files:
sys/kern   : syscalls.master vfs_syscalls.c kern_ktrace.c
 kern_pledge.c
sys/arch/sh/sh : trap.c
sys/arch/hppa/hppa: trap.c
sys/uvm: uvm_mmap.c
lib/libc/sys   : Makefile.inc
libexec/ld.so  : Makefile loader.c
libexec/ld.so/m88k: rtld_machine.c
usr.bin/kdump  : kdump.c
Added files:
libexec/ld.so  : syscall.h
Removed files:
lib/libc/sys   : ftruncate.c lseek.c mmap.c mquery.c pread.c
 preadv.c pwrite.c pwritev.c truncate.c
libexec/ld.so/m88k: syscall.h
libexec/ld.so/aarch64: syscall.h
libexec/ld.so/alpha: syscall.h
libexec/ld.so/amd64: syscall.h
libexec/ld.so/arm: syscall.h
libexec/ld.so/hppa: syscall.h
libexec/ld.so/i386: syscall.h
libexec/ld.so/mips64: syscall.h
libexec/ld.so/powerpc: syscall.h
libexec/ld.so/powerpc64: syscall.h
libexec/ld.so/riscv64: syscall.h
libexec/ld.so/sh: syscall.h
libexec/ld.so/sparc64: syscall.h

Log message:
Roll the syscalls that have an off_t argument to remove the explicit
padding.
Switch libc and ld.so to the generic stubs for these calls.
WARNING: reboot to updated kernel before installing libc or ld.so!

Time for a story...

When gcc (back in 1.x days) first implemented long long, it didn't (always)
pass 64bit arguments in 'aligned' registers/stack slots, with the result
that
argument offsets didn't match structure offsets.  This affected the nine
system
calls that pass off_t arguments:
ftruncate lseek mmap mquery pread preadv pwrite pwritev truncate

To avoid having to do custom ASM wrappers for those, BSD put an explicit pad
argument in so that the off_t argument would always start on a even slot and
thus be naturally aligned.  Thus those odd wrappers in lib/libc/sys/ that
use
__syscall() and pass an extra '0' argument.

The ABIs for different CPUs eventually settled how things should be passed
on
each and gcc 2.x followed them.  The only arch now where it helps is
landisk,
which needs to skip the last argument register if it would be the first
half of
a 64bit argument.  So: add new syscalls without the pad argument and on
landisk
do that skipping directly in the syscall handler in the kernel.  Keep compat
support for the existing syscalls long enough for the transition.

ok deraadt@

Re: hangman(6): skip retguard symbols

2021-12-24 Thread Philip Guenther

Skipping all leading double-underbar symbols makes sense to me.

Tempting to skip all symbols with more than one digit (or maybe just more
than one consecutive digit?), as guessing among per-chip symbols from, say,
the ar* or dc* families is an exercise in futility.

On Fri, Dec 24, 2021 at 5:23 AM Theo Buehler  wrote:

> __retguard_ symbols are easy to recognize and no real fun to guess.
> I suggest we skip them. Perhaps we should even skip all __* symbols?
>
> Index: ksyms.c
> ===
> RCS file: /cvs/src/games/hangman/ksyms.c,v
> retrieving revision 1.12
> diff -u -p -r1.12 ksyms.c
> --- ksyms.c 28 Jun 2019 13:32:52 -  1.12
> +++ ksyms.c 23 Dec 2021 18:28:02 -
> @@ -70,6 +70,9 @@ sym_getword(void)
> /* ignore symbols containing dots or dollar signs */
> if (strchr(sym, '.') != NULL || strchr(sym, '$') != NULL)
> continue;
> +   /* guessing retguard symbols is no fun */
> +   if (strncmp(sym, "__retguard", 10) == 0)
> +   continue;
>
> break;
> }
>
>

Re: Fix GNUism in bsd.dep.mk

2021-12-31 Thread Philip Guenther

On Fri, Dec 31, 2021 at 7:44 AM Christian Ehrhardt 
wrote:

> Here at genua, trying to build libpcap sometimes breaks in
> libpcap with the following error message:
>
> | Using $< in a non-suffix rule context is a GNUmake idiom \
> |(/data/git/ehrhardt/genuos/os/mk/bsd.dep.mk:47)
>
> The bug is in bsd.dep.mk where ${.IMPSRC} (aka $<) is used
> in a context where it is not neccessarily defined by OpenBSD make.
>
> Below is a diff to Use ${.ALLSRC:M*.y} instead of ${.IMPSRC} as
> the input file for yacc.
>
> The issue is with the rule for the grammar.h file that is generated
> by yacc from grammar.c. You can easily reproduce the bug with the
> following steps:
> - build libpcap from scratch: cd .../src/lib/libpcap && make clean all
> - remove the generated grammar.h file: rm obj*/grammar.h
> - build libpcap again (incremental build): make
> In normal builds this does not trigger as grammar.h is implicitly
> generated by the rule for grammar.c and when make checks for
> dependencies it simply finds grammar.h uptodate. However,
> incremental or parallel builds might decide to make grammar.h
> from grammar.y.
>
> Now, why is this only a problem for grammar.h but not for
> grammar.c? The answer to this question is burried deeply in
> OpenBSD's mk files.
>
> The snippet in bsd.dep.mk that triggers the error is a single
> rule statement that generates foo.c and foo.h from foo.y with a
> call to yacc -d. The rule is generated with a loop, i.e. it is not
> a prefix rule. However, a prefix rule context is required for
> the use of ${.IMPSRC} aka $<. For the .c file such a prefix
> rule is provided by bsd.sys.mk and this rule is in scope when
> make evaluates the yacc rule. However, for .h file generation
> from a .y file there is no such prefix rule defined in any of
> the Makefiles. Even if it were the .h suffix is missing from
> .SUFFIXES and the rule would not be considered.
>
> NOTE: The obvious way to fix this would be to use $f instead of
> ${.IMPSRC}. However, this does not work as $f is then missing
> the path prefix and yacc won't find it if an obj directory is
> used. This is probably the reason for the use of ${.IMPSRC} in
> the first place.
>

Ah, nice analysis!  The suggested fix looks safe to me (can't see how a .c
could have two .y direct prerequisites, so this can't be ambiguous).

ok guenther@

Re: less: fix use after free bug

2021-12-31 Thread Philip Guenther

On Fri, Dec 31, 2021 at 6:22 AM Tobias Stoeckmann 
wrote:

> Hi,
>
> it is possible to trigger a use after free bug in less with huge
> files or tight memory constraints. PoC with 100 MB file:
>
> dd if=/dev/zero bs=1024 count=102400 | tr '\0' 'a' > less-poc.txt
> ulimit -d 157286
> less less-poc.txt
>
> The linebuf and attr buffers in line.c are supposed to never be freed,
> since they are used for keeping (meta) data of current line in RAM.
>
> The linebuf and attr buffers are expanded in expand_linebuf. If linebuf
> could be expanded but attr could not, then returned pointers are freed,
> but linebuf variable still points to previous location. Subsequent
> accesses will therefore trigger a use after free bug.
>
> The easiest fix is to keep reallocated linebuf. The size of both buffers
> differ, but the code still assumes that the previous (smaller) size is
> the correct one.
>
> Thoughts on this approach?
>

Analysis makes sense.

To bikeshed slightly I would be inclined to do the work progressively,
perhaps like the diff below...but your diff works too.


Philip Guenther


Index: line.c
===
RCS file: /data/src/openbsd/src/usr.bin/less/line.c,v
retrieving revision 1.34
diff -u -p -r1.34 line.c
--- line.c  25 Oct 2021 19:54:29 -  1.34
+++ line.c  1 Jan 2022 06:24:31 -
@@ -96,16 +96,16 @@ expand_linebuf(void)

/* Just realloc to expand the buffer, if we can. */
char *new_buf = recallocarray(linebuf, size_linebuf, new_size, 1);
-   char *new_attr = recallocarray(attr, size_linebuf, new_size, 1);
-   if (new_buf == NULL || new_attr == NULL) {
-   free(new_attr);
-   free(new_buf);
-   return (1);
+   if (new_buf != NULL) {
+   char *new_attr = recallocarray(attr, size_linebuf,
new_size, 1);
+   linebuf = new_buf;
+   if (new_attr != NULL) {
+   attr = new_attr;
+   size_linebuf = new_size;
+   return (0);
+   }
}
-   linebuf = new_buf;
-   attr = new_attr;
-   size_linebuf = new_size;
-   return (0);
+   return (1);
 }

 /*

Re: WTERMSIG behavior difference

2022-01-07 Thread Philip Guenther

On Thu, Jan 6, 2022 at 10:47 PM Greg Steuck  wrote:

> I started by debugging a weird test failure in ghc. I narrowed it
> down to a simple C program which behaves differently between OpenBSD and
> FreeBSD. I stared at the headers of both systems for a bit and still
> don't see why on OpenBSD the program prints:
>
> uname -a ; cc ./a.c && ./a.out
> OpenBSD home.nest.cx 7.0 GENERIC.MP#28 amd64
> x 0 pret 130
>
> Yet on FreeBSD:
> uname -a ; cc ./a.c && ./a.out
> FreeBSD ... 13.0-STABLE FreeBSD 13.0-STABLE
> x 0 pret -2
>
> Where's the signed/unsigned confusion hiding?
>

No where.  The difference in behavior is that of 'sh' when signaled.  Run
your test programs under "ktrace -i" and compare the behavioral difference
of the child 'sh' process after reception of the SIGINT.

Philip Guenther

Re: new: lang/polyml

2022-01-10 Thread Philip Guenther

Yeah, it makes sense to move our base C environment to match the values
seen in the output of 'readelf' and in the broader programming environment.


Philip

On Mon, Jan 10, 2022 at 3:34 PM Mark Kettenis 
wrote:

> > Date: Sun, 09 Jan 2022 23:54:23 +0100
> > From: Leo Larnack 
> >
> > Daniel Dickman  wrote:
> >
> > > elfexport.cpp:352:56: error: use of undeclared identifier 'R_386_32'
> > > reloc.r_info = ELFXX_R_INFO(symbolNum, isFuncPtr ?
> > > HOST_DIRECT_FPTR_RELOC : HOST_DIRECT_DATA_RELOC);
> > >^
> > > elfexport.cpp:142:33: note: expanded from macro
> 'HOST_DIRECT_FPTR_RELOC'
> > > # define HOST_DIRECT_FPTR_RELOC R_386_32
> >
> > It looks like Poly/ML expects R_386_32 to be either in  of
> >  (it's located in  on Void Linux, for
> > example).  After a bit of searching, what I understand is that
> > OpenBSD/i386's (and other platforms')  contains
> > the right values, but doesn't follow the same naming convention
> > (RELOC_32), while other platforms do follow that convention.
> >
> > Hoping that I'm not doing something dumb, here is a diff that
> > standardizes the constants' names for these platforms. The new names
> > come from src/gnu/usr.bin/binutils-2.17/include/elf.
>
> Yes, I think this is a diff we want.  But it is probably best to
> review and test this on a per-architecture basis.
>
> Can you feed them to us one-by-one?
>
>
> > Index: libexec/ld.so/hppa/rtld_machine.c
> > ===
> > RCS file: /cvs/src/libexec/ld.so/hppa/rtld_machine.c,v
> > retrieving revision 1.43
> > diff -u -p -r1.43 rtld_machine.c
> > --- libexec/ld.so/hppa/rtld_machine.c 8 Jan 2022 06:49:41 -
>  1.43
> > +++ libexec/ld.so/hppa/rtld_machine.c 9 Jan 2022 22:03:27 -
> > @@ -164,7 +164,7 @@ _dl_md_reloc(elf_object_t *object, int r
> >   int type;
> >
> >   type = ELF_R_TYPE(rela->r_info);
> > - if (type == RELOC_NONE)
> > + if (type == R_TYPE(NONE))
> >   continue;
> >
> >   sym = object->dyn.symtab + ELF_R_SYM(rela->r_info);
> > @@ -191,7 +191,7 @@ _dl_md_reloc(elf_object_t *object, int r
> >  #endif
> >
> >   switch (type) {
> > - case RELOC_DIR32:
> > + case R_TYPE(DIR32):
> >   if (ELF_R_SYM(rela->r_info) && sym->st_name) {
> >   *pt = sr.obj->obj_base + sr.sym->st_value +
> >   rela->r_addend;
> > @@ -213,7 +213,7 @@ _dl_md_reloc(elf_object_t *object, int r
> >   }
> >   break;
> >
> > - case RELOC_PLABEL32:
> > + case R_TYPE(PLABEL32):
> >   if (ELF_R_SYM(rela->r_info)) {
> >   if (ELF_ST_TYPE(sr.sym->st_info) !=
> STT_FUNC) {
> >   DL_DEB(("[%x]PLABEL32: bad\n", i));
> > @@ -236,7 +236,7 @@ _dl_md_reloc(elf_object_t *object, int r
> >   }
> >   break;
> >
> > - case RELOC_IPLT:
> > + case R_TYPE(IPLT):
> >   if (ELF_R_SYM(rela->r_info)) {
> >   pt[0] = sr.obj->obj_base +
> sr.sym->st_value +
> >   rela->r_addend;
> > @@ -256,7 +256,7 @@ _dl_md_reloc(elf_object_t *object, int r
> >   }
> >   break;
> >
> > - case RELOC_COPY:
> > + case R_TYPE(COPY):
> >   {
> >   sr = _dl_find_symbol(symn,
> >   SYM_SEARCH_OTHER|SYM_WARNNOTFOUND|SYM_NOTPLT,
> > @@ -381,7 +381,7 @@ _dl_md_reloc_got(elf_object_t *object, i
> >   for (i = 0; i < numrela; i++, rela++) {
> >   Elf_Addr *r_addr = (Elf_Addr *)(ooff +
> rela->r_offset);
> >
> > - if (ELF_R_TYPE(rela->r_info) != RELOC_IPLT) {
> > + if (ELF_R_TYPE(rela->r_info) != R_TYPE(IPLT)) {
> >   _dl_printf("unexpected reloc 0x%x\n",
> >   ELF_R_TYPE(rela->r_info));
> >   return 1;
> > Index: libexec/ld.so/m88k/rtld_machine.c
> > ===
> > RCS file: /cvs/src/libexec/ld.so/m88k/rtld_machine.c,v
> > retrieving revision 1.31
> > diff -u -p -r1.31 rtld_machine.c
> > --- libexec/ld.so/m88k/rtld_machine.c 8 Jan 2022 06:49:42 -
>  1.31
> > +++ libexec/ld.so/m88k/rtld_machine.c 9 Jan 2022 22:03:27 -
> > @@ -99,17 +99,17 @@ _dl_md_reloc(elf_object_t *object, int r
> >
> >   type = ELF_R_TYPE(relas->r_info);
> >
> > - if (type == RELOC_GOTP_ENT && rel != DT_JMPREL)
> > + if (type == R_TYPE(GOTP_ENT) && rel != DT_JMPREL)
> >   continue;
> >
> > - if (type == RELOC_NONE)
> > +

Re: new: lang/polyml

2022-01-11 Thread Philip Guenther

On Tue, Jan 11, 2022 at 4:09 PM Daniel Dickman  wrote:

> On Mon, Jan 10, 2022 at 8:12 PM Leo Larnack  wrote:
> >
> > i386
> >
>
>
>
> with this diff I was able to install includes, rebuild ld.so and
> ctfconv. I've not managed to build a release yet.

...

Umm, with what diff?  There was no diff in nor attached to that message.
:-/

(That was a lot of lines of output.  I don't know about ports@, but my
expectation would be there would be *zero* reviewers of anything before,
say, the last 50 lines of output before the switch to actual compilation.
Standard "make lots of noise so when a failure occurs we can see the
leadup, but we'll ignore it otherwise" style of output, like a base build.
You read the lead up to the warnings and errors only.  )

Philip Guenther

Re: sbin/isakmpd: fix -Wunused-but-set-variable warnings

2022-01-15 Thread Philip Guenther

On Sat, Jan 15, 2022 at 12:36 PM Christian Weisgerber 
wrote:

> sbin/isakmpd: fix -Wunused-but-set-variable warnings
>
> The one in pf_key_v2.c could use an extra set of eyes, but I don't
> think there are any side effects.
>

The involved variables in pf_key_v2.c were added in 2000 as part of some
sort of sync-with-upstream, weren't used then, and never used after.
Deleting them seems perfectly fine to me.

ok guenther@ on the entire diff

Re: dt: make vmm tracepoints amd64 only

2022-01-17 Thread Philip Guenther

On Mon, Jan 17, 2022 at 5:02 AM Klemens Nanni  wrote:

> These don't hurt on !VMM architectures but I was still surprised to see
> them on e.g. sparc64:
>
> # arch -s ; btrace -l | grep vmm
> sparc64
> tracepoint:vmm:guest_enter
> tracepoint:vmm:guest_exit
>
> Like some network drivers, we could use __amd64__ to limit those to
> amd64 and save a few bits in all other kernels.
>
> Is this approach feasible or should we just ignore such one-offs?
>

If there's a non-trivial number of trace points in drivers, then it would
suggest that such tracepoints should be added/registered/pick-a-term by the
driver's attach routine.

Philip

Re: usr.sbin/ospf6ctl: fix -Wunused-but-set-variable warning

2022-01-19 Thread Philip Guenther

On Mon, Jan 17, 2022 at 6:36 AM Christian Weisgerber 
wrote:

> usr.sbin/ospf6ctl: fix -Wunused-but-set-variable warning
>
> Maybe this is uncompleted code, but I think we can remove it for
> the time being.
>
> M  usr.sbin/ospf6ctl/ospf6ctl.c
>

Agreed.  ok guenther@

Re: usr.sbin/eigrpd: fix -Wunused-but-set-variable warning

2022-01-19 Thread Philip Guenther

On Sun, Jan 16, 2022 at 1:51 PM Christian Weisgerber 
wrote:

> usr.sbin/eigrpd: fix -Wunused-but-set-variable warning
>

ok guenther@

Re: usr.sbin/dvmrpctl: fix -Wunused-but-set-variable warning

2022-01-19 Thread Philip Guenther

On Sun, Jan 16, 2022 at 12:51 PM Christian Weisgerber 
wrote:

> usr.sbin/dvmrpctl: fix -Wunused-but-set-variable warning
>
> This looks like /* not yet implemented */, but the companion functions
> show_rib_detail_msg() and show_mfc_detail_msg() are equally empty.
>

ok guenther@

Re: usr.bin/mg: fix -Wunused-but-set-variable warnings

2022-01-19 Thread Philip Guenther

On Sun, Jan 16, 2022 at 8:10 AM Christian Weisgerber 
wrote:

> usr.bin/mg: fix -Wunused-but-set-variable warnings
>
> * strtonum() is only called to verify that a string is numerical, the
>   return value is unused.
> * inlist is no longer used after the code was refactored.
>
> OK?
>

ok guenther@

Re: Properly check if ACPI devices are enabled

2022-01-24 Thread Philip Guenther

On Mon, Jan 24, 2022 at 11:41 AM Mark Kettenis 
wrote:

> > Date: Mon, 24 Jan 2022 20:19:46 +0100
> > From: Anton Lindqvist 
> >
> > On Mon, Jan 24, 2022 at 05:31:49PM +0100, Mark Kettenis wrote:
> > > Currently we attach ACPI devices that are present in a machine.
> > > However, in some cases ACPI devices can be present, but not enabled.
> > > Attaching a device driver to devices that are not enabled is not a
> > > good idea since reading and writing from/to its registers will fail
> > > and the driver will malfunction in interesting ways.  Such as a com(4)
> > > serial port that is misdetected and hangs the kernel when it is
> > > actually opened.
> > >
> > > The diff below makes sure we only enable devices that are actually
> > > enabled.  This may cause some devices to disappear in OpenBSD.
> > > However those devices should have been unusable anyway, so that isn't
> > > an issue.
> > >
> > > ok?
> >
> > According to the ACPI specification[1]:
> >
> > > A device can only decode its hardware resources if both bits 0 and 1
> are
> > > set. If the device is not present (bit [0] cleared) or not enabled (bit
> > > [1] cleared), then the device must not decode its resources.
>
> Just before that it says:
>
>   If bit [0] is cleared, then bit 1 must also be cleared (in other
>   words, a device that is not present cannot be enabled).
>
> > Should we therefore check for presence of both STA_PRESENT and
> > STA_ENABLED?
>
> So according to the ACPI specification we don't need to do that.
> Should we do it just to be safe?
>

Unless you're taking money bets about this being the one thing in the ACPI
spec that some vendor won't screw up, doing both seems "can't be worse; can
be better".

Philip

Re: touch(1): don't leak descriptor if futimens(2) fails

2022-01-27 Thread Philip Guenther

On Thu, Jan 27, 2022 at 4:03 PM Scott Cheloha 
wrote:

> If futimens(2) fails here then close(2) is not called and we leak the
> descriptor.
>

Good catch.

I think futimens(2) and close(2) failures are exotic enough to warrant
> printing the system call name.
>

I don't understand this.  Can you give an example of an error message that
touch currently might emit where knowing that the failed call was
futimens() or close() would affect the analysis of how to deal with it?  I
mean, it looks like the only errors that futimens() could really return are
EROFS, EIO, and EPERM (implies a race by different users to create the
file), and close() could only return EIO.  For any of those errors, you're
going to handle them the same whether they're from open, futimens, or
close, no?

Philip Guenther

Re: touch(1): don't leak descriptor if futimens(2) fails

2022-01-28 Thread Philip Guenther

On Fri, Jan 28, 2022 at 7:37 AM Scott Cheloha 
wrote:

> On Fri, Jan 28, 2022 at 07:28:40AM -0700, Todd C. Miller wrote:
> > On Thu, 27 Jan 2022 20:02:18 -0800, Philip Guenther wrote:
> >
> > > > I think futimens(2) and close(2) failures are exotic enough to
> warrant
> > > > printing the system call name.
> > >
> > > I don't understand this.  Can you give an example of an error message
> that
> > > touch currently might emit where knowing that the failed call was
> > > futimens() or close() would affect the analysis of how to deal with
> it?  I
> > > mean, it looks like the only errors that futimens() could really
> return are
> > > EROFS, EIO, and EPERM (implies a race by different users to create the
> > > file), and close() could only return EIO.  For any of those errors,
> you're
> > > going to handle them the same whether they're from open, futimens, or
> > > close, no?
> >
> > I agree.  The actual syscall in this case is pretty much irrelevant.
> > The mostly likely failure is due to an I/O error of some kind.
>
> Alright, you have both convinced me.
>
> We'll go with this patch?
>

Sure (but I think the rval assignment and warn() call should be in a
consistent order).

Philip Guenther

Re: [PATCH v2 3/3] script(1): fix exit status wording, use 125 for general failure

2022-01-28 Thread Philip Guenther

On Fri, Jan 28, 2022 at 5:28 AM наб 
wrote:

> This is a base-line attempt at separating errors from the child from the
> ones from script itself ‒ 125 is the general-purpose code in POSIX
> utilities that exec() (with 126 being ENOEXEC and 127 ‒ ENOENT)
>

I just checked the draft of the next revision of the POSIX spec and can
find no reference to 125 being either recommended or required as the status
for general exec failures.  For example, the spec for xargs includes this:

EXIT STATUS
The following exit values shall be returned:
 0All invocations of utility returned exit
status zero.
 1-125 A command line meeting the specified
requirements could
   not be assembled, one or more of the
invocations of utility
   returned a non-zero exit status, or some
other error occurred.
 126The utility specified by utility was found but
could not be invoked.
 127The utility specified by utility could not be
found.

I'm confident that this isn't a change from previous versions.  Where is
this proposed use of 125 documented?

Philip Guenther

Re: disable libc sys wrappers?

2020-07-08 Thread Philip Guenther

On Wed, Jul 8, 2020 at 8:06 AM Theo de Raadt  wrote:

> Mark Kettenis  wrote:
>
> > > From: "Theo de Raadt" 
> > > Date: Wed, 08 Jul 2020 09:42:41 -0600
> > >
> > > I think we need something like this.
> > >
> > > Documenting it will be a challenge.
> > >
> > > I really don't like the name as is too generic, when the control is
> only
> > > for a narrow set of "current time" system calls.
> >
> > I'm not sure we should be using getenv() in this early initialization
> > function though.
>
> Ah, you worry about the static "#ifndef PIC / early_static_init" versus
> "PIC ld.so" environ setup, and this very early getenv() call might not be
> looking at the environ global.
>

It's late enough in the process (after a possible call
to early_static_init(), and definitely after any fixup by ld.so) that it
should work Just Fine.

I would flip the test to check the environment before running issetugid(2)
because the syscall is more expensive and it's nice not to clutter the
kdump output.  ;-)

Philip Guenther

Re: timekeep: fixing large skews on amd64 with RDTSCP

2020-07-17 Thread Philip Guenther

On Thu, Jul 16, 2020 at 4:55 PM Scott Cheloha 
wrote:

> > On Jul 16, 2020, at 19:36, Theo de Raadt  wrote:
> >
> >> Note the third sentence.
> >>
> >> Given that, I reason that a serializing instruction before *and* after
> >> the RDTSC should freeze it in place.
> >
> > I haven't seen anyone read it that way.
>
> They say that instructions after RDTSC can run before it because
> it isn't a serializing instruction.
>
> Do we want that?
>
> And then, consider this bit of programming advice.  Also from the
> ISA reference (Vol. 2B 4-547):
>
> > If software requires RDTSC to be executed prior to execution of any
> > subsequent instruction (including any memory accesses), it can
> > execute the sequence LFENCE immediately after RDTSC.
>
> What other reading is possible given this documentation?
>
> What is your interpretation?  Am I missing something?
>

If you're trying to time a sequence of instructions via rdtsc, then you
have to put an lfence after the first rdtsc (so that the sequence being
timed doesn't get lifted to before it completes) and an lfence before the
second rdtsc (so that the sequence is actually complete before the second
one).  In this case, is it a problem if instructions after the rdtsc that
are not dependent on its result may be started before it's actually
complete?  If not, then there's no reason to bracket that side.

Philip Guenther

Re: incorrect result from getppid for ptraced processes

2020-09-04 Thread Philip Guenther

On Fri, Sep 4, 2020 at 1:06 PM Mateusz Guzik  wrote:

> On 9/4/20, Vitaliy Makkoveev  wrote:
> > On Fri, Sep 04, 2020 at 05:24:42PM +0200, Mateusz Guzik wrote:
> >> getppid blindly follows the parent pointer and reads the pid.
> >>
> >> The problem is that ptrace reparents the traced process, so in
> >> particular if you gdb -p $something, the target proc will start seeing
> >> gdb instead of its actual parent.
> >>
> >> There is a lot to say about the entire reparenting business or storing
> >> the original pid in ps_oppid (instead of some form of a reference to
> >> the process).
> >>
> >> However, I think the most feasible fix for now is the same thing
> >> FreeBSD did: *always* store the actual parent pid in ps_oppid. This
> >> means all repareting will keep updating it (most notably when
> >> abandoning children on exit), while ptrace will skip that part.
> >>
> >> Side effect of such a change be that getppid will stop requiring the
> >> kernel lock.
> >>
> >
> > Thanks for report. But we are in beta stage now so such modification is
> > impossible until next iteration.
> >
> > Since original parent identifier is stored as `ps_oppid' while process
> > is traced we just return it to userland for this case. This is the way I
> > propose to fix this bug for now.
> >
> > Comments? OKs?
> >
> > Index: sys/kern/kern_prot.c
> > ===
> > RCS file: /cvs/src/sys/kern/kern_prot.c,v
> > retrieving revision 1.76
> > diff -u -p -r1.76 kern_prot.c
> > --- sys/kern/kern_prot.c  9 Jul 2019 12:23:25 -   1.76
> > +++ sys/kern/kern_prot.c  4 Sep 2020 21:12:15 -
> > @@ -84,7 +84,11 @@ int
> >  sys_getppid(struct proc *p, void *v, register_t *retval)
> >  {
> >
> > - *retval = p->p_p->ps_pptr->ps_pid;
> > + if (p->p_p->ps_flags & PS_TRACED)
> > + *retval = p->p_p->ps_oppid;
> > + else
> > +     *retval = p->p_p->ps_pptr->ps_pid;
> > +
> >   return (0);
> >  }
>
> This is definitely a bare minimum fix, but it does the job.
>

ptrace() has behaved like this for the life of OpenBSD and an indefinite
number of years previous in the BSD releases.  What has happened that a
definitely incomplete fix is needed Right Now?


Philip Guenther

Re: incorrect result from getppid for ptraced processes

2020-09-04 Thread Philip Guenther

On Fri, Sep 4, 2020 at 2:59 PM Mateusz Guzik  wrote:

> On 9/5/20, Philip Guenther  wrote:
> > On Fri, Sep 4, 2020 at 1:06 PM Mateusz Guzik  wrote:
> >
> >> On 9/4/20, Vitaliy Makkoveev  wrote:
> >> > On Fri, Sep 04, 2020 at 05:24:42PM +0200, Mateusz Guzik wrote:
> >> >> getppid blindly follows the parent pointer and reads the pid.
> >> >>
> >> >> The problem is that ptrace reparents the traced process, so in
> >> >> particular if you gdb -p $something, the target proc will start
> seeing
> >> >> gdb instead of its actual parent.
> >> >>
> >> >> There is a lot to say about the entire reparenting business or
> storing
> >> >> the original pid in ps_oppid (instead of some form of a reference to
> >> >> the process).
> >> >>
> >> >> However, I think the most feasible fix for now is the same thing
> >> >> FreeBSD did: *always* store the actual parent pid in ps_oppid. This
> >> >> means all repareting will keep updating it (most notably when
> >> >> abandoning children on exit), while ptrace will skip that part.
> >> >>
> >> >> Side effect of such a change be that getppid will stop requiring the
> >> >> kernel lock.
> >> >>
> >> >
> >> > Thanks for report. But we are in beta stage now so such modification
> is
> >> > impossible until next iteration.
> >> >
> >> > Since original parent identifier is stored as `ps_oppid' while process
> >> > is traced we just return it to userland for this case. This is the way
> >> > I
> >> > propose to fix this bug for now.
> >> >
> >> > Comments? OKs?
> >> >
> >> > Index: sys/kern/kern_prot.c
> >> > ===
> >> > RCS file: /cvs/src/sys/kern/kern_prot.c,v
> >> > retrieving revision 1.76
> >> > diff -u -p -r1.76 kern_prot.c
> >> > --- sys/kern/kern_prot.c  9 Jul 2019 12:23:25 -   1.76
> >> > +++ sys/kern/kern_prot.c  4 Sep 2020 21:12:15 -
> >> > @@ -84,7 +84,11 @@ int
> >> >  sys_getppid(struct proc *p, void *v, register_t *retval)
> >> >  {
> >> >
> >> > - *retval = p->p_p->ps_pptr->ps_pid;
> >> > + if (p->p_p->ps_flags & PS_TRACED)
> >> > + *retval = p->p_p->ps_oppid;
> >> > + else
> >> > + *retval = p->p_p->ps_pptr->ps_pid;
> >> > +
> >> >   return (0);
> >> >  }
> >>
> >> This is definitely a bare minimum fix, but it does the job.
> >>
> >
> > ptrace() has behaved like this for the life of OpenBSD and an indefinite
> > number of years previous in the BSD releases.  What has happened that a
> > definitely incomplete fix is needed Right Now?
>
> I don't see how this reads as a demand this is fixed Right Now.
>

I didn't call it a demand, but the point stands: what has changed?


I don't see how the fix is incomplete either. It can be done better
> with more effort, but AFAICS the above results in correct behavior.
>

There are at least 2 other uses of ps_pptr->ps_pid that should also change,
unless you like coredumps and ps disagreeing with getppid(), and someone
needs to think how it affects doas.

Philip Guenther

Re: Infinite loop in uvm protection mapping

2020-10-19 Thread Philip Guenther

On Mon, Oct 19, 2020 at 3:13 PM Tom Rollet  wrote:

> Hi,
>
> I'm starting to help in the development of the dt device.
>
> I'm stuck on permission handling of memory. I'm trying to allocate a
> page in kernel with read/write protections, fill the allocated page
> with data then change the permissions to  read/exec.
>
> Snippet of my code:
>
>   addr = uvm_km_alloc(kernel_map, PAGE_SIZE);
>
>  [...] (memcpy data in allocated page)
>
>   uvm_map_protect(kernel_map, addr, addr + PAGE_SIZE, PROT_READ
>  | PROT_EXEC, FALSE)))
>

This is same usage as seen in the 'sti' driver...which is on hppa only, so
while it's presumably the correct usage of uvm_km_alloc() and
uvm_map_protect(), I don't think uvm_map_protect() has been used on
kernel-space on amd64 (or possibly all non-hppa archs) before in OpenBSD.
Whee?


It triggers the following error at boot time when executing
> the uvm_map_protect function.
>
> uvm_fault(0x81fb2c90, 0x7ffec0008000, 0, 2) -> e kernel: page fault
> trap, code=0 Stopped atpmap_write_protect+0x1f5:  lock andq
> $-0x3,0(%rdi)
>
> Trace:
>
> pmap_write_protect(82187b28,80002255b000,80002255c000,
>  5,50e8b70481f4f622,fd81b6567e70) at pmap_write_protect+0x212
> uvm_map_protect(82129ae0,80002255b000,80002255c000
>  ,5,0,82129ae0) at uvm_map_protect+0x501
> dt_alloc_kprobe(815560e0,80173900,e7ef01a2855152cc,
>  82395c98,0,815560e0) at dt_alloc_kprobe+0x1ff
> dt_prov_kprobe_init(2333e28db00d3edd,0,82121150,0,0,
>  824d9008) at dt_prov_kprobe_init+0x1d9
> dtattach(1,821fb384,f,1,c2ee1c3f472154e,2dda28) at dtattach+0x5d
> main(0,0,0,0,0,1) at main+0x419
>
> The problem comes from the loop in pmap_write_protect
> (sys/arch/amd64/amd64/pmap.c:2108) that is executed
> infinity in my case.
>
> Entry of function pmap_write_protect:
>  sva:  80002250A000
>  eva:  80002250B000
>
> After &= PG_FRAME (line 2098-2099)
>  sva= F80002250A000
>  eva= F80002250B000
>
>   loop:  (ligne 2108)
>
>   first iteration:
>  va   = F80002250A000
>  eva = F80002250B000
>  blockend = 080012240
>
...

> Does anyone have an idea how to fix this issue?


So, blockend is clearly wrong for va and eva.  I suspect the use of
L2_FRAME here:
   blockend = (va & L2_FRAME) + NBPD_L2;

is wrong here and it should be
   blockend = (va & VA_SIGN_NEG(L2_FRAME)) + NBPD_L2;

or some equivalent expression to keep all the bits above the frame.


Philip Guenther

Re: [patch] const cclasses reference in ksh

2020-10-26 Thread Philip Guenther

On Mon, Oct 26, 2020 at 8:35 AM Matthew Martin  wrote:

> Recently cclasses in lib/libc/gen/charclass.h was made const.[1]
> Mark the pointer used to walk the array in ksh const as well.
>
> 1: https://marc.info/?l=openbsd-cvs&m=160256416506433&w=2
>

Ugh, I totally missed that reach-around.

ok guenther@



> diff --git misc.c misc.c
> index 9e6e9db5e76..7226f74eccf 100644
> --- misc.c
> +++ misc.c
> @@ -713,7 +713,7 @@ do_gmatch(const unsigned char *s, const unsigned char
> *se,
>  static int
>  posix_cclass(const unsigned char *pattern, int test, const unsigned char
> **ep)
>  {
> -   struct cclass *cc;
> +   const struct cclass *cc;
> const unsigned char *colon;
> size_t len;
> int rval = 0;
>
>

Re: [PATCH] ifconfig: keep track of allowed ips pointer correctly

2020-10-27 Thread Philip Guenther

On Tue, Oct 27, 2020 at 9:18 AM Jason A. Donenfeld  wrote:
...

> @@ -5721,16 +5720,18 @@ growwgdata(size_t by)
> if (wg_aip != NULL)
> wg_aip = (void *)wg_interface + aip_offset;
>
> -   ret = (void *)wg_interface + wgdata.wgd_size - by;
> -   bzero(ret, by);
> -
> -   return ret;
> +   bzero((void *)wg_interface + wgdata.wgd_size - by, by);
>  }
>

Total side note, but this caught my eye as relying on the gcc extension to
do pointer arithmetic on "void *" pointers.  At least historically we've
avoided relying on that, which is easy in this case by just casting to
"char *" instead.

Philip Guenther

Re: Diff: Introductory Clause Comma Crap

2020-10-31 Thread Philip Guenther

On Sat, Oct 31, 2020 at 10:19 AM VARIK VALEFOR  wrote:

> This message contains even more grammatical fixes for the OpenBSD
> manual pages.  To ensure that the changes which are proposed in this
> message can be justified, this message primarily fixes only a single
> type of error: the absence of commas after introductory clauses.
>

As a procedural side-note, I would like to suggest that before going on a
quest to make a change that touches so many files cross OpenBSD's code
base, that it would be wise to send out a diff with just a couple examples
and verify that the particular item of concern and the proposed fix is
agreed upon before spending the time to search and edit many other pages.

This is true not just for documentation changes but code changes, of
course: doing lots of work before there's "buy-in" is risking your time.

Philip Guenther

Re: powerpc ld.lld fix

2020-11-01 Thread Philip Guenther

Makes sense.  This code is just the space reservation, the relocation
generation or whatever fills them in is suppressed already, yes?  Assuming
so, r+

On Sat, Oct 31, 2020 at 2:46 PM Mark Kettenis 
wrote:

> > Date: Sat, 10 Oct 2020 23:19:19 +0200 (CEST)
> > From: Mark Kettenis 
> >
> > On powerpc with the secure-plt ABI we need a .got section, even if the
> > _GLOBAL_OFFSET_TABLE_ symbol isn't referenced.  This is needed because
> > the first three entries of the GOT are used by the dynamic linker.
> >
> > With this fix I can build executables of all flavours (including
> > -static/-nopie).
>
> Turns out that adding these GOT entries when using "ld -r" is a bad
> idea.  Dif below fixes that.
>
> ok?
>
>
> Index: gnu/llvm/lld/ELF/SyntheticSections.cpp
> ===
> RCS file: /cvs/src/gnu/llvm/lld/ELF/SyntheticSections.cpp,v
> retrieving revision 1.2
> diff -u -p -r1.2 SyntheticSections.cpp
> --- gnu/llvm/lld/ELF/SyntheticSections.cpp  11 Oct 2020 13:10:13
> -  1.2
> +++ gnu/llvm/lld/ELF/SyntheticSections.cpp  31 Oct 2020 23:37:11 -
> @@ -604,7 +604,7 @@ GotSection::GotSection()
>// ElfSym::globalOffsetTable.
>if (ElfSym::globalOffsetTable && !target->gotBaseSymInGotPlt)
>  numEntries += target->gotHeaderEntriesNum;
> -  else if (config->emachine == EM_PPC)
> +  else if (config->emachine == EM_PPC && !config->relocatable)
>  numEntries += target->gotHeaderEntriesNum;
>  }
>
>
>

Re: Fix ilogb(3)

2020-11-07 Thread Philip Guenther

On Fri, Nov 6, 2020 at 4:51 PM George Koehler  wrote:

> Your ilogb fix is ok gkoehler@
>

It's annoying that C and/or ieee754 and the original hardware
implementation in the x87 instructions diverged in their definitions, but
the former is what matters and libm needs to follow that.  ok guenther@

> On Sat, 31 Oct 2020 16:09:07 +0100 (CET)
> Mark Kettenis  wrote:
>
> > - Dropping the amd64 and i386 versions.  Fixing the corner cases in
> >   assembly is hard, and the C implementation should be fast enough for
> >   regular floating-point values.
>
> The amd64 and i386 assembly uses the x87 fxtract instruction.  I feel
> that x87 instructions are obsolete on amd64.  <...>

Umm, no?  The amd64 ABI defines "long double" as matching the format used
by the x87 instructions; a function returning that type returns it in the
x87 %st(0) register and a review of libm finds many of the functions are
naturally implemented by the native x87 instructions.

I believe the main issue in this case is that the standard evolved away
from what Kahan originally did as a consultant with Intel.  For a note on
how the ieee754 standard originated in (large) part from what Kahan did
with Intel, see
  https://people.eecs.berkeley.edu/~wkahan/ieee754status/754story.html

I'm not nearly enough of a numerical analyst to judge the decision of the
standard.

Philip Guenther

Re: stdio: fclose(3), fopen(3): intended locking hierarchy?

2020-11-27 Thread Philip Guenther

lflush() safe
without
   a lock?  Mumble...

Now, back to that first assumption: if you're not willing to assume
it then the "is allocated" test needs to not use _flags but be some
other state change which can be relied on to not have false-positives,
but otherwise the other bits above still apply, I believe.  Would
be a major ABI change...and if we do that there's like 3 other
things we should do at the same time (merge __sfileext into FILE,
grow _file to an int, and maybe move the recursive mutex for
f{,un}lockfile() into FILE?)


Philip Guenther

Re: stdio: fclose(3), fopen(3): intended locking hierarchy?

2020-11-28 Thread Philip Guenther

On Fri, Nov 27, 2020 at 10:35 PM Philip Guenther  wrote:
...

> So yeah, maybe it does work to:
> 1) make __sfp() FLOCKFILE() the allocated FILE before unlocking sfp_mutex
> 2) update f{,d,mem,un}open() and open_*memstream() to match (1), unlocking
>in all paths when the FILE is safe to be accessed by _fwalk(), and
> locking
>sfp_mutex around the zeroing of _flags.
> 3) make fclose() and freopen() also lock sfp_mutex around the zeroing of
> _flags
>(should add an _frelease() to findfp.c that does this dance for (2) and
> (3))
> 4) make _fwalk() take sfp_mutex, and maybe also a FILE* so the setting of
>__SIGN can be done under the lock?
>
5) decide how/whether to handle adjust the FLOCKFILE placement in the
> _fwalk()
>tree: is the testing of the "is line-buffered" flag in lflush() safe
> without
>a lock?  Mumble...
>

After thinking through states more, #4 isn't safe: _fwalk() can't take
sfp_mutex because it's called from __srefill() which has its FILE locked,
which would reverse the order: two concurrent calls to __srefill() from
line-buffered FILEs could have one in _fwalk() blocking on locking the
other, while the other blocks on the sfp_mutex for _fwalk().

Hmm, there's currently a loop between two __srefill() calls like that, as
there's nothing to force visibility of the __SIGN flag between CPUs so they
could try to lock each other.  Grrr.

Time to check other BSDs and see if they have a good solution to this...

Philip

>
> Now, back to that first assumption: if you're not willing to assume
> it then the "is allocated" test needs to not use _flags but be some
> other state change which can be relied on to not have false-positives,
> but otherwise the other bits above still apply, I believe.  Would
> be a major ABI change...and if we do that there's like 3 other
> things we should do at the same time (merge __sfileext into FILE,
> grow _file to an int, and maybe move the recursive mutex for
> f{,un}lockfile() into FILE?)
>
>
> Philip Guenther
>
>

Re: kvm_getfiles and KERN_FILE_BYFILE

2020-11-28 Thread Philip Guenther

On Thu, Nov 26, 2020 at 1:08 PM Martijn van Duren <
openbsd+t...@list.imperialat.at> wrote:

> I'm currently playing around a bit with kvm_getfiles and found that I
> couldn't use KERN_FILE_BYFILE with DTYPE_SOCKET.
> According to kvm_getfiles(3):
>  For KERN_FILE_BYFILE the recognized file types are defined in
>  :
>
>DTYPE_VNODE   files and devices
>DTYPE_SOCKET  sockets, regardless of domain
>DTYPE_PIPEpipes and FIFOs
>DTYPE_KQUEUE  kqueues
>
> But these defines are under ifdef _KERNEL.
>
> So is the manpage lying here, or should the defines be hoisted out
> of the ifdef?
>

Let's go ahead and hoist them: FreeBSD and NetBSD already have.  If
possible, the diff to do that should also simplify the #include bits in
these files:
usr.bin/netstat/inet.c
usr.bin/fstat/fstat.c
    usr.bin/fstat/fuser.c
usr.bin/systat/netstat.c


Philip Guenther

Re: kvm_getfiles and KERN_FILE_BYFILE

2020-11-29 Thread Philip Guenther

On Sun, Nov 29, 2020 at 12:14 PM Martijn van Duren <
openbsd+t...@list.imperialat.at> wrote:

> On Sat, 2020-11-28 at 16:23 -0800, Philip Guenther wrote:
> > On Thu, Nov 26, 2020 at 1:08 PM Martijn van Duren <
> openbsd+t...@list.imperialat.at> wrote:
> > > I'm currently playing around a bit with kvm_getfiles and found that I
> > > couldn't use KERN_FILE_BYFILE with DTYPE_SOCKET.
> > > According to kvm_getfiles(3):
> > >  For KERN_FILE_BYFILE the recognized file types are defined in
> > >  :
> > >
> > >DTYPE_VNODE   files and devices
> > >DTYPE_SOCKET  sockets, regardless of domain
> > >DTYPE_PIPEpipes and FIFOs
> > >DTYPE_KQUEUE  kqueues
> > >
> > > But these defines are under ifdef _KERNEL.
> > >
> > > So is the manpage lying here, or should the defines be hoisted out
> > > of the ifdef?
> > >
> >
> >
> > Let's go ahead and hoist them: FreeBSD and NetBSD already have.  If
> possible, the diff to do that should also simplify the #include bits in
> these files:
> > usr.bin/netstat/inet.c
> > usr.bin/fstat/fstat.c
> > usr.bin/fstat/fuser.c
> > usr.bin/systat/netstat.c
> >
> >
> > Philip Guenther
> >
>
> The others have the #endif/#ifdef break rather low in the file.
> Personally I reckon it's better reading if the common code is more
> towards the top.
>
> OK?
>

ok guenther@

How do the userland clean up bits look?


Philip Guenther

Re: stdio: fclose(3), fopen(3): intended locking hierarchy?

2020-11-30 Thread Philip Guenther

On Mon, Nov 30, 2020 at 6:10 PM Scott Cheloha 
wrote:

> On Sat, Nov 28, 2020 at 01:41:50PM -0800, Philip Guenther wrote:
>
...

> > After thinking through states more, #4 isn't safe: _fwalk() can't take
> > sfp_mutex because it's called from __srefill() which has its FILE locked,
> > which would reverse the order: two concurrent calls to __srefill() from
> > line-buffered FILEs could have one in _fwalk() blocking on locking the
> > other, while the other blocks on the sfp_mutex for _fwalk().
>
> This part in __srefill():
>
> /*
>  * Before reading from a line buffered or unbuffered file,
>  * flush all line buffered output files, per the ANSI C
>  * standard.
>  */
>
...

> Where in the standard(s) is this behavior required?  I'm not even sure
> how to look for it.
>

Pick a version of the C standard and search for "buffered" until you find
something like this, which is part of 7.19.3p3 in the draft I'm looking at:

   <...>  When a stream is line buffered, characters are intended to be
   transmitted to or from the host environment as a block when a new-line
character is
   encountered. Furthermore, characters are intended to be transmitted as a
block to the host
   environment when a buffer is filled, when input is requested on an
unbuffered stream, or
   when input is requested on a line buffered stream that requires the
transmission of
   characters from the host environment. Support for these characteristics
is
   implementation-defined, and may be affected via the setbuf and setvbuf
functions.

Effectively the same text appears in the POSIX standard in XSH 2.5p6.

Basically, you're allowed to stop doing that by instead printing out your
cell-phone number so that everyone who wants to complain that their program
failed to output its prompt before blocking for input can call and scream
at you.  :D

Philip Guenther

Re: Use SMR_TAILQ for `ps_threads'

2020-12-02 Thread Philip Guenther

On Tue, Dec 1, 2020 at 5:48 AM Martin Pieuchot  wrote:
...

> exit1() for a multi-threaded process does the following:
>
> atomic_setbits_int(&p->p_flag, P_WEXIT);
> single_thread_check(p, 0);  // directly or via single_thread_set()
> SMR_TAILQ_REMOVE_LOCKED(&pr->ps_threads, p, p_thr_link);
>
> Note that exit1() can't be called in parallel.
>

Well, multiple threads in a process can easily hit exit1() practically
concurrently, under the limitations of the kernel lock.  If anything in
exit1() sleeps, it has to be ready for some other thread to start into
exit1()!


If exit1() is called with EXIT_NORMAL, the current thread will start by
> setting `ps_single' then iterate over `ps_thread'.  This is done before
> updating `ps_thread' so there's no race in that case.
>
> If exit1() is called with EXIT_THREAD, the current thread might end up
> removing itself from `ps_thread' while another one is iterating over it.
> Since we're currently concerned about the iteration in single_thread_set()
> notice that `P_WEXIT' is checked first.  So if my understanding is correct
> is should be enough to do the following:
>
> SMR_TAILQ_REMOVE_LOCKED(&pr->ps_threads, p, p_thr_link);
> smr_barrier();
>
> That would delays exit1() a bit but doesn't involve callback which is
> IMHO simpler.
>
> Did I miss anything?
>

What does it take to be sure that an "is empty" (or "only contains one")
test is _really_ true with SMR?

Philip

Re: rw_lock_held() & friends

2020-12-07 Thread Philip Guenther

On Mon, Dec 7, 2020 at 11:16 AM Alexandr Nedvedicky <
alexandr.nedvedi...@oracle.com> wrote:

> What's the plan for rw_write_held()? Currently macro is true, if whoever
> holds
> the lock.
>
> >
> > +#define  rw_write_held(rwl)  (rw_status(rwl) == RW_WRITE)
>

Nope.  rw_status() returns RW_WRITE_OTHER if a different thread holds the
lock.


Philip

Re: Make pthread_np.h standalone

2019-05-27 Thread Philip Guenther

On Mon, May 27, 2019 at 7:36 AM Jeremie Courreges-Anglas 
wrote:

>
> A bunch of our ports expect pthread_np.h to be standalone, alas our
> version doesn't include pthread.h.  The diff below should help us get
> rid of some patches in (at least) mongodb, mono, gnustep and webkitgtk4.
>
> ok?
>

ok guenther@

Re: usr.bin/getconf: Add reporting LONG_BIT

2019-05-27 Thread Philip Guenther

On Mon, May 27, 2019 at 3:43 PM Brian Callahan  wrote:

> Below is a small diff in response to some configuration attempts
> found in ports land.
> lang/ponyc uses $(shell getconf LONG_BIT) in its Makefile to
> determine whether or not we're on a 64-bit platform.
>

It needs to know that at the scripting level, outside of the C code itself
where it can directly test LONG_BIT (or perhaps better, _LP64)?  Huh.

> However, our getconf(1) doesn't support reporting LONG_BIT.
> This diff enables it. GNU/FreeBSD/DragonFly/MacOS getconf reports
> LONG_BIT, but NetBSD getconf does not.
> Desirable? OK?
>

That's the way to do it, I just have this vague "what tempting lunacy has
led them to this?" lurking in my mind.

Philip Guenther

logger(1): add -c option for LOG_CONS

2019-06-16 Thread Philip Guenther



Testing something else I needed to call syslog(3) with LOG_CONS.  Diff 
below adds support to logger(1) for doing that.  Option choice is 
compatible with NetBSD.

ok?

Philip Guenther

Index: logger.1
===
RCS file: /data/src/openbsd/src/usr.bin/logger/logger.1,v
retrieving revision 1.21
diff -u -p -r1.21 logger.1
--- logger.15 Jun 2013 17:35:12 -   1.21
+++ logger.116 Jun 2019 19:19:26 -
@@ -38,7 +38,7 @@
 .Nd make entries in the system log
 .Sh SYNOPSIS
 .Nm logger
-.Op Fl is
+.Op Fl cis
 .Op Fl f Ar file
 .Op Fl p Ar pri
 .Op Fl t Ar tag
@@ -52,6 +52,10 @@ system log module.
 .Pp
 The options are as follows:
 .Bl -tag -width "-f file"
+.It Fl c
+If unable to pass the message to
+.Xr syslogd 8 ,
+attempt to write the message to the console.
 .It Fl f Ar file
 Read from the specified
 .Ar file .
@@ -102,5 +106,5 @@ utility is compliant with the
 specification.
 .Pp
 The flags
-.Op Fl ifpst
+.Op Fl cfipst
 are extensions to that specification.
Index: logger.c
===
RCS file: /data/src/openbsd/src/usr.bin/logger/logger.c,v
retrieving revision 1.17
diff -u -p -r1.17 logger.c
--- logger.c28 Mar 2016 18:18:52 -  1.17
+++ logger.c16 Jun 2019 19:18:33 -
@@ -61,8 +61,11 @@ main(int argc, char *argv[])
tag = NULL;
pri = LOG_NOTICE;
logflags = 0;
-   while ((ch = getopt(argc, argv, "f:ip:st:")) != -1)
+   while ((ch = getopt(argc, argv, "cf:ip:st:")) != -1)
switch(ch) {
+   case 'c':   /* log to console */
+   logflags |= LOG_CONS;
+   break;
case 'f':   /* file to log */
if (freopen(optarg, "r", stdin) == NULL) {
(void)fprintf(stderr, "logger: %s: %s.\n",
@@ -180,6 +183,6 @@ void
 usage(void)
 {
(void)fprintf(stderr,
-   "usage: logger [-is] [-f file] [-p pri] [-t tag] [message ...]\n");
+   "usage: logger [-cis] [-f file] [-p pri] [-t tag] [message ...]\n");
exit(1);
 }

Re: Comment fix: No mcontext

2019-07-13 Thread Philip Guenther

On Thu, Jul 11, 2019 at 10:12 PM Benjamin Baier 
wrote:

> there is no mcontext, and since V1.2 of process_machdep.c struct reg
> and struct trapframe don't have to be closely synced.
> process_machdep.c no longer memcpy's one to the other.
>

Yep.  Committed.  Thanks!


Philip Guenther

Re: remove duplicate definitions of LAPIC_ID_MASK and LAPIC_ID_SHIFT

2019-07-25 Thread Philip Guenther

On Thu, Jul 25, 2019 at 4:44 PM Kevin Lo  wrote:

> ok?
>

Yes, please.  guenther@

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1071 matches

Mail list logo