Re: simple pledge for xeyes(1)

2023-09-07 Thread Sebastien Marie
On Thu, Sep 07, 2023 at 11:30:11PM -0400, Thomas Frohwein wrote:
> Very basic pledge(2) for the whole program. I didn't dive too much into
> the details and maybe this can be refined some more. This is kind of a
> product of me trying a tool I made `abstain` [1] for usefulness of
> pledge(2) execpromises and it helped quickly find that xeyes(1) can run
> with a very limited set of promises. I tested all permutations of
> running xeyes(1) that are listed in the man page and none of them break
> with this configuration.
> 
> ok to add?

Runtime testing isn't the better way to work with pledge, as you could easily 
miss cases.

Here, you are manipulating a X11 program: does it is still work with distant 
DISPLAY ? (hint: no, you missed "inet" promise). So the program will not work 
anymore with ssh -X (for the more common example).

"prot_exec" is also suspisious. usually it is required for dlopen() stuff. I 
beg 
it is a problem due to infering promises from execpromises by parent process 
(or 
else, libX11 is doing dlopen(3) early).

"rpath" is a bit odd in xeyes(1) normal behaviour (but it will be required on 
X11 error, as if I remember well, error codes are "translated" to message by 
reading some file).

For me, you are pledging too early (before initialization). It should be done 
at 
least after calling XtAppInitialize(3).

It will be the main limitation for a tool like `abstain`. pledge(2) should be 
called *after* initialization, and not at the beginning of the program.

> 
> Index: xeyes.c
> ===
> RCS file: /cvs/xenocara/app/xeyes/xeyes.c,v
> retrieving revision 1.5
> diff -u -p -r1.5 xeyes.c
> --- xeyes.c   29 Aug 2021 17:50:32 -  1.5
> +++ xeyes.c   8 Sep 2023 03:23:51 -
> @@ -38,6 +38,8 @@ from the X Consortium.
>  #include "Eyes.h"
>  #include 
>  #include 
> +#include 
> +#include 
>  #include "eyes.bit"
>  #include "eyesmask.bit"
>  
> @@ -111,6 +113,8 @@ main(int argc, char **argv)
>  Arg arg[2];
>  Cardinal i;
>  
> +if(pledge("stdio rpath unix prot_exec", NULL) == -1)
> + err(1, "pledge");
>  XtSetLanguageProc(NULL, (XtLanguageProc) NULL, NULL);
>  
>  toplevel = XtAppInitialize(_context, "XEyes",
> 

Thanks.
-- 
Sebastien Marie



Re: vnode: drop comment, nonsensical where it is

2023-07-17 Thread Sebastien Marie
On Mon, Jul 17, 2023 at 11:40:55AM +0200, Claudio Jeker wrote:
> On Wed, Jul 12, 2023 at 12:25:19PM +0200, thib4711 wrote:
> > The line comment in struct vnode is fine;
> > 
> > diff --git sys/sys/vnode.h sys/sys/vnode.h
> > index 30787afddd8..b2f0fa4b60c 100644
> > --- sys/sys/vnode.h
> > +++ sys/sys/vnode.h
> > @@ -74,12 +74,7 @@ enum vtagtype{
> >  "unused", "unused", "unused", "ISOFS", "unused",   \
> >  "EXT2FS", "VFS", "NTFS", "UDF", "FUSEFS", "TMPFS"
> >  
> > -/*
> > - * Each underlying filesystem allocates its own private area and hangs
> > - * it from v_data.  If non-null, this area is freed in getnewvnode().
> > - */
> >  LIST_HEAD(buflists, buf);
> > -
> >  RBT_HEAD(buf_rb_bufs, buf);
> >  
> >  struct namecache;
> > 
> 
> Yes, this comment is not helpful (especially since v_data is cleaned up by
> the reclaim function).
> 
> OK claudio@

it is fine with me too. ok semarie@
-- 
Sebastien Marie



Re: vfs: drop unnecessary cache_purge()s

2023-07-17 Thread Sebastien Marie
On Sat, Jul 15, 2023 at 09:21:40AM +0200, Thordur Bjornsson wrote:
> VOP_RECLAIM is only ever called from vclean() to cleanup fs dependent
> data, and vclean() calls cache_purge().
> 
> Makes all of the reclaim implementations the same in this regard.

for now, I am still unsure about the change.

yes, vclean() will call cache_purge() after calling VOP_RECLAIM(). so we 
ended-up 
to have called cache_purge() several times.

but the vnode isn't in the same state inside VOP_RECLAIM() and after calling 
it. 
it seems fine as the *_reclaim() is freeing v_data contents, and cache_purge() 
doesn't touch that.

also, you didn't change ufs_reclaim() to not call cache_purge() ? is it on 
purpose ?

thanks.
-- 
Sebastien Marie

> diff --git sys/isofs/cd9660/cd9660_node.c sys/isofs/cd9660/cd9660_node.c
> index bce99d77c22..300277f3b37 100644
> --- sys/isofs/cd9660/cd9660_node.c
> +++ sys/isofs/cd9660/cd9660_node.c
> @@ -218,7 +218,6 @@ cd9660_reclaim(void *v)
>   /*
>* Purge old data structures associated with the inode.
>*/
> - cache_purge(vp);
>   if (ip->i_devvp) {
>   vrele(ip->i_devvp);
>   ip->i_devvp = 0;
> diff --git sys/msdosfs/msdosfs_denode.c sys/msdosfs/msdosfs_denode.c
> index 7a33212b648..3707c97458e 100644
> --- sys/msdosfs/msdosfs_denode.c
> +++ sys/msdosfs/msdosfs_denode.c
> @@ -600,7 +600,6 @@ msdosfs_reclaim(void *v)
>   /*
>* Purge old data structures associated with the denode.
>*/
> - cache_purge(vp);
>   if (dep->de_devvp) {
>   vrele(dep->de_devvp);
>   dep->de_devvp = 0;
> diff --git sys/nfs/nfs_node.c sys/nfs/nfs_node.c
> index c8ac3b9bb14..38ad5db82fc 100644
> --- sys/nfs/nfs_node.c
> +++ sys/nfs/nfs_node.c
> @@ -237,7 +237,6 @@ nfs_reclaim(void *v)
>   if (np->n_wcred)
>   crfree(np->n_wcred);
>  
> - cache_purge(vp);
>   pool_put(_node_pool, vp->v_data);
>   vp->v_data = NULL;
>  
> diff --git sys/ntfs/ntfs_vnops.c sys/ntfs/ntfs_vnops.c
> index d239112e991..d40e3d254f6 100644
> --- sys/ntfs/ntfs_vnops.c
> +++ sys/ntfs/ntfs_vnops.c
> @@ -221,8 +221,6 @@ ntfs_reclaim(void *v)
>   return (error);
>   
>   /* Purge old data structures associated with the inode. */
> - cache_purge(vp);
> -
>   ntfs_frele(fp);
>   ntfs_ntput(ip);
>  
> diff --git sys/tmpfs/tmpfs_vnops.c sys/tmpfs/tmpfs_vnops.c
> index bc1390d72c9..6ec13e686b2 100644
> --- sys/tmpfs/tmpfs_vnops.c
> +++ sys/tmpfs/tmpfs_vnops.c
> @@ -1079,8 +1079,6 @@ tmpfs_reclaim(void *v)
>   racing = TMPFS_NODE_RECLAIMING(node);
>   rw_exit_write(>tn_nlock);
>  
> - cache_purge(vp);
> -
>   /*
>* If inode is not referenced, i.e. no links, then destroy it.
>* Note: if racing - inode is about to get a new vnode, leave it.
> diff --git sys/ufs/ext2fs/ext2fs_vnops.c sys/ufs/ext2fs/ext2fs_vnops.c
> index 235590d7c74..006a06b0dc8 100644
> --- sys/ufs/ext2fs/ext2fs_vnops.c
> +++ sys/ufs/ext2fs/ext2fs_vnops.c
> @@ -1247,7 +1247,6 @@ ext2fs_reclaim(void *v)
>   /*
>* Purge old data structures associated with the inode.
>*/
> - cache_purge(vp);
>   if (ip->i_devvp)
>   vrele(ip->i_devvp);
>  
> 



Re: vfs: drop a bunch of cast macros

2023-07-17 Thread Sebastien Marie
On Wed, Jul 12, 2023 at 12:26:01PM +0200, thib4711 wrote:
> make it obvious in the vfsops assignment that an op isnt supported.

I agree that it is more readable.

ok semarie@

thanks.
-- 
Sebastien Marie

> diff --git sys/isofs/cd9660/cd9660_extern.h sys/isofs/cd9660/cd9660_extern.h
> index 2a5348e1768..bd8154a27bd 100644
> --- sys/isofs/cd9660/cd9660_extern.h
> +++ sys/isofs/cd9660/cd9660_extern.h
> @@ -94,10 +94,8 @@ int cd9660_vptofh(struct vnode *, struct fid *);
>  int cd9660_init(struct vfsconf *);
>  int cd9660_check_export(struct mount *, struct mbuf *, int *,
>   struct ucred **);
> -#define cd9660_sysctl ((int (*)(int *, u_int, void *, size_t *, void *, \
> -size_t, struct proc *))eopnotsupp)
>  
> -int cd9660_mountroot(void); 
> +int cd9660_mountroot(void);
>  
>  extern const struct vops cd9660_vops;
>  extern const struct vops cd9660_specvops;
> diff --git sys/isofs/cd9660/cd9660_vfsops.c sys/isofs/cd9660/cd9660_vfsops.c
> index ef0ffbbb152..b844a2ff709 100644
> --- sys/isofs/cd9660/cd9660_vfsops.c
> +++ sys/isofs/cd9660/cd9660_vfsops.c
> @@ -72,7 +72,7 @@ const struct vfsops cd9660_vfsops = {
>   .vfs_fhtovp = cd9660_fhtovp,
>   .vfs_vptofh = cd9660_vptofh,
>   .vfs_init   = cd9660_init,
> - .vfs_sysctl = cd9660_sysctl,
> + .vfs_sysctl = (void *)eopnotsupp,
>   .vfs_checkexp   = cd9660_check_export,
>  };
>  
> diff --git sys/msdosfs/msdosfs_vfsops.c sys/msdosfs/msdosfs_vfsops.c
> index 0de37665dfd..6b90195b5e5 100644
> --- sys/msdosfs/msdosfs_vfsops.c
> +++ sys/msdosfs/msdosfs_vfsops.c
> @@ -762,27 +762,18 @@ msdosfs_check_export(struct mount *mp, struct mbuf 
> *nam, int *exflagsp,
>   return (0);
>  }
>  
> -#define msdosfs_vget ((int (*)(struct mount *, ino_t, struct vnode **)) \
> -   eopnotsupp)
> -
> -#define msdosfs_quotactl ((int (*)(struct mount *, int, uid_t, caddr_t, \
> - struct proc *))eopnotsupp)
> -
> -#define msdosfs_sysctl ((int (*)(int *, u_int, void *, size_t *, void *, \
> -size_t, struct proc *))eopnotsupp)
> -
>  const struct vfsops msdosfs_vfsops = {
>   .vfs_mount  = msdosfs_mount,
>   .vfs_start  = msdosfs_start,
>   .vfs_unmount= msdosfs_unmount,
>   .vfs_root   = msdosfs_root,
> - .vfs_quotactl   = msdosfs_quotactl,
> + .vfs_quotactl   = (void *)eopnotsupp,
>   .vfs_statfs = msdosfs_statfs,
>   .vfs_sync   = msdosfs_sync,
> - .vfs_vget   = msdosfs_vget,
> + .vfs_vget   = (void *)eopnotsupp,
>   .vfs_fhtovp = msdosfs_fhtovp,
>   .vfs_vptofh = msdosfs_vptofh,
>   .vfs_init   = msdosfs_init,
> - .vfs_sysctl = msdosfs_sysctl,
> + .vfs_sysctl = (void *)eopnotsupp,
>   .vfs_checkexp   = msdosfs_check_export,
>  };
> 



Re: patch: profiling using utrace(2) (compatible with pledge and unveil)

2023-05-20 Thread Sebastien Marie
Hi,

I am getting back to gmon/profil(2), replying to all feedback. Sorry for the 
delay.

On Wed, May 03, 2023 at 08:45:39AM -0500, Scott Cheloha wrote:
> On Tue, Apr 11, 2023 at 09:28:31AM +0200, Sebastien Marie wrote:
> 
> > One place where I needed to cheat a bit is at moncontrol(0) call in 
> > _mcleanup() 
> > function.
> > 
> > moncontrol(0) will stop profiling (change the state to GMON_PROF_OFF), and 
> > it is 
> > calling profil(2) to disable program counter profiling. _mcleanup() is the 
> > atexit(3) handler.
> > 
> > Instead of changing pledge(2) permission for profil(2) (which could go deep 
> > in 
> > the kernel to change the clockrate with setstatclockrate()), I just assumed 
> > that 
> > it isn't necessary to disable it here: we are in atexit(3), so the process 
> > is 
> > about to call _exit(2), and the kernel will stop the profiling itself if 
> > still 
> > running.
> 
> This is an acceptable workaround if you are using moncontrol(3).
> profil(2) is set up before main() and the teardown is transparent to
> the process.  We can probably neglect to turn profil(2) off.
> 
> If you wanted to be *really* safe you could remove the write
> permissions from the profil(2) buffer with mprotect(2) during
> _mcleanup().  This would prevent the kernel from racing with you.
> 
> On the other hand, it would be really nice if profil(2) could fit into
> the "stdio" pledge(2) promise.  I don't quite know what the
> implications of that would be.
> 
> profil(2) tells the kernel: write sample data at *this* address.
> Later, if that piece of memory is not readable and writable by the
> calling process, the copyin(9) or copyout(9) will fail when the thread
> is returning from the clock interrupt and profil(2) is quietly
> disabled for the process.
> 
> Is that set of actions a security concern?  Sorry if that's a dumb
> question.
> 
> It seems like the process is basically just operating on its own
> memory, but maybe it violates the spirit of the "stdio" promise?
> 

I didn't change profil(2) regarding pledge(2) as I basically doesn't understand 
the underline kernel code (which is a very good reason for me). All I saw is it 
is playing which {start,stop}profclock(9) which is going deep in the kernel, 
and 
regulary write in user supplied memory address.

So, as I had a somehow valid workaround (have profil(2) enabled before any 
pledge(2) call, and don't call it at end), I didn't looked to add profil(2) to 
"stdio". It wasn't strictly required.

Please note that adding it in "stdio" means almost every programs (even sshd 
sandbox) will gain it.

> > For post-execution gmon.out extraction, an helper might be needed: 
> > ktrace.out 
> > could contain multiple gmon.out files.
> > 
> > ## compile and collect profil information (-tu option on ktrace is optional)
> > $ cc -static -pg test.c
> > $ ktrace -di -tu ./a.out
> > 
> > ## get gmon.out file
> > $ kdump -u gmon.out | unvis > gmon.out
> > 
> > ## get gmon.out.$name.$pid for multiple processes
> > ##  - first get pid process-name
> > ##  - extract each gmon.out for each pid and store in "gmon.out.$name.$pid" 
> > file
> > $ kdump -tu | sed -ne 's/^ \([0-9][0-9]*\) \([^ ]*\) .*/\1 \2/p' | sort -u \
> > | while read pid name; do kdump -u gmon.out -p $pid | unvis > 
> > gmon.out.$name.$pid ; done
> > 
> > kdump diff from otto@ mallocdump is need for 'kdump -u label'.
> 
> Have you taken a shot at changing kdump(8) to produce the gmon.out
> files gprof(1) expects?
> 
> If not, I will take a shot at it.

I looked a bit at kdump(8) to see how to be more user-friendly. Currently, I 
only added an option to produce binary data with -u label (basically to not use 
vis(3)).

When you have only on process data, it means using:

$ kdump -U -u gmon.out > gmon.out

I didn't look to make kdump(8) directly create files as it is currently a 
read-only tool (pledged "stdio rpath getpw"). But it might be a way.


On Thu, May 04, 2023 at 11:09:34AM +0200, Omar Polo wrote:
> On 2023/04/11 09:28:31 +0200, Sebastien Marie  wrote:
> 
> I've used this to profile `gotadmin pack' a couple of times.  got uses
> several "libexec helpers" to parse data on disk, so profiling usually
> gets awkward.  You'd need to build only the program you're interested
> in (e.g. got-read-pack) with PROFILE=1 and collect the data.  $PROFDIR
> doesn't work out-of-the-box due to unveil (PROFILE builds disable
> pledge but leave unveil and permit "gmon.out" rwc.)  With gotd it gets
> a little more tricky since it's a single binary that gets fork+exec'd
&g

Re: unwind(8): fix (some?) bad packet log messages

2023-04-18 Thread Sebastien Marie
 @@ -1545,12 +1545,6 @@ check_resolver_done(struct uw_resolver *res, void 
> *arg, int rcode,
>  
>   prev_state = checked_resolver->state;
>  
> - if (answer_len < LDNS_HEADER_SIZE) {
> - checked_resolver->state = DEAD;
> - log_warnx("%s: bad packet: too short", __func__);
> - goto out;
> - }
> -
>   if (rcode == LDNS_RCODE_SERVFAIL) {
>   log_debug("%s: %s rcode: SERVFAIL", __func__,
>   uw_resolver_type_str[checked_resolver->type]);
> @@ -1559,6 +1553,12 @@ check_resolver_done(struct uw_resolver *res, void 
> *arg, int rcode,
>   goto out;
>   }
>  
> + if (answer_len < LDNS_HEADER_SIZE) {
> + checked_resolver->state = DEAD;
> + log_warnx("%s: bad packet: too short", __func__);
> + goto out;
> + }
> +
>   if (sec == SECURE) {
>   if (dns64_present && (res->type == UW_RES_AUTOCONF ||
>   res->type == UW_RES_ODOT_AUTOCONF)) {
> @@ -1902,6 +1902,11 @@ trust_anchor_resolve_done(struct uw_resolver *res, 
> void *arg, int rcode,
>   uint16_t dnskey_flags;
>   char rdata_buf[1024], *ta;
>  
> + if (rcode == LDNS_RCODE_SERVFAIL) {
> + log_debug("%s: rcode: SERVFAIL", __func__);
> + goto out;
> + }
> +
>   if (answer_len < LDNS_HEADER_SIZE) {
>   log_warnx("bad packet: too short");
>   goto out;
> 
> 
> -- 
> In my defence, I have been left unsupervised.
> 

-- 
Sebastien Marie



patch: profiling using utrace(2) (compatible with pledge and unveil)

2023-04-11 Thread Sebastien Marie
Hi,

After otto@ work on mallocdump using utrace(2), I started to look again at 
profiling (see moncontrol(2)).

The current implementation tries to write a gmon.out file at program exit(3) 
time, which isn't compatible with pledge(2) or unveil(2).

This diff changes the way the runtime profiling information is extracted by 
using utrace(2) (which is permitted while pledged).

The information is collected using ktrace(2), and the gmon.out file could be 
recreated from ktrace.out file post-execution (so without unveil(2) 
restriction).


One place where I needed to cheat a bit is at moncontrol(0) call in _mcleanup() 
function.

moncontrol(0) will stop profiling (change the state to GMON_PROF_OFF), and it 
is 
calling profil(2) to disable program counter profiling. _mcleanup() is the 
atexit(3) handler.

Instead of changing pledge(2) permission for profil(2) (which could go deep in 
the kernel to change the clockrate with setstatclockrate()), I just assumed 
that 
it isn't necessary to disable it here: we are in atexit(3), so the process is 
about to call _exit(2), and the kernel will stop the profiling itself if still 
running.


For post-execution gmon.out extraction, an helper might be needed: ktrace.out 
could contain multiple gmon.out files.

## compile and collect profil information (-tu option on ktrace is optional)
$ cc -static -pg test.c
$ ktrace -di -tu ./a.out

## get gmon.out file
$ kdump -u gmon.out | unvis > gmon.out

## get gmon.out.$name.$pid for multiple processes
##  - first get pid process-name
##  - extract each gmon.out for each pid and store in "gmon.out.$name.$pid" file
$ kdump -tu | sed -ne 's/^ \([0-9][0-9]*\) \([^ ]*\) .*/\1 \2/p' | sort -u \
| while read pid name; do kdump -u gmon.out -p $pid | unvis > 
gmon.out.$name.$pid ; done


kdump diff from otto@ mallocdump is need for 'kdump -u label'.


Feedback would be appreciated.
-- 
Sebastien Marie


diff /home/semarie/repos/openbsd/src
commit - 7639070d00278d5f3d76cbc265d756c39619d7a8
path + /home/semarie/repos/openbsd/src
blob - f09ffd91837040d44fb17a9e8c0197c72ba9459a
file + lib/libc/gmon/gmon.c
--- lib/libc/gmon/gmon.c
+++ lib/libc/gmon/gmon.c
@@ -29,6 +29,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -51,6 +52,32 @@ void
 PROTO_NORMAL(moncontrol);
 PROTO_DEPRECATED(monstartup);
 
+static void
+montrace(void *addr, size_t len)
+{
+   for (;;) {
+   if (len < KTR_USER_MAXLEN) {
+   if (utrace("gmon.out", addr, len) == -1)
+   ERR("error on utrace(), truncated gmon.out");
+   return;
+   }
+   if (utrace("gmon.out", addr, KTR_USER_MAXLEN) == -1)
+   ERR("error on utrace(), truncated gmon.out");
+   
+   len -= KTR_USER_MAXLEN;
+   addr += KTR_USER_MAXLEN;
+   }
+}
+
+#ifdef DEBUG
+static int
+monlog(char *msg)
+{
+   size_t len = strlen(msg);
+   return utrace("gmon.log", msg, len);
+}
+#endif
+
 void
 monstartup(u_long lowpc, u_long highpc)
 {
@@ -136,7 +163,6 @@ _mcleanup(void)
 void
 _mcleanup(void)
 {
-   int fd;
int fromindex;
int endfrom;
u_long frompc;
@@ -147,11 +173,8 @@ _mcleanup(void)
struct clockinfo clockinfo;
const int mib[2] = { CTL_KERN, KERN_CLOCKRATE };
size_t size;
-   char *profdir;
-   char *proffile;
-   char  buf[PATH_MAX];
 #ifdef DEBUG
-   int log, len;
+   int len;
char dbuf[200];
 #endif
 
@@ -169,67 +192,16 @@ _mcleanup(void)
clockinfo.profhz = clockinfo.hz;/* best guess */
}
 
-   moncontrol(0);
+   /*
+* Do not call moncontrol(0) (neither profil(2)) as we might be pledged.
+* We are in _mcleanup(), so the process is inside exit(3).
+*/
+   p->state = GMON_PROF_OFF;
 
-   if (issetugid() == 0 && (profdir = getenv("PROFDIR")) != NULL) {
-   char *s, *t, *limit;
-   pid_t pid;
-   long divisor;
-
-   /* If PROFDIR contains a null value, no profiling
-  output is produced */
-   if (*profdir == '\0') {
-   return;
-   }
-
-   limit = buf + sizeof buf - 1 - 10 - 1 -
-   strlen(__progname) - 1;
-   t = buf;
-   s = profdir;
-   while((*t = *s) != '\0' && t < limit) {
-   t++;
-   s++;
-   }
-   *t++ = '/';
-
-   /*
-* Copy and convert pid from a pid_t to a string.  For
-* best performance, divisor should be initialized to
-* the largest power of 10 less than PID_MAX.
-*/
-   pid = getpid();
-   divisor=1

Re: MALLOC_STATS: dump internal state and leak info via utrace(2)

2023-04-09 Thread Sebastien Marie
gc, argv, "f:dHlm:np:RTt:u:xX")) != -1)
>   switch (ch) {
>   case 'f':
>   tracefile = optarg;
> @@ -211,6 +212,11 @@ main(int argc, char *argv[])
>   trpoints = getpoints(optarg, DEF_POINTS);
>   if (trpoints < 0)
>   errx(1, "unknown trace point in %s", optarg);
> + utracefilter = NULL;
> + break;
> + case 'u':
> + utracefilter = optarg;
> + trpoints = KTRFAC_USER;
>   break;
>   case 'x':
>   iohex = 1;
> @@ -246,7 +252,7 @@ main(int argc, char *argv[])
>   silent = 0;
>   if (pid_opt != -1 && pid_opt != ktr_header.ktr_pid)
>   silent = 1;
> - if (silent == 0 && trpoints & (1< + if (utracefilter == NULL && silent == 0 && trpoints & 
> (1<   dumpheader(_header);
>   ktrlen = ktr_header.ktr_len;
>   if (ktrlen > size) {
> @@ -1254,10 +1260,16 @@ showbufc(int col, unsigned char *dp, siz
>  static void
>  showbuf(unsigned char *dp, size_t datalen)
>  {
> - int i, j;
> + size_t i, j;
>   int col = 0, bpl;
>   unsigned char c;
> + char visbuf[4 * KTR_USER_MAXLEN + 1];
>  
> + if (utracefilter != NULL) {
> + strvisx(visbuf, dp, datalen, VIS_SAFE | VIS_OCTAL);
> + printf("%s", visbuf);
> + return;
> + }
>   if (iohex == 1) {
>   putchar('\t');
>   col = 8;
> @@ -1280,7 +1292,7 @@ showbuf(unsigned char *dp, size_t datale
>   if (bpl <= 0)
>   bpl = 1;
>   for (i = 0; i < datalen; i += bpl) {
> - printf("   %04x:  ", i);
> + printf("   %04zx:  ", i);
>   for (j = 0; j < bpl; j++) {
>   if (i+j >= datalen)
>   printf("   ");
> @@ -1413,9 +1425,13 @@ ktruser(struct ktr_user *usr, size_t len
>   if (len < sizeof(struct ktr_user))
>   errx(1, "invalid ktr user length %zu", len);
>   len -= sizeof(struct ktr_user);
> - printf("%.*s:", KTR_USER_MAXIDLEN, usr->ktr_id);
> - printf(" %zu bytes\n", len);
> - showbuf((unsigned char *)(usr + 1), len);
> + if (utracefilter == NULL) {
> + printf("%.*s:", KTR_USER_MAXIDLEN, usr->ktr_id);
> + printf(" %zu bytes\n", len);
> + showbuf((unsigned char *)(usr + 1), len);
> + }
> + else if (strncmp(usr->ktr_id, utracefilter, KTR_USER_MAXIDLEN) == 0)
> + showbuf((unsigned char *)(usr + 1), len);
>  }
>  
>  static void
> @@ -1473,8 +1489,8 @@ usage(void)
>  
>   extern char *__progname;
>   fprintf(stderr, "usage: %s "
> - "[-dHlnRTXx] [-f file] [-m maxdata] [-p pid] [-t trstr]\n",
> - __progname);
> + "[-dHlnRTXx] [-f file] [-m maxdata] [-p pid] [-t trstr] "
> + "[-u label]\n", __progname);
>   exit(1);
>  }
>  

-- 
Sebastien Marie



Re: MALLOC_STATS: dump internal state and leak info via utrace(2)

2023-04-09 Thread Sebastien Marie
On Sun, Apr 09, 2023 at 10:08:25AM +0200, Claudio Jeker wrote:
> On Sun, Apr 09, 2023 at 09:15:12AM +0200, Otto Moerbeek wrote:
> > On Sun, Apr 09, 2023 at 08:20:43AM +0200, Otto Moerbeek wrote:
> 
> I would prefer if every utrace() call is a full line (in other words ulog
> should be line buffered). It makes the regular kdump output more useable.
> Right now you depend on kdump -u to put the lines back together.
> 
> Whenever I used utrace() I normally passed binary objects to the call so I
> could enrich the ktrace with userland trace info.  So if kdump -u is used
> for more then just mallocstats it should have a true binary mode. For
> example gmon.out is a binary format so the above example by semarie@ would
> not work.  As usual this can be solved in tree once that problem is hit.

Yes, I saw my example wasn't right for gmon.out. but it could be workarounded 
easily with unvis(1) as the vis(3) options used preserved the reversibility.

$ kdump -u profil | unvis > gmon.out

I agree it could be done in tree if a binary output in kdump(1) is prefered.

Regarding profiling, I wonder if it is currently broken. I am investigating, 
and 
will do a bug report once I have enough information.
-- 
Sebastien Marie



Re: MALLOC_STATS: dump internal state and leak info via utrace(2)

2023-04-09 Thread Sebastien Marie
is(3) as dp is fixed size (KTR_USER_MAXLEN). so visbuf would be 
KTR_USER_MAXLEN*4+1 in size.

but the way you did is fine too.

> + }
> + return;
> + }
>   if (iohex == 1) {
>   putchar('\t');
>   col = 8;
> @@ -1280,7 +1292,7 @@ showbuf(unsigned char *dp, size_t datale
>   if (bpl <= 0)
>   bpl = 1;
>   for (i = 0; i < datalen; i += bpl) {
> - printf("   %04x:  ", i);
> + printf("   %04zx:  ", i);
>   for (j = 0; j < bpl; j++) {
>   if (i+j >= datalen)
>   printf("   ");
> @@ -1413,9 +1425,13 @@ ktruser(struct ktr_user *usr, size_t len
>   if (len < sizeof(struct ktr_user))
>   errx(1, "invalid ktr user length %zu", len);
>   len -= sizeof(struct ktr_user);
> - printf("%.*s:", KTR_USER_MAXIDLEN, usr->ktr_id);
> - printf(" %zu bytes\n", len);
> - showbuf((unsigned char *)(usr + 1), len);
> + if (utracefilter == NULL) {
> + printf("%.*s:", KTR_USER_MAXIDLEN, usr->ktr_id);
> + printf(" %zu bytes\n", len);
> + showbuf((unsigned char *)(usr + 1), len);
> + }
> + else if (strncmp(usr->ktr_id, utracefilter, KTR_USER_MAXIDLEN) == 0)
> + showbuf((unsigned char *)(usr + 1), len);
>  }
>  
>  static void
> @@ -1473,8 +1489,8 @@ usage(void)
>  
>   extern char *__progname;
>   fprintf(stderr, "usage: %s "
> - "[-dHlnRTXx] [-f file] [-m maxdata] [-p pid] [-t trstr]\n",
> - __progname);
> + "[-dHlnRTXx] [-f file] [-m maxdata] [-p pid] [-t trstr] "
> + "[-u word]\n", __progname);
>   exit(1);
>  }
>  
> 
> 
> 

-- 
Sebastien Marie



Re: MALLOC_STATS: dump internal state and leak info via utrace(2)

2023-04-08 Thread Sebastien Marie
On Fri, Apr 07, 2023 at 09:52:52AM +0200, Otto Moerbeek wrote:
> > Hi,
> > 
> > This is work in progress. I have to think if the flags to kdump I'm
> > introducing should be two or a single one.
> > 
> > Currently, malloc.c can be compiled with MALLOC_STATS defined. If run
> > with option D it dumps its state to a malloc.out file at exit. This
> > state can be used to find leaks amongst other things.
> > 
> > This is not ideal for pledged processes, as they often have no way to
> > write files.
> > 
> > This changes malloc to use utrace(2) for that.
> > 
> > As kdump has no nice way to show those lines without all extras it
> > normally shows, so add two options to it to just show the lines.
> > 
> > To use, compile and install libc with MALLOC_STATS defined.
> > 
> > Run :
> > 
> > $ MALLOC_OPTIONS=D ktrace -tu your_program
> > ...
> > $ kdump -hu
> > 
> > Feedback appreciated.

I can't really comment on malloc(3) stuff, but I agree that utrace(2) is a good 
way to get information outside a pledged process.

I tend to think it is safe to use it, as the pledged process need cooperation 
from outside to exfiltrate informations (a process with permission to call 
ktrace(2) on this pid).

Please note it is a somehow generic problem: at least profiled processes would 
also get advantage of using it.


Regarding kdump options, I think that -u option should implies -h (no header).

Does it would make sens to considere a process using utrace(2) with several 
interleaved records for different sources ? A process with MALLOC_OPTIONS=D and 
profiling enabled for example ? An (another) option on kdump to filter on 
utrace 
label would be useful in such case, or have -u mandate a label to filter on.

$ MALLOC_OPTIONS=D ktrace -tu your_program
$ kdump -u mallocdumpline

and for profiling:

$ kdump -u profil > gmon.out
$ gprof your_program gmon.out

Thanks.
-- 
Sebastien Marie



Re: help wanted for a specific clang diff

2023-01-19 Thread Sebastien Marie
On Tue, Jan 17, 2023 at 05:57:41PM +0100, Sebastien Marie wrote:
> On Tue, Jan 17, 2023 at 02:14:20PM +0100, Sebastien Marie wrote:
> > On Mon, Jan 16, 2023 at 12:00:30PM -0700, Theo de Raadt wrote:
> > > For this xonly work, we are having to one-by-one find .S files that
> > > are putting data tables into the .text segment
> > > 
> > > I am hoping to find someone who can do c++ well enough, and maybe
> > > has some familiarity with the clang code, to add a warning message
> > > for this
> > > 
> > > if a .long, .quad, .byte are placed into a .text section, issue
> > > a warning, then we'll be able to identify all these in a ports build
> > > and decide which need manual fixing, and move the objects into .rodata
> > > and apply __PIC__ handling as neccessary
> > > 
> > > Yes, there are cases where people use .long to inject an instruction
> > > they don't believe the assembler has.  We can ignore those by inspect.
> > > 
> > > Can anyone help?  It doesn't need to be fancy, it just needs to get
> > > us moving faster.
> > > 
> > > Thanks
> > > 

again, a new diff.

in C code, when using asm("") with .byte directive, the warning is sensible to 
-Werror, and it broke the tree (on some archs, we are using such directive to 
put instructions unknown from LLVM 13).

the main drawback of the diff is some context (like code from macro) could be 
missing now:

$ cat -n test.c
 1  #define TEST() asm(".byte 0xAA")
 2
 3  int
 4  test(void)
 5  {
 6  TEST();
 7  return 0;
 8  }
$ cc -c test.c -Werror
:1:8: warning: directive value inside .text section: directive 
'.byte', section '.text'
.byte 0xAA
  ^
$ echo $?
0


before, we had TEST() macro information...

$ cc -c test.c -Werror
test.c:6:2: error: directive value inside .text section: directive '.byte', 
section '.text' [-Werror,-Winline-asm]
TEST();
^
test.c:1:20: note: expanded from macro 'TEST'
#define TEST() asm(".byte 0xAA")
       ^
:1:8: note: instantiated into assembly here
.byte 0xAA
  ^
1 error generated.
$ echo $?
1

I will look if I could do a bit better, but the following diff is immune to 
-Werror.
-- 
Sebastien Marie

diff /home/semarie/repos/openbsd/src
commit - b18ed08defc1c5e5b4be701ce5ef913bda8ae66a
path + /home/semarie/repos/openbsd/src
blob - 047ed1660a837e77561b1f2839ec9601f9582a7e
file + gnu/llvm/llvm/lib/MC/MCParser/AsmParser.cpp
--- gnu/llvm/llvm/lib/MC/MCParser/AsmParser.cpp
+++ gnu/llvm/llvm/lib/MC/MCParser/AsmParser.cpp
@@ -3168,6 +3168,18 @@ bool AsmParser::parseDirectiveValue(StringRef IDVal, u
 SMLoc ExprLoc = getLexer().getLoc();
 if (checkForValidSection() || parseExpression(Value))
   return true;
+// Check for directive inside .text
+const MCSection *Section = getStreamer().getCurrentSectionOnly();
+const StringRef sectionName = Section->getName();
+if (sectionName.equals(".text") || sectionName.startswith(".text.")) {
+  const SMDiagnostic diag = SrcMgr.GetMessage(ExprLoc, 
SourceMgr::DK_Warning,
+"directive value inside .text section: "
+"directive '" + Twine(IDVal) + "', section '" + Twine(sectionName) + 
"'",
+None, None);
+  unsigned CurBuf = SrcMgr.FindBufferContainingLoc(diag.getLoc());
+  SrcMgr.PrintIncludeStack(SrcMgr.getBufferInfo(CurBuf).IncludeLoc, 
errs());
+  diag.print(nullptr, errs(), false);
+}
 // Special case constant expressions to match code generator.
 if (const MCConstantExpr *MCE = dyn_cast(Value)) {
   assert(Size <= 8 && "Invalid size");
blob - 7b4d6e529cc2c3c4efce05f62f41620658d7a8e0
file + gnu/llvm/llvm/lib/MC/MCParser/MasmParser.cpp
--- gnu/llvm/llvm/lib/MC/MCParser/MasmParser.cpp
+++ gnu/llvm/llvm/lib/MC/MCParser/MasmParser.cpp
@@ -3809,6 +3809,21 @@ bool MasmParser::parseDirectiveValue(StringRef IDVal, 
 /// parseDirectiveValue
 ///  ::= (byte | word | ... ) [ expression (, expression)* ]
 bool MasmParser::parseDirectiveValue(StringRef IDVal, unsigned Size) {
+  // Check for directive inside .text
+  const MCSection *Section = getStreamer().getCurrentSectionOnly();
+  if (Section != NULL) {
+const StringRef sectionName = Section->getName();
+if (sectionName.equals(".text") || sectionName.startswith(".text.")) {
+  const SMLoc ExprLoc = getLexer().getLoc();
+  const SMDiagnostic diag = SrcMgr.GetMessage(ExprLoc, 
SourceMgr::DK_Warning,
+"directive value inside .text section: "
+"directive '" + Twine(IDVal) + "', section '" + Twine(sectionName) + 
"'",
+None, None);
+  unsigned CurBuf = SrcMgr.FindBuf

Re: help wanted for a specific clang diff

2023-01-17 Thread Sebastien Marie
On Tue, Jan 17, 2023 at 02:14:20PM +0100, Sebastien Marie wrote:
> On Mon, Jan 16, 2023 at 12:00:30PM -0700, Theo de Raadt wrote:
> > For this xonly work, we are having to one-by-one find .S files that
> > are putting data tables into the .text segment
> > 
> > I am hoping to find someone who can do c++ well enough, and maybe
> > has some familiarity with the clang code, to add a warning message
> > for this
> > 
> > if a .long, .quad, .byte are placed into a .text section, issue
> > a warning, then we'll be able to identify all these in a ports build
> > and decide which need manual fixing, and move the objects into .rodata
> > and apply __PIC__ handling as neccessary
> > 
> > Yes, there are cases where people use .long to inject an instruction
> > they don't believe the assembler has.  We can ignore those by inspect.
> > 
> > Can anyone help?  It doesn't need to be fancy, it just needs to get
> > us moving faster.
> > 
> > Thanks
> > 
> 
> The following diff seems to work. It adds a warning inside assembler parsers.
> 

New diff with warning message simpler to grep.

  test.S:29:8: warning: directive value inside .text section: directive 
'.byte', section '.text'

$ grep -F "directive value inside .text section:"

-- 
Sebastien Marie

diff /home/semarie/repos/openbsd/src
commit - b18ed08defc1c5e5b4be701ce5ef913bda8ae66a
path + /home/semarie/repos/openbsd/src
blob - 047ed1660a837e77561b1f2839ec9601f9582a7e
file + gnu/llvm/llvm/lib/MC/MCParser/AsmParser.cpp
--- gnu/llvm/llvm/lib/MC/MCParser/AsmParser.cpp
+++ gnu/llvm/llvm/lib/MC/MCParser/AsmParser.cpp
@@ -3168,6 +3168,12 @@ bool AsmParser::parseDirectiveValue(StringRef IDVal, u
 SMLoc ExprLoc = getLexer().getLoc();
 if (checkForValidSection() || parseExpression(Value))
   return true;
+// Check for directive inside .text
+const MCSection *Section = getStreamer().getCurrentSectionOnly();
+const StringRef sectionName = Section->getName();
+if (sectionName.equals(".text") || sectionName.startswith(".text."))
+  Warning(ExprLoc, "directive value inside .text section: "
+"directive '" + Twine(IDVal) + "', section '" + Twine(sectionName) + 
"'");
 // Special case constant expressions to match code generator.
 if (const MCConstantExpr *MCE = dyn_cast(Value)) {
   assert(Size <= 8 && "Invalid size");
blob - 7b4d6e529cc2c3c4efce05f62f41620658d7a8e0
file + gnu/llvm/llvm/lib/MC/MCParser/MasmParser.cpp
--- gnu/llvm/llvm/lib/MC/MCParser/MasmParser.cpp
+++ gnu/llvm/llvm/lib/MC/MCParser/MasmParser.cpp
@@ -3809,6 +3809,14 @@ bool MasmParser::parseDirectiveValue(StringRef IDVal, 
 /// parseDirectiveValue
 ///  ::= (byte | word | ... ) [ expression (, expression)* ]
 bool MasmParser::parseDirectiveValue(StringRef IDVal, unsigned Size) {
+  // Check for directive inside .text
+  const MCSection *Section = getStreamer().getCurrentSectionOnly();
+  if (Section != NULL) {
+const StringRef sectionName = Section->getName();
+if (sectionName.equals(".text") || sectionName.startswith(".text."))
+  Warning(getLexer().getLoc(), "directive value inside .text section: "
+"directive '" + Twine(IDVal) + "', section '" + Twine(sectionName) + 
"'");
+  }
   if (StructInProgress.empty()) {
 // Initialize data value.
 if (emitIntegralValues(Size))
@@ -3825,6 +3833,14 @@ bool MasmParser::parseDirectiveNamedValue(StringRef Ty
 bool MasmParser::parseDirectiveNamedValue(StringRef TypeName, unsigned Size,
   StringRef Name, SMLoc NameLoc) {
   if (StructInProgress.empty()) {
+// Check for directive inside .text
+const MCSection *Section = getStreamer().getCurrentSectionOnly();
+if (Section != NULL) {
+  const StringRef sectionName = Section->getName();
+  if (sectionName.equals(".text") || sectionName.startswith(".text."))
+Warning(getLexer().getLoc(), "directive value inside .text section: "
+  "named directive '" + Twine(Name) + "', section '" + 
Twine(sectionName) + "'");
+}
 // Initialize named data value.
 MCSymbol *Sym = getContext().getOrCreateSymbol(Name);
 getStreamer().emitLabel(Sym);



Re: help wanted for a specific clang diff

2023-01-17 Thread Sebastien Marie
On Mon, Jan 16, 2023 at 12:00:30PM -0700, Theo de Raadt wrote:
> For this xonly work, we are having to one-by-one find .S files that
> are putting data tables into the .text segment
> 
> I am hoping to find someone who can do c++ well enough, and maybe
> has some familiarity with the clang code, to add a warning message
> for this
> 
> if a .long, .quad, .byte are placed into a .text section, issue
> a warning, then we'll be able to identify all these in a ports build
> and decide which need manual fixing, and move the objects into .rodata
> and apply __PIC__ handling as neccessary
> 
> Yes, there are cases where people use .long to inject an instruction
> they don't believe the assembler has.  We can ignore those by inspect.
> 
> Can anyone help?  It doesn't need to be fancy, it just needs to get
> us moving faster.
> 
> Thanks
> 

The following diff seems to work. It adds a warning inside assembler parsers.

Please note that it is a warning at assembler level, so -Werror doesn't 
influence it, whereas -Wa,--fatal-warnings will.


The check is done for "directive value" and "directive named value". It doesn't 
cover ascii directives (like .asci{i,z}, .string) neither real directives 
(.real{4,8,10}).


I checked it on somehow simple code:

$ clang -c test.S 
test.S:1:10: warning: directive value '.long' inside '.text' section
a: .long 0x90909090
 ^
test.S:2:10: warning: directive value '.word' inside '.text' section
b: .word 0x9090
 ^
test.S:3:10: warning: directive value '.quad' inside '.text' section
c: .quad 0x90
 ^
test.S:4:10: warning: directive value '.byte' inside '.text' section
d: .byte 0x90
 ^


-- 
Sebastien Marie


diff /home/semarie/repos/openbsd/src
commit - b18ed08defc1c5e5b4be701ce5ef913bda8ae66a
path + /home/semarie/repos/openbsd/src
blob - 047ed1660a837e77561b1f2839ec9601f9582a7e
file + gnu/llvm/llvm/lib/MC/MCParser/AsmParser.cpp
--- gnu/llvm/llvm/lib/MC/MCParser/AsmParser.cpp
+++ gnu/llvm/llvm/lib/MC/MCParser/AsmParser.cpp
@@ -3168,6 +3168,12 @@ bool AsmParser::parseDirectiveValue(StringRef IDVal, u
 SMLoc ExprLoc = getLexer().getLoc();
 if (checkForValidSection() || parseExpression(Value))
   return true;
+// Check for directive inside .text
+const MCSection *Section = getStreamer().getCurrentSectionOnly();
+const StringRef sectionName = Section->getName();
+if (sectionName.equals(".text") || sectionName.startswith(".text."))
+  Warning(ExprLoc, "directive value '" + Twine(IDVal) + "' inside '"
++ Twine(sectionName) + "' section");
 // Special case constant expressions to match code generator.
 if (const MCConstantExpr *MCE = dyn_cast(Value)) {
   assert(Size <= 8 && "Invalid size");
blob - 7b4d6e529cc2c3c4efce05f62f41620658d7a8e0
file + gnu/llvm/llvm/lib/MC/MCParser/MasmParser.cpp
--- gnu/llvm/llvm/lib/MC/MCParser/MasmParser.cpp
+++ gnu/llvm/llvm/lib/MC/MCParser/MasmParser.cpp
@@ -3809,6 +3809,14 @@ bool MasmParser::parseDirectiveValue(StringRef IDVal, 
 /// parseDirectiveValue
 ///  ::= (byte | word | ... ) [ expression (, expression)* ]
 bool MasmParser::parseDirectiveValue(StringRef IDVal, unsigned Size) {
+  // Check for directive inside .text
+  const MCSection *Section = getStreamer().getCurrentSectionOnly();
+  if (Section != NULL) {
+const StringRef sectionName = Section->getName();
+if (sectionName.equals(".text") || sectionName.startswith(".text."))
+  Warning(getLexer().getLoc(), "directive value '" + Twine(IDVal)
++ "' inside '" + Twine(sectionName) + "' section");
+  }
   if (StructInProgress.empty()) {
 // Initialize data value.
 if (emitIntegralValues(Size))
@@ -3825,6 +3833,14 @@ bool MasmParser::parseDirectiveNamedValue(StringRef Ty
 bool MasmParser::parseDirectiveNamedValue(StringRef TypeName, unsigned Size,
   StringRef Name, SMLoc NameLoc) {
   if (StructInProgress.empty()) {
+// Check for directive inside .text
+const MCSection *Section = getStreamer().getCurrentSectionOnly();
+if (Section != NULL) {
+  const StringRef sectionName = Section->getName();
+  if (sectionName.equals(".text") || sectionName.startswith(".text."))
+Warning(getLexer().getLoc(), "directive named value '" + Twine(Name)
+  + "' inside '" + Twine(sectionName) + "' section");
+}
 // Initialize named data value.
 MCSymbol *Sym = getContext().getOrCreateSymbol(Name);
 getStreamer().emitLabel(Sym);



Re: patch: change swblk_t type and use it in blist

2022-08-06 Thread Sebastien Marie
On Sat, Aug 06, 2022 at 02:19:31AM +0200, Jeremie Courreges-Anglas wrote:
> On Fri, Aug 05 2022, Sebastien Marie  wrote:
> > Hi,
> >
> > When initially ported blist from DragonFlyBSD, we used custom type bsblk_t 
> > and 
> > bsbmp_t instead of the one used by DragonFlyBSD (swblk_t and u_swblk_t).
> >
> > The reason was swblk_t is already defined on OpenBSD, and was incompatible 
> > with 
> > blist (int32_t). It is defined, but not used (outside some regress file 
> > which 
> > seems to be not affected by type change).
> >
> > This diff changes the __swblk_t definition in sys/_types.h to be 'unsigned 
> > long', and switch back blist to use swblk_t (and u_swblk_t, even if it 
> > isn't 
> > 'unsigned swblk_t').
> >
> > It makes the diff with DragonFlyBSD more thin. I added a comment with the 
> > git id 
> > used for the initial port.
> >
> > I tested it on i386 and amd64 (kernel and userland).
> >
> > By changing bitmap type from 'u_long' to 'u_swblk_t' ('u_int64_t'), it 
> > makes the 
> > regress the same on 64 and 32bits archs (and it success on both).
> >
> > Comments or OK ?
> 
> This seems fair, but maybe we should just zap the type from sys/types.h and
> define it only in sys/blist.h, as done in DragonflyBSD?

Yes, it makes lot of sense. 

Updated diff below.
 
> I'm building a release on amd64 with the type removed (also from
> regress).  I don't expect fallout in ports (and I can take care of it if
> there is any).

I also have a release building on i386.

Thanks.
-- 
Sebastien Marie

diff /home/semarie/repos/openbsd/src
commit - 73f52ef7130cefbe5a8fe028eedaad0e54be7303
path + /home/semarie/repos/openbsd/src
blob - 4cf6259417df583dadc5d63e7bb1753628eb8b50
file + sys/kern/subr_blist.c
--- sys/kern/subr_blist.c
+++ sys/kern/subr_blist.c
@@ -1,4 +1,5 @@
 /* $OpenBSD: subr_blist.c,v 1.1 2022/07/29 17:47:12 semarie Exp $ */
+/* 
DragonFlyBSD:7b80531f545c7d3c51c1660130c71d01f6bccbe0:/sys/kern/subr_blist.c */
 /*
  * BLIST.C -   Bitmap allocator/deallocator, using a radix tree with hinting
  * 
@@ -133,29 +134,29 @@
  * static support functions
  */
 
-static bsblk_t blst_leaf_alloc(blmeta_t *scan, bsblk_t blkat,
-   bsblk_t blk, bsblk_t count);
-static bsblk_t blst_meta_alloc(blmeta_t *scan, bsblk_t blkat,
-   bsblk_t blk, bsblk_t count,
-   bsblk_t radix, bsblk_t skip);
-static void blst_leaf_free(blmeta_t *scan, bsblk_t relblk, bsblk_t count);
-static void blst_meta_free(blmeta_t *scan, bsblk_t freeBlk, bsblk_t count,
-   bsblk_t radix, bsblk_t skip,
-   bsblk_t blk);
-static bsblk_t blst_leaf_fill(blmeta_t *scan, bsblk_t blk, bsblk_t count);
-static bsblk_t blst_meta_fill(blmeta_t *scan, bsblk_t fillBlk, bsblk_t count,
-   bsblk_t radix, bsblk_t skip,
-   bsblk_t blk);
-static void blst_copy(blmeta_t *scan, bsblk_t blk, bsblk_t radix,
-   bsblk_t skip, blist_t dest, bsblk_t count);
-static bsblk_t blst_radix_init(blmeta_t *scan, bsblk_t radix,
-   bsblk_t skip, bsblk_t count);
-static int blst_radix_gapfind(blmeta_t *scan, bsblk_t blk, bsblk_t radix, 
bsblk_t skip,
-int state, bsblk_t *maxbp, bsblk_t *maxep, bsblk_t *bp, bsblk_t *ep);
+static swblk_t blst_leaf_alloc(blmeta_t *scan, swblk_t blkat,
+   swblk_t blk, swblk_t count);
+static swblk_t blst_meta_alloc(blmeta_t *scan, swblk_t blkat,
+   swblk_t blk, swblk_t count,
+   swblk_t radix, swblk_t skip);
+static void blst_leaf_free(blmeta_t *scan, swblk_t relblk, swblk_t count);
+static void blst_meta_free(blmeta_t *scan, swblk_t freeBlk, swblk_t count, 
+   swblk_t radix, swblk_t skip,
+   swblk_t blk);
+static swblk_t blst_leaf_fill(blmeta_t *scan, swblk_t blk, swblk_t count);
+static swblk_t blst_meta_fill(blmeta_t *scan, swblk_t fillBlk, swblk_t count,
+   swblk_t radix, swblk_t skip,
+   swblk_t blk);
+static void blst_copy(blmeta_t *scan, swblk_t blk, swblk_t radix,
+   swblk_t skip, blist_t dest, swblk_t count);
+static swblk_t blst_radix_init(blmeta_t *scan, swblk_t radix,
+   swblk_t skip, swblk_t count);
+static int blst_radix_gapfind(blmeta_t *scan, swblk_t blk, swblk_t radix, 
swblk_t skip,
+int state, swblk_t *maxbp, swblk_t *maxep, swblk_t *bp, swblk_t *ep);
 
 #if defined(BLIST_DEBUG) || defined(DDB)
-static void

patch: change swblk_t type and use it in blist

2022-08-05 Thread Sebastien Marie
Hi,

When initially ported blist from DragonFlyBSD, we used custom type bsblk_t and 
bsbmp_t instead of the one used by DragonFlyBSD (swblk_t and u_swblk_t).

The reason was swblk_t is already defined on OpenBSD, and was incompatible with 
blist (int32_t). It is defined, but not used (outside some regress file which 
seems to be not affected by type change).

This diff changes the __swblk_t definition in sys/_types.h to be 'unsigned 
long', and switch back blist to use swblk_t (and u_swblk_t, even if it isn't 
'unsigned swblk_t').

It makes the diff with DragonFlyBSD more thin. I added a comment with the git 
id 
used for the initial port.

I tested it on i386 and amd64 (kernel and userland).

By changing bitmap type from 'u_long' to 'u_swblk_t' ('u_int64_t'), it makes 
the 
regress the same on 64 and 32bits archs (and it success on both).

Comments or OK ?
-- 
Sebastien Marie

diff /home/semarie/repos/openbsd/src
commit - 73f52ef7130cefbe5a8fe028eedaad0e54be7303
path + /home/semarie/repos/openbsd/src
blob - e05867429cdd81c434f9ca589c1fb8c6d25957f8
file + sys/sys/_types.h
--- sys/sys/_types.h
+++ sys/sys/_types.h
@@ -60,7 +60,7 @@ typedef   __uint8_t   __sa_family_t;  /* sockaddr 
address f
 typedef__int32_t   __segsz_t;  /* segment size */
 typedef__uint32_t  __socklen_t;/* length type for network 
syscalls */
 typedeflong__suseconds_t;  /* microseconds (signed) */
-typedef__int32_t   __swblk_t;  /* swap offset */
+typedefunsigned long   __swblk_t;  /* swap offset */
 typedef__int64_t   __time_t;   /* epoch time */
 typedef__int32_t   __timer_t;  /* POSIX timer identifiers */
 typedef__uint32_t  __uid_t;/* user id */
blob - 102ca95dd45ba6d9cab0f3fcbb033d6043ec1606
file + sys/sys/blist.h
--- sys/sys/blist.h
+++ sys/sys/blist.h
@@ -1,4 +1,5 @@
 /* $OpenBSD: blist.h,v 1.1 2022/07/29 17:47:12 semarie Exp $ */
+/* DragonFlyBSD:7b80531f545c7d3c51c1660130c71d01f6bccbe0:/sys/sys/blist.h */
 /*
  * Copyright (c) 2003,2004 The DragonFly Project.  All rights reserved.
  * 
@@ -65,15 +66,13 @@
 #include 
 #endif
 
-#defineSWBLK_BITS 64
-typedef u_long bsbmp_t;
-typedef u_long bsblk_t;
+typedef u_int64_t  u_swblk_t;
 
 /*
  * note: currently use SWAPBLK_NONE as an absolute value rather then
  * a flag bit.
  */
-#define SWAPBLK_NONE   ((bsblk_t)-1)
+#define SWAPBLK_NONE   ((swblk_t)-1)
 
 /*
  * blmeta and bl_bitmap_t MUST be a power of 2 in size.
@@ -81,39 +80,39 @@ typedef u_long bsblk_t;
 
 typedef struct blmeta {
union {
-   bsblk_t bmu_avail;  /* space available under us */
-   bsbmp_t bmu_bitmap; /* bitmap if we are a leaf  */
+   swblk_t bmu_avail;  /* space available under us */
+   u_swblk_t   bmu_bitmap; /* bitmap if we are a leaf  */
} u;
-   bsblk_t bm_bighint; /* biggest contiguous block hint*/
+   swblk_t bm_bighint; /* biggest contiguous block hint*/
 } blmeta_t;
 
 typedef struct blist {
-   bsblk_t bl_blocks;  /* area of coverage */
+   swblk_t bl_blocks;  /* area of coverage */
/* XXX int64_t bl_radix */
-   bsblk_t bl_radix;   /* coverage radix   */
-   bsblk_t bl_skip;/* starting skip*/
-   bsblk_t bl_free;/* number of free blocks*/
+   swblk_t bl_radix;   /* coverage radix   */
+   swblk_t bl_skip;/* starting skip*/
+   swblk_t bl_free;/* number of free blocks*/
blmeta_t*bl_root;   /* root of radix tree   */
-   bsblk_t bl_rootblks;/* bsblk_t blks allocated for tree */
+   swblk_t bl_rootblks;/* swblk_t blks allocated for tree */
 } *blist_t;
 
-#define BLIST_META_RADIX   (sizeof(bsbmp_t)*8/2)   /* 2 bits per */
-#define BLIST_BMAP_RADIX   (sizeof(bsbmp_t)*8) /* 1 bit per */
+#define BLIST_META_RADIX   (sizeof(u_swblk_t)*8/2) /* 2 bits per */
+#define BLIST_BMAP_RADIX   (sizeof(u_swblk_t)*8)   /* 1 bit per */
 
 /*
  * The radix may exceed the size of a 64 bit signed (or unsigned) int
- * when the maximal number of blocks is allocated.  With a 32-bit bsblk_t
+ * when the maximal number of blocks is allocated.  With a 32-bit swblk_t
  * this corresponds to ~1G x PAGE_SIZE = 4096GB.  The swap code usually
  * divides this by 4, leaving us with a capability of up to four 1TB swap
  * devices.
  *
- * With a 64-bit bsblk_t the limitation is some insane number.
+ * With a 64-bit swblk_t the limitation is some insane number.
  *
  * NOTE: For now I don't trust that we overflow-detect properly so we divide
  *  out to ensure that no overflow occurs.
  */
 
-#if SWBLK_BITS == 64
+#if defined(_LP64)
 #define

Re: Introduce uvm_pagewait()

2022-07-11 Thread Sebastien Marie
  continue;
>   }
> Index: uvm/uvm_page.c
> ===
> RCS file: /cvs/src/sys/uvm/uvm_page.c,v
> retrieving revision 1.166
> diff -u -p -r1.166 uvm_page.c
> --- uvm/uvm_page.c12 May 2022 12:48:36 -  1.166
> +++ uvm/uvm_page.c28 Jun 2022 11:57:42 -
> @@ -1087,6 +1087,23 @@ uvm_page_unbusy(struct vm_page **pgs, in
>   }
>  }
>  
> +/*
> + * uvm_pagewait: wait for a busy page
> + *
> + * => page must be known PG_BUSY
> + * => object must be locked
> + * => object will be unlocked on return
> + */
> +void
> +uvm_pagewait(struct vm_page *pg, struct rwlock *lock, const char *wmesg)
> +{
> + KASSERT(rw_lock_held(lock));

the KASSERT(rw_lock_held(lock)) shouldn't be required: rwsleep_nsec() will
already call rw_assert_anylock() (via rwsleep function).

rw_status() (used by both rw_lock_held and rw_assert_anylock) could only return
4 values:
 - 0 (not locked)
 - RW_READ
 - RW_WRITE
 - RW_WRITE_OTHER

using KASSERT(rw_lock_held(lock)) or rw_assert_anylock() permits only RW_READ 
|| 
RW_WRITE (assert on 0 and on RW_WRITE_OTHER).

> + KASSERT((pg->pg_flags & PG_BUSY) != 0);
> +
> + atomic_setbits_int(>pg_flags, PG_WANTED);
> + rwsleep_nsec(pg, lock, PVM | PNORELOCK, wmesg, INFSLP);
> +}
> +
>  #if defined(UVM_PAGE_TRKOWN)
>  /*
>   * uvm_page_own: set or release page ownership
> Index: uvm/uvm_page.h
> ===
> RCS file: /cvs/src/sys/uvm/uvm_page.h,v
> retrieving revision 1.68
> diff -u -p -r1.68 uvm_page.h
> --- uvm/uvm_page.h12 May 2022 12:48:36 -  1.68
> +++ uvm/uvm_page.h28 Jun 2022 11:53:08 -
> @@ -233,7 +233,7 @@ void  uvm_pagefree(struct vm_page *);
>  void uvm_page_unbusy(struct vm_page **, int);
>  struct vm_page   *uvm_pagelookup(struct uvm_object *, voff_t);
>  void uvm_pageunwire(struct vm_page *);
> -void uvm_pagewait(struct vm_page *, int);
> +void uvm_pagewait(struct vm_page *, struct rwlock *, const char *);
>  void uvm_pagewake(struct vm_page *);
>  void uvm_pagewire(struct vm_page *);
>  void uvm_pagezero(struct vm_page *);
> Index: uvm/uvm_vnode.c
> ===
> RCS file: /cvs/src/sys/uvm/uvm_vnode.c,v
> retrieving revision 1.124
> diff -u -p -r1.124 uvm_vnode.c
> --- uvm/uvm_vnode.c   3 May 2022 21:20:35 -   1.124
> +++ uvm/uvm_vnode.c   28 Jun 2022 11:53:08 -
> @@ -684,11 +684,10 @@ uvn_flush(struct uvm_object *uobj, voff_
>   }
>   } else if (flags & PGO_FREE) {
>   if (pp->pg_flags & PG_BUSY) {
> - atomic_setbits_int(>pg_flags,
> - PG_WANTED);
>   uvm_unlock_pageq();
> - rwsleep_nsec(pp, uobj->vmobjlock, PVM,
> - "uvn_flsh", INFSLP);
> + uvm_pagewait(pp, uobj->vmobjlock,
> + "uvn_flsh");
> + rw_enter(uobj->vmobjlock, RW_WRITE);
>   uvm_lock_pageq();
>   curoff -= PAGE_SIZE;
>   continue;
> @@ -1054,9 +1053,8 @@ uvn_get(struct uvm_object *uobj, voff_t 
>  
>   /* page is there, see if we need to wait on it */
>   if ((ptmp->pg_flags & PG_BUSY) != 0) {
> - atomic_setbits_int(>pg_flags, PG_WANTED);
> - rwsleep_nsec(ptmp, uobj->vmobjlock, PVM,
> - "uvn_get", INFSLP);
> + uvm_pagewait(ptmp, uobj->vmobjlock, "uvn_get");
> + rw_enter(uobj->vmobjlock, RW_WRITE);
>   continue;   /* goto top of pps while loop */
>   }
>  
> 

Thanks.
-- 
Sebastien Marie



patch: xlock: unveil issue: login.conf

2022-07-06 Thread Sebastien Marie
Hi,

I am currently debugging some unveil issues reported when process accounting in 
enabled (see acct(2)). 

xlock is currently doing some unveil(2) violation:

$ lastcomm | grep U
xlock-FU semarie  __ 
0.00 secs Wed Jul  6 07:13 (1:21:00.00)

I tracked them to be related to "login.conf" access (due to auth_userokay(3) 
usage).

The diff belows adds all "login.conf" related files to readable files by the 
process:

- /etc/login.conf
- /etc/login.conf.db
- /etc/login.conf.d/*


diff 7f83513b277728e78b173796466b04c2373f0b55 
/home/semarie/repos/openbsd/xenocara
blob - 7fdf4f11d18bdb1bab730f008ef7ea10e0e482ca
file + app/xlockmore/xlock/privsep.c
--- app/xlockmore/xlock/privsep.c
+++ app/xlockmore/xlock/privsep.c
@@ -255,8 +255,14 @@ priv_init(gid_t gid)
 
imsg_init(_ibuf, socks[0]);
 
+   if (unveil(_PATH_LOGIN_CONF, "r") == -1)
+   err(1, "unveil %s", _PATH_LOGIN_CONF);
+   if (unveil(_PATH_LOGIN_CONF ".db", "r") == -1)
+   err(1, "unveil %s.db", _PATH_LOGIN_CONF);
+   if (unveil(_PATH_LOGIN_CONF_D, "r") == -1)
+   err(1, "unveil %s", _PATH_LOGIN_CONF_D);
if (unveil(_PATH_AUTHPROGDIR, "rx") == -1)
-   err(1, "unveil");
+   err(1, "unveil %s", _PATH_AUTHPROGDIR);
if (pledge("stdio rpath getpw proc exec", NULL) == -1)
    err(1, "pledge");


With it, I don't have unveil(2) violation anymore when running xlock(1).
 
Comments or OK ?
-- 
Sebastien Marie



Re: acpitz(4): perform passive cooling only when perfpolicy is AUTO

2022-06-27 Thread Sebastien Marie
On Mon, Jun 27, 2022 at 11:47:58PM +0200, Stefan Hagen wrote:
> Bryan Steele wrote (2022-06-27 23:12 CEST):
> > On Mon, Jun 27, 2022 at 11:01:31PM +0200, Stefan Hagen wrote:
> > > acpitz(4) implements passive cooling, which starts throttling the CPU to 
> > > keep it under the temperature reported by the _PSV trip point.
> > > 
> > > https://uefi.org/specs/ACPI/6.4/11_Thermal_Management/thermal-control.html
> > > 
> > > The specs (1.1.5.1) leave the decision to activate passive cooling to
> > > the OS. It is a way to limit noise and heat rather than to protect the
> > > CPU (for which _HOT and _CRT are the better trip points).
> > > 
> > > I would like to restrict passive cooling to the AUTO perfpolicy.
> > > 
> > > For low (apm -L) and high (apm -H) it doesn't make much sense, because
> > >  - low is setting a low pstate anyway
> > >  - high is probably never set with the intention to have a cool and 
> > >quiet machine.
> > 
> > Shouldn't this also take into consideration hw.power as well? If it
> > doesn't make sense for perfpolicy=high then it probably doesn't for
> > perfpolicy=auto when on AC power?
> 
> I would say yes.

no particular opinion for now regarding the logic, but please avoid magic 
number when constant/define exists.

AUTO perfpolicy is PERFPOL_AUTO (which is 1).

Thanks.
 
> Index: sys/dev/acpi/acpitz.c
> ===
> RCS file: /home/cvs/src/sys/dev/acpi/acpitz.c,v
> retrieving revision 1.58
> diff -u -p -r1.58 acpitz.c
> --- sys/dev/acpi/acpitz.c 6 Apr 2022 18:59:27 -   1.58
> +++ sys/dev/acpi/acpitz.c 27 Jun 2022 21:23:06 -
> @@ -89,7 +89,9 @@ voidacpitz_init(struct acpitz_softc *, 
>  void (*acpitz_cpu_setperf)(int);
>  int  acpitz_perflevel = -1;
>  extern void  (*cpu_setperf)(int);
> +extern int   hw_power;
>  extern int   perflevel;
> +extern int   perfpolicy;
>  #define PERFSTEP 10
>  
>  #define ACPITZ_TRIPS (1L << 0)
> @@ -381,7 +383,7 @@ acpitz_refresh(void *arg)
>   sc->sc_tc1, sc->sc_tc2, sc->sc_psv);
>  
>   nperf = acpitz_perflevel;
> - if (sc->sc_psv <= sc->sc_tmp) {
> + if (sc->sc_psv <= sc->sc_tmp && perfpolicy == 1 && hw_power == 
> 0) {
>   /* Passive cooling enabled */
>   dnprintf(1, "%s: enabling passive %d %d\n",
>   DEVNAME(sc), sc->sc_tmp, sc->sc_psv);
> 

-- 
Sebastien Marie



Re: match recent Intel CPUs in fw_update(8)

2022-06-21 Thread Sebastien Marie
On Tue, Jun 21, 2022 at 02:58:46PM +1000, Jonathan Gray wrote:
> Intel CPUs used to have strings like
> cpu0: Intel(R) Pentium(R) M processor 1200MHz ("GenuineIntel" 686-class) 1.20 
> GHz
> cpu0: Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz, 2494.61 MHz, 06-3d-04
> recent CPUs use
> cpu0: 11th Gen Intel(R) Core(TM) i5-1130G7 @ 1.10GHz, 30009.37 MHz, 06-8c-01
> cpu0: 12th Gen Intel(R) Core(TM) i5-12400, 4390.71 MHz, 06-97-02
> cpu0: 12th Gen Intel(R) Core(TM) i7-1260P, 1995.55 MHz, 06-9a-03
> 
> Index: patterns.c
> ===
> RCS file: /cvs/src/usr.sbin/fw_update/patterns.c,v
> retrieving revision 1.3
> diff -u -p -r1.3 patterns.c
> --- patterns.c10 Mar 2022 07:12:13 -  1.3
> +++ patterns.c21 Jun 2022 04:31:24 -
> @@ -94,7 +94,7 @@ main(void)
>   printf("%s\n", "bwfm");
>   printf("%s\n", "bwi");
>   printf("%s\n", "intel");
> - printf("%s\n", "intel ^cpu0: Intel");
> + printf("%s\n", "intel ^cpu0:*Intel(R)");
>   printf("%s\n", "inteldrm");
>   print_devices("inteldrm", i915_devices, nitems(i915_devices));
>   printf("%s\n", "ipw");
> 

If I properly understood the way patterns file is used in fw_update, having a 
'*' in the middle of the searched string could be hasardous and will result in 
possible false positive (installing a firmware whereas not need).

With the pattern "intel ^cpu0:*Intel(R)", the search is done using:

_d="intel"
_m="^cpu0:*Intel(R)"
_nl=$(echo) # newline: \n

[ "$_m" ] || _m="${_nl}${_d}[0-9] at "  # no change
[ "$_m" = "${_m#^}" ] || _m="${_nl}${_m#^}" # changed to 
_m="\ncpu0:*Intel(R)"

if [[ ${_dmesgtail} = *$_m* ]] ; then
echo "$_d"
fi

The final searched string (glob) is "*\ncpu0:*Intel(R)*".

It means that if the dmesg contains "\ncpu0:" and "Intel(R)" (in this order, 
but 
not necessary on the same line) it will match.

On dmesglog, I found only few candidates for "Intel(R)" string not on a "cpuX:" 
line:

bios0: Intel(R) Client Systems NUC8i5BEH
ugen0 at uhub2 port 2 "Intel Corporation Intel(R) Centrino(R) 
Wireless-N + WiMAX 6150" rev 2.00/0.00 addr 3
ugen0 at uhub2 port 6 "Intel(R) Corporation Intel(R) Centrino(R) 
Advanced-N + WiMAX 6250" rev 2.00/0.00 addr 4
uhidev0 at uhub2 port 1 configuration 1 interface 0 "Intel Corporation 
Intel(R) Sensor Solution" rev 2.00/0.01 addr 3

the bios0 entry is *before* cpu0, so it will not match. But the others entries 
will match (assuming the dmesg contains a cpu0: line, which somehow expected).

Below an alternative diff to match the Intel cpu without using '*':

diff 5177244f5c04ec0ff6b9b724b540740d62d2dcfa /home/semarie/repos/openbsd/src
blob - d7cbab70c1654d8dc6ca067cea55b440fd70d426
file + usr.sbin/fw_update/patterns.c
--- usr.sbin/fw_update/patterns.c
+++ usr.sbin/fw_update/patterns.c
@@ -95,6 +95,7 @@ main(void)
printf("%s\n", "bwi");
printf("%s\n", "intel");
printf("%s\n", "intel ^cpu0: Intel");
+   printf("%s\n", "intel ^cpu0: [0-9][0-9]th Gen Intel");
printf("%s\n", "inteldrm");
print_devices("inteldrm", i915_devices, nitems(i915_devices));
printf("%s\n", "ipw");

Comments ?
-- 
Sebastien Marie



Re: vers.c: make kernel date in UTC

2022-05-02 Thread Sebastien Marie
Just a simple ping.

I already have one formal ok, and bluhm@ said it would makes his life easier.

Any objections ?
-- 
Sebastien Marie

On Sat, Apr 23, 2022 at 09:20:06AM +0200, Sebastien Marie wrote:
> Hi,
> 
> When I want to check if a particular commit is expected to be present in a 
> particular kernel, I usually check the dates.
> 
> $ what bsd.mp
>   OpenBSD 7.1-current (GENERIC.MP) #481: Thu Apr 21 21:11:42 MDT 2022
> 
> $ cvs log dev/usb/if_ral.c
> [...]
> 
> revision 1.149
> date: 2022/04/21 21:03:03;  author: stsp;  state: Exp;  lines: +2 -3;  
> commitid: 71y4MGxVttLEpbpK;
> Use memset() to initialize struct ieee80211_rxinfo properly.
> [...]
> 
> 
> The problem is cvs (or got) is using date using UTC whereas the kernel is 
> using 
> localized date (here MDT, but could be CEST in my case too if I build the 
> kernel 
> myself).
> 
> Whereas I know that the build time isn't necessary a perfect indication of 
> the 
> presence of a particular commit (specifically if the time interval is small), 
> it 
> might be more simple to compare date if both are in the same timezone.
> 
> Currently, I am using such command to convert MDT to UTC:
> 
> $ TZ=Canada/Mountain date -j -z UTC 202204212111
> Fri Apr 22 03:11:00 UTC 2022
> 
> but the input format (202204212111 for 'Apr 21 21:11:42 MDT 2022') isn't 
> particulary straightfor to write.
> 
> Does such diff to force UTC timezone in kernel buildate would be acceptable ?
> 
> Please note that both cvs(1) and got(1) are ignoring TZ environment variable, 
> so 
> I can't get them to show directly MDT dates.
> 
> Thanks.
> -- 
> Sebastien Marie
> 
> 
diff 62198fa5a9d005ca1c651b3df2c33ce50d333b27 /home/semarie/repos/openbsd/src
blob - ab97ce4c59639a6b357120b9a953c3584211aea2
file + sys/conf/newvers.sh
--- sys/conf/newvers.sh
+++ sys/conf/newvers.sh
@@ -40,7 +40,7 @@ then
 fi
 
 touch version
-v=`cat version` u=`logname` d=${PWD%/obj} h=`hostname` t=`date`
+v=`cat version` u=`logname` d=${PWD%/obj} h=`hostname` t=`date -z UTC`
 id=`basename "${d}"`
 
 # additional things which need version number upgrades:



Re: uvmpd_scan(): Recheck PG_BUSY after locking the page

2022-04-29 Thread Sebastien Marie
On Thu, Apr 28, 2022 at 12:28:45PM +0200, Martin Pieuchot wrote:
> rw_enter(9) can sleep.  When the lock is finally acquired by the
> pagedaemon the previous check might no longer be true and the page
> could be busy.  In this case we shouldn't touch it.
> 
> Diff below recheck for PG_BUSY after acquiring the lock and also
> use a variable for the lock to reduce the differences with NetBSD.
> 
> ok?

hep. ok semarie@

just cosmetic comment below, but if it makes the code diverges from NetBSD, 
feel 
free to ignore.
 
> Index: uvm/uvm_pdaemon.c
> ===
> RCS file: /cvs/src/sys/uvm/uvm_pdaemon.c,v
> retrieving revision 1.96
> diff -u -p -r1.96 uvm_pdaemon.c
> --- uvm/uvm_pdaemon.c 11 Apr 2022 16:43:49 -  1.96
> +++ uvm/uvm_pdaemon.c 28 Apr 2022 10:22:52 -
> @@ -879,6 +879,8 @@ uvmpd_scan(void)
>   int free, inactive_shortage, swap_shortage, pages_freed;
>   struct vm_page *p, *nextpg;
>   struct uvm_object *uobj;
> + struct vm_anon *anon;
> + struct rwlock *slock;
>   boolean_t got_it;
>  
>   MUTEX_ASSERT_LOCKED();
> @@ -947,20 +949,34 @@ uvmpd_scan(void)
>p != NULL && (inactive_shortage > 0 || swap_shortage > 0);
>p = nextpg) {
>   nextpg = TAILQ_NEXT(p, pageq);
> -
> - /* skip this page if it's busy. */
> - if (p->pg_flags & PG_BUSY)
> + if (p->pg_flags & PG_BUSY) {
>   continue;
> + }
>  
> - if (p->pg_flags & PQ_ANON) {
> - KASSERT(p->uanon != NULL);
> - if (rw_enter(p->uanon->an_lock, RW_WRITE|RW_NOSLEEP))
> + /*
> +  * lock the page's owner.
> +  */
> + if (p->uobject != NULL) {
> + uobj = p->uobject;
> + slock = uobj->vmobjlock;
> + if (rw_enter(slock, RW_WRITE|RW_NOSLEEP)) {
>   continue;
> + }
>   } else {
> - KASSERT(p->uobject != NULL);
> - if (rw_enter(p->uobject->vmobjlock,
> - RW_WRITE|RW_NOSLEEP))
> + anon = p->uanon;
> + KASSERT(p->uanon != NULL);
> + slock = anon->an_lock;
> + if (rw_enter(slock, RW_WRITE|RW_NOSLEEP)) {
>   continue;
> + }
> + }

I would move the rw_enter() call after the if(p->pg_flags & PQ_ANON), for 
better readability.

if (p->uobject != NULL) {
uobj = p->uobject;
slock = uobj->vmobjlock;
} else {
anon = p->uanon;
KASSERT(p->uanon != NULL);
slock = anon->an_lock;
}

if (rw_enter(slock, RW_WRITE|RW_NOSLEEP)) {
continue;
}

> +
> + /*
> +  * skip this page if it's busy.
> +  */
> + if ((p->pg_flags & PG_BUSY) != 0) {
> + rw_exit(slock);
> + continue;
>   }
>  
>   /*
> @@ -997,10 +1013,11 @@ uvmpd_scan(void)
>   uvmexp.pddeact++;
>   inactive_shortage--;
>   }
> -     if (p->pg_flags & PQ_ANON)
> - rw_exit(p->uanon->an_lock);
> - else
> - rw_exit(p->uobject->vmobjlock);
> +
> + /*
> +  * we're done with this page.
> +  */
> + rw_exit(slock);
>   }
>  }
>  
> 

-- 
Sebastien Marie



Re: Use vgonel() in vop_generic_revoke

2022-04-27 Thread Sebastien Marie
On Wed, Apr 27, 2022 at 12:27:13PM +0200, Claudio Jeker wrote:
> This is just a mini cleanup. Switch from vgone() to vgonel() like it is
> done a bit later already. vgone() is just a wrapper around vgonel() using
> curproc (which is cached in vop_generic_revoke()).
> 
> -- 
> :wq Claudio

ok semarie@
 
> Index: vfs_default.c
> ===
> RCS file: /cvs/src/sys/kern/vfs_default.c,v
> retrieving revision 1.50
> diff -u -p -r1.50 vfs_default.c
> --- vfs_default.c 15 Oct 2021 06:30:06 -  1.50
> +++ vfs_default.c 18 Mar 2022 10:56:02 -
> @@ -107,7 +107,7 @@ vop_generic_revoke(void *v)
>   if (vq->v_rdev != vp->v_rdev ||
>   vq->v_type != vp->v_type || vp == vq)
>   continue;
> - vgone(vq);
> + vgonel(vq, p);
>       break;
>   }
>   }
> 

-- 
Sebastien Marie



vers.c: make kernel date in UTC

2022-04-23 Thread Sebastien Marie
Hi,

When I want to check if a particular commit is expected to be present in a 
particular kernel, I usually check the dates.

$ what bsd.mp
OpenBSD 7.1-current (GENERIC.MP) #481: Thu Apr 21 21:11:42 MDT 2022

$ cvs log dev/usb/if_ral.c
[...]

revision 1.149
date: 2022/04/21 21:03:03;  author: stsp;  state: Exp;  lines: +2 -3;  
commitid: 71y4MGxVttLEpbpK;
Use memset() to initialize struct ieee80211_rxinfo properly.
[...]


The problem is cvs (or got) is using date using UTC whereas the kernel is using 
localized date (here MDT, but could be CEST in my case too if I build the 
kernel 
myself).

Whereas I know that the build time isn't necessary a perfect indication of the 
presence of a particular commit (specifically if the time interval is small), 
it 
might be more simple to compare date if both are in the same timezone.

Currently, I am using such command to convert MDT to UTC:

$ TZ=Canada/Mountain date -j -z UTC 202204212111
Fri Apr 22 03:11:00 UTC 2022

but the input format (202204212111 for 'Apr 21 21:11:42 MDT 2022') isn't 
particulary straightfor to write.

Does such diff to force UTC timezone in kernel buildate would be acceptable ?

Please note that both cvs(1) and got(1) are ignoring TZ environment variable, 
so 
I can't get them to show directly MDT dates.

Thanks.
-- 
Sebastien Marie


diff 62198fa5a9d005ca1c651b3df2c33ce50d333b27 /home/semarie/repos/openbsd/src
blob - ab97ce4c59639a6b357120b9a953c3584211aea2
file + sys/conf/newvers.sh
--- sys/conf/newvers.sh
+++ sys/conf/newvers.sh
@@ -40,7 +40,7 @@ then
 fi
 
 touch version
-v=`cat version` u=`logname` d=${PWD%/obj} h=`hostname` t=`date`
+v=`cat version` u=`logname` d=${PWD%/obj} h=`hostname` t=`date -z UTC`
 id=`basename "${d}"`
 
 # additional things which need version number upgrades:



Re: vfs: document (and correct) the protection required for manipulating v_numoutput

2022-04-12 Thread Sebastien Marie
On Sun, Mar 27, 2022 at 03:36:20PM +0200, Sebastien Marie wrote:
> Hi,
> 
> v_numoutput is a struct member of vnode which is used to keep track the 
> number 
> of writes in progress.
> 
> in several function comments, it is marked as "Manipulates v_numoutput. Must 
> be 
> called at splbio()".
> 
> So I added a "[B]" mark in the comment to properly document the need of 
> IPL_BIO 
> protection.
> 
> Next, I audited the tree for usage. I found 2 occurrences of v_numoutput 
> (modification) without the required protection, inside dev/softraid.c. I 
> added 
> them.
> 
> Comments or OK ?

anyone ?

the purpose of splbio() is to protect v_numoutput manipulation from interrupts.

for example, bread_cluster_callback() is called from interrupt context (per 
code 
documentation), and could modify v_numoutput (code path: bread_cluster_callback 
-> biodone -> vwakeup).

to ensure that it effectively required (code not already called at IPL_BIO 
level), I added some splassert() (the machine is using softraid with encrypting 
discipline):

- for sr_rw:

splassert_check(6,81fde589,30859b2dea1d26e8,400,8000,8000) at 
splassert_check+0x4d
sr_rw(8041f000,400,80472000,8000,10,0) at sr_rw+0x1fe
sr_meta_save(80422000,1,628df3ebb476eac1,80422000,82712a68,0)
 at sr_meta_save+0x23d
sr_ioctl_createraid(8041f000,82712a68,0,822fe290,cae617dbeff84e1f,82712d50)
 at sr_ioctl_createraid+0xe03
sr_boot_assembly(8041f000,8041f000,e2eb5961e3789080,8041f328,8041f000,8229fd40)
 at sr_boot_assembly+0x936
sr_attach(0,8041f000,0,0,ab529b23aa62cd2b,0) at sr_attach+0x157
config_attach(0,822bff60,0,0,33d5fbe4fc2cb990,c00) at 
config_attach+0x1f4
main(0,0,c00,1001000,100,8000fa40) at main+0x4fb

splassert: sr_rw: want 6 have 0
Starting stack trace...
splassert_check(6,81fde589) at splassert_check+0x4d
sr_rw(8041f000,413,81a52000,8000,10,0) at sr_rw+0x1fe
sr_meta_save(81a0b000,1) at sr_meta_save+0x23d
sr_ioctl_createraid(8041f000,8199b800,1,0) at 
sr_ioctl_createraid+0xe03
sr_bio_handler(8041f000,0,c2e84226,8199b800) at 
sr_bio_handler+0x223
VOP_IOCTL(8000a7ae9730,c2e84226,8199b800,3,fd879e7e4f00,8000a76acd28)
 at VOP_IOCTL+0x5c
vn_ioctl(fd8746f39080,c2e84226,8199b800,8000a76acd28) at 
vn_ioctl+0x75
sys_ioctl(8000a76acd28,8000a7d277e0,8000a7d27830) at sys_ioctl+0x2c4
syscall(8000a7d278a0) at syscall+0x374
Xsyscall() at Xsyscall+0x128
end of kernel

- for sr_ccb_rw, I had some problem to get proper/readable stack trace as the 
console is just spammed too much and stack trace are interleaved.

splassert: sr_ccb_rw: want 6 have 0
Starting stack trace...
splassert_check(6,81f0b74d) at splassert_check+0x4d
sr_ccb_rw(80422000,0,836b80,3800,800021bff000,1001,f0cbd67d55f39eb0)
 at sr_ccb_rw+0x194
sr_crypto_dev_rw(80423300,80423300) at sr_crypto_dev_rw+0x4a
sr_crypto_rw(80423300) at sr_crypto_rw+0xb6
sr_scsi_cmd(fd86af9f11c0) at sr_scsi_cmd+0x30c
scsi_xs_exec(fd86af9f11c0) at scsi_xs_exec+0x3f
sdstart(fd86af9f11c0) at sdstart+0x2d1
scsi_iopool_run(80422aa8) at scsi_iopool_run+0x159
scsi_xsh_runqueued86af9f11c0) at scsi_xs_exec+0x3f
sdstart(fd86af9f11c0) 0419d80) at scsi_xsh_add+0x91
sdstrategy(fd86b037fec8) at sdstrategy+0x112
spec_strategy(8000a7689c88) at spec_strategy+0x56
VOP_STRATEGY(8000fffda770,fd86b037fec8) at VOP_STRATEGY+0x4c
ufs_strategy(8000a7689d18) at ufs_strategy+0xe5
VOP_STRATEGY(8000a78bab98,fd86b037fec8) at VOP_STRATEGY+0x4c
bwrite(fd86b037fec8) at bwrite+0x138
ufs_dirremove(8000a78bab98,fd874cda31e0,800c,0) at ufs_dirremove+0x19f
ufs_remove(8000a7689e50) at ufs_remove+0xbe
VOP_REMOVE(80s_truncate+0x7d0
ufs_inactive(8000a7689d98) at ufs_inactive+
dounlinkat(8000a766e540,ff9c,146b649b778,0) at dounlinkat+0xbd
syscall(8000a768a070) at syscall+0x374
Xsyscall() at Xsyscall+0x128
end of kernel
end trace frame: 0x7f7bffa0, count: 235
End of stack trace.



> -- 
> Sebastien Marie
> 
> Index: dev/softraid.c
> ===
> RCS file: /cvs/src/sys/dev/softraid.c,v
> retrieving revision 1.422
> diff -u -p -r1.422 softraid.c
> --- dev/softraid.c20 Mar 2022 13:14:02 -  1.422
> +++ dev/softraid.c27 Mar 2022 13:28:55 -
> @@ -437,8 +437,12 @@ sr_rw(struct sr_softc *sc, dev_t dev, ch
>   b.b_resid = bufsize;
>   b.b_vp = vp;
>  
> - if ((b.b_flags & B_READ) == 0)
> + if ((b.b_flags & B_READ) == 0) {
> + int s;
> + s = splbio();
>   

vfs: document (and correct) the protection required for manipulating v_numoutput

2022-03-27 Thread Sebastien Marie
Hi,

v_numoutput is a struct member of vnode which is used to keep track the number 
of writes in progress.

in several function comments, it is marked as "Manipulates v_numoutput. Must be 
called at splbio()".

So I added a "[B]" mark in the comment to properly document the need of IPL_BIO 
protection.

Next, I audited the tree for usage. I found 2 occurrences of v_numoutput 
(modification) without the required protection, inside dev/softraid.c. I added 
them.

Comments or OK ?
-- 
Sebastien Marie

Index: dev/softraid.c
===
RCS file: /cvs/src/sys/dev/softraid.c,v
retrieving revision 1.422
diff -u -p -r1.422 softraid.c
--- dev/softraid.c  20 Mar 2022 13:14:02 -  1.422
+++ dev/softraid.c  27 Mar 2022 13:28:55 -
@@ -437,8 +437,12 @@ sr_rw(struct sr_softc *sc, dev_t dev, ch
b.b_resid = bufsize;
b.b_vp = vp;
 
-   if ((b.b_flags & B_READ) == 0)
+   if ((b.b_flags & B_READ) == 0) {
+   int s;
+   s = splbio();
vp->v_numoutput++;
+   splx(s);
+   }
 
LIST_INIT(_dep);
VOP_STRATEGY(vp, );
@@ -2006,8 +2010,12 @@ sr_ccb_rw(struct sr_discipline *sd, int 
ccb->ccb_buf.b_vp = sc->src_vn;
ccb->ccb_buf.b_bq = NULL;
 
-   if (!ISSET(ccb->ccb_buf.b_flags, B_READ))
+   if (!ISSET(ccb->ccb_buf.b_flags, B_READ)) {
+   int s;
+   s = splbio();
ccb->ccb_buf.b_vp->v_numoutput++;
+   splx(s);
+   }
 
LIST_INIT(>ccb_buf.b_dep);
 
Index: sys/vnode.h
===
RCS file: /cvs/src/sys/sys/vnode.h,v
retrieving revision 1.163
diff -u -p -r1.163 vnode.h
--- sys/vnode.h 12 Dec 2021 09:14:59 -  1.163
+++ sys/vnode.h 27 Mar 2022 13:28:56 -
@@ -89,6 +89,7 @@ RBT_HEAD(namecache_rb_cache, namecache);
  * Locks used to protect struct members in struct vnode:
  * a   atomic
  * V   vnode_mtx
+ * B   IPL_BIO
  */
 struct uvm_vnode;
 struct vnode {
@@ -113,7 +114,7 @@ struct vnode {
struct  buf_rb_bufs v_bufs_tree;/* lookup of all bufs */
struct  buflists v_cleanblkhd;  /* clean blocklist head */
struct  buflists v_dirtyblkhd;  /* dirty blocklist head */
-   u_int   v_numoutput;/* num of writes in progress */
+   u_int   v_numoutput;/* [B] num of writes in progress */
LIST_ENTRY(vnode) v_synclist;   /* vnode with dirty buffers */
union {
struct mount*vu_mountedhere;/* ptr to mounted vfs (VDIR) */



vfs: do not export vnode_{hold,free}_list outside kern/vfs_subr.c

2022-03-27 Thread Sebastien Marie
Hi,

The following diff removes vnode_hold_list, vnode_free_list extern references, 
and `struct freelst` definition from sys/vnode.h

`struct freelst` definition (TAILQ_HEAD) is moved where used, inside vfs_subr.c.

Kernel (GENERIC.MP i386) still build (and run). No usage found via grep on 
whole 
tree. Full release(8) build is pending.

No intented behaviour changes.

Comments or OK ?
-- 
Sebastien Marie

Index: kern/vfs_subr.c
===
RCS file: /cvs/src/sys/kern/vfs_subr.c,v
retrieving revision 1.314
diff -u -p -r1.314 vfs_subr.c
--- kern/vfs_subr.c 25 Jan 2022 04:04:40 -  1.314
+++ kern/vfs_subr.c 27 Mar 2022 12:56:20 -
@@ -98,6 +98,7 @@ int suid_clear = 1;   /* 1 => clear SUID 
LIST_NEXT(bp, b_vnbufs) = NOLIST;   \
 }
 
+TAILQ_HEAD(freelst, vnode);
 struct freelst vnode_hold_list;/* list of vnodes referencing buffers */
 struct freelst vnode_free_list;/* vnode free list */
 
Index: sys/vnode.h
===
RCS file: /cvs/src/sys/sys/vnode.h,v
retrieving revision 1.163
diff -u -p -r1.163 vnode.h
--- sys/vnode.h 12 Dec 2021 09:14:59 -  1.163
+++ sys/vnode.h 27 Mar 2022 12:56:20 -
@@ -243,10 +243,6 @@ extern int vttoif_tab[];
 #define REVOKEALL  0x0001  /* vop_revoke: revoke all aliases */
 
 
-TAILQ_HEAD(freelst, vnode);
-extern struct freelst vnode_hold_list; /* free vnodes referencing buffers */
-extern struct freelst vnode_free_list; /* vnode free list */
-
 #defineVATTR_NULL(vap) vattr_null(vap)
 #defineNULLVP  ((struct vnode *)NULL)
 #defineVN_KNOTE(vp, b) \



Re: Use km_alloc(9) in drm

2022-02-06 Thread Sebastien Marie
On Sat, Feb 05, 2022 at 11:44:03AM +0100, Mark Kettenis wrote:
> We want to get rid of the uvm_km_valloc() interfaces in favour of
> km_alloc().  This changes the calls in drm(4) over.  The kv_physwait
> struct is made static to prevent collission with a symbol in
> vm_machdep.c on some architectures.  The goal is to move this into
> uvm/uvm_km.c eventually.
> 
> Just to make sure I didn't screw the conversion up somehow a few tests
> on a mix of inteldrm(4) and amdgpu(4) systems would be good.

I am running it since Feb 5 on amd64 (amdgpu0: RAVEN2 3 CU rev 0x09)
and see no particular new problem.
 

> Index: dev/pci/drm/drm_linux.c
> ===
> RCS file: /cvs/src/sys/dev/pci/drm/drm_linux.c,v
> retrieving revision 1.89
> diff -u -p -r1.89 drm_linux.c
> --- dev/pci/drm/drm_linux.c   21 Jan 2022 23:49:36 -  1.89
> +++ dev/pci/drm/drm_linux.c   5 Feb 2022 10:40:22 -
> @@ -564,6 +564,11 @@ __pagevec_release(struct pagevec *pvec)
>   pagevec_reinit(pvec);
>  }
>  
> +static struct kmem_va_mode kv_physwait = {
> + .kv_map = _map,
> + .kv_wait = 1,
> +};
> +
>  void *
>  kmap(struct vm_page *pg)
>  {
> @@ -572,7 +577,7 @@ kmap(struct vm_page *pg)
>  #if defined (__HAVE_PMAP_DIRECT)
>   va = pmap_map_direct(pg);
>  #else
> - va = uvm_km_valloc_wait(phys_map, PAGE_SIZE);
> + va = (vaddr_t)km_alloc(PAGE_SIZE, _physwait, _none, _waitok);
>   pmap_kenter_pa(va, VM_PAGE_TO_PHYS(pg), PROT_READ | PROT_WRITE);
>   pmap_update(pmap_kernel());
>  #endif
> @@ -589,7 +594,7 @@ kunmap_va(void *addr)
>  #else
>   pmap_kremove(va, PAGE_SIZE);
>   pmap_update(pmap_kernel());
> - uvm_km_free_wakeup(phys_map, va, PAGE_SIZE);
> + km_free((void *)va, PAGE_SIZE, _physwait, _none);
>  #endif
>  }
>  
> @@ -624,7 +629,8 @@ vmap(struct vm_page **pages, unsigned in
>   paddr_t pa;
>   int i;
>  
> - va = uvm_km_valloc(kernel_map, PAGE_SIZE * npages);
> + va = (vaddr_t)km_alloc(PAGE_SIZE * npages, _any, _none,
> + _nowait);
>   if (va == 0)
>   return NULL;
>   for (i = 0; i < npages; i++) {
> @@ -645,7 +651,7 @@ vunmap(void *addr, size_t size)
>  
>   pmap_remove(pmap_kernel(), va, va + size);
>   pmap_update(pmap_kernel());
> - uvm_km_free(kernel_map, va, size);
> + km_free((void *)va, size, _any, _none);
>  }
>  
>  bool
> 

-- 
Sebastien Marie



patch: move kern_unveil.c to use DPRINTF()

2022-01-09 Thread Sebastien Marie
Hi,

The following diff changes (but not too much) the way printf debug is
done in kern_unveil.c

Currently, each printf() is enclosed in #ifdef DEBUG_UNVEIL. The diff
moves to using DPRINTF(). It reduces the number of #ifdef inside the
file.

I also changed some strings to use __func__ instead of using the
function name verbatim.

Build tested with and without DEBUG_UNVEIL defined.

No intented changes.

Comments or OK ?
-- 
Sebastien Marie

diff de3a27964b222c4be979b46c1662d48f67059711 /home/semarie/repos/openbsd/src
blob - 50a043e63d5ae40167428b894994ae10f5b705c1
file + sys/kern/kern_unveil.c
--- sys/kern/kern_unveil.c
+++ sys/kern/kern_unveil.c
@@ -56,6 +56,11 @@ struct unveil {
 };
 
 /* #define DEBUG_UNVEIL */
+#ifdef DEBUG_UNVEIL
+#defineDPRINTF(x...)   do { printf(x); } while (0)
+#else
+#defineDPRINTF(x...)
+#endif
 
 #define UNVEIL_MAX_VNODES  128
 #define UNVEIL_MAX_NAMES   128
@@ -103,9 +108,8 @@ unveil_delete_names(struct unveil *uv)
ret++;
}
rw_exit_write(>uv_lock);
-#ifdef DEBUG_UNVEIL
-   printf("deleted %d names\n", ret);
-#endif
+
+   DPRINTF("deleted %d names\n", ret);
return ret;
 }
 
@@ -120,9 +124,8 @@ unveil_add_name_unlocked(struct unveil *uv, char *name
unvname_delete(unvn);
return 0;
}
-#ifdef DEBUG_UNVEIL
-   printf("added name %s underneath vnode %p\n", name, uv->uv_vp);
-#endif
+
+   DPRINTF("added name %s underneath vnode %p\n", name, uv->uv_vp);
return 1;
 }
 
@@ -144,10 +147,8 @@ unveil_namelookup(struct unveil *uv, char *name)
 
rw_enter_read(>uv_lock);
 
-#ifdef DEBUG_UNVEIL
-   printf("unveil_namelookup: looking up name %s (%p) in vnode %p\n",
-   name, name, uv->uv_vp);
-#endif
+   DPRINTF("%s: looking up name %s (%p) in vnode %p\n",
+   __func__, name, name, uv->uv_vp);
 
KASSERT(uv->uv_vp != NULL);
 
@@ -158,14 +159,9 @@ unveil_namelookup(struct unveil *uv, char *name)
 
rw_exit_read(>uv_lock);
 
-#ifdef DEBUG_UNVEIL
-   if (ret == NULL)
-   printf("unveil_namelookup: no match for name %s in vnode %p\n",
-   name, uv->uv_vp);
-   else
-   printf("unveil_namelookup: matched name %s in vnode %p\n",
-   name, uv->uv_vp);
-#endif
+   DPRINTF("%s: %s name %s in vnode %p\n", __func__,
+   (ret == NULL) ? "no match for" : "matched",
+   name, uv->uv_vp);
return ret;
 }
 
@@ -181,11 +177,10 @@ unveil_destroy(struct process *ps)
/* skip any vnodes zapped by unveil_removevnode */
if (vp != NULL) {
vp->v_uvcount--;
-#ifdef DEBUG_UNVEIL
-   printf("unveil: %s(%d): removing vnode %p uvcount %d "
+
+   DPRINTF("unveil: %s(%d): removing vnode %p uvcount %d "
"in position %ld\n",
ps->ps_comm, ps->ps_pid, vp, vp->v_uvcount, i);
-#endif
vrele(vp);
}
ps->ps_uvncount -= unveil_delete_names(uv);
@@ -291,7 +286,7 @@ unveil_find_cover(struct vnode *dp, struct proc *p)
 * This corner case should not happen because
 * we have not set LOCKPARENT in the flags
 */
-   printf("vnode %p PDIRUNLOCK on error\n", vp);
+   DPRINTF("vnode %p PDIRUNLOCK on error\n", vp);
vrele(vp);
}
break;
@@ -372,9 +367,7 @@ unveil_setflags(u_char *flags, u_char nflags)
 {
 #if 0
if (((~(*flags)) & nflags) != 0) {
-#ifdef DEBUG_UNVEIL
-   printf("Flags escalation %llX -> %llX\n", *flags, nflags);
-#endif
+   DPRINTF("Flags escalation %llX -> %llX\n", *flags, nflags);
return 1;
}
 #endif
@@ -465,11 +458,10 @@ unveil_add(struct proc *p, struct nameidata *ndp, cons
 * unrestrict it.
 */
if (directory_add) {
-#ifdef DEBUG_UNVEIL
-   printf("unveil: %s(%d): updating directory vnode %p"
+   DPRINTF("unveil: %s(%d): updating directory vnode %p"
" to unrestricted uvcount %d\n",
pr->ps_comm, pr->ps_pid, vp, vp->v_uvcount);
-#endif
+
if (!unveil_setflags(>uv_flags, flags))
ret = EPERM;
else
@@ -485,12 +477,11 @@ unveil_add(struct proc *p, struct nameidata

Re: scramble printf(9) %p

2022-01-07 Thread Sebastien Marie
On Fri, Jan 07, 2022 at 12:25:25PM -0700, Theo de Raadt wrote:
> > I agree, but the intent is replacing a debugging method with another
> > debugging method (hoping it is more useful). The messages showed here
> > are the same that the ones which would be shown on the console before
> > the diff.
> 
> the thing about the kernel messages, is you see the pointers and you
> know to go fix the code.
> 
> The problem with hiding it in some optional-visibility interface, is
> that these things may show up without the more paranoid developers
> noticing the problem.
> 
> How do we prevent non-#define DEBUG_FOO situations from growing?

As the kernel currently contains (at least) 1946 pointers leak by using
printf(9) and %p, it might be more efficient to scramble %p with a
static value (randomly assigned at boot time).

Using xor with such static-random-value will conserve uniqueness and
reproductibility of the printf(9) for the kernel live-time.

The following diff does it.

If the real value is still wanted (but it should be for unmentionable
reasons, as even in debug code is seems to be unwanted), a
%P format options could be introduced for such case.

Alternatively, %P might be used for print the scrambled value, but it
will imply reviewing 1946 usage of %p.
-- 
Sebastien Marie

diff 9fbb8b8358a4a6c4112136a2bc60d2cee1a7a243 /home/semarie/repos/openbsd/src
blob - a7de554db1973d60c53eae927b51febc6f547b8e
file + share/man/man9/printf.9
--- share/man/man9/printf.9
+++ share/man/man9/printf.9
@@ -214,6 +214,9 @@ encountered
 distinguishable by its value being \*(Le 32 or \*(Ge 128
 .Pc ,
 or the end of the decoding directive string itself.
+.It Li %p
+The value showed is anonymized to prevent pointer information leaking.
+Uniqueness and reproducibility are preserved for the kernel live-time.
 .El
 .Sh RETURN VALUES
 The
blob - 3ca2a9226b57519b65852db170309d169ee9e751
file + sys/kern/init_main.c
--- sys/kern/init_main.c
+++ sys/kern/init_main.c
@@ -173,6 +173,7 @@ main(void *framep)
struct pdevinit *pdev;
extern struct pdevinit pdevinit[];
extern void disk_init(void);
+   extern u_long prf_rand;
 
/*
 * Initialize the current process pointer (curproc) before
@@ -213,6 +214,9 @@ main(void *framep)
 
random_start(boothowto & RB_GOODRANDOM);/* Start the flow */
 
+   /* Initialize prf_rand with a random value */
+   arc4random_buf(_rand, sizeof(prf_rand));
+   
/*
 * Initialize mbuf's.  Do this now because we might attempt to
 * allocate mbufs or mbuf clusters during autoconfiguration.
blob - e2ad6cd97b37ad844cddca0af596e862d0ef8060
file + sys/kern/subr_prf.c
--- sys/kern/subr_prf.c
+++ sys/kern/subr_prf.c
@@ -99,6 +99,8 @@ struct mutex kprintf_mutex =
 extern int log_open;   /* subr_log: is /dev/klog open? */
 const  char *panicstr; /* arg to first call to panic (used as a flag
   to indicate that panic has already been called). */
+u_long prf_rand;   /* printf(9) %p pointer anonymizer */
+
 #ifdef DDB
 /*
  * Enter ddb on panic.
@@ -903,7 +905,7 @@ reswitch:   switch (ch) {
 * defined manner.''
 *  -- ANSI X3J11
 */
-   _uquad = (u_long)va_arg(ap, void *);
+   _uquad = (u_long)va_arg(ap, void *) ^ prf_rand;
base = HEX;
xdigs = "0123456789abcdef";
flags |= HEXPREFIX;



Re: patch: add a new ktrace facility for replacing some printf-debug

2022-01-07 Thread Sebastien Marie
On Fri, Jan 07, 2022 at 11:06:30AM +, Visa Hankala wrote:
> On Fri, Jan 07, 2022 at 10:54:54AM +0100, Sebastien Marie wrote:
> > Debugging some code paths is complex: for example, unveil(2) code is
> > involved inside VFS, and using DEBUG_UNVEIL means that the kernel is
> > spamming printf() for all processes using unveil(2) (a lot of
> > processes) instead of just the interested cases I want to follow.
> > 
> > So I cooked the following diff to add a KTRLOG() facility to be able
> > to replace printf()-like debugging with a more process-limited method.
> > 
> > From ktrace(2) point of vue, it adds a new KTR_LOG record type, which
> > is just a string. And it adds a function ktrlog(struct proc *, const char 
> > *, ...)
> > which is a printf-like function.
> > 
> > The following diff includes unveil(2) conversion from printf-debug
> > message to KTRLOG-debug message.
> > 
> > 
> > kdump output (LOG entries are new):
> > 
> >  10388 a.outCALL  unveil(0x149833ac,0x149833c3)
> >  10388 a.outSTRU  flags="rx"
> >  10388 a.outNAMI  "/usr/bin/id"
> >  10388 a.outLOG   "added name id underneath vnode 0xd580c9a0"
> >  10388 a.outLOG   "unveil: added name id beneath restricted vnode 
> > 0xd580c9a0, uvcount 6"
> >  10388 a.outRET   unveil 0
> >  10388 a.outCALL  unveil(0,0)
> >  10388 a.outRET   unveil 0
> >  10388 a.outCALL  kbind(0xcf7e4724,12,0x35c420ed57689eb8)
> >  10388 a.outRET   kbind 0
> >  10388 a.outCALL  stat(0x149833ba,0xcf7e47a0)
> >  10388 a.outNAMI  "/usr/bin"
> >  10388 a.outLOG   "unveil: component directory match for vnode 
> > 0xd58537f8"
> >  10388 a.outLOG   "unveil: no match for vnode 0xd580c9a0"
> >  10388 a.outLOG   "unveil: matched \"bin\" underneath/at vnode 
> > 0xd58537f8"
> >  10388 a.outSTRU  struct stat { dev=3, ino=77952, mode=drwxr-xr-x , 
> > nlink=2, uid=0<"root">, gid=0<"wheel">, rdev=319024, atime=1641543393<"Jan  
> > 7 09:16:33 2022">.860011379, mtime=1641542018<"Jan  7 08:53:38 
> > 2022">.406865980, ctime=1641542018<"Jan  7 08:53:38 2022">.406865980, 
> > size=5632, blocks=12, blksize=16384, flags=0x0, gen=0x0 }
> >  10388 a.outRET   stat 0
> > 
> > Additionnally, it permits to properly link the message string with the
> > syscall involved and the process code path.
> > 
> > 
> > Does it might be interesting for people ? or should I just keep it
> > locally ?
> 
> If this gets added, more care is needed with the messages. For example,
> kernel memory addresses should not be shown because they are privileged
> information.

I agree, but the intent is replacing a debugging method with another
debugging method (hoping it is more useful). The messages showed here
are the same that the ones which would be shown on the console before
the diff.

And to see them you need to run a kernel with DEBUG_UNVEIL defined (in
kern/kern_unveil.c file).
-- 
Sebastien Marie



patch: add a new ktrace facility for replacing some printf-debug

2022-01-07 Thread Sebastien Marie
Hi,

Debugging some code paths is complex: for example, unveil(2) code is
involved inside VFS, and using DEBUG_UNVEIL means that the kernel is
spamming printf() for all processes using unveil(2) (a lot of
processes) instead of just the interested cases I want to follow.

So I cooked the following diff to add a KTRLOG() facility to be able
to replace printf()-like debugging with a more process-limited method.

>From ktrace(2) point of vue, it adds a new KTR_LOG record type, which
is just a string. And it adds a function ktrlog(struct proc *, const char *, 
...)
which is a printf-like function.

The following diff includes unveil(2) conversion from printf-debug
message to KTRLOG-debug message.


kdump output (LOG entries are new):

 10388 a.outCALL  unveil(0x149833ac,0x149833c3)
 10388 a.outSTRU  flags="rx"
 10388 a.outNAMI  "/usr/bin/id"
 10388 a.outLOG   "added name id underneath vnode 0xd580c9a0"
 10388 a.outLOG   "unveil: added name id beneath restricted vnode 
0xd580c9a0, uvcount 6"
 10388 a.outRET   unveil 0
 10388 a.outCALL  unveil(0,0)
 10388 a.outRET   unveil 0
 10388 a.outCALL  kbind(0xcf7e4724,12,0x35c420ed57689eb8)
 10388 a.outRET   kbind 0
 10388 a.outCALL  stat(0x149833ba,0xcf7e47a0)
 10388 a.outNAMI  "/usr/bin"
 10388 a.outLOG   "unveil: component directory match for vnode 0xd58537f8"
 10388 a.outLOG   "unveil: no match for vnode 0xd580c9a0"
 10388 a.outLOG   "unveil: matched \"bin\" underneath/at vnode 0xd58537f8"
 10388 a.outSTRU  struct stat { dev=3, ino=77952, mode=drwxr-xr-x , 
nlink=2, uid=0<"root">, gid=0<"wheel">, rdev=319024, atime=1641543393<"Jan  7 
09:16:33 2022">.860011379, mtime=1641542018<"Jan  7 08:53:38 2022">.406865980, 
ctime=1641542018<"Jan  7 08:53:38 2022">.406865980, size=5632, blocks=12, 
blksize=16384, flags=0x0, gen=0x0 }
 10388 a.outRET   stat 0

Additionnally, it permits to properly link the message string with the
syscall involved and the process code path.


Does it might be interesting for people ? or should I just keep it
locally ?

Thanks.
-- 
Sebastien Marie

diff aab2d6589bc1ae2859ab3800e0ad61fbd5f87fd1 
598aa538be37ceddf77c1c352553462362219e2c
blob - 23b2af5c88dfbc4858fe14718beed80c0ebb0d48
blob + 23fd09812f14fedc59f0a971507715e21c3b9441
--- sys/kern/kern_ktrace.c
+++ sys/kern/kern_ktrace.c
@@ -405,6 +405,34 @@ ktrpledge(struct proc *p, int error, uint64_t code, in
atomic_clearbits_int(>p_flag, P_INKTR);
 }
 
+void
+ktrlog(struct proc *p, const char *fmt, ...)
+{
+   struct ktr_header kth;
+   char buf[1024];
+   int n;
+   va_list ap;
+
+   atomic_setbits_int(>p_flag, P_INKTR);
+   ktrinitheader(, p, KTR_LOG);
+
+   va_start(ap, fmt);
+   n = vsnprintf(buf, sizeof(buf), fmt, ap);
+   va_end(ap);
+   
+   if (n >= sizeof(buf)) {
+   buf[sizeof(buf) - 4] = '.';
+   buf[sizeof(buf) - 3] = '.';
+   buf[sizeof(buf) - 2] = '.';
+   buf[sizeof(buf) - 1] = '\0';
+   }
+   
+   KERNEL_LOCK();
+   ktrwrite(p, , buf, strlen(buf));
+   KERNEL_UNLOCK();
+   atomic_clearbits_int(>p_flag, P_INKTR);
+}
+
 /* Interface and common routines */
 
 int
blob - 801c210c113666aacf826b52a6a430e334f0a09d
blob + 32d3284ca932ec7a504647dd424502ea8a0a1f7b
--- sys/kern/kern_unveil.c
+++ sys/kern/kern_unveil.c
@@ -55,7 +55,9 @@ struct unveil {
u_char  uv_flags;
 };
 
-/* #define DEBUG_UNVEIL */
+#ifdef KTRACE
+/*#define DEBUG_UNVEIL */
+#endif
 
 #define UNVEIL_MAX_VNODES  128
 #define UNVEIL_MAX_NAMES   128
@@ -104,7 +106,7 @@ unveil_delete_names(struct unveil *uv)
}
rw_exit_write(>uv_lock);
 #ifdef DEBUG_UNVEIL
-   printf("deleted %d names\n", ret);
+   KTRLOG(curproc, "deleted %d names", ret);
 #endif
return ret;
 }
@@ -121,7 +123,7 @@ unveil_add_name_unlocked(struct unveil *uv, char *name
return 0;
}
 #ifdef DEBUG_UNVEIL
-   printf("added name %s underneath vnode %p\n", name, uv->uv_vp);
+   KTRLOG(curproc, "added name %s underneath vnode %p", name, uv->uv_vp);
 #endif
return 1;
 }
@@ -145,7 +147,7 @@ unveil_namelookup(struct unveil *uv, char *name)
rw_enter_read(>uv_lock);
 
 #ifdef DEBUG_UNVEIL
-   printf("unveil_namelookup: looking up name %s (%p) in vnode %p\n",
+   KTRLOG(curproc, "unveil_namelookup: looking up name %s (%p) in vnode 
%p",
name, name, uv->uv_vp);
 #endif
 
@@ -160,10 +162,10 @@ unveil_namelookup(struct unveil *uv, char *name)
 
 #ifdef DEBUG_UNVEIL
if (ret == NULL)
-   printf("unveil_namelookup: no match for name %s in vnode %p\n",
+   KTRLOG(c

Re: Please test: UVM fault unlocking (aka vmobjlock)

2021-11-30 Thread Sebastien Marie
On Tue, Nov 30, 2021 at 07:17:14PM +0100, Jan Stary wrote:
> 
> While here, I notice that sysctl.witness.watch is documented
> in witness(4), but kern.witness.locktrace is not - is that intended?
> 

It is documented on options(4)

   option WITNESS_LOCKTRACE
Enable witness(4) lock stack trace saving at boot.  The feature is
disabled by default and has to be enabled by setting the
kern.witness.locktrace sysctl(8) variable.

Thanks.
-- 
Sebastien Marie



Re: Retry sleep in poll/select

2021-11-18 Thread Sebastien Marie
On Thu, Nov 18, 2021 at 07:50:01PM -0600, Scott Cheloha wrote:
> On Thu, Nov 18, 2021 at 12:30:30PM +0100, Martin Pieuchot wrote:
> > On 17/11/21(Wed) 09:51, Scott Cheloha wrote:
> > > > On Nov 17, 2021, at 03:22, Martin Pieuchot  wrote:
> > > > 
> > > > ???On 16/11/21(Tue) 13:55, Visa Hankala wrote:
> > > >> Currently, dopselect() and doppoll() call tsleep_nsec() without retry.
> > > >> cheloha@ asked if the functions should handle spurious wakeups. I guess
> > > >> such wakeups are unlikely with the nowake wait channel, but I am not
> > > >> sure if that is a safe guess.
> > > > 
> > > > I'm not sure to understand, are we afraid a thread sleeping on `nowake'
> > > > can be awaken?  Is it the assumption here?
> > > 
> > > Yes, but I don't know how.
> > 
> > Then I'd suggest we start with understanding how this can happen otherwise
> > I fear we are adding more complexity for reasons we don't understands.
> > 
> > > kettenis@ said spurious wakeups were
> > > possible on a similar loop in sigsuspend(2)
> > > so I mentioned this to visa@ off-list.
> > 
> > I don't understand how this can happen.
> > 
> > > If we added an assert to panic in wakeup(9)
> > > if the channel is , would that be
> > > sufficient?
> > 
> > I guess so.
> 
> So, something like the attached patch?  All variants of wakeup(9) end
> up in wakeup_proc(), right?
> 
> Wondering if it'd be better (and cheaper) to do the assert at the top
> of wakeup_n(9)...
> 
> kettenis: Can you explain how a spurious wakeup would actually happen
> here or in sigsuspend(2)?
> 
> Index: kern_synch.c
> ===
> RCS file: /cvs/src/sys/kern/kern_synch.c,v
> retrieving revision 1.180
> diff -u -p -r1.180 kern_synch.c
> --- kern_synch.c  7 Oct 2021 08:51:00 -   1.180
> +++ kern_synch.c  19 Nov 2021 01:41:21 -
> @@ -493,6 +493,8 @@ wakeup_proc(struct proc *p, const volati
>  {
>   int s, awakened = 0;
>  
> + KASSERT(chan != );
> +
>   SCHED_LOCK(s);
>   if (p->p_wchan != NULL &&
>  ((chan == NULL) || (p->p_wchan == chan))) {
> 

To cover all wakeup point, KASSERT() should be done at:
- unsleep()
- wakeup_n()

note: wakeup_proc() will ends in calling unsleep() (setrunnable() will
call unsleep() too)

Thanks.
-- 
Sebastien Marie

diff a5dfec8fdaf0b4c0fdc18ec79317f74db550548a /home/semarie/repos/openbsd/src
blob - f0190261fe60789beb42bd76509517c604a487a3
file + sys/kern/kern_synch.c
--- sys/kern/kern_synch.c
+++ sys/kern/kern_synch.c
@@ -534,6 +534,8 @@ unsleep(struct proc *p)
 {
SCHED_ASSERT_LOCKED();
 
+   KASSERT(p->p_wchan != );
+   
if (p->p_wchan != NULL) {
TAILQ_REMOVE([LOOKUP(p->p_wchan)], p, p_runq);
p->p_wchan = NULL;
@@ -553,6 +555,8 @@ wakeup_n(const volatile void *ident, int n)
struct proc *pnext;
int s;
 
+   KASSERT(ident != );
+   
SCHED_LOCK(s);
qp = [LOOKUP(ident)];
for (p = TAILQ_FIRST(qp); p != NULL && n != 0; p = pnext) {



Re: [PATCH] [www] faq/current.html - docoment recent changes in Xenocara [Was: Re: X server updated to version 21.1.1]

2021-11-15 Thread Sebastien Marie
On Mon, Nov 15, 2021 at 02:55:35PM +0100, Marcus MERIGHI wrote:
> Hello, 
> 
> [...]
> > +A more detailed cleanup can be done with the aid of the sysclean package.
> 
> FWIW, sysclean(8) (that I just ran) did not remove
> 
> /etc/fonts/conf.d/70-no-bitmaps.conf
> 

it will, when snapshots will be up to date (and sets list in sync).

sysclean is using locate database installed by the snapshots. you
could check the content of this database with:

$ locate -d /usr/X11R6/lib/locate/xorg.db '70-no-bitmaps.conf'
xbase70:/etc/fonts/conf.avail/70-no-bitmaps.conf
xetc70:/etc/fonts/conf.d/70-no-bitmaps.conf

$ sysctl kern.version
kern.version=OpenBSD 7.0-current (GENERIC.MP) #96: Sun Nov 14 09:45:13 MST 2021
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP

it means that on amd64 snapshots from Sun Nov 14 (GENERIC.MP #96), the
file /etc/fonts/conf.d/70-no-bitmaps.conf still exists in xetc70
set. so sysclean will not report it as removable.

thanks.
-- 
Sebastien Marie



Re: give sppp(4) its own RTM_PROPOSAL priority

2021-11-10 Thread Sebastien Marie
On Wed, Nov 10, 2021 at 04:22:49PM +0100, Bjorn Ketelaars wrote:
> sppp(4) is currently using RTP_PROPOSAL_STATIC for sending DNS
> proposals, whereas all others sources, e.g. umb(4), are using a specific
> value. Diff below fixes this by adding RTP_PROPOSAL_PPP.
> 
> Although the diff is limited in size it touches several pieces:
> - sppp(4)
> - route(4)
> - route(8)
> - unwindctl(8)
> 
> Thanks to semarie@ for noting the above. Thanks to claudio@ for
> suggesting RTP_PROPOSAL_PPP.
> 
> Comments/OK?

ok semarie@

I just wonder about the system behaviour after building a new kernel
and rebooting to build userland: RTP_PROPOSAL_SOLICIT is changed and
kernel/userland will mismatch.

But UMB proposal was done this way too (moving RTP_PROPOSAL_SOLICIT to
next id). So disturb shouldn't be big.


> diff --git sbin/route/route.c sbin/route/route.c
> index 952d9446943..f92d1419125 100644
> --- sbin/route/route.c
> +++ sbin/route/route.c
> @@ -1513,6 +1513,9 @@ print_rtmsg(struct rt_msghdr *rtm, int msglen)
>   case RTP_PROPOSAL_UMB:
>   printf("umb");
>   break;
> + case RTP_PROPOSAL_PPP:
> + printf("ppp");
> + break;
>   case RTP_PROPOSAL_SOLICIT:
>   printf("solicit");
>   break;
> diff --git share/man/man4/route.4 share/man/man4/route.4
> index 5085487a1fb..a09b6e87e12 100644
> --- share/man/man4/route.4
> +++ share/man/man4/route.4
> @@ -282,7 +282,8 @@ The predefined constants for the routing priorities are:
>  #define RTP_PROPOSAL_DHCLIENT58
>  #define RTP_PROPOSAL_SLAAC   59
>  #define RTP_PROPOSAL_UMB 60
> -#define RTP_PROPOSAL_SOLICIT 61  /* request reply of all RTM_PROPOSAL */
> +#define RTP_PROPOSAL_PPP 61
> +#define RTP_PROPOSAL_SOLICIT 62  /* request reply of all RTM_PROPOSAL */
>  #define RTP_MAX  63  /* maximum priority */
>  #define RTP_ANY  64  /* any of the above */
>  #define RTP_MASK 0x7f
> diff --git sys/net/if_spppsubr.c sys/net/if_spppsubr.c
> index 6cff6a3585e..882c97d5ce1 100644
> --- sys/net/if_spppsubr.c
> +++ sys/net/if_spppsubr.c
> @@ -4931,7 +4931,7 @@ sppp_update_dns(struct ifnet *ifp)
>   rtdns.sr_len = 2 + i * sz;
>   info.rti_info[RTAX_DNS] = srtdnstosa();
>  
> - rtm_proposal(ifp, , flag, RTP_PROPOSAL_STATIC);
> + rtm_proposal(ifp, , flag, RTP_PROPOSAL_PPP);
>  }
>  
>  void
> diff --git sys/net/route.h sys/net/route.h
> index 914581aa6bb..acc07f06f7d 100644
> --- sys/net/route.h
> +++ sys/net/route.h
> @@ -171,7 +171,8 @@ struct rtentry {
>  #define RTP_PROPOSAL_DHCLIENT58
>  #define RTP_PROPOSAL_SLAAC   59
>  #define RTP_PROPOSAL_UMB 60
> -#define RTP_PROPOSAL_SOLICIT 61  /* request reply of all RTM_PROPOSAL */
> +#define RTP_PROPOSAL_PPP 61
> +#define RTP_PROPOSAL_SOLICIT 62  /* request reply of all RTM_PROPOSAL */
>  #define RTP_MAX  63  /* maximum priority */
>  #define RTP_ANY  64  /* any of the above */
>  #define RTP_MASK 0x7f
> diff --git usr.sbin/unwindctl/unwindctl.c usr.sbin/unwindctl/unwindctl.c
> index 9380abb937e..10db694e414 100644
> --- usr.sbin/unwindctl/unwindctl.c
> +++ usr.sbin/unwindctl/unwindctl.c
> @@ -67,6 +67,8 @@ prio2str(int prio)
>   return "STATIC";
>   case RTP_PROPOSAL_UMB:
>   return "UMB";
> + case RTP_PROPOSAL_PPP:
> + return "PPP";
>   }
>   return "OTHER";
>  }

-- 
Sebastien Marie



Re: sppp(4)/pppoe(4) - DNS configuration via resolvd(8)

2021-11-09 Thread Sebastien Marie
On Wed, Nov 10, 2021 at 07:35:26AM +0100, Bjorn Ketelaars wrote:
> On Mon 08/11/2021 11:52, Bjorn Ketelaars wrote:
> > Diff below does two things:
> > 1. add PPP IPCP extensions for name server addresses (rfc1877) to
> >sppp(4)
> > 2. propose negotiated name servers from sppp(4) to resolvd(8) using
> >RTM_PROPOSAL_STATIC route messages.
> 
> 
> Updated diff below, based on feedback from kn@ and claudio@:
> 
> - fix forgotten parentheses with `sizeof`
> - instead of using `u_int32_t` use `struct in_addr` for holding dns
>   addresses. Makes it more clear what the data is
> - decouple `IPCP_OPT` definitions from the bitmask values to
>   enable/disable an option. Makes the code look a bit better
> - use `memcpy`
> - fit code within 80 columns
> 
> While here add RFC to sppp(4)'s STANDARDS section.
> 
> @kn, is this still OK for you?
> 
> Other OK's?

There is one point which bother me a bit: you are using
RTP_PROPOSAL_STATIC for sending the proposal, whereas all others
sources (dhcpleased/dhclient, slaacd, umb) are using a specific value.

By using RTP_PROPOSAL_STATIC, it means also that route(8) nameserver
subcommand might interfere with it.

Using a new specific value (like RTP_PROPOSAL_SPPP) would make sense
to me. But no objection if RTM_PROPOSAL_STATIC is preferred.

> +void
> +sppp_update_dns(struct ifnet *ifp)
> +{
> + struct rt_addrinfo info;
> + struct sockaddr_rtdns rtdns;
> + struct sppp *sp = ifp->if_softc;
> + size_t sz = 0;
> + int i, flag = 0;
> +
> + memset(, 0, sizeof(rtdns));
> + memset(, 0, sizeof(info));
> +
> + for (i = 0; i < IPCP_MAX_DNSSRV; i++) {
> + if (sp->ipcp.dns[i].s_addr == INADDR_ANY)
> + break;
> + sz = sizeof(sp->ipcp.dns[i].s_addr);
> + memcpy(rtdns.sr_dns + i * sz, >ipcp.dns[i].s_addr, sz);
> + flag = RTF_UP;
> + }
> +
> + rtdns.sr_family = AF_INET;
> +     rtdns.sr_len = 2 + i * sz;
> + info.rti_info[RTAX_DNS] = srtdnstosa();
> +
> + rtm_proposal(ifp, , flag, RTP_PROPOSAL_STATIC);
> +}

Thanks.
-- 
Sebastien Marie



patch: nm(1): add support for symbols created with -ffunction-sections

2021-11-06 Thread Sebastien Marie
Hi,

aja@ shows me some problems with x11/gnome/librsvg update (the port is
Rust based), and I finally tracked the problem inside nm(1).

I will not speak of Rust anymore, and will use only C for the example.

When an object is compiled using -ffunction-sections, the
compiler/linker will use one section per function (if I correctly
understood the usual purpose, it is to be able to easily discard
unused sections/functions at linking time).

$ cat test.c
#include 

void
test_fn(void)
{
printf("test_fn()\n");
}

$ cc -Wall -c test.c -ffunction-sections
$ readelf --sections test.o | grep -A1 test_fn
  [ 3] .text.test_fn PROGBITS   0040
   0040    AX   0 0 16
$ readelf -s test.o

Symbol table '.symtab' contains 8 entries:
   Num:Value  Size TypeBind   Vis  Ndx Name
 0:  0 NOTYPE  LOCAL  DEFAULT  UND
 1:  0 FILELOCAL  DEFAULT  ABS test.c
 2: 11 OBJECT  LOCAL  DEFAULT7 .L.str
 3:  0 SECTION LOCAL  DEFAULT3
 4: 24 FUNCWEAK   HIDDEN 6 __llvm_retpoline_r11
 5:  8 OBJECT  WEAK   HIDDEN 9 __retguard_759
 6:  0 NOTYPE  GLOBAL DEFAULT  UND printf
 7: 64 FUNCGLOBAL DEFAULT3 test_fn
   

The problem is nm(1) doesn't recognize the test_fn type as a TEXT function:

$ nm test.o
 d .L.str
 W __llvm_retpoline_r11
 W __retguard_759
 U printf
 F test.c
 ? test_fn

test_fn symbol should be 'T', but it is reported as '?'.


llvm-nm(1) is working correctly (but we don't have it in base):

$ llvm-nm test.o
 r .L.str
 W __llvm_retpoline_r11
 V __retguard_759
 U printf
 T test_fn
 


The following diff makes nm(1) to properly mark the function 'T', by
recognize ".text.*" sections:

diff cecccd4b3c548875286ca2b010c95cbce6c0e359 /home/semarie/repos/openbsd/src
blob - 5aeef7a01a7cbff029299cfc5562cfcec085347f
file + usr.bin/nm/elf.c
--- usr.bin/nm/elf.c
+++ usr.bin/nm/elf.c
@@ -274,6 +274,8 @@ elf_shn2type(Elf_Ehdr *eh, u_int shn, const char *sn)
return (-1);
else if (!strcmp(sn, ELF_TEXT))
return (N_TEXT);
+   else if (!strncmp(sn, ".text.", 6))
+   return (N_TEXT);
else if (!strcmp(sn, ELF_RODATA))
return (N_SIZE);
else if (!strcmp(sn, ELF_OPENBSDRANDOMDATA))
@@ -355,6 +357,7 @@ elf2nlist(Elf_Sym *sym, Elf_Ehdr *eh, Elf_Shdr *shdr, 
} else if (sn != NULL && *sn != 0 &&
strcmp(sn, ELF_INIT) &&
strcmp(sn, ELF_TEXT) &&
+   strncmp(sn, ".text.", 6) &&
strcmp(sn, ELF_FINI))   /* XXX GNU compat */
np->nl.n_other = '?';
break;

The change on elf_shn2type() isn't strictly necessary for my use-case,
but it (should) makes .text.* support better (recognize N_TEXT for
STT_NOTYPE, STT_OBJECT, STT_TLS).


After, nm(1) properly recognize the symbol:

$ /usr/obj/usr.bin/nm/nm test.o
 d .L.str
 W __llvm_retpoline_r11
 W __retguard_759
 U printf
 F test.c
 T test_fn

and it makes libtool(1) happy (LT/Archive.pm: get_symbollist
function), and it makes librsvg build happy (which is playing with
symbols at build time), and it should makes aja@ happy too.

Comments or OK ?
-- 
Sebastien Marie



Re: some warnings in prep for LLVM 13

2021-10-25 Thread Sebastien Marie
On Mon, Oct 25, 2021 at 12:35:03PM +0100, Jeremie Courreges-Anglas wrote:
> On Mon, Oct 25 2021, Theo Buehler  wrote:
> >> index 664a5200037..e33763e7420 100644
> >> --- a/usr.bin/openssl/Makefile
> >> +++ b/usr.bin/openssl/Makefile
> >> @@ -17,6 +17,7 @@ CFLAGS+= -Wuninitialized
> >>  CFLAGS+= -Wunused
> >>  .if ${COMPILER_VERSION:L} == "clang"
> >>  CFLAGS+= -Werror
> >> +CFLAGS+= -Wno-unused-but-set-variable
> >
> > This will break the build with LLVM 11 because of -Werror:
> >
> > error: unknown warning option '-Wno-unused-but-set-variable'; did you mean 
> > '-Wno-unused-const-variable'? [-Werror,-Wunknown-warning-option]
> > *** Error 1 in /usr/src/usr.bin/openssl (:87 'apps.o')
> >
> > Also, it would be nice to know what triggered this addition.

I have a working llvm13 here for building zig 0.9.0-dev.

/usr/src/usr.bin/openssl/s_client.c:897:16: error: variable 'pbuf_off' set but 
not used [-Werror,-Wunused-but-set-variable]
int pbuf_len, pbuf_off;
  ^
/usr/src/usr.bin/openssl/s_client.c:897:6: error: variable 'pbuf_len' set but 
not used [-Werror,-Wunused-but-set-variable]
int pbuf_len, pbuf_off;
^
> You can use egcc to spot why clang 13 errors out, but they may not warn
> exactly about the same problems.  This works for easy stuff, not so much
> for the kernel.  We really need the clang 13 errors.
> 
> Here's an attempt to use egcc over openssl and libcrypto.  The s_client
> diff probably fixes what clang 13 complains about.
> 
> The two latter diffs for libcrypto remove unneeded includes.  This looks
> unrelated to the addition of -Wno-unused-but-set-variable by mortimer@,
> but I thought maybe you would be interested.
> egcc -Wno-unused-but-set-variable doesn't seem to find anything else in
> libcrypto.
> 
> In file included from /usr/src/lib/libcrypto/x509/x509_asid.c:28:
> /usr/src/lib/libcrypto/x509/ext_dat.h:81:33: error: 'standard_exts' defined 
> but not used [-Werror=unused-variable]
>  static const X509V3_EXT_METHOD *standard_exts[] = {
> 
> ok for the s_client diff?

ok semarie@ for s_client diff

but you could also remove pbuf_off :)


> Your call regarding the 2nd & 3rd. If that structure really can be used
> in multiple files then "static" is probably not appropriate.  Or maybe
> the structure should just be moved to x509/x509_lib.c.
> 
> 
> Index: usr.bin/openssl/s_client.c
> ===
> RCS file: /home/cvs/src/usr.bin/openssl/s_client.c,v
> retrieving revision 1.55
> diff -u -p -r1.55 s_client.c
> --- usr.bin/openssl/s_client.c22 Oct 2021 09:44:58 -  1.55
> +++ usr.bin/openssl/s_client.c25 Oct 2021 10:53:06 -
> @@ -894,7 +894,6 @@ s_client_main(int argc, char **argv)
>   char *cbuf = NULL, *sbuf = NULL, *mbuf = NULL, *pbuf = NULL;
>   int cbuf_len, cbuf_off;
>   int sbuf_len, sbuf_off;
> - int pbuf_len;
>   int full_log = 1;
>   char *pass = NULL;
>   X509 *cert = NULL;
> @@ -1195,7 +1194,6 @@ s_client_main(int argc, char **argv)
>   cbuf_off = 0;
>   sbuf_len = 0;
>   sbuf_off = 0;
> - pbuf_len = 0;
>  
>   /* This is an ugly hack that does a lot of assumptions */
>   /*
> @@ -1502,7 +1500,6 @@ s_client_main(int argc, char **argv)
>   if (SSL_get_error(con, p) == SSL_ERROR_NONE) {
>   if (p <= 0)
>   goto end;
> - pbuf_len = p;
>  
>   k = SSL_read(con, sbuf, p);
>   }

-- 
Sebastien Marie



Re: use NULL instead of 0 for pointers in sys/net

2021-10-23 Thread Sebastien Marie


ok semarie@

and while reviewing the code, I saw sppp_get_ip_addrs() last argument
(u_int32_t *srcmask) is always passed as NULL.

In fact, both sppp_get_ip_addrs() and sppp_get_ip6_addrs() has the
same argument never used.

Does it is something to simplify ? The code inside the functions is
relatively small. Not sure it would be helpfull to remove it.

Thanks
-- 
Sebastien Marie



On Sun, Oct 24, 2021 at 12:14:22PM +1100, Jonathan Gray wrote:
> diff --git sys/net/bpf.c sys/net/bpf.c
> index 87a9d726423..87418c3dc17 100644
> --- sys/net/bpf.c
> +++ sys/net/bpf.c
> @@ -1019,7 +1019,7 @@ bpf_setf(struct bpf_d *d, struct bpf_program *fp, int 
> wf)
>  
>   KERNEL_ASSERT_LOCKED();
>  
> - if (fp->bf_insns == 0) {
> + if (fp->bf_insns == NULL) {
>   if (fp->bf_len != 0)
>   return (EINVAL);
>   bps = NULL;
> diff --git sys/net/if.c sys/net/if.c
> index 8fe99eff4df..6bd8899561c 100644
> --- sys/net/if.c
> +++ sys/net/if.c
> @@ -1440,7 +1440,8 @@ ifaof_ifpforaddr(struct sockaddr *addr, struct ifnet 
> *ifp)
>   continue;
>   if (ifa_maybe == NULL)
>   ifa_maybe = ifa;
> - if (ifa->ifa_netmask == 0 || ifp->if_flags & IFF_POINTOPOINT) {
> + if (ifa->ifa_netmask == NULL ||
> + ifp->if_flags & IFF_POINTOPOINT) {
>   if (equal(addr, ifa->ifa_addr) ||
>   (ifa->ifa_dstaddr && equal(addr, ifa->ifa_dstaddr)))
>   return (ifa);
> diff --git sys/net/if_mpip.c sys/net/if_mpip.c
> index a8daeeea314..fe155eb18d8 100644
> --- sys/net/if_mpip.c
> +++ sys/net/if_mpip.c
> @@ -96,7 +96,7 @@ mpip_clone_create(struct if_clone *ifc, int unit)
>  
>   sc->sc_txhprio = 0;
>   sc->sc_rxhprio = IF_HDRPRIO_PACKET;
> - sc->sc_neighbor = 0;
> + sc->sc_neighbor = NULL;
>   sc->sc_cword = 0; /* default to no control word */
>   sc->sc_fword = 0; /* both sides have to agree on FAT first */
>   sc->sc_flow = arc4random() & 0xf;
> diff --git sys/net/if_ppp.c sys/net/if_ppp.c
> index fb32d9ea9ef..c3b6d9051b7 100644
> --- sys/net/if_ppp.c
> +++ sys/net/if_ppp.c
> @@ -324,21 +324,21 @@ pppdealloc(struct ppp_softc *sc)
>   sc->sc_rc_state = NULL;
>  #endif /* PPP_COMPRESS */
>  #if NBPFILTER > 0
> - if (sc->sc_pass_filt.bf_insns != 0) {
> + if (sc->sc_pass_filt.bf_insns != NULL) {
>   free(sc->sc_pass_filt.bf_insns, M_DEVBUF, 0);
> - sc->sc_pass_filt.bf_insns = 0;
> + sc->sc_pass_filt.bf_insns = NULL;
>   sc->sc_pass_filt.bf_len = 0;
>   }
> - if (sc->sc_active_filt.bf_insns != 0) {
> + if (sc->sc_active_filt.bf_insns != NULL) {
>   free(sc->sc_active_filt.bf_insns, M_DEVBUF, 0);
> - sc->sc_active_filt.bf_insns = 0;
> + sc->sc_active_filt.bf_insns = NULL;
>   sc->sc_active_filt.bf_len = 0;
>   }
>  #endif
>  #ifdef VJC
> - if (sc->sc_comp != 0) {
> + if (sc->sc_comp != NULL) {
>   free(sc->sc_comp, M_DEVBUF, 0);
> - sc->sc_comp = 0;
> + sc->sc_comp = NULL;
>   }
>  #endif
>   NET_UNLOCK();
> @@ -538,7 +538,7 @@ pppioctl(struct ppp_softc *sc, u_long cmd, caddr_t data, 
> int flag,
>   return EINVAL;
>   }
>   } else
> - newcode = 0;
> + newcode = NULL;
>   bp = (cmd == PPPIOCSPASS) ?
>   >sc_pass_filt : >sc_active_filt;
>   oldcode = bp->bf_insns;
> @@ -546,7 +546,7 @@ pppioctl(struct ppp_softc *sc, u_long cmd, caddr_t data, 
> int flag,
>   bp->bf_len = nbp->bf_len;
>   bp->bf_insns = newcode;
>   splx(s);
> - if (oldcode != 0)
> + if (oldcode != NULL)
>   free(oldcode, M_DEVBUF, 0);
>   break;
>  #endif
> @@ -730,7 +730,7 @@ pppoutput(struct ifnet *ifp, struct mbuf *m0, struct 
> sockaddr *dst,
>* but only if it is a data packet.
>*/
>   *mtod(m0, u_char *) = 1;/* indicates outbound */
> - if (sc->sc_pass_filt.bf_insns != 0 &&
> + if (sc->sc_pass_filt.bf_insns != NULL &&
>   bpf_filter(sc->sc_pass_filt.bf_insns, (u_char *)m0,
>   len, 0) == 0) {
>   error = 0; /* drop this packet */
> @@ -7

Re: is rpath needed on rad(8)?

2021-10-23 Thread Sebastien Marie
On Sat, Oct 23, 2021 at 06:55:02PM +0100, Ricardo Mestre wrote:
> does rad(8) actually need rpath here? as far as i can see it needs to open the
> config, plus its includes, but that is already done before pledge(2) by imsg
> sending IMSG_RECONF_CONF.
> 
> florian@ is this correct or am i trusting too much on my eyes?

If I correctly followed the code path, rad(8) is split in 3 processes:
- 'main'
- 'frontend'
- 'engine'

Here, you are modifing pledge(2) for 'main' process.

'main' process has a signal handler (main_sig_handler) which will call
main_reload() on SIGHUP.

main_reload() will open `conffile` (using fopen(3)), before sending
the config read to childs.

Due to fopen(3), "rpath" is required for 'main' process.

So florian@ is correct :)

Thanks.
-- 
Sebastien Marie



Re: patch: nullify v_data with NULL (and not with 0)

2021-10-21 Thread Sebastien Marie
; + (so3 = sonewconn(so2, 0)) == NULL) {
>   error = ECONNREFUSED;
>   goto put_locked;
>   }
> diff --git sys/kern/vfs_biomem.c sys/kern/vfs_biomem.c
> index b3c41c6551a..1f517296484 100644
> --- sys/kern/vfs_biomem.c
> +++ sys/kern/vfs_biomem.c
> @@ -238,7 +238,7 @@ buf_unmap(struct buf *bp)
>   TAILQ_REMOVE(_valist, bp, b_valist);
>   bcstats.kvaslots_avail--;
>   va = (vaddr_t)bp->b_data;
> - bp->b_data = 0;
> + bp->b_data = NULL;
>   pmap_kremove(va, bp->b_bufsize);
>   pmap_update(pmap_kernel());
>  
> diff --git sys/kern/vfs_lookup.c sys/kern/vfs_lookup.c
> index 94943dad1e1..b094cf577aa 100644
> --- sys/kern/vfs_lookup.c
> +++ sys/kern/vfs_lookup.c
> @@ -381,7 +381,7 @@ int
>  vfs_lookup(struct nameidata *ndp)
>  {
>   char *cp;   /* pointer into pathname argument */
> - struct vnode *dp = 0;   /* the directory we are searching */
> + struct vnode *dp = NULL;/* the directory we are searching */
>   struct vnode *tdp;  /* saved dp */
>   struct mount *mp;   /* mount table entry */
>   int docache;/* == 0 do not cache last component */
> @@ -725,7 +725,7 @@ bad:
>  int
>  vfs_relookup(struct vnode *dvp, struct vnode **vpp, struct componentname 
> *cnp)
>  {
> - struct vnode *dp = 0;   /* the directory we are searching */
> + struct vnode *dp = NULL;/* the directory we are searching */
>   int wantparent; /* 1 => wantparent or lockparent flag */
>   int rdonly; /* lookup read-only flag bit */
>   int error = 0;
> diff --git sys/kern/vfs_subr.c sys/kern/vfs_subr.c
> index f807760ea9d..6541408ea17 100644
> --- sys/kern/vfs_subr.c
> +++ sys/kern/vfs_subr.c
> @@ -271,8 +271,8 @@ vfs_rootmountalloc(char *fstypename, char *devname, 
> struct mount **mpp)
>   mp = vfs_mount_alloc(NULLVP, vfsp);
>   mp->mnt_flag |= MNT_RDONLY;
>   mp->mnt_stat.f_mntonname[0] = '/';
> - copystr(devname, mp->mnt_stat.f_mntfromname, MNAMELEN, 0);
> - copystr(devname, mp->mnt_stat.f_mntfromspec, MNAMELEN, 0);
> + copystr(devname, mp->mnt_stat.f_mntfromname, MNAMELEN, NULL);
> + copystr(devname, mp->mnt_stat.f_mntfromspec, MNAMELEN, NULL);
>   *mpp = mp;
>   return (0);
>   }
> @@ -427,7 +427,7 @@ getnewvnode(enum vtagtype tag, struct mount *mp, const 
> struct vops *vops,
>   if (vp == NULL) {
>   splx(s);
>   tablefull("vnode");
> - *vpp = 0;
> + *vpp = NULL;
>   return (ENFILE);
>   }
>  
> @@ -464,7 +464,7 @@ getnewvnode(enum vtagtype tag, struct mount *mp, const 
> struct vops *vops,
>   insmntque(vp, mp);
>   *vpp = vp;
>   vp->v_usecount = 1;
> - vp->v_data = 0;
> + vp->v_data = NULL;
>   return (0);
>  }
>  
> @@ -530,7 +530,7 @@ getdevvp(dev_t dev, struct vnode **vpp, enum vtype type)
>   }
>   vp = nvp;
>   vp->v_type = type;
> - if ((nvp = checkalias(vp, dev, NULL)) != 0) {
> + if ((nvp = checkalias(vp, dev, NULL)) != NULL) {
>   vput(vp);
>   vp = nvp;
>   }
> @@ -1147,7 +1147,8 @@ vgonel(struct vnode *vp, struct proc *p)
>* If special device, remove it from special device alias list
>* if it is on one.
>*/
> - if ((vp->v_type == VBLK || vp->v_type == VCHR) && vp->v_specinfo != 0) {
> + if ((vp->v_type == VBLK || vp->v_type == VCHR) &&
> + vp->v_specinfo != NULL) {
>   if ((vp->v_flag & VALIASED) == 0 && vp->v_type == VCHR &&
>   (cdevsw[major(vp->v_rdev)].d_flags & D_CLONE) &&
>   (minor(vp->v_rdev) >> CLONE_SHIFT == 0)) {
> @@ -1427,7 +1428,7 @@ vfs_hang_addrlist(struct mount *mp, struct netexport 
> *nep,
>   struct radix_node_head *rnh;
>   int nplen, i;
>   struct radix_node *rn;
> - struct sockaddr *saddr, *smask = 0;
> + struct sockaddr *saddr, *smask = NULL;
>   int error;
>  
>   if (argp->ex_addrlen == 0) {
> @@ -1480,7 +1481,7 @@ vfs_hang_addrlist(struct mount *mp, struct netexport 
> *nep,
>   goto out;
>   }
>   rn = rn_addroute(saddr, smask, rnh, np->netc_rnodes, 0);
> - if (rn == 0 || np != (struct netcred *)rn) { /* already exists */
> + if (rn == NULL || np != (struct netcred *)rn) { /* already exists */
>   error = EPERM;
>   goto out;
>   }
> @@ -1749,7 +1750,7 @@ vfs_shutdown(struct proc *p)
>  
>   printf("syncing disks...");
>  
> - if (panicstr == 0) {
> + if (panicstr == NULL) {
>   /* Sync before unmount, in case we hang on something. */
>   sys_sync(p, NULL, NULL);
>   vfs_unmountall();

-- 
Sebastien Marie



Re: patch: nullify v_data with NULL (and not with 0)

2021-10-19 Thread Sebastien Marie
On Tue, Oct 19, 2021 at 06:08:04PM +1100, Jonathan Gray wrote:
> On Tue, Oct 19, 2021 at 08:32:57AM +0200, Sebastien Marie wrote:
> > Hi,
> > 
> > Simple online diff for properly nullify v_data (which is `void *`)
> > with NULL instead of 0.
> > 
> > Comments or OK ?
> > -- 
> > Sebastien Marie
> > 
> 
> There are many others along those lines in the kernel, for example
> sparse complains about these in vfs_subr.c
> 
> /sys/kern/vfs_subr.c:274:64: warning: Using plain integer as NULL pointer
> /sys/kern/vfs_subr.c:275:64: warning: Using plain integer as NULL pointer
> /sys/kern/vfs_subr.c:430:32: warning: Using plain integer as NULL pointer
> /sys/kern/vfs_subr.c:467:22: warning: Using plain integer as NULL pointer
> /sys/kern/vfs_subr.c:533:50: warning: Using plain integer as NULL pointer
> /sys/kern/vfs_subr.c:1150:77: warning: Using plain integer as NULL pointer
> /sys/kern/vfs_subr.c:1430:42: warning: Using plain integer as NULL pointer
> /sys/kern/vfs_subr.c:1483:19: warning: Using plain integer as NULL pointer
> /sys/kern/vfs_subr.c:1752:25: warning: Using plain integer as NULL pointer
> 
> If this is something worth changing should it be a larger diff?

I am playing^Wworking inside vnode, so my eyes was only hurted by
v_data initialisation.

But I would be happy to avoid using 0 and use NULL at others places.
-- 
Sebastien Marie



Re: patch: move vnode lock from FS implementation to struct vnode - FS part (cd9660)

2021-10-19 Thread Sebastien Marie
And here it is the cd9660 part.

I prefered showing cd9660 over ffs, as ffs lock is shared between more
FS implementation (ext2fs, ffs).

else, cd9660 is similar to other implementations.


blob - 5082a7ad64a7c73a250023dd9e833b7ff91893b2
blob + 368b0a5b1252ba6b57401f28a8e0aad585828abb
--- sys/isofs/cd9660/cd9660_node.h
+++ sys/isofs/cd9660/cd9660_node.h
@@ -65,7 +65,6 @@ struct iso_node {
doff_t  i_diroff;   /* offset in dir, where we found last entry */
doff_t  i_offset;   /* offset of free space in directory */
cdino_t i_ino;  /* inode number of found directory */
-   struct  rrwlock i_lock; /* node lock */
 
doff_t  iso_extent; /* extent of file */
doff_t  i_size;
@@ -110,11 +109,8 @@ intcd9660_reclaim(void *);
 intcd9660_link(void *);
 intcd9660_symlink(void *);
 intcd9660_bmap(void *);
-intcd9660_lock(void *);
-intcd9660_unlock(void *);
 intcd9660_strategy(void *);
 intcd9660_print(void *);
-intcd9660_islocked(void *);
 intcd9660_pathconf(void *);
 
 intcd9660_bufatoff(struct iso_node *, off_t, char **, struct buf **);
blob - 0bcfbd0270d8276b3569ade303fdd9b6d90a9505
blob + 8ca74c6685f468f3856f82fad2993065488bdaea
--- sys/isofs/cd9660/cd9660_vfsops.c
+++ sys/isofs/cd9660/cd9660_vfsops.c
@@ -715,7 +715,6 @@ retry:
return (error);
}
ip = malloc(sizeof(*ip), M_ISOFSNODE, M_WAITOK | M_ZERO);
-   rrw_init_flags(>i_lock, "isoinode", RWL_DUPOK | RWL_IS_VNODE);
vp->v_data = ip;
ip->i_vnode = vp;
ip->i_dev = dev;
@@ -867,10 +866,10 @@ retry:
if ((nvp = checkalias(vp, ip->inode.iso_rdev, mp)) != NULL) {
/*
 * Discard unneeded vnode, but save its iso_node.
-* Note that the lock is carried over in the iso_node
 */
nvp->v_data = vp->v_data;
vp->v_data = NULL;
+   VOP_UNLOCK(vp); /* unlock before changing v_op */
vp->v_op = _vops;
vrele(vp);
vgone(vp);
@@ -879,6 +878,7 @@ retry:
 */
vp = nvp;
ip->i_vnode = vp;
+   vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
}
break;
case VLNK:
blob - ed1b5f39c7122cdcb37c133fae35d243a2995d90
blob + e9f20249ac4825abe12d206ac25d52c98af7ad29
--- sys/isofs/cd9660/cd9660_vnops.c
+++ sys/isofs/cd9660/cd9660_vnops.c
@@ -684,31 +684,6 @@ cd9660_symlink(void *v)
 }
 
 /*
- * Lock an inode.
- */
-int
-cd9660_lock(void *v)
-{
-   struct vop_lock_args *ap = v;
-   struct vnode *vp = ap->a_vp;
-
-   return rrw_enter((vp)->i_lock, ap->a_flags & LK_RWFLAGS);
-}
-
-/*
- * Unlock an inode.
- */
-int
-cd9660_unlock(void *v)
-{
-   struct vop_unlock_args *ap = v;
-   struct vnode *vp = ap->a_vp;
-
-   rrw_exit((vp)->i_lock);
-   return 0;
-}
-
-/*
  * Calculate the logical to physical mapping if not done already,
  * then call the device strategy routine.
  */
@@ -762,17 +737,6 @@ cd9660_print(void *v)
 }
 
 /*
- * Check for a locked inode.
- */
-int
-cd9660_islocked(void *v)
-{
-   struct vop_islocked_args *ap = v;
-
-   return rrw_status((ap->a_vp)->i_lock);
-}
-
-/*
  * Return POSIX pathconf information applicable to cd9660 filesystems.
  */
 int
@@ -840,12 +804,12 @@ const struct vops cd9660_vops = {
.vop_abortop= vop_generic_abortop,
.vop_inactive   = cd9660_inactive,
.vop_reclaim= cd9660_reclaim,
-   .vop_lock   = cd9660_lock,
-   .vop_unlock = cd9660_unlock,
+   .vop_lock   = vop_generic_lock,
+   .vop_unlock = vop_generic_unlock,
+   .vop_islocked   = vop_generic_islocked,
.vop_bmap   = cd9660_bmap,
.vop_strategy   = cd9660_strategy,
.vop_print  = cd9660_print,
-   .vop_islocked   = cd9660_islocked,
.vop_pathconf   = cd9660_pathconf,
.vop_advlock= eopnotsupp,
.vop_bwrite = vop_generic_bwrite,
@@ -858,10 +822,10 @@ const struct vops cd9660_specvops = {
.vop_setattr= cd9660_setattr,
.vop_inactive   = cd9660_inactive,
.vop_reclaim= cd9660_reclaim,
-   .vop_lock   = cd9660_lock,
-   .vop_unlock = cd9660_unlock,
+   .vop_lock   = vop_generic_lock,
+   .vop_unlock = vop_generic_unlock,
+   .vop_islocked   = vop_generic_islocked,
.vop_print  = cd9660_print,
-   .vop_islocked   = cd9660_islocked,
 
/* XXX: Keep in sync with spec_vops. */
.vop_lookup = vop_generic_lookup,
@@ -899,10 +863,10 @@ const struct vops cd9660_fifovops = {
.vop_setattr= cd9660_setattr,
.vop_inactive   = cd9660_inactive,
.vop_reclaim= cd9660_reclaim,
-   .vop_lock   = 

Re: patch: move vnode lock from FS implementation to struct vnode - new vnode code only

2021-10-19 Thread Sebastien Marie
Below is the part of the full diff which is adding generic vnode lock
inside struct vnode.


blob - 7df5a5757b90244ab361f0687bd2eabf3e7093c2
blob + 53fb67ace69ada6faf768a6e7b20a48e2b6e9740
--- sys/kern/vfs_default.c
+++ sys/kern/vfs_default.c
@@ -167,6 +167,34 @@ vop_generic_abortop(void *v)
return (0);
 }
 
+int
+vop_generic_lock(void *v)
+{
+   struct vop_lock_args *ap = v;
+   struct vnode *vp = ap->a_vp;
+
+   return rrw_enter(>v_lock, ap->a_flags & LK_RWFLAGS);
+}
+
+int
+vop_generic_unlock(void *v)
+{
+   struct vop_unlock_args *ap = v;
+   struct vnode *vp = ap->a_vp;
+
+   rrw_exit(>v_lock);
+   return 0;
+}
+
+int
+vop_generic_islocked(void *v)
+{
+   struct vop_islocked_args *ap = v;
+   struct vnode *vp = ap->a_vp;
+
+   return rrw_status(>v_lock);
+}
+
 const struct filterops generic_filtops = {
.f_flags= FILTEROP_ISFD,
.f_attach   = NULL,
blob - 4861468362592c649c579e0f3a47d630a39d051e
blob + 1f7409235f4696765c6b8b22e65d953b1d6e5100
--- sys/kern/vfs_subr.c
+++ sys/kern/vfs_subr.c
@@ -408,6 +408,7 @@ getnewvnode(enum vtagtype tag, struct mount *mp, const
((TAILQ_FIRST(listhd = _hold_list) == NULL) || toggle))) {
splx(s);
vp = pool_get(_pool, PR_WAITOK | PR_ZERO);
+   rrw_init_flags(>v_lock, "vnode", RWL_DUPOK | RWL_IS_VNODE);
vp->v_uvm = pool_get(_vnode_pool, PR_WAITOK | PR_ZERO);
vp->v_uvm->u_vnode = vp;
RBT_INIT(buf_rb_bufs, >v_bufs_tree);
@@ -463,6 +464,10 @@ getnewvnode(enum vtagtype tag, struct mount *mp, const
vp->v_op = vops;
insmntque(vp, mp);
*vpp = vp;
+#ifdef DIAGNOSTIC
+   if (rrw_status(>v_lock) != 0)
+   panic("%s: free vnode %p isn't lock free", __func__, vp);
+#endif
vp->v_usecount = 1;
vp->v_data = 0;
return (0);
blob - 490f3e367cf322eff38ac7a184b7ea49a7cef7fa
blob + 3a9ff1f58d50e005ea9f87c974647d1f56fcb397
--- sys/sys/vnode.h
+++ sys/sys/vnode.h
@@ -102,6 +102,7 @@ struct vnode {
u_int   v_uvcount;  /* unveil references */
u_int   v_writecount;   /* reference count of writers */
u_int   v_lockcount;/* [V] # threads waiting on lock */
+   struct rrwlock v_lock;  /* generic vnode lock */
 
/* Flags that can be read/written in interrupts */
u_int   v_bioflag;
@@ -632,6 +633,9 @@ int vop_generic_bwrite(void *);
 intvop_generic_revoke(void *);
 intvop_generic_kqfilter(void *);
 intvop_generic_lookup(void *);
+intvop_generic_lock(void *);
+intvop_generic_unlock(void *);
+intvop_generic_islocked(void *);
 
 /* vfs_vnops.c */
 intvn_isunder(struct vnode *, struct vnode *, struct proc *);



patch: move vnode lock from FS implementation to struct vnode

2021-10-19 Thread Sebastien Marie
Hi,

The following diff is a bit large. it could be splitted for easy
review, but I tought having the full view would be good too.

It moves the current vnode lock mecanism, implemented inside FS
specific struct, to `struct vnode`.

I used `vop_generic_lock`, `vop_generic_unlock` and
`vop_generic_islocked` functions for this generic implementation (see
kern/vfs_default.c for implementation and kern/vfs_subr.c for
initialisation).

As usual, `struct vops` controls which functions are called when using
VOP wrappers. It makes possible to move FS to generic implementation
slowly.

FS implementation part is mostly about removing the current code
(rrw_init, lock/unlock functions, vop_{lock,unlock} setting).

As part of the diff, ntfs and mfs are using the generic vnode lock
(set vop_{lock,unlock,islocked} in vops).


There is one tricky part is vget() FS implementation for special
devices. The aliases mecanism is a bit tricky as it replaces the
v_data from one vnode to another.

With new code, the lock isn't moved anymore (as it is part of vnode,
and is not inside v_data) and it should be reacquired. It makes me to
unlock old vnode, and lock the new vnode. I think this part might be
wrong and/or racy (but I am unsure the current version would be
right), but we don't have a way to properly "move" the lock (WITNESS
is tracking pointers, so just copying the data doesn't work with
WITNESS).

Comments would be welcome.
-- 
Sebastien Marie


diff 0def1dfbf35c717f500a4280ac4e33a8e31279e2 
767ac1f1e81fa96b73dd2a7955e4eb5bbce3e538
blob - 5082a7ad64a7c73a250023dd9e833b7ff91893b2
blob + 368b0a5b1252ba6b57401f28a8e0aad585828abb
--- sys/isofs/cd9660/cd9660_node.h
+++ sys/isofs/cd9660/cd9660_node.h
@@ -65,7 +65,6 @@ struct iso_node {
doff_t  i_diroff;   /* offset in dir, where we found last entry */
doff_t  i_offset;   /* offset of free space in directory */
cdino_t i_ino;  /* inode number of found directory */
-   struct  rrwlock i_lock; /* node lock */
 
doff_t  iso_extent; /* extent of file */
doff_t  i_size;
@@ -110,11 +109,8 @@ intcd9660_reclaim(void *);
 intcd9660_link(void *);
 intcd9660_symlink(void *);
 intcd9660_bmap(void *);
-intcd9660_lock(void *);
-intcd9660_unlock(void *);
 intcd9660_strategy(void *);
 intcd9660_print(void *);
-intcd9660_islocked(void *);
 intcd9660_pathconf(void *);
 
 intcd9660_bufatoff(struct iso_node *, off_t, char **, struct buf **);
blob - 0bcfbd0270d8276b3569ade303fdd9b6d90a9505
blob + 8ca74c6685f468f3856f82fad2993065488bdaea
--- sys/isofs/cd9660/cd9660_vfsops.c
+++ sys/isofs/cd9660/cd9660_vfsops.c
@@ -715,7 +715,6 @@ retry:
return (error);
}
ip = malloc(sizeof(*ip), M_ISOFSNODE, M_WAITOK | M_ZERO);
-   rrw_init_flags(>i_lock, "isoinode", RWL_DUPOK | RWL_IS_VNODE);
vp->v_data = ip;
ip->i_vnode = vp;
ip->i_dev = dev;
@@ -867,10 +866,10 @@ retry:
if ((nvp = checkalias(vp, ip->inode.iso_rdev, mp)) != NULL) {
/*
 * Discard unneeded vnode, but save its iso_node.
-* Note that the lock is carried over in the iso_node
 */
nvp->v_data = vp->v_data;
vp->v_data = NULL;
+   VOP_UNLOCK(vp); /* unlock before changing v_op */
vp->v_op = _vops;
vrele(vp);
vgone(vp);
@@ -879,6 +878,7 @@ retry:
 */
vp = nvp;
ip->i_vnode = vp;
+   vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
}
break;
case VLNK:
blob - ed1b5f39c7122cdcb37c133fae35d243a2995d90
blob + e9f20249ac4825abe12d206ac25d52c98af7ad29
--- sys/isofs/cd9660/cd9660_vnops.c
+++ sys/isofs/cd9660/cd9660_vnops.c
@@ -684,31 +684,6 @@ cd9660_symlink(void *v)
 }
 
 /*
- * Lock an inode.
- */
-int
-cd9660_lock(void *v)
-{
-   struct vop_lock_args *ap = v;
-   struct vnode *vp = ap->a_vp;
-
-   return rrw_enter((vp)->i_lock, ap->a_flags & LK_RWFLAGS);
-}
-
-/*
- * Unlock an inode.
- */
-int
-cd9660_unlock(void *v)
-{
-   struct vop_unlock_args *ap = v;
-   struct vnode *vp = ap->a_vp;
-
-   rrw_exit((vp)->i_lock);
-   return 0;
-}
-
-/*
  * Calculate the logical to physical mapping if not done already,
  * then call the device strategy routine.
  */
@@ -762,17 +737,6 @@ cd9660_print(void *v)
 }
 
 /*
- * Check for a locked inode.
- */
-int
-cd9660_islocked(void *v)
-{
-   struct vop_islocked_args *ap = v;
-
-   return rrw_status((ap->a_vp)->i_lock);
-}
-
-/*
  * Return POSIX pathconf information applicable to cd9660 filesystems.
  */
 int
@@ -840,12 +804,12 @@ const struct vops cd9660_vops = {
.vop_

patch: nullify v_data with NULL (and not with 0)

2021-10-19 Thread Sebastien Marie
Hi,

Simple online diff for properly nullify v_data (which is `void *`)
with NULL instead of 0.

Comments or OK ?
-- 
Sebastien Marie


blob - 1f7409235f4696765c6b8b22e65d953b1d6e5100
blob + 8cb011e9c3e48fcc34b8ce62974cafd430bb4e6a
--- sys/kern/vfs_subr.c
+++ sys/kern/vfs_subr.c
@@ -469,7 +469,7 @@ getnewvnode(enum vtagtype tag, struct mount *mp, const
panic("%s: free vnode %p isn't lock free", __func__, vp);
 #endif
vp->v_usecount = 1;
-   vp->v_data = 0;
+   vp->v_data = NULL;
return (0);
 }
 



vnode lock: remove VLOCKSWORK flag

2021-10-15 Thread Sebastien Marie
Hi,

The following diff removes VLOCKSWORK flag.

This flag is currently used to mark or unmark a vnode to actively
check vnode locking semantic (when compiled with VFSLCKDEBUG).
 
Currently, VLOCKSWORK flag isn't properly set for several FS
implementation which have full locking support, specially:
 - cd9660
 - udf
 - fuse
 - msdosfs
 - tmpfs

Instead of using a particular flag, I propose to directly check if
v_op->vop_islocked is nullop or not to activate or not the vnode
locking checks.

Some alternate methods might be possible, like having a specific
member inside struct vops. But it will only duplicate the fact that
nullop is used as lock mecanism.

I also slightly changed ASSERT_VP_ISLOCKED(vp) macro:
- evaluate vp argument only once
- explicitly check if VOP_ISLOCKED() != LK_EXCLUSIVE (it might returns
  error or 'locked by some else', and it doesn't mean "locked by me")
- show the VOP_ISLOCKED returned code in panic message

Some code are using ASSERT_VP_ISLOCKED() like code. I kept them simple.

The direct impact on snapshots should be low as VFSLCKDEBUG isn't set
by default.

Comments or OK ?
-- 
Sebastien Marie


diff e44725a8dd99f82f94f37ecff5c0e710c4dba97e 
/home/semarie/repos/openbsd/sys-clean
blob - c752dd99e9ef62b05162cfeda67913ab5bccf06e
file + kern/vfs_subr.c
--- kern/vfs_subr.c
+++ kern/vfs_subr.c
@@ -1075,9 +1075,6 @@ vclean(struct vnode *vp, int flags, struct proc *p)
vp->v_op = _vops;
VN_KNOTE(vp, NOTE_REVOKE);
vp->v_tag = VT_NON;
-#ifdef VFSLCKDEBUG
-   vp->v_flag &= ~VLOCKSWORK;
-#endif
mtx_enter(_mtx);
vp->v_lflag &= ~VXLOCK;
if (vp->v_lflag & VXWANT) {
@@ -1930,7 +1927,7 @@ vinvalbuf(struct vnode *vp, int flags, struct ucred *c
int s, error;
 
 #ifdef VFSLCKDEBUG
-   if ((vp->v_flag & VLOCKSWORK) && !VOP_ISLOCKED(vp))
+   if ((vp->v_op->vop_islocked != nullop) && !VOP_ISLOCKED(vp))
panic("%s: vp isn't locked, vp %p", __func__, vp);
 #endif
 
blob - caf2dc327bfc2f5a001bcee80edd90938497ef99
file + kern/vfs_vops.c
--- kern/vfs_vops.c
+++ kern/vfs_vops.c
@@ -48,11 +48,15 @@
 #include 
 
 #ifdef VFSLCKDEBUG
-#define ASSERT_VP_ISLOCKED(vp) do {\
-   if (((vp)->v_flag & VLOCKSWORK) && !VOP_ISLOCKED(vp)) { \
-   VOP_PRINT(vp);  \
-   panic("vp not locked"); \
-   }   \
+#define ASSERT_VP_ISLOCKED(vp) do {\
+   struct vnode *_vp = (vp);   \
+   int r;  \
+   if (_vp->v_op->vop_islocked == nullop)  \
+   break;  \
+   if ((r = VOP_ISLOCKED(_vp)) != LK_EXCLUSIVE) {  \
+   VOP_PRINT(_vp); \
+   panic("%s: vp not locked, vp %p, %d", __func__, _vp, r);\
+   }   \
 } while (0)
 #else
 #define ASSERT_VP_ISLOCKED(vp)  /* nothing */
blob - 81b900e83d2071d8450f35cfae42c6cb91f1a414
file + nfs/nfs_node.c
--- nfs/nfs_node.c
+++ nfs/nfs_node.c
@@ -133,9 +133,6 @@ loop:
}
 
vp = nvp;
-#ifdef VFSLCKDEBUG
-   vp->v_flag |= VLOCKSWORK;
-#endif
rrw_init_flags(>n_lock, "nfsnode", RWL_DUPOK | RWL_IS_VNODE);
vp->v_data = np;
/* we now have an nfsnode on this vnode */
blob - 3668f954a9aab3fd49ed5e41e7d4ab51b4bf0a90
file + sys/vnode.h
--- sys/vnode.h
+++ sys/vnode.h
@@ -146,8 +146,7 @@ struct vnode {
 #defineVCLONED 0x0400  /* vnode was cloned */
 #defineVALIASED0x0800  /* vnode has an alias */
 #defineVLARVAL 0x1000  /* vnode data not yet set up by higher 
level */
-#defineVLOCKSWORK  0x4000  /* FS supports locking discipline */
-#defineVCLONE  0x8000  /* vnode is a clone */
+#defineVCLONE  0x4000  /* vnode is a clone */
 
 /*
  * (v_bioflag) Flags that may be manipulated by interrupt handlers
blob - d859d216b40ebb2f5cce1eb5cf0becbfff21a638
file + ufs/ext2fs/ext2fs_subr.c
--- ufs/ext2fs/ext2fs_subr.c
+++ ufs/ext2fs/ext2fs_subr.c
@@ -170,9 +170,6 @@ ext2fs_vinit(struct mount *mp, struct vnode **vpp)
nvp->v_data = vp->v_data;
vp->v_data = NULL;
vp->v_op = _vops;
-#ifdef VFSLCKDEBUG
-   vp->v_flag &= ~VLOCKSWORK;
-#endif
vrele(vp);
vgone(vp);
/* Reinitialize aliased vnode. */
blob - 2aedef06acfdc500a4019c3f75a986b648c5b36a
file + ufs/ffs/ffs_subr.

vnode lock: avoid manipulating vnode lock directly

2021-10-15 Thread Sebastien Marie
Hi,

The following diff is still without intented behaviour changes.

The first chunk replaces a direct vp->v_op->vop_lock() call by
VOP_LOCK() wrapper. It only adds some safety check on vop_lock being
NULL (and MUTEX_ASSERT_UNLOCKED on vnode_mtx).

Others chunks replaces several direct manipulation of vnode lock by
VOP_LOCK / VOP_UNLOCK calls (instead of rrw_enter / rrw_exit). I
prefered VOP_LOCK() over vn_lock() to keep the code bug-to-bug
equivalent (if any).

Comments or OK ?
-- 
Sebastien Marie


diff e44725a8dd99f82f94f37ecff5c0e710c4dba97e 
/home/semarie/repos/openbsd/sys-clean
blob - a2a4643c4649ece502b8af46328cd953a7a93450
file + miscfs/deadfs/dead_vnops.c
--- miscfs/deadfs/dead_vnops.c
+++ miscfs/deadfs/dead_vnops.c
@@ -227,7 +227,7 @@ dead_lock(void *v)
if (ap->a_flags & LK_DRAIN || !chkvnlock(vp))
return (0);
 
-   return ((vp->v_op->vop_lock)(ap));
+   return VOP_LOCK(vp, ap->a_flags);
 }
 
 /*
blob - 38ca5b3e196592ea7d5a66d1d1bef52f99d9ecb2
file + isofs/cd9660/cd9660_node.c
--- isofs/cd9660/cd9660_node.c
+++ isofs/cd9660/cd9660_node.c
@@ -140,7 +140,7 @@ cd9660_ihashins(struct iso_node *ip)
*ipp = ip;
/* XXX locking unlock hash list? */
 
-   rrw_enter(>i_lock, RW_WRITE);
+   VOP_LOCK(ITOV(ip), LK_EXCLUSIVE);
 
return (0);
 }
blob - 81b900e83d2071d8450f35cfae42c6cb91f1a414
file + nfs/nfs_node.c
--- nfs/nfs_node.c
+++ nfs/nfs_node.c
@@ -146,7 +146,7 @@ loop:
bcopy(fh, np->n_fhp, fhsize);
np->n_fhsize = fhsize;
/* lock the nfsnode, then put it on the rbtree */
-   rrw_enter(>n_lock, RW_WRITE);
+   VOP_LOCK(vp, LK_EXCLUSIVE);
np2 = RBT_INSERT(nfs_nodetree, >nm_ntree, np);
KASSERT(np2 == NULL);
np->n_accstamp = -1;
blob - 2fef0bc139da82177ec809f76c30da4afef249f3
file + ufs/ufs/ufs_ihash.c
--- ufs/ufs/ufs_ihash.c
+++ ufs/ufs/ufs_ihash.c
@@ -137,7 +137,7 @@ ufs_ihashins(struct inode *ip)
ufsino_t inum = ip->i_number;
 
/* lock the inode, then put it on the appropriate hash list */
-   rrw_enter(>i_lock, RW_WRITE);
+   VOP_LOCK(ITOV(ip), LK_EXCLUSIVE);
 
/* XXXLOCKING lock hash list */
 
@@ -145,7 +145,7 @@ ufs_ihashins(struct inode *ip)
LIST_FOREACH(curip, ipp, i_hash) {
if (inum == curip->i_number && dev == curip->i_dev) {
/* XXXLOCKING unlock hash list? */
-   rrw_exit(>i_lock);
+   VOP_UNLOCK(ITOV(ip));
return (EEXIST);
}
}



patch: vnode lock: remove vop_generic_{,is,un}lock functions

2021-10-13 Thread Sebastien Marie
Hi,

The following diff removes vop_generic_{,un,is}lock functions.

These functions are only stubs (returning 0). Replace them by using
nullop function (same behaviour). There is no intented behaviour
changes.

While here, I reordered some vop_islocked member in structs to be next
others vop_{,un}lock members.

Note that I intent to reintroduce vop_generic_{,un,is}lock functions
later, but for now it is simplier to just remove them.

Comments or OK ?
-- 
Sebastien Marie


diff 5543f5ef435017650e5c7febf3b39d036a3c0b60 /home/semarie/repos/openbsd/src
blob - c018508380a9c91644585eec77e5070cf0c4f00c
file + sys/kern/spec_vnops.c
--- sys/kern/spec_vnops.c
+++ sys/kern/spec_vnops.c
@@ -89,9 +89,9 @@ const struct vops spec_vops = {
.vop_abortop= vop_generic_badop,
.vop_inactive   = spec_inactive,
.vop_reclaim= nullop,
-   .vop_lock   = vop_generic_lock,
-   .vop_unlock = vop_generic_unlock,
-   .vop_islocked   = vop_generic_islocked,
+   .vop_lock   = nullop,
+   .vop_unlock = nullop,
+   .vop_islocked   = nullop,
.vop_bmap   = vop_generic_bmap,
.vop_strategy   = spec_strategy,
.vop_print  = spec_print,
blob - b661ba724de5453b6489d74935f3155ba7771de9
file + sys/kern/vfs_default.c
--- sys/kern/vfs_default.c
+++ sys/kern/vfs_default.c
@@ -167,37 +167,6 @@ vop_generic_abortop(void *v)
return (0);
 }
 
-/*
- * Stubs to use when there is no locking to be done on the underlying object.
- * A minimal shared lock is necessary to ensure that the underlying object
- * is not revoked while an operation is in progress. So, an active shared
- * count should be maintained in an auxiliary vnode lock structure. However,
- * that's not done now.
- */
-int
-vop_generic_lock(void *v)
-{
-   return (0);
-}
- 
-/*
- * Decrement the active use count. (Not done currently)
- */
-int
-vop_generic_unlock(void *v)
-{
-   return (0);
-}
-
-/*
- * Return whether or not the node is in use. (Not done currently)
- */
-int
-vop_generic_islocked(void *v)
-{
-   return (0);
-}
-
 const struct filterops generic_filtops = {
.f_flags= FILTEROP_ISFD,
.f_attach   = NULL,
blob - 65ef86619a77d7a6858595757eb52a4308604ebb
file + sys/kern/vfs_sync.c
--- sys/kern/vfs_sync.c
+++ sys/kern/vfs_sync.c
@@ -267,9 +267,9 @@ const struct vops sync_vops = {
.vop_fsync  = sync_fsync,
.vop_inactive   = sync_inactive,
.vop_reclaim= nullop,
-   .vop_lock   = vop_generic_lock,
-   .vop_unlock = vop_generic_unlock,
-   .vop_islocked   = vop_generic_islocked,
+   .vop_lock   = nullop,
+   .vop_unlock = nullop,
+   .vop_islocked   = nullop,
.vop_print  = sync_print
 };
 
blob - a2a4643c4649ece502b8af46328cd953a7a93450
file + sys/miscfs/deadfs/dead_vnops.c
--- sys/miscfs/deadfs/dead_vnops.c
+++ sys/miscfs/deadfs/dead_vnops.c
@@ -89,11 +89,11 @@ const struct vops dead_vops = {
.vop_inactive   = dead_inactive,
.vop_reclaim= nullop,
.vop_lock   = dead_lock,
-   .vop_unlock = vop_generic_unlock,
+   .vop_unlock = nullop,
+   .vop_islocked   = nullop,
.vop_bmap   = dead_bmap,
.vop_strategy   = dead_strategy,
.vop_print  = dead_print,
-   .vop_islocked   = vop_generic_islocked,
.vop_pathconf   = dead_ebadf,
.vop_advlock= dead_ebadf,
.vop_bwrite = nullop,
blob - f2d49e4322df91b95dbe4ae650cdc9abee4bd1ef
file + sys/miscfs/fifofs/fifo_vnops.c
--- sys/miscfs/fifofs/fifo_vnops.c
+++ sys/miscfs/fifofs/fifo_vnops.c
@@ -91,12 +91,12 @@ const struct vops fifo_vops = {
.vop_abortop= vop_generic_badop,
.vop_inactive   = fifo_inactive,
.vop_reclaim= fifo_reclaim,
-   .vop_lock   = vop_generic_lock,
-   .vop_unlock = vop_generic_unlock,
+   .vop_lock   = nullop,
+   .vop_unlock = nullop,
+   .vop_islocked   = nullop,
.vop_bmap   = vop_generic_bmap,
.vop_strategy   = vop_generic_badop,
.vop_print  = fifo_print,
-   .vop_islocked   = vop_generic_islocked,
.vop_pathconf   = fifo_pathconf,
.vop_advlock= fifo_advlock,
.vop_bwrite = nullop
blob - fa334e23c17fe3ad5ef07a32f5b25807d7225ae8
file + sys/ntfs/ntfs_vnops.c
--- sys/ntfs/ntfs_vnops.c
+++ sys/ntfs/ntfs_vnops.c
@@ -668,9 +668,9 @@ const struct vops ntfs_vops = {
.vop_reclaim= ntfs_reclaim,
.vop_print  = ntfs_print,
.vop_pathconf   = ntfs_pathconf,
-   .vop_islocked   = vop_generic_islocked,
-   .vop_unlock = vop_generic_unlock,
-   .vop_lock   = vop_generic_lock,
+   .vop_islocked   = nullop,
+   .vop_unlock = nullop,
+   .vop_lock   = nullop,
.vop_lookup = ntfs_lookup,
.vop_access = ntfs_access,
.vop_close  = ntfs_close,
blob

Re: patch: remove dead variable from sys___realpath()

2021-10-02 Thread Sebastien Marie
On Sat, Oct 02, 2021 at 12:31:29PM +0200, Sebastien Marie wrote:
> Hi,
> 
> The following diff removes a dead variable `c' which is not used.
> 
> it is a leftover from LOCKPARENT removal in NDINIT().
> 
> See:
>  - 
> https://github.com/openbsd/src/commit/0eff3f09cea339207a839f716e75649e24b667b9
>  - 
> https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/kern/vfs_syscalls.c.diff?r1=1.349=1.350

cvsweb is hard. it is:
  
https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/kern/vfs_syscalls.c.diff?r1=1.336=1.337

-- 
Sebastien Marie



patch: remove dead variable from sys___realpath()

2021-10-02 Thread Sebastien Marie
Hi,

The following diff removes a dead variable `c' which is not used.

it is a leftover from LOCKPARENT removal in NDINIT().

See:
 - 
https://github.com/openbsd/src/commit/0eff3f09cea339207a839f716e75649e24b667b9
 - 
https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/kern/vfs_syscalls.c.diff?r1=1.349=1.350

Comments or OK ?
-- 
Sebastien Marie

 
diff 1006e50b83c18340095bb2af9ff6ea6f01c9a107 
72e4555fa87bffa757ba4a351ce114e513650faf
blob - 3a5db48992eac3fb8adaf98e1ed4db186d76ab33
blob + 7e8dbd9bd21f2d3646d6c9fdafb0f5c289174f1f
--- sys/kern/vfs_syscalls.c
+++ sys/kern/vfs_syscalls.c
@@ -862,7 +862,7 @@ sys___realpath(struct proc *p, void *v, register_t *re
syscallarg(const char *) pathname;
syscallarg(char *) resolved;
} */ *uap = v;
-   char *pathname, *c;
+   char *pathname;
char *rpbuf;
struct nameidata nd;
size_t pathlen;
@@ -916,11 +916,6 @@ sys___realpath(struct proc *p, void *v, register_t *re
free(cwdbuf, M_TEMP, cwdlen);
}
 
-   /* find root "/" or "//" */
-   for (c = pathname; *c != '\0'; c++) {
-   if (*c != '/')
-   break;
-   }
NDINIT(, LOOKUP, FOLLOW | SAVENAME | REALPATH, UIO_SYSSPACE,
pathname, p);
 



patch: vfs: merge *_badop to vop_generic_badop

2021-10-02 Thread Sebastien Marie
Hi,

The following diff was suggested by mpi@ some months ago.

It replaces spec_badop, fifo_badop, dead_badop and mfs_badop, which
are only calls to panic(9), to one unique function
vop_generic_badop().

No intented behaviour changes (outside the panic message which isn't
the same).

Comments or OK ?
-- 
Sebastien Marie


diff 5dc4728f64d299884a31fd6bf0344fc428f17017 /home/semarie/repos/openbsd/sys
blob - 06ab7114aa889fb2ceca3a3a7105acf017fa4396
file + isofs/cd9660/cd9660_vnops.c
--- isofs/cd9660/cd9660_vnops.c
+++ isofs/cd9660/cd9660_vnops.c
@@ -865,8 +865,8 @@ const struct vops cd9660_specvops = {
 
/* XXX: Keep in sync with spec_vops. */
.vop_lookup = vop_generic_lookup,
-   .vop_create = spec_badop,
-   .vop_mknod  = spec_badop,
+   .vop_create = vop_generic_badop,
+   .vop_mknod  = vop_generic_badop,
.vop_open   = spec_open,
.vop_close  = spec_close,
.vop_read   = spec_read,
@@ -876,15 +876,15 @@ const struct vops cd9660_specvops = {
.vop_kqfilter   = spec_kqfilter,
.vop_revoke = vop_generic_revoke,
.vop_fsync  = spec_fsync,
-   .vop_remove = spec_badop,
-   .vop_link   = spec_badop,
-   .vop_rename = spec_badop,
-   .vop_mkdir  = spec_badop,
-   .vop_rmdir  = spec_badop,
-   .vop_symlink= spec_badop,
-   .vop_readdir= spec_badop,
-   .vop_readlink   = spec_badop,
-   .vop_abortop= spec_badop,
+   .vop_remove = vop_generic_badop,
+   .vop_link   = vop_generic_badop,
+   .vop_rename = vop_generic_badop,
+   .vop_mkdir  = vop_generic_badop,
+   .vop_rmdir  = vop_generic_badop,
+   .vop_symlink= vop_generic_badop,
+   .vop_readdir= vop_generic_badop,
+   .vop_readlink   = vop_generic_badop,
+   .vop_abortop= vop_generic_badop,
.vop_bmap   = vop_generic_bmap,
.vop_strategy   = spec_strategy,
.vop_pathconf   = spec_pathconf,
@@ -907,8 +907,8 @@ const struct vops cd9660_fifovops = {
 
/* XXX: Keep in sync with fifo_vops. */
.vop_lookup = vop_generic_lookup,
-   .vop_create = fifo_badop,
-   .vop_mknod  = fifo_badop,
+   .vop_create = vop_generic_badop,
+   .vop_mknod  = vop_generic_badop,
.vop_open   = fifo_open,
.vop_close  = fifo_close,
.vop_read   = fifo_read,
@@ -918,17 +918,17 @@ const struct vops cd9660_fifovops = {
.vop_kqfilter   = fifo_kqfilter,
.vop_revoke = vop_generic_revoke,
.vop_fsync  = nullop,
-   .vop_remove = fifo_badop,
-   .vop_link   = fifo_badop,
-   .vop_rename = fifo_badop,
-   .vop_mkdir  = fifo_badop,
-   .vop_rmdir  = fifo_badop,
-   .vop_symlink= fifo_badop,
-   .vop_readdir= fifo_badop,
-   .vop_readlink   = fifo_badop,
-   .vop_abortop= fifo_badop,
+   .vop_remove = vop_generic_badop,
+   .vop_link   = vop_generic_badop,
+   .vop_rename = vop_generic_badop,
+   .vop_mkdir  = vop_generic_badop,
+   .vop_rmdir  = vop_generic_badop,
+   .vop_symlink= vop_generic_badop,
+   .vop_readdir= vop_generic_badop,
+   .vop_readlink   = vop_generic_badop,
+   .vop_abortop= vop_generic_badop,
.vop_bmap   = vop_generic_bmap,
-   .vop_strategy   = fifo_badop,
+   .vop_strategy   = vop_generic_badop,
.vop_pathconf   = fifo_pathconf,
.vop_advlock= fifo_advlock,
 };
blob - 0e0071ce7194dc5ebf5a59b0c6f14224a6b2b55d
file + kern/spec_vnops.c
--- kern/spec_vnops.c
+++ kern/spec_vnops.c
@@ -64,8 +64,8 @@ struct vnodechain speclisth[SPECHSZ];
 
 const struct vops spec_vops = {
.vop_lookup = vop_generic_lookup,
-   .vop_create = spec_badop,
-   .vop_mknod  = spec_badop,
+   .vop_create = vop_generic_badop,
+   .vop_mknod  = vop_generic_badop,
.vop_open   = spec_open,
.vop_close  = spec_close,
.vop_access = spec_access,
@@ -78,15 +78,15 @@ const struct vops spec_vops = {
.vop_kqfilter   = spec_kqfilter,
.vop_revoke = vop_generic_revoke,
.vop_fsync  = spec_fsync,
-   .vop_remove = spec_badop,
-   .vop_link   = spec_badop,
-   .vop_rename = spec_badop,
-   .vop_mkdir  = spec_badop,
-   .vop_rmdir  = spec_badop,
-   .vop_symlink= spec_badop,
-   .vop_readdir= spec_badop,
-   .vop_readlink   = spec_badop,
-   .vop_abortop= spec_badop,
+   .vop_remove = vop_generic_badop,
+   .vop_link   = vop_generic_badop,
+   .vop_rename = vop_generic_badop,
+   .vop_mkdir  = vop_generic_badop,
+   .vop_rmdir  = vop_generic_badop,
+   .vop_symlink= vop_generic_badop,
+   .vop_readdir= vop_generic_badop

Re: xdpyinfo: can't load library 'libdmx.so.2.0'

2021-09-15 Thread Sebastien Marie
On Wed, Sep 15, 2021 at 05:11:12AM +, Renato Aguiar wrote:
> I noticed this after sysupgrading to latest snapshot:
> 
> $ ldd /usr/X11R6/bin/xdpyinfo
> /usr/X11R6/bin/xdpyinfo:
> ld.so: xdpyinfo: can't load library 'libdmx.so.2.0'
> /usr/X11R6/bin/xdpyinfo: signal 9
> 
> It works fine after manually building and installing xdpyinfo from
> latest xenocara code.
> 

libdmx.so has been removed recently. From
xenocara/app/xdpyinfo/configure, dmx library could be detected as
build time.

I suppose the host used to build the snapshots still has old dmx
library in path, and the build of xdpyinfo detected it and link to it,
whereas the library isn't in sets anymore.

Thanks for the report.
-- 
Sebastien Marie



Re: Change vm_dsize to vsize_t

2021-09-10 Thread Sebastien Marie
On Fri, Sep 10, 2021 at 08:12:58AM -0600, Theo de Raadt wrote:
> > for Rust programs. It should even be free as Rust libc crate doesn't
> > have `struct kinfo_proc` definition. I can't comment on Go side.
> 
> The structure has nothing to do with libc!
> 
> So why would it occur there?

Because Rust libc is where Rust developers puts all the OS stuff.

The `struct kinfo_vmentry' or `struct kinfo_proc` on FreeBSD is
defined in Rust libc crate for example.

https://github.com/rust-lang/libc/blob/master/src/unix/bsd/freebsdlike/freebsd/mod.rs#L173

-- 
Sebastien Marie



Re: Change vm_dsize to vsize_t

2021-09-10 Thread Sebastien Marie
On Thu, Sep 09, 2021 at 12:01:14PM -0600, Theo de Raadt wrote:
> Stuart Henderson  wrote:
> 
> > On 2021/09/09 06:47, Greg Steuck wrote:
> > > Mark Kettenis  writes:
> > > 
> > > >> From: "Theo de Raadt" 
> > > >> Date: Tue, 07 Sep 2021 07:08:19 -0600
> > > >> 
> > > >> Or we could coordinate the Greg approach as a sysctl ABI change near a
> > > >> libc major bump.  On the other side of such a bump, all kernel + base +
> > > >> packages are updated to use the new storage ABI.  We get orderly .h
> > > >> files without a confusing glitch, and kern_sysctl.c doesn't need to
> > > >> store the value into two fields (32bit and 64bit) for the forseeable
> > > >> future.
> > > >> 
> > > >> Over the years I've arrived at the conclusion that maintaining binary
> > > >> compatibility at all costs collects too much confusing damage.  
> > > >> Instead,
> > > >> we've built an software ecosystem where ABI changes are expected and
> > > >> carry minimal consequence.
> > 
> > Sadly rust and especially go made a different decision about that.
> 
> I don't understand why this matters.
> 
> 3 libraries are cranking tomorrow.  For binary compatibility, people must
> upgrade their packages.
> 
> As to the binaries they built by hand?  Shrug!
> 

Speaking for Rust (which I know better, but I could easily assume Go
did the same), developers easily assumes that the world is Linux, and
backward compatibility is mandatory everywhere.

I had hard time to explain to them (1) that ABI and API could break on
platforms where they expects it wouldn't (like Windows), and that it
exists platforms where ABI/API breaking are normal (like BSDs).

Rust built lot of code on this (wrong) assumption. The simplier
example is the rust libc crate (2), which is a one time translation in
Rust language of libc headers. The difficulty is you can't easily
update definitions in breaking ways (3), because it isn't expected to
happen.

So for us, it means we have to deal with broken definitions in ports
and maintain local patches. But please don't missread me: I am not
asking for not doing breaking changes. I am just explaining why it is
matter a bit: the transition might not be free for ports
developers. It just need some synchronization to measure the potential
impact and prepare patches if need (like we already do for several
changes in base).

Regarding Rust, a badly prepared transition could impact more than
2000 ports, so ~20% of ports for amd64, with runtime breakages.

$ show-reverse-deps lang/rust | wc -l
2091

This particular change on vm_dsize is not expected to have such impact
for Rust programs. It should even be free as Rust libc crate doesn't
have `struct kinfo_proc` definition. I can't comment on Go side.

Thanks.


(1) Extend Rust target specification to follow more closely LLVM triple 
specification.
The underlined purpose is to allow Rust compiler to target more easily BSD 
OS version.

https://github.com/semarie/rust-rfcs/blob/target-extension/text/-target-extension.md#motivation
    https://github.com/rust-lang/rfcs/pull/2048

(2) How to deal with breaking changes on platform ? [BSDs related]
https://github.com/rust-lang/libc/issues/570

(3) List of libc issues labels as "breakage-candidate"
https://github.com/rust-lang/libc/labels/breakage-candidate

-- 
Sebastien Marie



timeout: execvp(2) return is always an error

2021-09-02 Thread Sebastien Marie
Hi,

If execvp(2) returns, it is always an error: there is no need to check
if the return value is -1. Just unconditionally call err(3).

Comments or OK ?
-- 
Sebastien Marie


 timeout: execvp(2) should not return except on error
 
diff 7821223d9093e8d64d2bab7db6b91e28360fdab3 
e4e5edd4879c3b7829d67cfb2b2fa1f7064c7b60
blob - 768df18bc0d2acc1b2ef46ea2693e4ef3c1d9d3a
blob + e45382c8f28e58754a579a6491580ebaab1d9a26
--- usr.bin/timeout/timeout.c
+++ usr.bin/timeout/timeout.c
@@ -165,7 +165,7 @@ main(int argc, char **argv)
int ch;
unsigned long   i;
int foreground = 0, preserve = 0;
-   int error, pstat, status;
+   int pstat, status;
int killsig = SIGTERM;
pid_t   pgid = 0, pid, cpid = 0;
double  first_kill;
@@ -251,9 +251,8 @@ main(int argc, char **argv)
signal(SIGTTIN, SIG_DFL);
signal(SIGTTOU, SIG_DFL);
 
-   error = execvp(argv[0], argv);
-   if (error == -1)
-   err(1, "execvp");
+   execvp(argv[0], argv);
+   err(1, "execvp");
}
 
if (pledge("stdio", NULL) == -1)



atactl: remove few printf("%s", NULL)

2021-09-02 Thread Sebastien Marie
Hi,

While looking at my worktree, I found some old changes I have locally
which could be commited.

Here one to remove few printf("%s", NULL) I saw long time ago in
atactl. I don't remember exactly when I saw them, which field trigger
syslog entry, and if I would still see them.

Comments or OK ?
-- 
Sebastien Marie



diff c9d76f17cafb34f4739ba697be46a54e300798de 
84d6df4e4168c371fb46e81b1c44c947a6b0b3f6
blob - 625e16f3087d2400445c1f69d41eb130703758b9
blob + b8858f311a522577502b0faf15a1c121e3525f8c
--- sbin/atactl/atactl.c
+++ sbin/atactl/atactl.c
@@ -74,7 +74,7 @@ __dead void usage(void);
 void ata_command(struct atareq *);
 void print_bitinfo(const char *, u_int, struct bitinfo *);
 int  strtoval(const char *, struct valinfo *);
-const char *valtostr(int, struct valinfo *);
+const char *valtostr(int, struct valinfo *, const char *);
 
 intfd; /* file descriptor for device */
 
@@ -453,15 +453,15 @@ strtoval(const char *str, struct valinfo *vinfo)
 /*
  * valtostr():
  *returns string associated with given value,
- *if no string found NULL is returned.
+ *if no string found def value is returned.
  */
 const char *
-valtostr(int val, struct valinfo *vinfo)
+valtostr(int val, struct valinfo *vinfo, const char *def)
 {
for (; vinfo->string != NULL; vinfo++)
if (val == vinfo->value)
return (vinfo->string);
-   return (NULL);
+   return (def);
 }
 
 /*
@@ -1338,14 +1338,14 @@ device_smart_read(int argc, char *argv[])
 
printf("Off-line data collection:\n");
printf("status: %s\n",
-   valtostr(data.offstat & 0x7f, smart_offstat));
+   valtostr(data.offstat & 0x7f, smart_offstat, "?"));
printf("activity completion time: %d seconds\n",
letoh16(data.time));
printf("capabilities:\n");
print_bitinfo("\t%s\n", data.offcap, smart_offcap);
printf("Self-test execution:\n");
printf("status: %s\n", valtostr(SMART_SELFSTAT_STAT(data.selfstat),
-   smart_selfstat));
+   smart_selfstat, "?"));
if (SMART_SELFSTAT_STAT(data.selfstat) == SMART_SELFSTAT_PROGRESS)
printf("remains %d%% of total time\n",
SMART_SELFSTAT_PCNT(data.selfstat));
@@ -1511,7 +1511,7 @@ device_smart_readlog(int argc, char *argv[])
printf("status: %s\n",
valtostr(SMART_SELFSTAT_STAT(
data->desc[i].selfstat),
-   smart_selfstat));
+   smart_selfstat, "?"));
printf("timestamp: %d\n",
MAKEWORD(data->desc[i].time1,
 data->desc[i].time2));
@@ -1551,7 +1551,7 @@ smart_print_errdata(struct smart_log_errdata *data)
printf("LBA High register: 0x%x\n", data->err.reg_lbahi);
printf("device register: 0x%x\n", data->err.reg_dev);
printf("status register: 0x%x\n", data->err.reg_stat);
-   printf("state: %s\n", valtostr(data->err.state, smart_logstat));
+   printf("state: %s\n", valtostr(data->err.state, smart_logstat, 
"?"));
printf("timestamp: %d\n", MAKEWORD(data->err.time1,
   data->err.time2));
printf("history:\n");
@@ -1643,9 +1643,8 @@ device_attr(int argc, char *argv[])
printf("ID\tAttribute name\t\t\tThreshold\tValue\tRaw\n");
for (i = 0; i < 30; i++) {
if (thr[i].id != 0 && thr[i].id == attr[i].id) {
-   attr_name = valtostr(thr[i].id, ibm_attr_names);
-   if (attr_name == NULL)
-   attr_name = "Unknown";
+   attr_name = valtostr(thr[i].id, ibm_attr_names,
+   "Unknown");
 
for (k = 0; k < 6; k++) {
u_int8_t b;



Re: [Patch] Document /upgrade.site in sysupgrade(8) man page

2021-08-28 Thread Sebastien Marie
On Sat, Aug 28, 2021 at 05:05:18PM +, Klemens Nanni wrote:
> On Sat, Aug 28, 2021 at 10:44:48AM -0500, Aaron Poffenberger wrote:
> > Based on conversations in another thread, here's a patch documenting
> > use of /upgrade.site in the sysupgrade(8) man page.
> > 
> > The revised doc references /upgrade.site and includes examples for
> > updating packages from Sebastien Marie.
> 
> Documenting is the right approach, imho (I didn't even know about
> $MODE.site) but this should probably be done in autoinstall(8).
> 
> This feature has nothing to do with sysupgrade per se and next to
> upgrade.site there's also install.site.

$MODE.site isn't specially related to autoinstall(8) too :)

> I'd amend autoinstall(8) and briefly mention it in sysupgrade(8), just
> via EXAMPLES or so to avoid duplication but showing a neat usecase.

Currently, these scripts seems to be only documented in the FAQ
(https://www.openbsd.org/faq/faq4.html#site). so having some
additionnal references at them in few man pages would be good.

Having examples in sysupgrade(8) and in autoinstall(8) makes sense to
me.

FAQ could be expanded a bit too.

Thanks.
-- 
Sebastien Marie



Re: [Patch] - Add -u (update packages) to sysupgrade(8)

2021-08-28 Thread Sebastien Marie
On Fri, Aug 27, 2021 at 08:17:51PM -0500, Aaron Poffenberger wrote:
> Following is patch to add a flag to upgrade packages during
> rc.firsttime after a sysupgrade.
> 

if you need this flag, is it a ponctual usage (running sysupgrade with
the flag or not, depending the moment) ? or you will use it everytime
you use sysupgrade ?

for permanent usage, I would recommand using existing facility: the
/upgrade.site script.

for example:

# cat /upgrade.site
#!/bin/sh
PATH=/sbin:/bin:/usr/sbin:/usr/bin

# upgrade packages
echo 'pkg_add -Iu' >>/etc/rc.firsttime

# run sysclean (if installed)
echo '[ -x /usr/local/sbin/sysclean ] && /usr/local/sbin/sysclean | mail -Es 
sysclean root &' >>/etc/rc.firsttime

exit 0
#

bsd.rd upgrade will run /upgrade.site (/ from target root disk) at end
of the upgrade process (it is ran chrooted in the target root
disk). during upgrade it is adding lines in /etc/rc.firsttime, and at
reboot it will run "pkg_add -Iu" and "sysclean" (mailing output to
root) each time.

It would be more convinent than using a flag at sysupgrade invocation
(no risk to omit it).

Thanks.
-- 
Sebastien Marie



Re: allow KARL with config(8)'d kernels

2021-08-25 Thread Sebastien Marie
On Tue, Aug 24, 2021 at 01:53:41PM +0200, Paul de Weerd wrote:
> I have a new machine where I'd like to use IPMI.  Of course, doing
> `config -e -f /bsd` will break KARL, so I tried to find a minimal way
> of supporting this.  Done by introducing a new config file,
> /etc/kernel.conf, which gets applied to the kernel reorder_kernel
> builds and installs.

I like it: it is simple.

but here are some thoughts (with questions and some answers, but it is
open):

- does it integrate well with syspatch ?

  yes, syspatch will call /usr/libexec/reorder_kernel (and the
  configuration will be applied)


- after install or upgrade, does the installed kernel (by installer)
  has the configuration applied ?

  it seems not: install.sub has it owns logic and is directly calling
  "make newbsd ; make install" and do not use reorder_kernel script.
  

  I could see reason to avoid it, and reason to requiring it.


  For install, no problem: the file is controlled (it doesn't
  exist). It doesn't change anything.

  For upgrade, the file could exists. Should the installer using it or
  not ?


  If using it, it makes the upgrade process dependent of the file: how
  to deal with an invalid file ? should the file ignored (kernel
  installed but not configured) ?

  how to deal with a format change in config(8) (file on disk in old
  format, used config(8) understanding new format) ? but config(8)
  doesn't change often, and could take care of both formats for one
  release, and/or current.html could mention some required changes.


  If not using it, does it is a problem that the first boot will be
  different from next boot ?

  I could imagine some changes made to be the machine bootable, so the
  first boot could lead to unbootable machine (but it isn't different
  from now).


Questions are open: I have no problem which using or not using the
configuration file on upgrade, but I think it should be documented (to
avoid questions and/or surprises).

Thanks
-- 
Sebastien Marie



> Index: reorder_kernel.sh
> ===
> RCS file: /home/OpenBSD/cvs/src/libexec/reorder_kernel/reorder_kernel.sh,v
> retrieving revision 1.9
> diff -u -p -r1.9 reorder_kernel.sh
> --- reorder_kernel.sh 28 Sep 2019 17:30:07 -  1.9
> +++ reorder_kernel.sh 24 Aug 2021 07:01:10 -
> @@ -63,6 +63,7 @@ fi
>  
>  cd $KERNEL_DIR/$KERNEL
>  make newbsd
> +[ -f /etc/kernel.conf ] && config -e -c /etc/kernel.conf -f bsd
>  make newinstall
>  sync
>  
> Index: Makefile
> ===
> RCS file: /home/OpenBSD/cvs/src/libexec/reorder_kernel/Makefile,v
> retrieving revision 1.1
> diff -u -p -r1.1 Makefile
> --- Makefile  21 Aug 2017 21:24:11 -  1.1
> +++ Makefile  24 Aug 2021 07:23:38 -
> @@ -1,6 +1,7 @@
>  #$OpenBSD: Makefile,v 1.1 2017/08/21 21:24:11 rpe Exp $
>  
>  SCRIPT=  reorder_kernel.sh
> +MAN= kernel.conf.5
>  
>  realinstall:
>   ${INSTALL} ${INSTALL_COPY} -o ${BINOWN} -g ${BINGRP} -m ${BINMODE} \
> Index: kernel.conf.5
> ===
> RCS file: kernel.conf.5
> diff -N kernel.conf.5
> --- /dev/null 1 Jan 1970 00:00:00 -
> +++ kernel.conf.5 24 Aug 2021 07:23:07 -
> @@ -0,0 +1,46 @@
> +.\"  $OpenBSD$
> +.\"
> +.\" Copyright (c) 2021 Paul de Weerd 
> +.\"
> +.\" Permission to use, copy, modify, and distribute this software for any
> +.\" purpose with or without fee is hereby granted, provided that the above
> +.\" copyright notice and this permission notice appear in all copies.
> +.\"
> +.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
> +.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
> +.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
> +.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
> +.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
> +.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
> +.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
> +.\"
> +.Dd $Mdocdate: August 24 2021 $
> +.Dt KERNEL.CONF 5
> +.Os
> +.Sh NAME
> +.Nm kernel.conf
> +.Nd kernel configuration file
> +.Sh DESCRIPTION
> +The
> +.Nm
> +file contains configuration information for the kernel.
> +If present, it is used during system startup to configure the kernel
> +that will be running at the next boot.
> +It can be used to enable or disable specific devices in the kernel.
> +.Sh EXAMPLES
> +To

Re: dhcpleased/resolved: omit ISP DNS in DHCP offer

2021-07-18 Thread Sebastien Marie
On Sun, Jul 18, 2021 at 07:10:06PM +0200, stolen data wrote:
> After reading the man pages of the replacements for dhclient(8) it looks
> like it will no longer be possible to prevent DNS entries in a DHCP offer
> from ending up in the resolv.conf - as is practically always the case with
> a residential Internet connection with DHCP. I don't want my ISP's DNS
> servers *anywhere* in my resolv.conf, so I tell dhclient(8) to SUPERSEDE
> DOMAIN-NAME-SERVERS, but now it looks like OpenBSD will no longer allow me
> the choice and instead decides for me that DNS entries in DHCPOFFER must
> always be included and made available for eventual use. Why? This is taking
> control out of the hands of users and I cannot see any sense in the
> rationale.

you don't want /etc/resolv.conf be controlled by dhcp ? just disable
resolvd daemon, and /etc/resolv.conf is your.

$ doas rcctl stop resolvd
$ doas rcctl disable resolvd

alternatively, if you are using unwind(8), resolvd(8) will
automatically use unwind nameserver instead of dhcp/slaacd ones.

Regards.
-- 
Sebastien Marie



patch: __realpath: no need of LOCKLEAF

2021-06-25 Thread Sebastien Marie
Hi,

The following diff removes LOCKLEAF from NDINIT. The code doesn't
doesn't need it: the returned vnode is released immediately. The
string path is built from the namei() call using REALPATH, during
directories traversal.

Without LOCKLEAF, calling vrele() only is enough if namei() found a
file, instead of calling VOP_UNLOCK() + vrele().

Comments or OK ?
-- 
Sebastien Marie

diff dca1bd8d5621788b92aad3f944ac965773e2 
545f8aacd74a1f793d1289475eb1c3f84d649e06
blob - 840ea5453e13604cb38b6a5d580370053386ce71
blob + fe240bb520b142726f358dfec4eaf6f26de151c1
--- sys/kern/vfs_syscalls.c
+++ sys/kern/vfs_syscalls.c
@@ -916,36 +916,35 @@ sys___realpath(struct proc *p, void *v, register_t *re
}
 
free(cwdbuf, M_TEMP, cwdlen);
}
 
/* find root "/" or "//" */
for (c = pathname; *c != '\0'; c++) {
if (*c != '/')
break;
}
-   NDINIT(, LOOKUP, FOLLOW | LOCKLEAF | SAVENAME | REALPATH,
-   UIO_SYSSPACE, pathname, p);
+   NDINIT(, LOOKUP, FOLLOW | SAVENAME | REALPATH, UIO_SYSSPACE,
+   pathname, p);
 
nd.ni_cnd.cn_rpbuf = rpbuf;
nd.ni_cnd.cn_rpi = strlen(rpbuf);
 
nd.ni_pledge = PLEDGE_RPATH;
nd.ni_unveil = UNVEIL_READ;
if ((error = namei()) != 0)
goto end;
 
-   /* release lock and reference from namei */
-   if (nd.ni_vp) {
-   VOP_UNLOCK(nd.ni_vp);
+   /* release reference from namei */
+   if (nd.ni_vp)
vrele(nd.ni_vp);
-   }
+
error = copyoutstr(nd.ni_cnd.cn_rpbuf, SCARG(uap, resolved),
MAXPATHLEN, NULL);
 
 #ifdef KTRACE
if (KTRPOINT(p, KTR_NAMEI))
ktrnamei(p, nd.ni_cnd.cn_rpbuf);
 #endif
pool_put(_pool, nd.ni_cnd.cn_pnbuf);
 end:
pool_put(_pool, rpbuf);



Re: Match ps pledge name order with pledge(2)

2021-06-09 Thread Sebastien Marie
On Wed, Jun 09, 2021 at 09:01:34AM -0600, Theo de Raadt wrote:
> Josh Rickmar  wrote:
> 
> > I figure that the manpage is probably the more consulted reference,
> > and the order that is preferred, so the patch below reorders the
> > promise names in pledge.h to match.
> 
> The current array was value-sorted (by the bit value) to allow binary
> search.  However no code is actually using binary search.  Honestly it
> will be hard to maintain this correctly in the future because of the
> symbolic names overlaying the bit values.
> 
> The order of the manual pages has come up in discussion before.  Some folk
> wanted them to be in alphabetic order, but I pushed back, because the order
> we use is better for learning incrementally.
> 
> So we have 3 orders to consider:  bit order, name order, or man page order.
> 
> My gut reaction is to agree -- man page order is the way to go.
> 
> Let's wait a little while and see what others say.

it is fine with me.

thanks.
-- 
Sebastien Marie



Re: ftpd(8): add pledge(2)

2021-05-14 Thread Sebastien Marie
On Fri, May 14, 2021 at 07:29:48AM +0200, Matthias Pressfreund wrote:
> Interesting. How do I figure the correct order of keywords? So far I thought 
> it
> didn't matter.

for the kernel, the order doesn't matter.

for people reviewing code, it matters.
 
> On 2021-05-13 18:40, Theo de Raadt wrote:
> > +   if (pledge("stdio rpath inet recvfd sendfd "
> > +   "wpath cpath proc tty getpw", NULL) == 
> > -1)
> > 
> > Please change the order:
> > 
> > stdio rpath wpath cpath inet recvfd sendfd proc tty getpw
> > 
> > (It remains extremely permissive).
> > 
> 

-- 
Sebastien Marie



patch: add support for RTLD_NODELETE

2021-05-10 Thread Sebastien Marie
Hi,

The following diff adds support for RTLD_NODELETE in ld.so(1).

It helps Qt programs which is using RTLD_NODELETE per default for
loading plugins.

Without this patch, qgis (for example) is crashing systematically on
exit. With it, it is fine.

If RTLD_NODELETE isn't POSIX, it is widely deployed: at least linux,
freebsd, dragonfly, netbsd, solaris, illumos, apple, and fuchsia have
it.

I built a full release on i386 with it and built several packages
(most of dependencies of gqis which is including qt5).

One drawback will be for ports: a build with the diff might change
built code as RTLD_NODELETE will be present in headers. So it might
deserves a libc bump to correctly update installed ports.

Comments or OK ?
-- 
Sebastien Marie


diff 393e7b397988bb6abe46729de1794883d2b9d5cf /home/semarie/repos/openbsd/src
blob - 431065f3eab32299ad39766592e72a1765c8e8dc
file + include/dlfcn.h
--- include/dlfcn.h
+++ include/dlfcn.h
@@ -42,6 +42,7 @@
 #define RTLD_GLOBAL0x100
 #define RTLD_LOCAL 0x000
 #define RTLD_TRACE 0x200
+#define RTLD_NODELETE  0x400
 
 /*
  * Special handle arguments for dlsym().
blob - b8d5512e32bf50351b432a539106b1695a51f10f
file + libexec/ld.so/dlfcn.c
--- libexec/ld.so/dlfcn.c
+++ libexec/ld.so/dlfcn.c
@@ -54,7 +54,7 @@ dlopen(const char *libname, int flags)
int failed = 0;
int obj_flags;
 
-   if (flags & ~(RTLD_TRACE|RTLD_LAZY|RTLD_NOW|RTLD_GLOBAL)) {
+   if (flags & ~(RTLD_TRACE|RTLD_LAZY|RTLD_NOW|RTLD_GLOBAL|RTLD_NODELETE)) 
{
_dl_errno = DL_INVALID_MODE;
return NULL;
}
@@ -89,6 +89,9 @@ dlopen(const char *libname, int flags)
 
_dl_link_dlopen(object);
 
+   if (flags & RTLD_NODELETE)
+   object->obj_flags |= DF_1_NODELETE;
+   
if (OBJECT_REF_CNT(object) > 1) {
/* if opened but grpsym_vec has not been filled in */
if (object->grpsym_vec.len == 0)
blob - afdf60ff428680eabc76f667442934511a8576fb
file + share/man/man3/dlfcn.3
--- share/man/man3/dlfcn.3
+++ share/man/man3/dlfcn.3
@@ -124,6 +124,19 @@ each of the above values together.
 If an object was opened with RTLD_LOCAL and later opened with RTLD_GLOBAL,
 then it is promoted to RTLD_GLOBAL.
 .Pp
+Additionally, the following flag may be ORed into the mode argument:
+.Pp
+.Bl -tag -width "RTLD_NODELETE" -compact -offset indent
+.It Sy RTLD_NODELETE
+Prevents unload of the loaded object on
+.Fn dlclose .
+The same behaviour may be requested by
+.Fl z
+.Cm nodelete
+option of the static linker
+.Xr ld 1 .
+.El
+.Pp
 The main executable's symbols are normally invisible to
 .Fn dlopen
 symbol resolution.



emutls and dlopen(3) problem - Re: patch: make ld.so aware of pthread_key_create destructor - Re: multimedia/mpv debug vo=gpu crash on exit

2021-05-08 Thread Sebastien Marie
On Thu, May 06, 2021 at 09:32:28AM +0200, Sebastien Marie wrote:
> Hi,
> 
> Anindya, did a good analysis of the problem with mpv using gpu video
> output backend (it is using EGL and mesa if I correctly followed).
> 
> 
> For people not reading ports@ here a resume: the destructor function
> used in pthread_key_create() needs to be present in memory until
> _rthread_tls_destructors() is called.
> 
> in the case of mesa, eglInitialize() function could load, via
> dlopen(), code which will use pthread_key_create() with destructor.
> 
> once dlclose() is called, the object is unloaded from memory, but a
> reference to destructor is kept, leading to segfault when
> _rthread_tls_destructors() run and use the destructor (because
> pointing to unloaded code).
>

I was going deeper in the analysis.

At first, I tought that the pthread_key_create() call was going from
mesa driver (radeonsi_dri.so on my machine) as pinning the DSO in
memory (using LD_PRELOAD) permitted to avoid the segfault.

In fact, it isn't directly radeonsi_dri.so but another dependant
library: libLLVM.so.5.0 in this case (by using
LD_PRELOAD=.../libLLVM.so.5.0, the crash disapparear).

Searching where is located the pthread_key_create() call, I found that
it was coming from emutls implementation (which is using
pthread_key_create + destructor) and which is statically linked with
compiler-rt.a

By instrumenting pthread_key_create, I have the following backtrace
(the abort(3) is mine):

(gdb) bt
#0  thrkill () at /tmp/-:3
#1  0x05188f550abe in _libc_abort () at /usr/src/lib/libc/stdlib/abort.c:51
#2  0x05191c7e8c2b in pthread_key_create () from 
/home/semarie/Documents/devel/libhijacking/libthread.so
#3  0x0519399e6a87 in emutls_init () at 
/usr/src/gnu/lib/libcompiler_rt/../../llvm/compiler-rt/lib/builtins/emutls.c:118
#4  0x05188f55b4f7 in pthread_once (once_control=0x51939d00b30 
, init_routine=0x27240efb23d627ef) at 
/usr/src/lib/libc/thread/rthread_once.c:26
#5  0x0519399e68dd in emutls_init_once () at 
/usr/src/gnu/lib/libcompiler_rt/../../llvm/compiler-rt/lib/builtins/emutls.c:125
#6  emutls_get_index (control=0x51939cae5c8 
<__emutls_v._ZL25TimeTraceProfilerInstance>) at 
/usr/src/gnu/lib/libcompiler_rt/../../llvm/compiler-rt/lib/builtins/emutls.c:316
#7  __emutls_get_address (control=0x51939cae5c8 
<__emutls_v._ZL25TimeTraceProfilerInstance>) at 
/usr/src/gnu/lib/libcompiler_rt/../../llvm/compiler-rt/lib/builtins/emutls.c:379
#8  0x0519387f296e in llvm::getTimeTraceProfilerInstance() () from 
/usr/lib/libLLVM.so.5.0
#9  0x051938ec2bf2 in llvm::legacy::PassManagerImpl::run(llvm::Module&) () 
from /usr/lib/libLLVM.so.5.0
#10 0x05193974eb67 in LLVMRunPassManager () from /usr/lib/libLLVM.so.5.0
#11 0x0518d11276d8 in ?? () from /usr/X11R6/lib/modules/dri/radeonsi_dri.so
#12 0x0518d1082761 in ?? () from /usr/X11R6/lib/modules/dri/radeonsi_dri.so
#13 0x0518d110b1ea in ?? () from /usr/X11R6/lib/modules/dri/radeonsi_dri.so
#14 0x0518d0c7939c in ?? () from /usr/X11R6/lib/modules/dri/radeonsi_dri.so
#15 0x0518d0c794ed in ?? () from /usr/X11R6/lib/modules/dri/radeonsi_dri.so
#16 0x0518d1cfbec1 in _rthread_start (v=) at /usr/src/lib/librthread/rthread.c:96
#17 0x05188f52bd2a in __tfork_thread () at 
/usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:84

It means that emutls implementation we are using couldn't be safely
used if the code is using dlopen(3).


I made the following PoC using __thread :

$ cat lib.c
#include 

__thread int value = 0;

void
fn()
{
printf("entering:  %s\n", __func__);
value = 1;
printf("returning: %s\n", __func__);
}

$ cat main.c
#include 
#include 
#include 
#include 
#include 

void *
loadcode(void *arg)
{
void *lib;
void (*fn)();

printf("thread: entering\n");

printf("dlopen(3)\n");
if ((lib = dlopen("./lib.so", 0)) == NULL)
errx(EXIT_FAILURE, "dlopen: %s", dlerror());

if ((fn = dlsym(lib, "fn")) == NULL)
errx(EXIT_FAILURE, "dlsym: %s", dlerror());

fn();

printf("dlclose(3)\n");
if (dlclose(lib) != 0)
errx(EXIT_FAILURE, "dlclose: %s", dlerror());

printf("thread: returning\n");
return arg;
}

int
main(int argc, char *argv[])
{
int error;
pthread_t th;

if ((error = pthread_create(, NULL, , NULL)) != 0)
errc(error, EXIT_FAILURE, "pthread_create");

if ((error = pthread_join(th, NULL)) != 0)
errc(error, EXIT_FAILURE, "pthread_join");

return EXIT_SUCCESS;
}

$ cc lib.c -Wall -lpthread -shared -fPIC -o lib.so
$ cc main.c -Wall -lpthread
$ ./a.out
thread: entering
dlopen(3)
entering:  fn
returning

Re: patch: make ld.so aware of pthread_key_create destructor - Re: multimedia/mpv debug vo=gpu crash on exit

2021-05-07 Thread Sebastien Marie
On Thu, May 06, 2021 at 06:23:08PM -0700, Anindya Mukherjee wrote:
> On Thu, May 06, 2021 at 08:00:56AM -0600, Todd C. Miller wrote:
> > On Thu, 06 May 2021 09:32:28 +0200, Sebastien Marie wrote:
> > 
> > > We already take care of such situation with __cxa_thread_atexit_impl
> > > (in libc/stdlib/thread_atexit.c), by keeping an additionnal reference
> > > on object loaded (it makes ld.so aware that it is still used and so
> > > dlclose() doesn't unload it).
> > >
> > > I used the same idiom for pthread_key_create() and used dlctl(3) in
> > > the same way with the destructor address.
> > 
> > This will set STAT_NODELETE so the DSO will never really get unloaded.
> > That's not a problem for atexit() since the process is headed for
> > the exit.
> > 
> > I'm less sure about using it here since we don't have a way to
> > unreference the DSO upon pthread_key_delete().
> > 
> >  - todd
> 
> I did a quick investigation on my Linux machine and there mpv seems to
> be using libEGL_mesa.so instead of iris_dri.so. In this case I am not
> seeing a call to pthread_key_create at the start of video playback
> (there are some other places where pthread_key_create is called from but
> they don't cause a problem). So, not sure what happens in Linux when
> iris_dri.so is used.

libEGL_mesa.so seems to be used when mesa is built with 'with_glvnd'
option. glvnd is "vendor-neutral libGL" :
  https://gitlab.freedesktop.org/glvnd/libglvnd


> However, the Linux implementation of
> pthread_key_create seems to also not increment the refcount when the
> destructor is set so I don't yet see how it's solved there, assuming
> iris_dri.so behaves identically.

glibc seems to have the same problem with pthread_key_create():
  https://sourceware.org/bugzilla/show_bug.cgi?id=21032
and the bugreport reference a simple poc at
  https://github.com/Aaron1011/pthread_dlopen


-- 
Sebastien Marie



Re: patch: make ld.so aware of pthread_key_create destructor - Re: multimedia/mpv debug vo=gpu crash on exit

2021-05-07 Thread Sebastien Marie
On Fri, May 07, 2021 at 04:41:50PM -0700, Anindya Mukherjee wrote:
> On Thu, May 06, 2021 at 08:00:56AM -0600, Todd C. Miller wrote:
> > On Thu, 06 May 2021 09:32:28 +0200, Sebastien Marie wrote:
> > 
> > > We already take care of such situation with __cxa_thread_atexit_impl
> > > (in libc/stdlib/thread_atexit.c), by keeping an additionnal reference
> > > on object loaded (it makes ld.so aware that it is still used and so
> > > dlclose() doesn't unload it).
> > >
> > > I used the same idiom for pthread_key_create() and used dlctl(3) in
> > > the same way with the destructor address.
> > 
> > This will set STAT_NODELETE so the DSO will never really get unloaded.
> > That's not a problem for atexit() since the process is headed for
> > the exit.
> > 
> > I'm less sure about using it here since we don't have a way to
> > unreference the DSO upon pthread_key_delete().
> 
> For sure I don't fully appreciate the complexities involved here, but is
> it possible to store the shared object handle along with the destructor
> when the reference count is incremented in the above patch? Then we
> could use that to decrement the reference.
> 

It needs help from ld.so to decrement the reference and unload the DSO
if the refcount is 0. It is something doable (by extending dlctl(3)
command set), but it could be tricky to implement properly.

I would be help full to understand why EGL is using some code path on
Linux and another on OpenBSD: it would permit to use a more
common/tested code path and avoid the issue at first.

Thanks.
-- 
Sebastien Marie



patch: make ld.so aware of pthread_key_create destructor - Re: multimedia/mpv debug vo=gpu crash on exit

2021-05-06 Thread Sebastien Marie
Hi,

Anindya, did a good analysis of the problem with mpv using gpu video
output backend (it is using EGL and mesa if I correctly followed).


For people not reading ports@ here a resume: the destructor function
used in pthread_key_create() needs to be present in memory until
_rthread_tls_destructors() is called.

in the case of mesa, eglInitialize() function could load, via
dlopen(), code which will use pthread_key_create() with destructor.

once dlclose() is called, the object is unloaded from memory, but a
reference to destructor is kept, leading to segfault when
_rthread_tls_destructors() run and use the destructor (because
pointing to unloaded code).

We already take care of such situation with __cxa_thread_atexit_impl
(in libc/stdlib/thread_atexit.c), by keeping an additionnal reference
on object loaded (it makes ld.so aware that it is still used and so
dlclose() doesn't unload it).

I used the same idiom for pthread_key_create() and used dlctl(3) in
the same way with the destructor address.

With the following diff, I am not able to reproduce the segfault
anymore with mpv.

diff 393e7b397988bb6abe46729de1794883d2b9d5cf /home/semarie/repos/openbsd/src
blob - 58b39bb7df70f54c708d2e2b11a3978806a86005
file + lib/libc/thread/rthread_tls.c
--- lib/libc/thread/rthread_tls.c
+++ lib/libc/thread/rthread_tls.c
@@ -22,10 +22,8 @@
 #include 
 #include 
 #include 
+#include 
 
-#include 
-#include 
-
 #include "rthread.h"
 
 
@@ -58,6 +56,9 @@ pthread_key_create(pthread_key_t *key, void (*destruct
rkeys[hint].used = 1;
rkeys[hint].destructor = destructor;
 
+   if (destructor != NULL)
+   dlctl(NULL, DL_REFERENCE, destructor);
+   
*key = hint++;
if (hint >= PTHREAD_KEYS_MAX)
hint = 0;


I am still unsure if we want to solve this problem this way, and if
this diff would introduce more problems than it tries to solve.

At first place, it might be better to not register a destructor when
using dlopen()...

Comments would be appreciated.

Thanks.
-- 
Sebastien Marie


On Wed, May 05, 2021 at 01:20:02AM -0700, Anindya Mukherjee wrote:
> Hi,
> 
> I have been investigating the crash on exit problem with mpv in ports
> with vo=gpu. I think I made a little bit of progress and thought I'd
> share my findings.
> 
> The crash (SIGSEGV) happens when thread local destructors
> are called from /usr/src/lib/libc/thread/rthread_tls.c:182 in
> _rthread_tls_destructors after the gpu thread exits: vo_thread in
> video/out/vo.c:1067. The crashing call stack looks like this:
> 
> #0  0x0176ffdc9680 in ?? ()
> #1  0x017748d347b5 in _rthread_tls_destructors (thread=0x17798917840) at 
> /usr/src/lib/libc/thread/rthread_tls.c:182
> #2  0x017748d98623 in _libc_pthread_exit (retval= Unhandled dwarf expression opcode 0xa3>) at 
> /usr/src/lib/libc/thread/rthread.c:150
> #3  0x017795b22189 in _rthread_start (v= Unhandled dwarf expression opcode 0xa3>) at 
> /usr/src/lib/librthread/rthread.c:97
> #4  0x017748d0c5ba in __tfork_thread () at 
> /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:84
> 
> Note that some of the traces were taken from different runs so there
> might be some mismatch between the handles/addresses.
> 
> It crashes because the destructor is dangling. This mystified me because
> if I look at the mpv source, there is no thread local data for the gpu
> thread. Indeed, right after the gpu thread starts running if we look
> inside the thread structure, local_storage is null. However, if we look
> at the same thread at the point of the crash, its local_storage is
> populated:
> 
> (gdb) p *(*thread).local_storage
> $3 = {
>   keyid = 7,
>   next = 0x177353442e0,
>   data = 0x1770c276000
> }
> 
> The keys are indexed by the keyid in the rkeys array, from where the
> destructor is fetched in _rthread_tls_destructors:
> 
> (gdb) p rkeys[7]
> $6 = {
>   used = 1,
>   destructor = 0x43dd33e2680
> }
> 
> This destructor now points to invalid memory. It turns out the thread
> local storage is being initialised here:
> 
> #0  _libc_pthread_key_create (key=0x43dd380da08, destructor=0x43dd33e2680) at 
> /usr/src/lib/libc/thread/rthread_tls.c:42
> #1  0x043dd33e2667 in ?? () from /usr/X11R6/lib/modules/dri/iris_dri.so
> #2  0x043e793f82f7 in pthread_once (once_control=0x43dd380d9f8, 
> init_routine=0x43e793db3c0 <_libc_pthread_key_create>) at 
> /usr/src/lib/libc/thread/rthread_once.c:26
> #3  0x043dd33e24bd in ?? () from /usr/X11R6/lib/modules/dri/iris_dri.so
> #4  0x043dd305475f in ?? () from /usr/X11R6/lib/modules/dri/iris_dri.so
> #5  0x043dd3036c70 in ?? () from /usr/X11R6/lib/modules/dri/iris_dri.so
> #6  0x043dd30e3ca3 in ?? () from /usr/X11R6/lib/modules/dri/iris_dri.so
> #7  0x043dd30e4b96 in ?? 

Re: enable dt(4)

2021-04-28 Thread Sebastien Marie
On Wed, Apr 28, 2021 at 01:11:07AM +0200, Alexander Bluhm wrote:
> On Mon, Apr 26, 2021 at 05:13:55PM +0200, Sebastien Marie wrote:
> > > I can't vouch that it builds for all architectures... Did anyone do
> > > that?  Number 1 rule: don't break Theo's build.
> >
> > One test would be to build on i386 (with full release process): we are
> > near the limit currently, so it could overflow the available disk
> > space very easily.
> 
> I did a release build on i386.
> 
> > Regarding usefulness on archs, dt(4) has proper frames-skip support
> > for amd64, prowerpc64 and sparc64 only (others archs would default to
> > "don't skip frames in stack traces").
> 
> On i386 dt(4) does not strip the stack trace.  But it builds and
> the device works.  The output is useful, the upper layers of the
> graph are redundant.
> 
> http://bluhm.genua.de/files/kstack-i386.svg
> 
> Better commit the architectures where it works instead of waiting
> until someone fixes alpha, loongson, ...

hep. it make sense to do it this way.

> ok?

ok semarie@

> bluhm
> 
> Index: conf/GENERIC
> ===
> RCS file: /data/mirror/openbsd/cvs/src/sys/conf/GENERIC,v
> retrieving revision 1.276
> diff -u -p -r1.276 GENERIC
> --- conf/GENERIC  22 Apr 2021 10:23:07 -  1.276
> +++ conf/GENERIC  26 Apr 2021 13:37:19 -
> @@ -82,7 +82,6 @@ pseudo-device   msts1   # MSTS line discipl
>  pseudo-deviceendrun  1   # EndRun line discipline
>  pseudo-devicevnd 4   # vnode disk devices
>  pseudo-deviceksyms   1   # kernel symbols device
> -#pseudo-device   dt  # Dynamic Tracer
> 
>  # clonable devices
>  pseudo-devicebpfilter# packet filter
> Index: arch/i386/conf/GENERIC
> ===
> RCS file: /data/mirror/openbsd/cvs/src/sys/arch/i386/conf/GENERIC,v
> retrieving revision 1.855
> diff -u -p -r1.855 GENERIC
> --- arch/i386/conf/GENERIC4 Feb 2021 16:25:39 -   1.855
> +++ arch/i386/conf/GENERIC26 Apr 2021 13:35:33 -
> @@ -763,6 +763,7 @@ owctr*at onewire? # Counter device
>  pseudo-devicepctr1
>  pseudo-devicenvram   1
>  pseudo-devicehotplug 1   # devices hot plugging
> +pseudo-devicedt
> 
>  # mouse & keyboard multiplexor pseudo-devices
>  pseudo-devicewsmux   2
> Index: arch/amd64/conf/GENERIC
> ===
> RCS file: /data/mirror/openbsd/cvs/src/sys/arch/amd64/conf/GENERIC,v
> retrieving revision 1.497
> diff -u -p -r1.497 GENERIC
> --- arch/amd64/conf/GENERIC   4 Feb 2021 16:25:38 -   1.497
> +++ arch/amd64/conf/GENERIC   26 Apr 2021 13:36:10 -
> @@ -685,6 +685,7 @@ owctr*at onewire? # Counter device
>  pseudo-devicepctr1
>  pseudo-devicenvram   1
>  pseudo-devicehotplug 1   # devices hot plugging
> +pseudo-devicedt
> 
>  # mouse & keyboard multiplexor pseudo-devices
>  pseudo-devicewsmux   2
> Index: arch/arm64/conf/GENERIC
> ===
> RCS file: /data/mirror/openbsd/cvs/src/sys/arch/arm64/conf/GENERIC,v
> retrieving revision 1.194
> diff -u -p -r1.194 GENERIC
> --- arch/arm64/conf/GENERIC   7 Apr 2021 17:12:22 -   1.194
> +++ arch/arm64/conf/GENERIC   26 Apr 2021 13:36:22 -
> @@ -490,6 +490,7 @@ owctr*at onewire? # Counter device
>  # Pseudo-Devices
>  pseudo-deviceopenprom
>  pseudo-devicehotplug 1   # devices hot plugging
> +pseudo-devicedt
> 
>  # mouse & keyboard multiplexor pseudo-devices
>  pseudo-devicewsmux   2
> Index: arch/powerpc64/conf/GENERIC
> ===
> RCS file: /data/mirror/openbsd/cvs/src/sys/arch/powerpc64/conf/GENERIC,v
> retrieving revision 1.25
> diff -u -p -r1.25 GENERIC
> --- arch/powerpc64/conf/GENERIC   4 Feb 2021 16:25:39 -   1.25
> +++ arch/powerpc64/conf/GENERIC   26 Apr 2021 13:37:00 -
> @@ -186,4 +186,5 @@ brgphy*   at mii? # Broadcom 
> Gigabit PH
> 
>  # Pseudo-Devices
>  pseudo-deviceopenprom
> +pseudo-devicedt
>  pseudo-devicewsmux 2

-- 
Sebastien Marie



Re: enable dt(4)

2021-04-26 Thread Sebastien Marie
On Mon, Apr 26, 2021 at 12:35:11PM +0200, Patrick Wildt wrote:
> Hi,
> 
> as proposed by bluhm@ recently, this is the diff to enable dt(4) in
> GENERIC.  The overhead should be small, and I have been using it on
> arm64 to successfully debug issues for a while now.

I would be fine with dt(4) enabled too.

> I can't vouch that it builds for all architectures... Did anyone do
> that?  Number 1 rule: don't break Theo's build.

One test would be to build on i386 (with full release process): we are
near the limit currently, so it could overflow the available disk
space very easily.

Regarding usefulness on archs, dt(4) has proper frames-skip support
for amd64, prowerpc64 and sparc64 only (others archs would default to
"don't skip frames in stack traces").

Thanks.
-- 
Sebastien Marie



Re: Syspatch man page -c behaviour

2021-04-10 Thread Sebastien Marie
On Sat, Apr 10, 2021 at 01:14:53PM +0100, SW wrote:
> On 10/04/2021 12:29, Hiltjo Posthuma wrote:
> >
> > It's already documented in the "EXIT STATUS" section, isn't it?
> EXIT STATUS
>  The syspatch utility exits 0 on success, and >0 if an error occurs.
> 
> That doesn't really sound clear to me- it'd not be unreasonable to argue
> that listing available patches is successful whether there are some
> patches or no patches.
> Also, I think in this case it makes sense for the specific exit statuses
> for that flag to be next to the flag itself, rather than putting
> conditionals in the exit status section.

the man page changed recently (Mon Dec 7 21:19:28 2020) and the EXIT
STATUS section has currently the following text:

EXIT STATUS
 The syspatch utility exits 0 on success, and >0 if an error occurs.  In
 particular, 2 indicates that applying patches was requested but no
 additional patch was installed.

see http://man.openbsd.org/syspatch#EXIT_STATUS

Thanks.
-- 
Sebastien Marie



mfs + cd9660: vop #define cleanup

2021-03-24 Thread Sebastien Marie
Hi,

The following diff removes some #define which hide generic vop
functions behind another classic name.

It makes clearer which vop functions are real
fileystem-implementations and which one are only stubs.

Only mfs and cd9660 are using such #define.

No functional changes are intented.

Comments or OK ?
-- 
Sebastien Marie


blob - a662a6787ef9a6b06363441fb8778d66a29c04d0
blob + 2c54a18f3ff43c3eac65a4e520f4dcb625fa2473
--- sys/ufs/mfs/mfs_extern.h
+++ sys/ufs/mfs/mfs_extern.h
@@ -61,6 +61,5 @@ int mfs_close(void *);
 int mfs_inactive(void *);
 int mfs_reclaim(void *);
 int mfs_print(void *);
-#definemfs_revoke vop_generic_revoke
 int mfs_badop(void *);
 
blob - 00c0c24efbe9fa4046b1569a12be47c74d3d180f
blob + b5b1d430cc9f0dac3eb40bdc71ccc16f5efa67cf
--- sys/ufs/mfs/mfs_vnops.c
+++ sys/ufs/mfs/mfs_vnops.c
@@ -61,9 +61,9 @@ const struct vops mfs_vops = {
 .vop_ioctl  = mfs_ioctl,
 .vop_poll   = mfs_badop,
 .vop_kqfilter   = mfs_badop,
-.vop_revoke = mfs_revoke,
+.vop_revoke = vop_generic_revoke,
 .vop_fsync  = spec_fsync,
 .vop_remove = mfs_badop,
 .vop_link   = mfs_badop,

blob - b7cc93f86c5c8d28c28b3ad81286600941c23875
blob + 3c70501643e8bf15af237c3056919bd222dc4206
--- sys/isofs/cd9660/cd9660_vnops.c
+++ sys/isofs/cd9660/cd9660_vnops.c
@@ -811,44 +811,29 @@ cd9660_pathconf(void *v)
 /*
  * Global vfs data structures for isofs
  */
-#definecd9660_create   eopnotsupp
-#definecd9660_mknodeopnotsupp
-#definecd9660_writeeopnotsupp
-#definecd9660_fsyncnullop
-#definecd9660_remove   eopnotsupp
-#definecd9660_rename   eopnotsupp
-#definecd9660_mkdireopnotsupp
-#definecd9660_rmdireopnotsupp
-#definecd9660_advlock  eopnotsupp
-#definecd9660_valloc   eopnotsupp
-#definecd9660_vfreeeopnotsupp
-#definecd9660_truncate eopnotsupp
-#definecd9660_update   eopnotsupp
-#definecd9660_bwrite   eopnotsupp
-#define cd9660_revoke   vop_generic_revoke
 
 /* Global vfs data structures for cd9660. */
 const struct vops cd9660_vops = {
.vop_lookup = cd9660_lookup,
-   .vop_create = cd9660_create,
-   .vop_mknod  = cd9660_mknod,
+   .vop_create = eopnotsupp,
+   .vop_mknod  = eopnotsupp,
.vop_open   = cd9660_open,
.vop_close  = cd9660_close,
.vop_access = cd9660_access,
.vop_getattr= cd9660_getattr,
.vop_setattr= cd9660_setattr,
.vop_read   = cd9660_read,
-   .vop_write  = cd9660_write,
+   .vop_write  = eopnotsupp,
.vop_ioctl  = cd9660_ioctl,
.vop_poll   = cd9660_poll,
.vop_kqfilter   = cd9660_kqfilter,
-   .vop_revoke = cd9660_revoke,
-   .vop_fsync  = cd9660_fsync,
-   .vop_remove = cd9660_remove,
+   .vop_revoke = vop_generic_revoke,
+   .vop_fsync  = nullop,
+   .vop_remove = eopnotsupp,
.vop_link   = cd9660_link,
-   .vop_rename = cd9660_rename,
-   .vop_mkdir  = cd9660_mkdir,
-   .vop_rmdir  = cd9660_rmdir,
+   .vop_rename = eopnotsupp,
+   .vop_mkdir  = eopnotsupp,
+   .vop_rmdir  = eopnotsupp,
.vop_symlink= cd9660_symlink,
.vop_readdir= cd9660_readdir,
.vop_readlink   = cd9660_readlink,
@@ -862,8 +847,8 @@ const struct vops cd9660_vops = {
.vop_print  = cd9660_print,
.vop_islocked   = cd9660_islocked,
.vop_pathconf   = cd9660_pathconf,
-   .vop_advlock= cd9660_advlock,
-   .vop_bwrite = vop_generic_bwrite
+   .vop_advlock= eopnotsupp,
+   .vop_bwrite = vop_generic_bwrite,
 };
 
 /* Special device vnode ops */



Re: namei/execve: entanglement's simplification

2021-03-24 Thread Sebastien Marie
Please withdraw for now.

sys/exec.h change doesn't fit well in userland.

Thanks.

On Mon, Mar 22, 2021 at 11:05:06AM +0100, Sebastien Marie wrote:
> Hi,
> 
> The following diff tries to simplify a bit the entanglement of namei
> data and execve(2) syscall.
> 
> Currently, when called, sys_execve() might doing recursive calls of
> check_exec() with a shared `struct nameidata' (`ep_ndp' inside
> `struct exec_package').
> 
> I would like to disassociate them, and make check_exec() to "own" the
> `struct nameidata' data. execve(2) is complex enough, no needs to adds
> namei() complexity inside.
> 
> check_exec() will initialize nameidata, call namei() (as now), extract
> useful information for the caller (only the vnode and the
> command-name, returned via exec_package struct), and free other namei
> ressources.
> 
> To proceed, it only needs to know the wanted path (to properly init
> `nd' with NDINIT).
> 
> As the call of check_exec() could be recursive (when scripts are
> involved), the path could come from:
> - directly from sys_execve() via SCARG(uap, path) (from userspace)
> - from exec_script_makecmds() via the shellname (from systemspace)
> 
> In `struct exec_package', I opted to reuse `ep_name' for passing the
> path: it is what sys_execve() is already doing at first call. Later it
> is only used for construct the fake-args list in
> exec_script_makecmds(), and it could being overrided after that (and
> restored on error). I reordered them a bit to make it fit.
> 
> Eventually I could also introduce a new struct field for the wanted
> path.
> 
> `ep_segflg' is introduced to mark if ep_name comes from UIO_USERSPACE
> or UIO_SYSSPACE. it will be used by NDINIT() in check_exec().
> 
> `ep_comm' is the other information (with ep_vp) wanted in result of
> check_exec() call. it will be copied to `ps_comm'.
> 
> 
> Comments or OK ?
> -- 
> Sebastien Marie



namei/execve: entanglement's simplification

2021-03-22 Thread Sebastien Marie
Hi,

The following diff tries to simplify a bit the entanglement of namei
data and execve(2) syscall.

Currently, when called, sys_execve() might doing recursive calls of
check_exec() with a shared `struct nameidata' (`ep_ndp' inside
`struct exec_package').

I would like to disassociate them, and make check_exec() to "own" the
`struct nameidata' data. execve(2) is complex enough, no needs to adds
namei() complexity inside.

check_exec() will initialize nameidata, call namei() (as now), extract
useful information for the caller (only the vnode and the
command-name, returned via exec_package struct), and free other namei
ressources.

To proceed, it only needs to know the wanted path (to properly init
`nd' with NDINIT).

As the call of check_exec() could be recursive (when scripts are
involved), the path could come from:
- directly from sys_execve() via SCARG(uap, path) (from userspace)
- from exec_script_makecmds() via the shellname (from systemspace)

In `struct exec_package', I opted to reuse `ep_name' for passing the
path: it is what sys_execve() is already doing at first call. Later it
is only used for construct the fake-args list in
exec_script_makecmds(), and it could being overrided after that (and
restored on error). I reordered them a bit to make it fit.

Eventually I could also introduce a new struct field for the wanted
path.

`ep_segflg' is introduced to mark if ep_name comes from UIO_USERSPACE
or UIO_SYSSPACE. it will be used by NDINIT() in check_exec().

`ep_comm' is the other information (with ep_vp) wanted in result of
check_exec() call. it will be copied to `ps_comm'.


Comments or OK ?
-- 
Sebastien Marie

diff 8de782433999755fc9356c8ed8dc7b327e532351 /home/semarie/repos/openbsd/src
blob - a31fa5a3269a3b568450066a1bd001317d710354
file + sys/kern/exec_script.c
--- sys/kern/exec_script.c
+++ sys/kern/exec_script.c
@@ -64,19 +64,24 @@ exec_script_makecmds(struct proc *p, struct exec_packa
 {
int error, hdrlinelen, shellnamelen, shellarglen;
char *hdrstr = epp->ep_hdr;
-   char *cp, *shellname, *shellarg, *oldpnbuf;
+   char *cp, *shellname, *shellarg;
char **shellargp = NULL, **tmpsap;
struct vnode *scriptvp;
+   char old_comm[MAXCOMLEN+1];
+   char *old_name;
+   enum uio_seg old_segflg;
uid_t script_uid = -1;
gid_t script_gid = -1;
u_short script_sbits;
 
/*
-* remember the old vp and pnbuf for later, so we can restore
+* remember the old epp values for later, so we can restore
 * them if check_exec() fails.
 */
+   old_segflg = epp->ep_segflg;
+   old_name = epp->ep_name;
scriptvp = epp->ep_vp;
-   oldpnbuf = epp->ep_ndp->ni_cnd.cn_pnbuf;
+   strlcpy(old_comm, epp->ep_comm, sizeof(old_comm));
 
/*
 * if the magic isn't that of a shell script, or we've already
@@ -186,13 +191,7 @@ check_shell:
FRELE(fp, p);
}
 
-   /* set up the parameters for the recursive check_exec() call */
-   epp->ep_ndp->ni_dirfd = AT_FDCWD;
-   epp->ep_ndp->ni_dirp = shellname;
-   epp->ep_ndp->ni_segflg = UIO_SYSSPACE;
-   epp->ep_flags |= EXEC_INDIR;
-
-   /* and set up the fake args list, for later */
+   /* set up the fake args list, for later */
shellargp = mallocarray(4, sizeof(char *), M_EXEC, M_WAITOK);
tmpsap = shellargp;
*tmpsap = malloc(shellnamelen + 1, M_EXEC, M_WAITOK);
@@ -214,6 +213,11 @@ check_shell:
tmpsap++;
*tmpsap = NULL;
 
+   /* set up the parameters for the recursive check_exec() call */
+   epp->ep_segflg = UIO_SYSSPACE;
+   epp->ep_name = shellname;
+   epp->ep_flags |= EXEC_INDIR;
+   
/*
 * mark the header we have as invalid; check_exec will read
 * the header from the new executable
@@ -233,9 +237,6 @@ check_shell:
if ((epp->ep_flags & EXEC_HASFD) == 0)
vn_close(scriptvp, FREAD, p->p_ucred, p);
 
-   /* free the old pathname buffer */
-   pool_put(_pool, oldpnbuf);
-
epp->ep_flags |= (EXEC_HASARGL | EXEC_SKIPARG);
epp->ep_fa = shellargp;
/*
@@ -250,8 +251,13 @@ check_shell:
return (0);
}
 
-   /* XXX oldpnbuf not set for "goto fail" path */
-   epp->ep_ndp->ni_cnd.cn_pnbuf = oldpnbuf;
+   /* 
+* no need to restore these ep_* in 'fail' as there were not
+* changed at this point
+*/
+   epp->ep_segflg = old_segflg;
+   epp->ep_name = old_name;
+   strlcpy(epp->ep_comm, old_comm, sizeof(epp->ep_comm));
 fail:
/* note that we've clobbered the header */
epp->ep_flags |= EXEC_DESTR;
@@ -265,8 +271,6 @@ fail:
} else
vn_close(scriptvp, FREAD, p->p_ucred, p);
 
-   

patch: constify and use C99-style initialization for struct execsw

2021-03-20 Thread Sebastien Marie
Hi,

The following diff makes `struct execsw' to:

- use C99-style initialization (grep works better with that)
- use const as execsw is not modified during runtime

Comments or OK ?
-- 
Sebastien Marie

diff 2533c50dc3c36fe283749b7fcaef52891806c13c /home/semarie/repos/openbsd/src
blob - 43b0ebb9c128d7352b7b74119d3ea315654e03d2
file + sys/kern/exec_conf.c
--- sys/kern/exec_conf.c
+++ sys/kern/exec_conf.c
@@ -38,9 +38,15 @@
 
 extern struct emul emul_native;
 
-struct execsw execsw[] = {
-   { EXEC_SCRIPT_HDRSZ, exec_script_makecmds },/* shell scripts */
-   { sizeof(Elf_Ehdr), exec_elf_makecmds },/* elf binaries */
+const struct execsw execsw[] = {
+   {   /* shell scripts */
+   .es_hdrsz = EXEC_SCRIPT_HDRSZ,
+   .es_check = exec_script_makecmds,
+   },
+   {   /* elf binaries */
+   .es_hdrsz = sizeof(Elf_Ehdr),
+   .es_check = exec_elf_makecmds,
+   },
 };
 int nexecs = (sizeof execsw / sizeof(*execsw));
 int exec_maxhdrsz;
blob - 5158eb25e5c720731deb9bdd04b65f8525140c49
file + sys/sys/exec.h
--- sys/sys/exec.h
+++ sys/sys/exec.h
@@ -212,7 +212,7 @@ voidnew_vmcmd(struct exec_vmcmd_set *evsp,
  * Functions for specific exec types should be defined in their own
  * header file.
  */
-extern struct  execsw execsw[];
+extern const structexecsw execsw[];
 extern int nexecs;
 extern int exec_maxhdrsz;
 



patch: unveil: remove some leftover of UNVEIL_INSPECT usage with ni_unveil

2021-03-11 Thread Sebastien Marie
Hi,

The following diff is a cleanup to remove two leftover checks, which
were used when ni_unveil was used with UNVEIL_INSPECT:

it was used by:
- readlink(2) - removed 2019-08-31

  Make readlink require UNVEIL_READ instead of UNVEIL_INSPECT only
  since realpath() is now a system call
  
- stat(2) and access(2) - removed 2019-03-24

  Make stat(2) and access(2) need UNVEIL_READ instead of UNVEIL_INSPECT

  UNVEIL_INSPECT is a hack we added to get chrome/glib working. It silently
  adds permission for stat(2), access(2), and readlink(2) to be used on
  all path components of any unveil'ed path. robert@ has sucessfully now
  fixed chrome/glib to not require exessive TOC vs TOU stat(2) and access(2)
  calls on the paths it uses,  so that this no longer needed there.

  readlink(2) is the sole call that is now permitted by UNVEIL_INSPECT,
  and this is only needed so that realpath(3) can work. Going forward we will
  likely make a realpath(2), after which we can completely deprecate
  UNVEIL_INSPECT.
 

I audited the values sets in ni_unveil, and UNVEIL_INSPECT is
effectively not used anywhere in this variable.

The diff removes two checks that were done:
- one in unveil_flagmatch(), for a debug printf
- one in pledge_namei(), for "getpw" usage when using 
access("/var/run/ypbind.lock")

Comments or OK ?
-- 
Sebastien Marie

diff 48cf7af2deddb13b1f53f18782fd5612c3fdc34a /home/semarie/repos/openbsd/src
blob - 2de0d500e39367046a93c951aeded70bcdeb097d
file + sys/kern/kern_pledge.c
--- sys/kern/kern_pledge.c
+++ sys/kern/kern_pledge.c
@@ -619,8 +619,7 @@ pledge_namei(struct proc *p, struct nameidata *ni, cha
/* when avoiding YP mode, getpw* functions touch this */
if (ni->ni_pledge == PLEDGE_RPATH &&
strcmp(path, "/var/run/ypbind.lock") == 0) {
-   if ((p->p_p->ps_pledge & PLEDGE_GETPW) ||
-   (ni->ni_unveil == UNVEIL_INSPECT)) {
+   if (p->p_p->ps_pledge & PLEDGE_GETPW) {
ni->ni_cnd.cn_flags |= BYPASSUNVEIL;
return (0);
} else
blob - 0822248e435b45baf4fa2640cc1a89d85f632cad
file + sys/kern/kern_unveil.c
--- sys/kern/kern_unveil.c
+++ sys/kern/kern_unveil.c
@@ -720,11 +720,6 @@ unveil_flagmatch(struct nameidata *ni, u_char flags)
return 0;
}
}
-   if (ni->ni_unveil & UNVEIL_INSPECT) {
-#ifdef DEBUG_UNVEIL
-   printf("any unveil allows UNVEIL_INSPECT\n");
-#endif
-   }
return 1;
 }
 



distrib: arm64: remove gratitious customization from mr.fs

2021-02-14 Thread Sebastien Marie
Hi,

arm64 ramdisk has customization in mr.fs target, in order to create
usr/mdec/pine64 and usr/mdec/rpi directories (files will be copied
inside them by runlist.sh).

I will argue that mr.fs target isn't the place for such per-arch
customization as runlist.sh supports MKDIR directive from the list
file.

The diff belows put back mr.fs target identical to others archs and
make the directories to be created with directives from list file.

Please note that armv7 already uses such MKDIR directives for the same
purpose (usr/mdec/xxx directory creation).

Comments or OK ?
-- 
Sebastien Marie

diff 1c9ddf109f6795c0adfcf5b19fd0bc61b3839042 /home/semarie/repos/openbsd/src
blob - 04b89cbda8d32468e5e91d2785b43d8e6094d90b
file + distrib/arm64/ramdisk/Makefile
--- distrib/arm64/ramdisk/Makefile
+++ distrib/arm64/ramdisk/Makefile
@@ -24,10 +24,6 @@ UTILS=   ${.CURDIR}/../../miniroot
 MRFSDISKTYPE=  rdroot
 MRMAKEFSARGS=  -o disklabel=${MRFSDISKTYPE},minfree=0,density=4096
 
-DIRS=\
-   pine64 \
-   rpi
-
 PIFILES=\
bootcode.bin \
start.elf \
@@ -85,9 +81,6 @@ bsd:
 mr.fs: instbin
rm -rf $@.d
install -d -o root -g wheel $@.d
-.for DIR in ${DIRS}
-   mkdir -p $@.d/usr/mdec/${DIR}
-.endfor
mtree -def ${MTREE} -p $@.d -u
CURDIR=${.CURDIR} OBJDIR=${.OBJDIR} OSrev=${OSrev} \
TARGDIR=$@.d UTILS=${UTILS} RELEASEDIR=${RELEASEDIR} \
blob - 6e7a37085a9024988fd24d7d63a5689ea78aed4c
file + distrib/arm64/ramdisk/list
--- distrib/arm64/ramdisk/list
+++ distrib/arm64/ramdisk/list
@@ -91,6 +91,7 @@ COPY  ${DESTDIR}/etc/firmware/atu-rfmd2958-int etc/firm
 COPY   ${DESTDIR}/etc/firmware/atu-rfmd2958smc-ext 
etc/firmware/atu-rfmd2958smc-ext
 COPY   ${DESTDIR}/etc/firmware/atu-rfmd2958smc-int 
etc/firmware/atu-rfmd2958smc-int
 
+MKDIR  usr/mdec/rpi
 COPY   /usr/local/share/raspberrypi-firmware/boot/bcm2710-rpi-3-b.dtb 
usr/mdec/rpi/bcm2710-rpi-3-b.dtb
 COPY   /usr/local/share/raspberrypi-firmware/boot/bcm2710-rpi-3-b-plus.dtb 
usr/mdec/rpi/bcm2710-rpi-3-b-plus.dtb
 COPY   /usr/local/share/raspberrypi-firmware/boot/bcm2710-rpi-cm3.dtb 
usr/mdec/rpi/bcm2710-rpi-cm3.dtb
@@ -100,6 +101,7 @@ COPY
/usr/local/share/raspberrypi-firmware/boot/fixup.
 COPY   /usr/local/share/raspberrypi-firmware/boot/overlays/disable-bt.dtbo 
usr/mdec/rpi/disable-bt.dtbo
 COPY   /usr/local/share/u-boot/rpi_3/u-boot.bin usr/mdec/rpi/u-boot.bin
 
+MKDIR  usr/mdec/pine64
 COPY   /usr/local/share/u-boot/pine64_plus/u-boot-sunxi-with-spl.bin 
usr/mdec/pine64/u-boot-sunxi-with-spl.bin
 
 # copy the MAKEDEV script and make some devices



Re: distrib: make rdsetroot -x to work again

2021-02-14 Thread Sebastien Marie
On Sun, Feb 14, 2021 at 09:46:32AM -0700, Theo de Raadt wrote:
> Sure.
> 
> I guess the next thing to do would be using gzopen type functions,
> to edit a gzip'd version of the file.

I am unsure it would be doable in sane manner. rdsetroot is using
libelf to do elf manipulation, and the read-write support is using a
plain file descriptor.

and adding support of gz in libelf is a no for me.

gunzip the file before manipulation isn't a big deal.

> Then the recent move to .gz files would be more transparent for this
> specific use case (which I think is very rare, honestly).  I'm not sure
> if it matters.

I would not say it is rare, but I agree that it isn't a common use
case.

I recently asked on ports@ about withdraw upobsd tool I wrote (we have
sysupgrade in base now which do upgrade better than my tool), and I
got several replies from people using it for unattented installation.

Thanks.
-- 
Sebastien Marie



Re: distrib: make rdsetroot -x to work again

2021-02-14 Thread Sebastien Marie
On Sun, Feb 14, 2021 at 09:54:15AM -0500, Daniel Jakots wrote:
> On Sun, 14 Feb 2021 15:23:05 +0100, Sebastien Marie 
> wrote:
> 
> In the alpha diff, I would put the "-R .eh_frame -R .shstrtab \" line
> before the -K line so the -R things are grouped together.

I put it after in order to make more obvious it is a MD part (alpha is
the sole arch which remove such sections).

Some investigations:

-R .shstrtab was added in 2017

commit e8a1490d19dbdd7d68fc1a5e96fc7c5115e2c42a
from: deraadt 
date: Sun Sep 17 16:33:07 2017 UTC
 
 Some further shrinking, but obviously not enough.  Something unknown
 caused bloat about a month ago (and it wasn't purely the ctf additions
 since those are being stripped).  Maybe the compiler generates
 different code when stronger debugging information is requested?
 

-R .eh_frame was added in 2004

commit ec828d0bd39e2262e00f331f64c0a74a91cf0e0c
from: miod 
date: Fri Nov  5 21:22:37 2004 UTC
 
 Binutils 2.15 require more aggressive stripping for installation media 
binaries,
 if we want to still fit on floppies (binaries carry one extra section now,
 which we don't need on installation media).
 ok deraadt@


I would be interested to know the bytes gained by removing these
sections. We might removed them on others archs too.

Thanks.
-- 
Sebastien Marie



distrib: make rdsetroot -x to work again

2021-02-14 Thread Sebastien Marie
Hi,

The following diff makes rdsetroot -x (extract the disk.fs image) to
work again for stripped bsd.rd.

It passes options to keep rd_root_size and rd_root_image symbols while
stripping. These symbols are the ones used by rdsetroot to insert or
extract disk image into RAMDISK.

If it matter, on my i386 test, the bsd.rd size grows to 284 bytes
before gzip and 113 bytes after gzip.

While here, uniformize a bit the sections removed (.comment section
wasn't removed on some archs while stripping).

Comments or OK ?
-- 
Sebastien Marie

Index: distrib/alpha/miniroot/Makefile
===
RCS file: /cvs/src/distrib/alpha/miniroot/Makefile,v
retrieving revision 1.21
diff -u -p -r1.21 Makefile
--- distrib/alpha/miniroot/Makefile 13 Feb 2021 18:48:23 -  1.21
+++ distrib/alpha/miniroot/Makefile 14 Feb 2021 13:58:49 -
@@ -62,7 +62,10 @@ ${CDROM}: bsd.gz
rm -f vnd
 
 bsd.gz: bsd.rd
-   objcopy -S -R .comment -R .SUNW_ctf -R .eh_frame -R .shstrtab bsd.rd 
bsd.strip
+   objcopy -S -R .comment -R .SUNW_ctf \
+   -K rd_root_size -K rd_root_image \
+   -R .eh_frame -R .shstrtab \
+   bsd.rd bsd.strip
gzip -9cn bsd.strip > bsd.gz
 
 bsd.rd: mr.fs bsd
Index: distrib/amd64/ramdiskA/Makefile
===
RCS file: /cvs/src/distrib/amd64/ramdiskA/Makefile,v
retrieving revision 1.14
diff -u -p -r1.14 Makefile
--- distrib/amd64/ramdiskA/Makefile 13 Feb 2021 18:52:08 -  1.14
+++ distrib/amd64/ramdiskA/Makefile 14 Feb 2021 13:58:49 -
@@ -33,7 +33,9 @@ MRDISKTYPE=   rdroot
 MRMAKEFSARGS=  -o disklabel=${MRDISKTYPE},minfree=0,density=4096
 
 bsd.gz: bsd.rd
-   objcopy -S -R .comment -R .SUNW_ctf bsd.rd bsd.strip
+   objcopy -S -R .comment -R .SUNW_ctf \
+   -K rd_root_size -K rd_root_image \
+   bsd.rd bsd.strip
gzip -9cn bsd.strip > bsd.gz
 
 bsd.rd: mr.fs bsd
Index: distrib/amd64/ramdisk_cd/Makefile
===
RCS file: /cvs/src/distrib/amd64/ramdisk_cd/Makefile,v
retrieving revision 1.28
diff -u -p -r1.28 Makefile
--- distrib/amd64/ramdisk_cd/Makefile   13 Feb 2021 18:52:08 -  1.28
+++ distrib/amd64/ramdisk_cd/Makefile   14 Feb 2021 13:58:49 -
@@ -56,7 +56,9 @@ MRDISKTYPE=   rdrootb
 MRMAKEFSARGS=  -o disklabel=${MRDISKTYPE},minfree=0,density=4096
 
 bsd.gz: bsd.rd
-   objcopy -S -R .comment -R .SUNW_ctf bsd.rd bsd.strip
+   objcopy -S -R .comment -R .SUNW_ctf \
+   -K rd_root_size -K rd_root_image \
+   bsd.rd bsd.strip
gzip -9cn bsd.strip > bsd.gz
 
 bsd.rd: mr.fs bsd
Index: distrib/i386/ramdisk/Makefile
===
RCS file: /cvs/src/distrib/i386/ramdisk/Makefile,v
retrieving revision 1.13
diff -u -p -r1.13 Makefile
--- distrib/i386/ramdisk/Makefile   13 Feb 2021 18:52:08 -  1.13
+++ distrib/i386/ramdisk/Makefile   14 Feb 2021 13:58:49 -
@@ -34,7 +34,9 @@ MRDISKTYPE=   rdroot
 MRMAKEFSARGS=  -o disklabel=${MRDISKTYPE},minfree=0,density=4096
 
 bsd.gz: bsd.rd
-   objcopy -S -R .comment -R .SUNW_ctf bsd.rd bsd.strip
+   objcopy -S -R .comment -R .SUNW_ctf \
+   -K rd_root_size -K rd_root_image \
+   bsd.rd bsd.strip
gzip -9cn bsd.strip > bsd.gz
 
 bsd.rd: mr.fs bsd
Index: distrib/i386/ramdisk_cd/Makefile
===
RCS file: /cvs/src/distrib/i386/ramdisk_cd/Makefile,v
retrieving revision 1.22
diff -u -p -r1.22 Makefile
--- distrib/i386/ramdisk_cd/Makefile13 Feb 2021 18:52:08 -  1.22
+++ distrib/i386/ramdisk_cd/Makefile14 Feb 2021 13:58:49 -
@@ -53,7 +53,9 @@ MRDISKTYPE=   rdrootb
 MRMAKEFSARGS=  -o disklabel=${MRDISKTYPE},minfree=0,density=4096
 
 bsd.gz: bsd.rd
-   objcopy -S -R .comment -R .SUNW_ctf bsd.rd bsd.strip
+   objcopy -S -R .comment -R .SUNW_ctf \
+   -K rd_root_size -K rd_root_image \
+   bsd.rd bsd.strip
gzip -9cn bsd.strip > bsd.gz
 
 bsd.rd: mr.fs bsd
Index: distrib/macppc/ramdisk/Makefile
===
RCS file: /cvs/src/distrib/macppc/ramdisk/Makefile,v
retrieving revision 1.48
diff -u -p -r1.48 Makefile
--- distrib/macppc/ramdisk/Makefile 5 Jan 2021 15:10:43 -   1.48
+++ distrib/macppc/ramdisk/Makefile 14 Feb 2021 13:58:49 -
@@ -35,9 +35,10 @@ bsd.gz: bsd.rd
gzip -9cn bsd.rd > bsd.gz
 
 bsd.rd: mr.fs bsd
-   cp bsd bsd.rd
+   objcopy -S -R .comment -R .SUNW_ctf \
+   -K rd_root_size -K rd_root_image \
+   bsd bsd.rd
rdsetroot bsd.rd mr.fs
-   strip -R .SUNW_ctf bsd.rd
 
 bsd:
cd ${.CURDIR}/../../../sys/arch/${MACHINE}/compile/${RAMDISK} && \
Index: distrib

distrib: reduce differences between archs: use ${MACHINE} when possible

2021-02-14 Thread Sebastien Marie
Hi,

The following diff makes various distrib Makefile to use ${MACHINE}
instead of hardcoded value.

Currently, some archs are using ${MACHINE} and others are using
hardcoded value.

Please note I am unsure about sgi: disklabel -w is using
"OpenBSD/sgi " as packid, and I changed it to "OpenBSD/${MACHINE}".
But the packid used isn't uniform across archs, so it might not matter.

The purpose is to reduce gradually the difference of Makefile between
archs.

Comments or OK ?
-- 
Sebastien Marie

Index: amd64/ramdisk_cd/Makefile
===
RCS file: /cvs/src/distrib/amd64/ramdisk_cd/Makefile,v
retrieving revision 1.28
diff -u -p -r1.28 Makefile
--- amd64/ramdisk_cd/Makefile   13 Feb 2021 18:52:08 -  1.28
+++ amd64/ramdisk_cd/Makefile   14 Feb 2021 13:40:13 -
@@ -38,18 +38,18 @@ ${FS}: bsd.gz
 
 ${CDROM}: bsd.rd
rm -rf ${.OBJDIR}/cd-dir
-   mkdir -p ${.OBJDIR}/cd-dir/${OSREV}/amd64
+   mkdir -p ${.OBJDIR}/cd-dir/${OSREV}/${MACHINE}
mkdir -p ${.OBJDIR}/cd-dir/etc
-   echo "set image /${OSREV}/amd64/bsd.rd" > 
${.OBJDIR}/cd-dir/etc/boot.conf
-   cp ${.OBJDIR}/bsd.rd ${.OBJDIR}/cd-dir/${OSREV}/amd64
-   cp ${DESTDIR}/usr/mdec/cdbr ${.OBJDIR}/cd-dir/${OSREV}/amd64
-   cp ${DESTDIR}/usr/mdec/cdboot ${.OBJDIR}/cd-dir/${OSREV}/amd64/cdboot
+   echo "set image /${OSREV}/${MACHINE}/bsd.rd" > 
${.OBJDIR}/cd-dir/etc/boot.conf
+   cp ${.OBJDIR}/bsd.rd ${.OBJDIR}/cd-dir/${OSREV}/${MACHINE}
+   cp ${DESTDIR}/usr/mdec/cdbr ${.OBJDIR}/cd-dir/${OSREV}/${MACHINE}
+   cp ${DESTDIR}/usr/mdec/cdboot 
${.OBJDIR}/cd-dir/${OSREV}/${MACHINE}/cdboot
mkhybrid -a -R -T -L -l -d -D -N -o ${.OBJDIR}/${CDROM} \
-   -A "OpenBSD ${OSREV} amd64 bootonly CD" \
+   -A "OpenBSD ${OSREV} ${MACHINE} bootonly CD" \
-P "Copyright (c) `date +%Y` Theo de Raadt, The OpenBSD project" \
-p "Theo de Raadt " \
-   -V "OpenBSD/amd64   ${OSREV} boot-only CD" \
-   -b ${OSREV}/amd64/cdbr -c ${OSREV}/amd64/boot.catalog \
+   -V "OpenBSD/${MACHINE}   ${OSREV} boot-only CD" \
+   -b ${OSREV}/${MACHINE}/cdbr -c ${OSREV}/${MACHINE}/boot.catalog \
${.OBJDIR}/cd-dir
 
 MRDISKTYPE=rdrootb
Index: hppa/ramdisk/Makefile
===
RCS file: /cvs/src/distrib/hppa/ramdisk/Makefile,v
retrieving revision 1.46
diff -u -p -r1.46 Makefile
--- hppa/ramdisk/Makefile   17 May 2020 17:04:27 -  1.46
+++ hppa/ramdisk/Makefile   14 Feb 2021 13:40:13 -
@@ -20,10 +20,10 @@ ${CDROM}: bsd.rd
rm -rf ${.OBJDIR}/cd-dir/
mkdir -p ${.OBJDIR}/cd-dir/
cp bsd.rd ${.OBJDIR}/cd-dir/bsd.rd
-   mkhybrid -A "OpenBSD ${OSREV} hppa bootonly CD" \
+   mkhybrid -A "OpenBSD ${OSREV} ${MACHINE} bootonly CD" \
-P "Copyright (c) `date +%Y` Theo de Raadt, The OpenBSD project" \
-p "Theo de Raadt " \
-   -V "OpenBSD/hppa ${OSREV} boot-only CD" \
+   -V "OpenBSD/${MACHINE} ${OSREV} boot-only CD" \
-o ${.OBJDIR}/${CDROM} ${.OBJDIR}/cd-dir
dd if=${DESTDIR}/usr/mdec/cdboot of=${.OBJDIR}/${CDROM} \
bs=32k count=1 conv=notrunc
Index: i386/ramdisk_cd/Makefile
===
RCS file: /cvs/src/distrib/i386/ramdisk_cd/Makefile,v
retrieving revision 1.22
diff -u -p -r1.22 Makefile
--- i386/ramdisk_cd/Makefile13 Feb 2021 18:52:08 -  1.22
+++ i386/ramdisk_cd/Makefile14 Feb 2021 13:40:13 -
@@ -35,18 +35,18 @@ ${FS}: bsd.gz
 
 ${CDROM}: bsd.rd
rm -rf ${.OBJDIR}/cd-dir
-   mkdir -p ${.OBJDIR}/cd-dir/${OSREV}/i386
+   mkdir -p ${.OBJDIR}/cd-dir/${OSREV}/${MACHINE}
mkdir -p ${.OBJDIR}/cd-dir/etc
-   echo "set image /${OSREV}/i386/bsd.rd" > ${.OBJDIR}/cd-dir/etc/boot.conf
-   cp ${.OBJDIR}/bsd.rd ${.OBJDIR}/cd-dir/${OSREV}/i386
-   cp ${DESTDIR}/usr/mdec/cdbr ${.OBJDIR}/cd-dir/${OSREV}/i386
-   cp ${DESTDIR}/usr/mdec/cdboot ${.OBJDIR}/cd-dir/${OSREV}/i386/cdboot
+   echo "set image /${OSREV}/${MACHINE}/bsd.rd" > 
${.OBJDIR}/cd-dir/etc/boot.conf
+   cp ${.OBJDIR}/bsd.rd ${.OBJDIR}/cd-dir/${OSREV}/${MACHINE}
+   cp ${DESTDIR}/usr/mdec/cdbr ${.OBJDIR}/cd-dir/${OSREV}/${MACHINE}
+   cp ${DESTDIR}/usr/mdec/cdboot 
${.OBJDIR}/cd-dir/${OSREV}/${MACHINE}/cdboot
mkhybrid -a -R -T -L -l -d -D -N -o ${.OBJDIR}/${CDROM} \
-   -A "OpenBSD ${OSREV} i386 bootonly CD" \
+   -A "OpenBSD ${OSREV} ${MACHINE} bootonly CD" \
-P "Copyright (c) `date +%Y` Theo de Raadt, The OpenBSD project" \
-p "Theo de Raadt " \
-   -V "OpenBSD/

Re: [diff] src/usr.sbin/smtpd: add a forward-file option

2020-12-20 Thread Sebastien Marie
On Sat, Dec 19, 2020 at 11:19:10PM -0700, Theo de Raadt wrote:
> There are thousands of people with smtpd configurations, and sysmerge
> is not going to handle this.
> 
> We cannot expect them all to change their files.  This is madness.

Well, it wouldn't be the first time. But I agree that such changes
should be rare and have really good reason for.

So yes, even if the option is desirable and being off-by-default would
be a good default, the flag-day way for handling it is complex.


Regarding the option itself, if I recall correctly some descriptions
made by Gilles about smtpd, opening ~/.forward is one of the few tasks
done by the priviligied process of smtpd. So it could make sense to
avoid it if not need.

Gilles, could you confirm that having an option to remove .forward
capability (whatever the default value of the option is) could
effectively help to reduce the attack surface of smtpd ?

For example, as immediate consequence, I see no reason for smtpd
priviliegied process to keep a full filesystem view: it might be
possible to restricted it to few directories with unveil(2) (I assume
priviliegied process is still need for bsd_auth, and bsd_auth will
have some requirements).

Thanks.
-- 
Sebastien Marie



Re: [diff] src/usr.sbin/smtpd: add a forward-file option

2020-12-19 Thread Sebastien Marie
On Sat, Dec 19, 2020 at 10:36:32PM +, gil...@poolp.org wrote:
> Hello,
> 
> Whenever a rule with a local action (mbox, maildir, lmtp or mda) is matched, 
> smtpd will
> attempt to search for a ~/.forward file in the recipient directory and 
> process it. This
> may be convenient for some setups but it is an implicit behavior that's not 
> overridable
> and not always wanted.
> 
> This diff changes this behavior by requiring the admins to explicitly allow 
> the forward
> files processing in the actions when desired:
> 
> action "local_users" maildir forward-file
> 
> 
> With this diff, if forward-file is not specified, code to request parent 
> process for an
> fd is bypassed and the expansion layer just pretends parent couldn't find 
> one. This let
> the code fallback in an already existing code path with the proper behavior 
> and is very
> uninvasive.
> 

if I could understood the direction (which is fine as it makes the
daemon less behaviour dependant on a user settings), the default seems
wrong to me (at least for now, and for OpenBSD base specifically).

Currently, root@ mail delivery is based on /root/.forward file:
install is writing this file to redirect root@ mail to user (if user
was created at install-time). It is done this way since 2011 (see
distrib/miniroot/install.sh rev 1.218). So I assume that all installs
which were done with a user configured, since 2011, could use it.

At first step, I would keep the default smtpd.conf with "forward-file"
option set. It would make smtpd(1) to default to no "forward-file" if
not set (what your diff do), but set the default to with
"forward-file" for OpenBSD base.

Admin could remove the option if he/she doesn't use it.

Thanks.
-- 
Sebastien Marie



Re: wireguard + witness

2020-12-01 Thread Sebastien Marie
On Tue, Dec 01, 2020 at 06:59:22AM +0100, Sebastien Marie wrote:
> On Mon, Nov 30, 2020 at 11:14:46PM +, Stuart Henderson wrote:
> > Thought I'd try a WITNESS kernel to see if that gives any clues about
> > what's going on with my APU crashing all over the place (long shot but
> > I got bored with trying different older kernels..)
> > 
> > I see these from time to time (one during netstart, and another 4 in
> > 15 mins uptime), anyone know if they're important?
> 
> this check ("lock_object uninitialized") was recently added in witness.
> 
> it means that the rwlock was uninitialized (the witness flag
> LO_INITIALIZED isn't present whereas it should) when witness check the
> lock.
> 
> it could be:
> - someone omits to call rw_init() or RWLOCK_INITIALIZER() (possible bug if 
> memory isn't zero)
> - the struct has been zeroed (possible bug too)
> 
> > witness: lock_object uninitialized: 0x80bcf0d8
> > Starting stack trace...
> > witness_checkorder(80bcf0d8,9,0) at witness_checkorder+0xab
> > rw_enter_write(80bcf0c8) at rw_enter_write+0x43
> > noise_remote_decrypt(80bcea48,978dc3d,0,fd805e22cb7c,10) at 
> > noise_remote_decrypt+0x135
> > wg_decap(8054a000,fd8061835200) at wg_decap+0xda
> > wg_decap_worker(8054a000) at wg_decap_worker+0x7a
> > taskq_thread(8012d900) at taskq_thread+0x9f
> > end trace frame: 0x0, count: 251
> > End of stack trace.
> 
> from the trace, the sole rw_enter_write() usage in noise_remote_decrypt() is
> 
>  rw_enter_write(>r_keypair_lock)
> 
> but I am seeing few rw_init() on r_keypair_lock. I will see if I could
> see the source of the problem.
> 

Jason, Matt,

sthen@ told me that the same lock is reported several times (exactly,
two locks are reported several times: lock1, lock2, lock1, lock2...)

witness(4) reports when a lock doesn't have LO_INITIALIZED flag set in
internal part of `struct rwlock'. Next it sets it, and skip futher
analysis for this particular lock.

if the same lock is reported several times, it means the memory is
zeroed (and so the flag removed). it could be intented or not. but in
all cases, a rwlock should be properly initialized with rw_init() or
RWLOCK_INITIALIZER() before use.

I don't have enough experiency with wg(4) stuff to well understand the
issue. in wg_decap(), the peer is get from a `struct wg_tag` (from
wg_tag_get()). If i didn't mess myself, the p_remote could come from
wg_send_initiation() or wg_send_response(). but for both, it comes
from wg_peer_create(), and p_remote is initialized with
noise_remote_init() (and so lock have proper rw_init()).

do you have idea on the cause of the lost of the rwlock flag ?

if you want to test it with witness(4) enabled, you will need to build
a kernel with "option WITNESS" in config file. you could uncomment it
from sys/arch/amd64/conf/GENERIC.MP, and run make config, make clean,
make

Thanks.
-- 
Sebastien Marie



Re: wireguard + witness

2020-11-30 Thread Sebastien Marie
On Mon, Nov 30, 2020 at 11:14:46PM +, Stuart Henderson wrote:
> Thought I'd try a WITNESS kernel to see if that gives any clues about
> what's going on with my APU crashing all over the place (long shot but
> I got bored with trying different older kernels..)
> 
> I see these from time to time (one during netstart, and another 4 in
> 15 mins uptime), anyone know if they're important?

this check ("lock_object uninitialized") was recently added in witness.

it means that the rwlock was uninitialized (the witness flag
LO_INITIALIZED isn't present whereas it should) when witness check the
lock.

it could be:
- someone omits to call rw_init() or RWLOCK_INITIALIZER() (possible bug if 
memory isn't zero)
- the struct has been zeroed (possible bug too)

> witness: lock_object uninitialized: 0x80bcf0d8
> Starting stack trace...
> witness_checkorder(80bcf0d8,9,0) at witness_checkorder+0xab
> rw_enter_write(80bcf0c8) at rw_enter_write+0x43
> noise_remote_decrypt(80bcea48,978dc3d,0,fd805e22cb7c,10) at 
> noise_remote_decrypt+0x135
> wg_decap(8054a000,fd8061835200) at wg_decap+0xda
> wg_decap_worker(8054a000) at wg_decap_worker+0x7a
> taskq_thread(8012d900) at taskq_thread+0x9f
> end trace frame: 0x0, count: 251
> End of stack trace.

from the trace, the sole rw_enter_write() usage in noise_remote_decrypt() is

 rw_enter_write(>r_keypair_lock)

but I am seeing few rw_init() on r_keypair_lock. I will see if I could
see the source of the problem.

thanks.
-- 
Sebastien Marie



Re: myx(4): add initialization of sc_sff_lock rwlock

2020-11-26 Thread Sebastien Marie
On Thu, Nov 26, 2020 at 02:54:11PM +0800, Kevin Lo wrote:
> Ok?

yes please.

ok semarie@

> Index: sys/dev/pci/if_myx.c
> ===
> RCS file: /cvs/src/sys/dev/pci/if_myx.c,v
> retrieving revision 1.111
> diff -u -p -u -p -r1.111 if_myx.c
> --- sys/dev/pci/if_myx.c  17 Jul 2020 03:37:36 -  1.111
> +++ sys/dev/pci/if_myx.c  26 Nov 2020 06:49:04 -
> @@ -270,6 +270,8 @@ myx_attach(struct device *parent, struct
>   char part[32];
>   pcireg_t memtype;
>  
> + rw_init(>sc_sff_lock, "myxsff");
> +
>   sc->sc_pc = pa->pa_pc;
>   sc->sc_tag = pa->pa_tag;
>   sc->sc_dmat = pa->pa_dmat;
> 

-- 
Sebastien Marie



Re: drm: avoid possible deadlock in kthread_stop

2020-10-17 Thread Sebastien Marie
On Wed, Oct 14, 2020 at 08:58:04PM +0200, Mark Kettenis wrote:
> > Date: Thu, 1 Oct 2020 09:09:50 +0200
> > From: Sebastien Marie 
> > 
> > Hi,
> > 
> > Currently, when a process is calling kthread_stop(), it sets a flag
> > asking the thread to stop, and enters in sleep mode, but the code
> > doing the stop doesn't wakeup the caller of kthread_stop().
> > 
> > The thread should also be unparked as else it will not seen the
> > KTHREAD_SHOULDSTOP flag. it follows what Linux is doing.
> > 
> > While here, I added some comments in the locking logic for park/unpark
> > and stop.
> > 
> > Comments or OK ?
> 
> I don't think adding all those comments makes a lot of sense.  This
> uses a fairly standard tsleep/wakeup pattern and the some of the
> comments really state the obvious.

it was the way I did to audit the code and understand what it did.

> Can you do a diff that just adds
> the missing wakeup() and kthread_unpark() call?

here a new diff.

diff 4efbe95c75086b3a7b0074651bfa04fd58990a98 /home/semarie/repos/openbsd/src
blob - fd797effc74d6eb4a172c81be8feac0ed168ec5d
file + sys/dev/pci/drm/drm_linux.c
--- sys/dev/pci/drm/drm_linux.c
+++ sys/dev/pci/drm/drm_linux.c
@@ -207,6 +217,7 @@ kthread_func(void *arg)
 
ret = thread->func(thread->data);
thread->flags |= KTHREAD_STOPPED;
+   wakeup(thread);
kthread_exit(ret);
 }
 
@@ -298,8 +327,9 @@ kthread_stop(struct proc *p)
 
while ((thread->flags & KTHREAD_STOPPED) == 0) {
thread->flags |= KTHREAD_SHOULDSTOP;
+   kthread_unpark(p);
wake_up_process(thread->proc);
tsleep_nsec(thread, PPAUSE, "stop", INFSLP);
}
LIST_REMOVE(thread, next);
 

Thanks.
-- 
Sebastien Marie



Re: amap: KASSERT()s and local variables

2020-10-11 Thread Sebastien Marie
On Wed, Oct 07, 2020 at 04:49:28PM +0200, Martin Pieuchot wrote:
> On 01/10/20(Thu) 14:18, Martin Pieuchot wrote:
> > Use more KASSERT()s instead of the "if (x) panic()" idiom for sanity
> > checks and add a couple of local variables to reduce the difference
> > with NetBSD and help for upcoming locking.
> 
> deraadt@ mentioned that KASSERT()s are not effective in RAMDISK kernels.
> 
> So the revisited diff below only converts checks that are redundant with
> NULL dereferences.
> 
> ok?

globally ok. but there is one problem, see below.

> Index: uvm/uvm_amap.c
> ===
> RCS file: /cvs/src/sys/uvm/uvm_amap.c,v
> retrieving revision 1.84
> diff -u -p -r1.84 uvm_amap.c
> --- uvm/uvm_amap.c25 Sep 2020 08:04:48 -  1.84
> +++ uvm/uvm_amap.c7 Oct 2020 14:40:53 -
> @@ -669,9 +669,7 @@ ReStart:
>   pg = anon->an_page;
>  
>   /* page must be resident since parent is wired */
> - if (pg == NULL)
> - panic("amap_cow_now: non-resident wired page"
> - " in anon %p", anon);
> + KASSERT(pg != NULL);
>  
>   /*
>* if the anon ref count is one, we are safe (the child
> @@ -740,6 +738,7 @@ ReStart:
>  void
>  amap_splitref(struct vm_aref *origref, struct vm_aref *splitref, vaddr_t 
> offset)
>  {
> + struct vm_amap *amap = origref->ar_amap;
>   int leftslots;
>  
>   AMAP_B2SLOT(leftslots, offset);
> @@ -747,17 +746,18 @@ amap_splitref(struct vm_aref *origref, s
>   panic("amap_splitref: split at zero offset");
>  
>   /* now: we have a valid am_mapped array. */
> - if (origref->ar_amap->am_nslot - origref->ar_pageoff - leftslots <= 0)
> + if (amap->am_nslot - origref->ar_pageoff - leftslots <= 0)
>   panic("amap_splitref: map size check failed");
>  
>  #ifdef UVM_AMAP_PPREF
> -/* establish ppref before we add a duplicate reference to the amap */
> - if (origref->ar_amap->am_ppref == NULL)
> - amap_pp_establish(origref->ar_amap);
> +/* Establish ppref before we add a duplicate reference to the amap. 
> */
> + if (amap->am_ppref == NULL)
> + amap_pp_establish(amap);
>  #endif
>  
> - splitref->ar_amap = origref->ar_amap;
> - splitref->ar_amap->am_ref++;/* not a share reference */
> + /* Note: not a share reference. */
> + amap->am_ref++;
> + splitref->ar_amap = amap;
>   splitref->ar_pageoff = origref->ar_pageoff + leftslots;
>  }
>  
> @@ -1104,12 +1104,11 @@ amap_add(struct vm_aref *aref, vaddr_t o
>  
>   slot = UVM_AMAP_SLOTIDX(slot);
>   if (replace) {
> - if (chunk->ac_anon[slot] == NULL)
> - panic("amap_add: replacing null anon");
> - if (chunk->ac_anon[slot]->an_page != NULL &&
> - (amap->am_flags & AMAP_SHARED) != 0) {
> - pmap_page_protect(chunk->ac_anon[slot]->an_page,
> - PROT_NONE);
> + struct vm_anon *oanon  = chunk->ac_anon[slot];
> +
> + KASSERT(oanon != NULL);
> + if (oanon->an_page && (amap->am_flags & AMAP_SHARED) != 0) {
> + pmap_page_protect(oanon->an_page, PROT_NONE);
>   /*
>* XXX: suppose page is supposed to be wired somewhere?
>*/
> @@ -1138,14 +1137,13 @@ amap_unadd(struct vm_aref *aref, vaddr_t
>  
>   AMAP_B2SLOT(slot, offset);
>   slot += aref->ar_pageoff;
> - KASSERT(slot < amap->am_nslot);
> + if (chunk->ac_anon[slot] == NULL)
> + panic("amap_unadd: nothing there");

this change seems wrong. you're removing a KASSERT() and readd an
explicit panic(9) (taken from few lines after)

>   chunk = amap_chunk_get(amap, slot, 0, PR_NOWAIT);
> - if (chunk == NULL)
> - panic("amap_unadd: chunk for slot %d not present", slot);
> + KASSERT(chunk != NULL);
>  
>   slot = UVM_AMAP_SLOTIDX(slot);
> - if (chunk->ac_anon[slot] == NULL)
> - panic("amap_unadd: nothing there");
> + KASSERT(chunk->ac_anon[slot] != NULL);
>  
>   chunk->ac_anon[slot] = NULL;
>   chunk->ac_usedmap &= ~(1 << slot);
> 

Thanks.
-- 
Sebastien Marie



drm: avoid possible deadlock in kthread_stop

2020-10-01 Thread Sebastien Marie
Hi,

Currently, when a process is calling kthread_stop(), it sets a flag
asking the thread to stop, and enters in sleep mode, but the code
doing the stop doesn't wakeup the caller of kthread_stop().

The thread should also be unparked as else it will not seen the
KTHREAD_SHOULDSTOP flag. it follows what Linux is doing.

While here, I added some comments in the locking logic for park/unpark
and stop.

Comments or OK ?

Thanks.
-- 
Sebastien Marie

---
commit 70e71461c8598e28820f1743923cac40670f7c33
from: Sébastien Marie 
date: Thu Oct  1 07:02:46 2020 UTC
 
 properly support kthread_stop()
 - wakeup pthread_stop() caller
 - unpark the thread if parked
 
 while here, add comments for locking logic for park/unpark/stop
 
diff ec329a4429e2542bc24dd017b8001b22df43564c 
ce2b5031503711bbdd7a3067c76c4f18b1d8da82
blob - 2cbd0905406ccc9d89c86cee38673a4e9c3fcf42
blob + f0e5a5a1b282c071c97505556510952ee7a6282a
--- sys/dev/pci/drm/drm_linux.c
+++ sys/dev/pci/drm/drm_linux.c
@@ -206,6 +206,10 @@ kthread_func(void *arg)
 
ret = thread->func(thread->data);
thread->flags |= KTHREAD_STOPPED;
+
+   /* wakeup thread waiting in kthread_stop() */
+   wakeup(thread);
+
kthread_exit(ret);
 }
 
@@ -256,7 +260,14 @@ kthread_parkme(void)
 
while (thread->flags & KTHREAD_SHOULDPARK) {
thread->flags |= KTHREAD_PARKED;
+
+   /* 
+* wakeup kthread_park() caller
+* to signal I am parked as asked.
+*/
wakeup(thread);
+
+   /* wait for someone to kthread_unpark() me */
tsleep_nsec(thread, PPAUSE, "parkme", INFSLP);
thread->flags &= ~KTHREAD_PARKED;
}
@@ -269,7 +280,13 @@ kthread_park(struct proc *p)
 
while ((thread->flags & KTHREAD_PARKED) == 0) {
thread->flags |= KTHREAD_SHOULDPARK;
+
wake_up_process(thread->proc);
+
+   /*
+* wait for thread to be parked.
+* the asked thread should call kthread_parkme()
+*/
tsleep_nsec(thread, PPAUSE, "park", INFSLP);
}
 }
@@ -280,6 +297,8 @@ kthread_unpark(struct proc *p)
struct kthread *thread = kthread_lookup(p);
 
thread->flags &= ~KTHREAD_SHOULDPARK;
+
+   /* wakeup kthread_parkme() caller */
wakeup(thread);
 }
 
@@ -297,7 +316,13 @@ kthread_stop(struct proc *p)
 
while ((thread->flags & KTHREAD_STOPPED) == 0) {
thread->flags |= KTHREAD_SHOULDSTOP;
+
+   /* kthread_unpark() the thread if parked */
+   kthread_unpark(p);
+
wake_up_process(thread->proc);
+   
+   /* wait for thread to stop (func() should return) */
tsleep_nsec(thread, PPAUSE, "stop", INFSLP);
}
LIST_REMOVE(thread, next);




Re: btrace: add boolean AND and OR operators

2020-09-14 Thread Sebastien Marie
ort("invalid argument type %d", ba->ba_type);
> Index: regress/usr.sbin/btrace/Makefile
> ===
> RCS file: /cvs/src/regress/usr.sbin/btrace/Makefile,v
> retrieving revision 1.4
> diff -u -p -r1.4 Makefile
> --- regress/usr.sbin/btrace/Makefile  19 Mar 2020 15:53:09 -  1.4
> +++ regress/usr.sbin/btrace/Makefile  14 Sep 2020 15:14:10 -
> @@ -3,8 +3,8 @@
>  BTRACE?=  /usr/sbin/btrace
>  
>  # scripts that don't need /dev/dt
> -BT_LANG_SCRIPTS= arithm beginend comments delete exit map map-unnamed \
> - maxoperand min+max+sum multismts nsecs+var
> +BT_LANG_SCRIPTS= arithm beginend boolean comments delete exit map \
> + map-unnamed maxoperand min+max+sum multismts nsecs+var
>  
>  BT_KERN_SCRIPTS=
>  
> Index: regress/usr.sbin/btrace/boolean.bt
> ===
> RCS file: regress/usr.sbin/btrace/boolean.bt
> diff -N regress/usr.sbin/btrace/boolean.bt
> --- /dev/null 1 Jan 1970 00:00:00 -
> +++ regress/usr.sbin/btrace/boolean.bt14 Sep 2020 15:14:10 -
> @@ -0,0 +1,8 @@
> +BEGIN
> +{
> + @a = 9;
> + @b = 1;
> +
> + printf("a & b = %d\n", @a & @b);
> + printf("a | b = %d\n", @a | @b);
> +}
> Index: regress/usr.sbin/btrace/boolean.ok
> ===
> RCS file: regress/usr.sbin/btrace/boolean.ok
> diff -N regress/usr.sbin/btrace/boolean.ok
> --- /dev/null 1 Jan 1970 00:00:00 -
> +++ regress/usr.sbin/btrace/boolean.ok14 Sep 2020 15:14:10 -
> @@ -0,0 +1,2 @@
> +a & b = 1
> +a | b = 9
> -- 
> jasper
> 

-- 
Sebastien Marie



Re: go/rust vs uvm_map_inentry()

2020-09-14 Thread Sebastien Marie
On Mon, Sep 14, 2020 at 01:25:03PM +0200, Mark Kettenis wrote:
> > Date: Sun, 13 Sep 2020 19:48:19 +0200
> > From: Sebastien Marie 
> > 
> > On Sun, Sep 13, 2020 at 04:49:48PM +0200, Sebastien Marie wrote:
> > > On Sun, Sep 13, 2020 at 03:29:57PM +0200, Martin Pieuchot wrote:
> > > > I'm no longer able to reproduce the corruption while building lang/go
> > > > with the diff below.  Something relevant to threading change in go since
> > > > march?
> > > > 
> > > > Can someone try this diff and tell me if go and/or rust still fail?
> > > 
> > > quickly tested with rustc build (nightly here), and it is failing at 
> > > random places (not always at the same) with memory errors (signal 11, 
> > > compiler ICE signal 6...)
> > > 
> > 
> > A first hint.
> > 
> > With the help of deraadt@, it was found that disabling
> > uvm_map_inentry() call in usertrap() is enough to avoid the crashes.
> > 
> > To be clear, I am using the following diff:
> 
> The diff below fixes at (for amd64).
> 
> What's happening is that uvm_map_inentry() may sleep to grab the lock
> of the map.  The fault address is read from cr2 in pageflttrap() which
> gets called after this check and if the check sleeps, cr2 is likely to
> be clobbered by a page fault in another process.
> 
> Diff below fixes this by reading cr2 early and passing it to pageflttrap().
> 
> ok?

it makes sens. and I can't trigger the crashes with rustc build anymore.

ok semarie@
 
> 
> Index: arch/amd64/amd64/trap.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/amd64/trap.c,v
> retrieving revision 1.80
> diff -u -p -r1.80 trap.c
> --- arch/amd64/amd64/trap.c   19 Aug 2020 10:10:57 -  1.80
> +++ arch/amd64/amd64/trap.c   14 Sep 2020 11:17:35 -
> @@ -92,7 +92,7 @@
>  
>  #include "isa.h"
>  
> -int  pageflttrap(struct trapframe *, int _usermode);
> +int  pageflttrap(struct trapframe *, uint64_t, int _usermode);
>  void kerntrap(struct trapframe *);
>  void usertrap(struct trapframe *);
>  void ast(struct trapframe *);
> @@ -157,12 +157,11 @@ fault(const char *format, ...)
>   * if something was so broken that we should panic.
>   */
>  int
> -pageflttrap(struct trapframe *frame, int usermode)
> +pageflttrap(struct trapframe *frame, uint64_t cr2, int usermode)
>  {
>   struct proc *p = curproc;
>   struct pcb *pcb;
>   int error;
> - uint64_t cr2;
>   vaddr_t va;
>   struct vm_map *map;
>   vm_prot_t ftype;
> @@ -172,7 +171,6 @@ pageflttrap(struct trapframe *frame, int
>  
>   map = >p_vmspace->vm_map;
>   pcb = >p_addr->u_pcb;
> - cr2 = rcr2();
>   va = trunc_page((vaddr_t)cr2);
>  
>   KERNEL_LOCK();
> @@ -280,6 +278,7 @@ void
>  kerntrap(struct trapframe *frame)
>  {
>   int type = (int)frame->tf_trapno;
> + uint64_t cr2 = rcr2();
>  
>   verify_smap(__func__);
>   uvmexp.traps++;
> @@ -299,7 +298,7 @@ kerntrap(struct trapframe *frame)
>   /*NOTREACHED*/
>  
>   case T_PAGEFLT: /* allow page faults in kernel mode */
> - if (pageflttrap(frame, 0))
> + if (pageflttrap(frame, cr2, 0))
>   return;
>   goto we_re_toast;
>  
> @@ -333,6 +332,7 @@ usertrap(struct trapframe *frame)
>  {
>   struct proc *p = curproc;
>   int type = (int)frame->tf_trapno;
> + uint64_t cr2 = rcr2();
>   union sigval sv;
>   int sig, code;
>  
> @@ -381,7 +381,7 @@ usertrap(struct trapframe *frame)
>   break;
>  
>   case T_PAGEFLT: /* page fault */
> - if (pageflttrap(frame, 1))
> + if (pageflttrap(frame, cr2, 1))
>   goto out;
>   /* FALLTHROUGH */
>  

-- 
Sebastien Marie



Re: go/rust vs uvm_map_inentry()

2020-09-13 Thread Sebastien Marie
On Sun, Sep 13, 2020 at 04:49:48PM +0200, Sebastien Marie wrote:
> On Sun, Sep 13, 2020 at 03:29:57PM +0200, Martin Pieuchot wrote:
> > I'm no longer able to reproduce the corruption while building lang/go
> > with the diff below.  Something relevant to threading change in go since
> > march?
> > 
> > Can someone try this diff and tell me if go and/or rust still fail?
> 
> quickly tested with rustc build (nightly here), and it is failing at random 
> places (not always at the same) with memory errors (signal 11, compiler ICE 
> signal 6...)
> 

A first hint.

With the help of deraadt@, it was found that disabling
uvm_map_inentry() call in usertrap() is enough to avoid the crashes.

To be clear, I am using the following diff:

diff 3e16148d8fe176d83ff415f6c03a79618da4401e /data/semarie/repos/openbsd/src
blob - 7f195a5309280943e0138953c61fffcb6a80c6bf
file + sys/arch/amd64/conf/GENERIC.MP
--- sys/arch/amd64/conf/GENERIC.MP
+++ sys/arch/amd64/conf/GENERIC.MP
@@ -4,6 +4,8 @@ include "arch/amd64/conf/GENERIC"
 
 option MULTIPROCESSOR
 #optionMP_LOCKDEBUG
-#optionWITNESS
+option WITNESS
+
+pseudo-device dt
 
 cpu*   at mainbus?
blob - fc23bc67e305a1a1edc7d6f08ecb982dccdc4a45
file + sys/uvm/uvm_map.c
--- sys/uvm/uvm_map.c
+++ sys/uvm/uvm_map.c
@@ -1893,16 +1893,16 @@ uvm_map_inentry(struct proc *p, struct p_inentry *ie, 
boolean_t ok = TRUE;
 
if (uvm_map_inentry_recheck(serial, addr, ie)) {
-   KERNEL_LOCK();
ok = uvm_map_inentry_fix(p, ie, addr, fn, serial);
if (!ok) {
+   KERNEL_LOCK();
printf(fmt, p->p_p->ps_comm, p->p_p->ps_pid, p->p_tid,
addr, ie->ie_start, ie->ie_end);
p->p_p->ps_acflag |= AMAP;
sv.sival_ptr = (void *)PROC_PC(p);
trapsignal(p, SIGSEGV, 0, SEGV_ACCERR, sv);
+   KERNEL_UNLOCK();
}
-   KERNEL_UNLOCK();
}
return (ok);
 }
blob - 4a4c6275aa766fe2e4f5c9d913d1257f41a9d578
file + sys/arch/amd64/amd64/trap.c
--- sys/arch/amd64/amd64/trap.c
+++ sys/arch/amd64/amd64/trap.c
@@ -343,10 +343,12 @@ usertrap(struct trapframe *frame)
p->p_md.md_regs = frame;
refreshcreds(p);
 
+#if 0
if (!uvm_map_inentry(p, >p_spinentry, PROC_STACK(p),
"[%s]%d/%d sp=%lx inside %lx-%lx: not MAP_STACK\n",
uvm_map_inentry_sp, p->p_vmspace->vm_map.sserial))
goto out;
+#endif
 
    switch (type) {
case T_PROTFLT: /* protection fault */


Thanks.
-- 
Sebastien Marie



Re: go/rust vs uvm_map_inentry()

2020-09-13 Thread Sebastien Marie
On Sun, Sep 13, 2020 at 09:15:15AM -0600, Theo de Raadt wrote:
> crashes -- but without any kernel printfs?

crashes and no kernel printfs

-- 
Sebastien Marie



Re: go/rust vs uvm_map_inentry()

2020-09-13 Thread Sebastien Marie
On Sun, Sep 13, 2020 at 03:29:57PM +0200, Martin Pieuchot wrote:
> I'm no longer able to reproduce the corruption while building lang/go
> with the diff below.  Something relevant to threading change in go since
> march?
> 
> Can someone try this diff and tell me if go and/or rust still fail?

quickly tested with rustc build (nightly here), and it is failing at random 
places (not always at the same) with memory errors (signal 11, compiler ICE 
signal 6...)


> Index: uvm/uvm_map.c
> ===
> RCS file: /cvs/src/sys/uvm/uvm_map.c,v
> retrieving revision 1.266
> diff -u -p -r1.266 uvm_map.c
> --- uvm/uvm_map.c 12 Sep 2020 17:08:50 -  1.266
> +++ uvm/uvm_map.c 13 Sep 2020 10:12:25 -
> @@ -1893,16 +1893,16 @@ uvm_map_inentry(struct proc *p, struct p
>   boolean_t ok = TRUE;
>  
>   if (uvm_map_inentry_recheck(serial, addr, ie)) {
> - KERNEL_LOCK();
>   ok = uvm_map_inentry_fix(p, ie, addr, fn, serial);
>   if (!ok) {
> + KERNEL_LOCK();
>   printf(fmt, p->p_p->ps_comm, p->p_p->ps_pid, p->p_tid,
>   addr, ie->ie_start, ie->ie_end);
>   p->p_p->ps_acflag |= AMAP;
>   sv.sival_ptr = (void *)PROC_PC(p);
>   trapsignal(p, SIGSEGV, 0, SEGV_ACCERR, sv);
> + KERNEL_UNLOCK();
>   }
> - KERNEL_UNLOCK();
>   }
>   return (ok);
>  }
> 

-- 
Sebastien Marie



Re: who(1) patch for unveil violation

2020-08-27 Thread Sebastien Marie
On Thu, Aug 27, 2020 at 07:00:22AM -0400, David Goerger wrote:
> Hello,
> 
> This morning I was surprised to see a who(1) unveil violation in a
> lastcomm(1) report, so I looked into it and found that when requesting
> show_idle (-u flag) or show_term (-T flag), we indeed try to read
> _PATH_DEV, which isn't unveiled yet.
> 
> I'm not an unveil(2) expert, and there might be a better way to handle
> this, but I confirmed this fixes both case 0 (no file arg) and case 1
> (e.g. `who -u /var/log/wtmp`). Tested on a -current snapshot from
> yesterday, as well as on an up-to-date 6.7-stable box.
> 
> Cheers,
> David

The diff is ok semarie@

who(1) is doing stat(2) on line to determine +/- mode of the tty (for
show_term) or to determine the idle time using st_atime (show_idle).

> ===
> --- who.c.orig  Thu Aug 27 06:24:18 2020
> +++ who.c   Thu Aug 27 06:40:52 2020
> @@ -124,6 +124,10 @@
> 
> if (unveil(_PATH_UTMP, "r") == -1)
> err(1, "unveil");
> +   if (show_term || show_idle) {
> +   if (unveil(_PATH_DEV, "r") == -1)
> +   err(1, "unveil");
> +   }
> switch (argc) {
> case 0:     /* who */
> if (pledge("stdio rpath getpw", NULL) == -1)
> 

-- 
Sebastien Marie



Re: smtpd: make smarthost to use SNI when relaying

2020-05-31 Thread Sebastien Marie
Hi,

updated diff after millert@ and beck@ remarks:
- use union to collapse in_addr + in6_addr
- doesn't allocate buffer and directly use s->relay->domain->name

Thanks.
-- 
Sebastien Marie


diff 73b535ef4537e8454483912fc3420bc304759e96 /home/semarie/repos/openbsd/src
blob - d384692a0e43de47d645142a6b99e72b7d83b687
file + usr.sbin/smtpd/mta_session.c
--- usr.sbin/smtpd/mta_session.c
+++ usr.sbin/smtpd/mta_session.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 #include 
@@ -1604,6 +1605,10 @@ mta_cert_init_cb(void *arg, int status, const char *na
struct mta_session *s = arg;
void *ssl;
char *xname = NULL, *xcert = NULL;
+   union {
+   struct in_addr in4;
+   struct in6_addr in6;
+   } addrbuf;
 
if (s->flags & MTA_WAIT)
mta_tree_pop(_tls_init, s->id);
@@ -1623,6 +1628,22 @@ mta_cert_init_cb(void *arg, int status, const char *na
free(xcert);
if (ssl == NULL)
fatal("mta: ssl_mta_init");
+
+   /*
+* RFC4366 (SNI): Literal IPv4 and IPv6 addresses are not
+* permitted in "HostName".
+*/
+   if (s->relay->domain->as_host == 1) {
+   if (inet_pton(AF_INET, s->relay->domain->name, ) != 1 &&
+   inet_pton(AF_INET6, s->relay->domain->name, ) != 1) 
{
+   log_debug("%016"PRIx64" mta tls setting SNI name=%s",
+   s->id, s->relay->domain->name);
+   if (SSL_set_tlsext_host_name(ssl, 
s->relay->domain->name) == 0)
+   log_warnx("%016"PRIx64" mta tls setting SNI 
failed",
+  s->id);
+   }
+   }
+
io_start_tls(s->io, ssl);
 }
 



smtpd: make smarthost to use SNI when relaying

2020-05-30 Thread Sebastien Marie
Hi,

I am looking to make smtpd to set SNI (SSL_set_tlsext_host_name) when connecting
to smarthost when relaying mail.

After digging a bit in libtls (to stole the right code) and smtpd (to see where
to put the stolen code), I have the following diff:


diff 73b535ef4537e8454483912fc3420bc304759e96 /home/semarie/repos/openbsd/src
blob - d384692a0e43de47d645142a6b99e72b7d83b687
file + usr.sbin/smtpd/mta_session.c
--- usr.sbin/smtpd/mta_session.c
+++ usr.sbin/smtpd/mta_session.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 #include 
@@ -1604,6 +1605,8 @@ mta_cert_init_cb(void *arg, int status, const char *na
struct mta_session *s = arg;
void *ssl;
char *xname = NULL, *xcert = NULL;
+   struct in_addr addrbuf4;
+   struct in6_addr addrbuf6;
 
if (s->flags & MTA_WAIT)
mta_tree_pop(_tls_init, s->id);
@@ -1623,6 +1626,24 @@ mta_cert_init_cb(void *arg, int status, const char *na
free(xcert);
if (ssl == NULL)
fatal("mta: ssl_mta_init");
+
+   /*
+* RFC4366 (SNI): Literal IPv4 and IPv6 addresses are not
+* permitted in "HostName".
+*/
+   if (s->relay->domain->as_host == 1) {
+   xname = xstrdup(s->relay->domain->name);
+   if (inet_pton(AF_INET, xname, ) != 1 &&
+   inet_pton(AF_INET6, xname, ) != 1) {
+   log_info("%016"PRIx64" mta setting SNI name=%s",
+   s->id, xname);
+   if (SSL_set_tlsext_host_name(ssl, xname) == 0)
+   log_warnx("%016"PRIx64" mta setting SNI failed",
+  s->id);
+   }
+   free(xname);
+   }
+
io_start_tls(s->io, ssl);
 }
 


For what I understood:

mta_cert_init_cb() function is responsable to prepare a connection. the SSL
initialization (SSL_new() call) occured in ssl_mta_init() which was just called,
so it seems it is the right place to call SSL_set_tlsext_host_name().

We just need the hostname to configure it.

Regarding mta_session structure, relay->domain->as_host is set to 1 when the
domain is linked to smarthost configuration (or when the mx is ip address I
think). And in smarthost case, the domain->name is the hostname. For SNI, we are
excluding ip, so I assume it should copte with domain->name as ip.

Does someone with better understanding of smtpd code source could confirm the
approch is right and comment ?

Please note I have only tested it on simple configuration.

Thanks.
-- 
Sebastien Marie



Re: random(4): use arc4random_ctx_buf() for large device reads

2020-05-25 Thread Sebastien Marie
On Mon, May 25, 2020 at 05:27:37PM +0200, Christian Weisgerber wrote:
> Sebastien Marie:
> 
> > > For large reads from /dev/random, use the arc4random_ctx_*() functions
> > > instead of hand-rolling the same code to set up a temporary ChaCha
> > > instance.
> > 
> > Eventually, I would get ride of myctx, initialize lctx to NULL, and use
> > (lctx == NULL) to replace (myctx == 0).
> 
> Right.  Here we go:

Thanks. ok semarie@
 
> Index: rnd.c
> ===
> RCS file: /cvs/src/sys/dev/rnd.c,v
> retrieving revision 1.213
> diff -u -p -r1.213 rnd.c
> --- rnd.c 18 May 2020 15:00:16 -  1.213
> +++ rnd.c 25 May 2020 14:58:43 -
> @@ -589,6 +589,9 @@ arc4random_ctx_free(struct arc4random_ct
>  void
>  arc4random_ctx_buf(struct arc4random_ctx *ctx, void *buf, size_t n)
>  {
> +#ifndef KEYSTREAM_ONLY
> + memset(buf, 0, n);
> +#endif
>   chacha_encrypt_bytes((chacha_ctx *)ctx, buf, buf, n);
>  }
>  
> @@ -701,40 +704,31 @@ randomclose(dev_t dev, int flag, int mod
>  int
>  randomread(dev_t dev, struct uio *uio, int ioflag)
>  {
> - u_char  lbuf[KEYSZ+IVSZ];
> - chacha_ctx  lctx;
> + struct arc4random_ctx *lctx = NULL;
>   size_t  total = uio->uio_resid;
>   u_char  *buf;
> - int myctx = 0, ret = 0;
> + int ret = 0;
>  
>   if (uio->uio_resid == 0)
>   return 0;
>  
>   buf = malloc(POOLBYTES, M_TEMP, M_WAITOK);
> - if (total > RND_MAIN_MAX_BYTES) {
> - arc4random_buf(lbuf, sizeof(lbuf));
> - chacha_keysetup(, lbuf, KEYSZ * 8);
> - chacha_ivsetup(, lbuf + KEYSZ, NULL);
> - explicit_bzero(lbuf, sizeof(lbuf));
> - myctx = 1;
> - }
> + if (total > RND_MAIN_MAX_BYTES)
> + lctx = arc4random_ctx_new();
>  
>   while (ret == 0 && uio->uio_resid > 0) {
>   size_t  n = ulmin(POOLBYTES, uio->uio_resid);
>  
> - if (myctx) {
> -#ifndef KEYSTREAM_ONLY
> - memset(buf, 0, n);
> -#endif
> - chacha_encrypt_bytes(, buf, buf, n);
> - } else
> + if (lctx != NULL)
> + arc4random_ctx_buf(lctx, buf, n);
> + else
>   arc4random_buf(buf, n);
>   ret = uiomove(buf, n, uio);
>   if (ret == 0 && uio->uio_resid > 0)
>   yield();
>   }
> - if (myctx)
> - explicit_bzero(, sizeof(lctx));
> + if (lctx != NULL)
> + arc4random_ctx_free(lctx);
>   explicit_bzero(buf, POOLBYTES);
>   free(buf, M_TEMP, POOLBYTES);
>   return ret;
> -- 
> Christian "naddy" Weisgerber  na...@mips.inka.de

-- 
Sebastien Marie



  1   2   3   4   >