Almost 2 decades ago we started work on W^X.  The concept was simple.
Pages that are writeable, should not be executable.  We applies this
concept object by object, trying to seperate objects with different
qualities to different pages. The first one we handled was the signal
trampoline at the top of the stack.  We just kept making changes in
the same vein.  Eventually W^X came to some of our kernel address
spaces also.

This also led to a push for creating more .rodata objects.

The fundamental concept is that an object should only have the
permissions neccessary, and any other operation should fault.  The
only permission seperations we have are kernel vs userland, and
then read, write, and execute.

How about we add another new permission!  This is not a hardware
permission, but a software permission.  It is opportunistically
enforced by the kernel.

the permission is MAP_STACK.  If you want to use memory as a stack,
you must mmap it with that flag bit.  The kernel does so automatically
for the stack region of a process's stack.  Two other types of stack
occur: thread stacks, and alternate signal stacks.  Those are handled
in clever ways.

When a system call happens, we check if the stack-pointer register
points to such a page.  If it doesn't, the program is killed.  We
have tightened the ABI.  You may no longer point your stack register
at non-stack memory.  You'll be killed.  This checking code is MI, so
it works for all platforms.

We can also perform this check on standard synchronous traps, for
instance page faults.  (We cannot yet perform it at standard
interrupts).  This checking code has been written for i386, amd64, and
mips64.  Others are being worked on.

The check is fast.  We comparing 5 variables for bounds.  If that
fast-path check fails we call a slower check which requires locking,
and checks if the process has other stacks which may be valid in the
region of the stack pointer location (for instance: threads)


Since page-permissions are generally done on page boundaries, there is
caveat that thread and altstacks must now be page-sized and
page-aligned, so that we can enforce the MAP_STACK attribute
correctly.  It is possible that a few ports need some massaging to
satisfy this condition, but we haven't found any which break yet.  A
syslog_r has been added so that we can identify these failure cases.
Also, the faulting cases are quite verbose for now, to help identify
the programs we need to repair.

This work has been done with Stefan Kempf (stefan@)

Index: lib/libc/sys/sigaltstack.2
===================================================================
RCS file: /cvs/src/lib/libc/sys/sigaltstack.2,v
retrieving revision 1.20
diff -u -p -u -r1.20 sigaltstack.2
--- lib/libc/sys/sigaltstack.2  11 Feb 2018 04:50:25 -0000      1.20
+++ lib/libc/sys/sigaltstack.2  5 Mar 2018 00:38:31 -0000
@@ -93,14 +93,13 @@ if the thread is currently on a signal s
 .Dv SS_DISABLE
 if the signal stack is currently disabled.
 .Pp
-The stack must be allocated using
-.Xr mmap 2
-with
+On OpenBSD
+the provided memory must be page-aligned.  It will be
+replaced (meaning zeroed) with a new
+.Ar MAP_ANON | Ar MAP_STACK
+mapping; the
 .Ar MAP_STACK
-to inform the kernel that the memory is being used as a stack.
-Otherwise, the first system call performed while operating on
-that stack will deliver
-.Dv SIGABRT .
+attribute will be disabled afterwards.
 .Sh NOTES
 The value
 .Dv SIGSTKSZ
@@ -157,6 +156,8 @@ member pointed to by the
 .Fa ss
 argument contains flags other than
 .Dv SS_DISABLE .
+.It Bq Er EINVAL
+Memory is not page aligned.
 .It Bq Er ENOMEM
 Size of alternate stack area is less than or equal to
 .Dv MINSIGSTKSZ .
Index: lib/libpthread/man/pthread_attr_setstack.3
===================================================================
RCS file: /cvs/src/lib/libpthread/man/pthread_attr_setstack.3,v
retrieving revision 1.4
diff -u -p -u -r1.4 pthread_attr_setstack.3
--- lib/libpthread/man/pthread_attr_setstack.3  5 Jun 2013 03:44:50 -0000       
1.4
+++ lib/libpthread/man/pthread_attr_setstack.3  5 Mar 2018 00:41:00 -0000
@@ -39,6 +39,17 @@ and the size of the storage shall be
 bytes.
 The stacksize shall be at least
 .Dv PTHREAD_STACK_MIN .
+.Pp
+On OpenBSD
+the provided stack must be page-aligned.
+It will be replaced (meaning zeroed) with a new
+.Ar MAP_ANON | Ar MAP_STACK
+mapping.
+It is recommended that the initial mapping be allocated using
+an allocator which has a matching deallocator that discards whole
+pages, to clear the
+.Ar MAP_STACK
+attribute afterwards.
 .Sh RETURN VALUES
 Upon successful completion,
 .Fn pthread_attr_setstack
Index: lib/libpthread/man/pthread_attr_setstackaddr.3
===================================================================
RCS file: /cvs/src/lib/libpthread/man/pthread_attr_setstackaddr.3,v
retrieving revision 1.11
diff -u -p -u -r1.11 pthread_attr_setstackaddr.3
--- lib/libpthread/man/pthread_attr_setstackaddr.3      5 Jun 2013 03:44:50 
-0000       1.11
+++ lib/libpthread/man/pthread_attr_setstackaddr.3      5 Mar 2018 00:46:29 
-0000
@@ -34,6 +34,16 @@ attribute specifies the location of stor
 used for the created thread's stack.
 The size of the storage is at least
 .Dv PTHREAD_STACK_MIN .
+.Pp
+On OpenBSD
+the stack must have been allocated using
+.Xr mmap
+with the
+.Va MAP_STACK
+attribute.
+Otherwise, use of the stack will cause BUSERR faults.
+.Xr pthread_attr_setstack 3
+can avoid this problem because it knows the size of the stack to remap.
 .Sh RETURN VALUES
 Upon successful completion,
 .Fn pthread_attr_setstackaddr
Index: lib/librthread/rthread_attr.c
===================================================================
RCS file: /cvs/src/lib/librthread/rthread_attr.c,v
retrieving revision 1.23
diff -u -p -u -r1.23 rthread_attr.c
--- lib/librthread/rthread_attr.c       5 Sep 2017 02:40:54 -0000       1.23
+++ lib/librthread/rthread_attr.c       4 Mar 2018 20:08:23 -0000
@@ -19,8 +19,11 @@
  * generic attribute support
  */
 
+#include <sys/mman.h>
+
 #include <stdint.h>
 #include <stdlib.h>
+#include <syslog.h>
 #include <unistd.h>
 #include <errno.h>
 
@@ -118,13 +121,54 @@ int
 pthread_attr_setstack(pthread_attr_t *attrp, void *stackaddr, size_t stacksize)
 {
        int error;
+       volatile char *p = stackaddr;
+       size_t i;
+       struct syslog_data data = SYSLOG_DATA_INIT;
+
+       if (stacksize < PTHREAD_STACK_MIN) {
+               syslog_r(LOG_USER, &data,
+                   "pthread_attr_setstack(%p, %zu): "
+                   "stack size below min size %d",
+                   stackaddr, stacksize, PTHREAD_STACK_MIN);
+               return (EINVAL);
+       }
 
        /*
-        * XXX Add an alignment test, on stackaddr for stack-grows-up
-        * archs or on stackaddr+stacksize for stack-grows-down archs
+        * Make sure that the stack is page-aligned and a multiple
+        * of the page size
         */
-       if (stacksize < PTHREAD_STACK_MIN)
+       if (((uintptr_t)stackaddr % PTHREAD_STACK_MIN) != 0
+           || (stacksize % PTHREAD_STACK_MIN) != 0) {
+               syslog_r(LOG_USER, &data,
+                   "pthread_attr_setstack(%p, 0x%zx): "
+                   "unaligned thread stack start and/or size",
+                   stackaddr, stacksize);
                return (EINVAL);
+       }
+
+       /*
+        * We are going to re-mmap() stackaddr to MAP_STACK, but only
+        * if the entire range [stackaddr, stackaddr+stacksize) consists
+        * of valid address that are mapped PROT_READ|PROT_WRITE.
+        * Test this by reading and writing every page.
+        *
+        * XXX: What if the caller has SIGSEGV blocked or ignored?
+        * Then we won't crash here when entering an invalid mapping.
+        */
+       for (i = 0; i < stacksize; i += PTHREAD_STACK_MIN) {
+               char val = p[i];
+
+               p[i] = val;
+       }
+
+       if (mmap(stackaddr, stacksize, PROT_READ|PROT_WRITE,
+           MAP_FIXED|MAP_STACK|MAP_ANON|MAP_PRIVATE, -1, 0) == MAP_FAILED) {
+               syslog_r(LOG_USER, &data,
+                   "pthread_attr_setstack(%p, %zu): mmap error %m",
+                   stackaddr, stacksize);
+               return (errno);
+       }
+
        if ((error = pthread_attr_setstackaddr(attrp, stackaddr)))
                return (error);
        (*attrp)->stack_size = stacksize;
Index: sys/arch/amd64/amd64/trap.c
===================================================================
RCS file: /cvs/src/sys/arch/amd64/amd64/trap.c,v
retrieving revision 1.64
diff -u -p -u -r1.64 trap.c
--- sys/arch/amd64/amd64/trap.c 21 Feb 2018 19:24:15 -0000      1.64
+++ sys/arch/amd64/amd64/trap.c 4 Mar 2018 20:08:23 -0000
@@ -175,9 +175,33 @@ trap(struct trapframe *frame)
 #endif
 
        if (!KERNELMODE(frame->tf_cs, frame->tf_rflags)) {
+               vaddr_t sp = PROC_STACK(p);
+
                type |= T_USER;
                p->p_md.md_regs = frame;
                refreshcreds(p);
+
+               if (p->p_vmspace->vm_map.serial != p->p_spserial ||
+                   p->p_spstart == 0 || sp < p->p_spstart ||
+                   sp >= p->p_spend) {
+                       KERNEL_LOCK();
+                       if (!uvm_map_check_stack_range(p, sp)) {
+                               struct sigaction sa;
+
+                               printf("trap pid %d tid %d type %d: sp %lx not 
inside %lx-%lx\n",
+                                   curproc->p_p->ps_pid, p->p_tid,
+                                   (int)frame->tf_trapno, sp, p->p_spstart, 
p->p_spend);
+
+                               memset(&sa, 0, sizeof sa);
+                               sa.sa_handler = SIG_DFL;
+                               setsigvec(p, SIGABRT, &sa);
+                               sv.sival_ptr = (void *)frame->tf_rip;
+                               trapsignal(p, SIGABRT, type & ~T_USER,
+                                   ILL_BADSTK, sv);
+                       }
+
+                       KERNEL_UNLOCK();
+               }
        }
 
        switch (type) {
Index: sys/arch/i386/i386/trap.c
===================================================================
RCS file: /cvs/src/sys/arch/i386/i386/trap.c,v
retrieving revision 1.136
diff -u -p -u -r1.136 trap.c
--- sys/arch/i386/i386/trap.c   4 Oct 2017 17:41:01 -0000       1.136
+++ sys/arch/i386/i386/trap.c   5 Mar 2018 06:22:48 -0000
@@ -154,9 +154,36 @@ trap(struct trapframe *frame)
 #endif
 
        if (!KERNELMODE(frame->tf_cs, frame->tf_eflags)) {
+#if 1
+               vaddr_t sp = frame->tf_esp;
+#endif
+
                type |= T_USER;
                p->p_md.md_regs = frame;
                refreshcreds(p);
+
+#if 1
+               if (p->p_vmspace->vm_map.serial != p->p_spserial ||
+                   p->p_spstart == 0 || sp < p->p_spstart ||
+                   sp >= p->p_spend) {
+                       KERNEL_LOCK();
+                       if (!uvm_map_check_stack_range(p, sp)) {
+                               struct sigaction sa;
+
+                               printf("trap pid %d tid %d type %d: sp %lx not 
inside %lx-%lx\n",
+                                   curproc->p_p->ps_pid, p->p_tid,
+                                   (int)frame->tf_trapno, sp, p->p_spstart, 
p->p_spend);
+
+                               memset(&sa, 0, sizeof sa);
+                               sa.sa_handler = SIG_DFL;
+                               setsigvec(p, SIGABRT, &sa);
+                               sv.sival_ptr = (void *)frame->tf_eip;
+                               trapsignal(p, SIGABRT, type & ~T_USER,
+                                   ILL_BADSTK, sv);
+                       }
+                       KERNEL_UNLOCK();
+               }
+#endif
        }
 
        switch (type) {
Index: sys/arch/mips64/mips64/trap.c
===================================================================
RCS file: /cvs/src/sys/arch/mips64/mips64/trap.c,v
retrieving revision 1.130
diff -u -p -u -r1.130 trap.c
--- sys/arch/mips64/mips64/trap.c       2 Sep 2017 15:56:29 -0000       1.130
+++ sys/arch/mips64/mips64/trap.c       5 Mar 2018 16:58:57 -0000
@@ -251,8 +251,32 @@ trap(struct trapframe *trapframe)
        }
 #endif
 
-       if (type & T_USER)
+       if (type & T_USER) {
+               vaddr_t sp = trapframe->sp;
+
                refreshcreds(p);
+
+               if (p->p_vmspace->vm_map.serial != p->p_spserial ||
+                   p->p_spstart == 0 || sp < p->p_spstart ||
+                   sp >= p->p_spend) {
+                       KERNEL_LOCK();
+                       if (!uvm_map_check_stack_range(p, sp)) {
+                               struct sigaction sa;
+                               union sigval sv;
+
+                               printf("trap pid %d tid %d type %d: sp %lx not 
inside %lx-%lx\n",
+                                   curproc->p_p->ps_pid, p->p_tid, type,
+                                   sp, p->p_spstart, p->p_spend);
+
+                               memset(&sa, 0, sizeof sa);
+                               sa.sa_handler = SIG_DFL;
+                               setsigvec(p, SIGABRT, &sa);
+                               sv.sival_ptr = (void *)trapframe->pc;
+                               trapsignal(p, SIGABRT, 0, ILL_BADSTK, sv);
+                       }
+                       KERNEL_UNLOCK();
+               }
+       }
 
        itsa(trapframe, ci, p, type);
 
Index: sys/arch/powerpc/powerpc/trap.c
===================================================================
RCS file: /cvs/src/sys/arch/powerpc/powerpc/trap.c,v
retrieving revision 1.106
diff -u -p -u -r1.106 trap.c
--- sys/arch/powerpc/powerpc/trap.c     20 Dec 2016 12:08:01 -0000      1.106
+++ sys/arch/powerpc/powerpc/trap.c     5 Mar 2018 04:50:06 -0000
@@ -234,8 +234,33 @@ trap(struct trapframe *frame)
        db_expr_t offset;
 
        if (frame->srr1 & PSL_PR) {
+               vaddr_t sp = PROC_STACK(p);
+
                type |= EXC_USER;
                refreshcreds(p);
+
+               if (p->p_vmspace->vm_map.serial != p->p_spserial ||
+                   p->p_spstart == 0 || sp < p->p_spstart ||
+                   sp >= p->p_spend) {
+                       KERNEL_LOCK();
+                       if (!uvm_map_check_stack_range(p, sp)) {
+                               struct sigaction sa;
+
+                               printf("trap pid %d tid %d type %d: sp %lx not 
inside %lx-%lx\n",
+                                   curproc->p_p->ps_pid, p->p_tid,
+                                   (int)frame->tf_trapno, sp, p->p_spstart, 
p->p_spend);
+
+                               memset(&sa, 0, sizeof sa);
+                               sa.sa_handler = SIG_DFL;
+                               setsigvec(p, SIGABRT, &sa);
+                               sv.sival_ptr = (void *)frame->srr0;
+                               trapsignal(p, SIGABRT, type & ~T_USER,
+                                   ILL_BADSTK, sv);
+                       }
+
+                       KERNEL_UNLOCK();
+
+
        }
 
        switch (type) {
Index: sys/arch/sparc64/sparc64/trap.c
===================================================================
RCS file: /cvs/src/sys/arch/sparc64/sparc64/trap.c,v
retrieving revision 1.98
diff -u -p -u -r1.98 trap.c
--- sys/arch/sparc64/sparc64/trap.c     22 Jul 2017 15:17:49 -0000      1.98
+++ sys/arch/sparc64/sparc64/trap.c     5 Mar 2018 04:48:06 -0000
@@ -427,6 +427,29 @@ trap(struct trapframe64 *tf, unsigned ty
        p->p_md.md_tf = tf;     /* for ptrace/signals */
        refreshcreds(p);
 
+       vaddr_t sp = PROC_STACK(p);
+       if (p->p_vmspace->vm_map.serial != p->p_spserial ||
+           p->p_spstart == 0 || sp < p->p_spstart ||
+           sp >= p->p_spend) {
+               KERNEL_LOCK();
+               if (!uvm_map_check_stack_range(p, sp)) {
+                       struct sigaction sa;
+
+                       printf("trap pid %d tid %d type %d: sp %lx not inside 
%lx-%lx\n",
+                           curproc->p_p->ps_pid, p->p_tid,
+                           (int)frame->tf_trapno, sp, p->p_spstart, 
p->p_spend);
+
+                       memset(&sa, 0, sizeof sa);
+                       sa.sa_handler = SIG_DFL;
+                       setsigvec(p, SIGABRT, &sa);
+                       sv.sival_ptr = (void *)tf->tf_pc;
+                       trapsignal(p, SIGABRT, type & ~T_USER,
+                           ILL_BADSTK, sv);
+               }
+
+               KERNEL_UNLOCK();
+       }
+
        switch (type) {
 
        default:
Index: sys/kern/exec_subr.c
===================================================================
RCS file: /cvs/src/sys/kern/exec_subr.c,v
retrieving revision 1.54
diff -u -p -u -r1.54 exec_subr.c
--- sys/kern/exec_subr.c        10 Feb 2018 02:54:33 -0000      1.54
+++ sys/kern/exec_subr.c        4 Mar 2018 20:08:23 -0000
@@ -276,7 +276,8 @@ vmcmd_map_zero(struct proc *p, struct ex
        return (uvm_map(&p->p_vmspace->vm_map, &cmd->ev_addr,
            round_page(cmd->ev_len), NULL, UVM_UNKNOWN_OFFSET, 0,
            UVM_MAPFLAG(cmd->ev_prot, PROT_MASK, MAP_INHERIT_COPY,
-           MADV_NORMAL, UVM_FLAG_FIXED|UVM_FLAG_COPYONW)));
+           MADV_NORMAL, UVM_FLAG_FIXED|UVM_FLAG_COPYONW |
+           (cmd->ev_flags & VMCMD_STACK ? UVM_FLAG_STACK : 0))));
 }
 
 /*
@@ -379,17 +380,19 @@ exec_setup_stack(struct proc *p, struct 
 #ifdef MACHINE_STACK_GROWS_UP
        NEW_VMCMD(&epp->ep_vmcmds, vmcmd_map_zero,
            ((epp->ep_minsaddr - epp->ep_ssize) - epp->ep_maxsaddr),
-           epp->ep_maxsaddr + epp->ep_ssize, NULLVP, 0, PROT_NONE);
-       NEW_VMCMD(&epp->ep_vmcmds, vmcmd_map_zero, epp->ep_ssize,
+           epp->ep_maxsaddr + epp->ep_ssize, NULLVP, 0,
+           PROT_NONE);
+       NEW_VMCMD2(&epp->ep_vmcmds, vmcmd_map_zero, epp->ep_ssize,
            epp->ep_maxsaddr, NULLVP, 0,
-           PROT_READ | PROT_WRITE);
+           PROT_READ | PROT_WRITE, VMCMD_STACK);
 #else
        NEW_VMCMD(&epp->ep_vmcmds, vmcmd_map_zero,
            ((epp->ep_minsaddr - epp->ep_ssize) - epp->ep_maxsaddr),
-           epp->ep_maxsaddr, NULLVP, 0, PROT_NONE);
-       NEW_VMCMD(&epp->ep_vmcmds, vmcmd_map_zero, epp->ep_ssize,
+           epp->ep_maxsaddr, NULLVP, 0,
+           PROT_NONE);
+       NEW_VMCMD2(&epp->ep_vmcmds, vmcmd_map_zero, epp->ep_ssize,
            (epp->ep_minsaddr - epp->ep_ssize), NULLVP, 0,
-           PROT_READ | PROT_WRITE);
+           PROT_READ | PROT_WRITE, VMCMD_STACK);
 #endif
 
        return (0);
Index: sys/kern/init_main.c
===================================================================
RCS file: /cvs/src/sys/kern/init_main.c,v
retrieving revision 1.274
diff -u -p -u -r1.274 init_main.c
--- sys/kern/init_main.c        28 Feb 2018 18:47:33 -0000      1.274
+++ sys/kern/init_main.c        4 Mar 2018 20:08:23 -0000
@@ -651,7 +651,7 @@ start_init(void *arg)
        if (uvm_map(&p->p_vmspace->vm_map, &addr, PAGE_SIZE, 
            NULL, UVM_UNKNOWN_OFFSET, 0,
            UVM_MAPFLAG(PROT_READ | PROT_WRITE, PROT_MASK, MAP_INHERIT_COPY,
-           MADV_NORMAL, UVM_FLAG_FIXED|UVM_FLAG_OVERLAY|UVM_FLAG_COPYONW)))
+           MADV_NORMAL, 
UVM_FLAG_FIXED|UVM_FLAG_OVERLAY|UVM_FLAG_COPYONW|UVM_FLAG_STACK)))
                panic("init: couldn't allocate argument space");
 
        for (pathp = &initpaths[0]; (path = *pathp) != NULL; pathp++) {
Index: sys/kern/kern_sig.c
===================================================================
RCS file: /cvs/src/sys/kern/kern_sig.c,v
retrieving revision 1.216
diff -u -p -u -r1.216 kern_sig.c
--- sys/kern/kern_sig.c 26 Feb 2018 13:33:25 -0000      1.216
+++ sys/kern/kern_sig.c 4 Mar 2018 20:08:23 -0000
@@ -554,6 +554,11 @@ sys_sigaltstack(struct proc *p, void *v,
        }
        if (ss.ss_size < MINSIGSTKSZ)
                return (ENOMEM);
+
+       error = uvm_map_remap_as_stack(p, (vaddr_t)ss.ss_sp, ss.ss_size);
+       if (error)
+               return (error);
+
        p->p_sigstk = ss;
        return (0);
 }
Index: sys/sys/exec.h
===================================================================
RCS file: /cvs/src/sys/sys/exec.h,v
retrieving revision 1.36
diff -u -p -u -r1.36 exec.h
--- sys/sys/exec.h      8 Feb 2017 21:04:44 -0000       1.36
+++ sys/sys/exec.h      4 Mar 2018 20:08:23 -0000
@@ -98,6 +98,7 @@ struct exec_vmcmd {
        int     ev_flags;
 #define VMCMD_RELATIVE  0x0001  /* ev_addr is relative to base entry */
 #define VMCMD_BASE      0x0002  /* marks a base entry */
+#define VMCMD_STACK     0x0004  /* create with UVM_FLAG_STACK */
 };
 
 #define        EXEC_DEFAULT_VMCMD_SETSIZE      8       /* # of cmds in set to 
start */
Index: sys/sys/proc.h
===================================================================
RCS file: /cvs/src/sys/sys/proc.h,v
retrieving revision 1.247
diff -u -p -u -r1.247 proc.h
--- sys/sys/proc.h      26 Feb 2018 13:43:51 -0000      1.247
+++ sys/sys/proc.h      4 Mar 2018 20:08:23 -0000
@@ -330,6 +330,10 @@ struct proc {
 #define        p_startcopy     p_sigmask
        sigset_t p_sigmask;     /* Current signal mask. */
 
+       u_int    p_spserial;
+       vaddr_t  p_spstart;
+       vaddr_t  p_spend;
+
        u_char  p_priority;     /* Process priority. */
        u_char  p_usrpri;       /* User-priority based on p_estcpu and ps_nice. 
*/
        int     p_pledge_syscall;       /* Cache of current syscall */
Index: sys/sys/syscall_mi.h
===================================================================
RCS file: /cvs/src/sys/sys/syscall_mi.h,v
retrieving revision 1.18
diff -u -p -u -r1.18 syscall_mi.h
--- sys/sys/syscall_mi.h        20 Apr 2017 15:21:51 -0000      1.18
+++ sys/sys/syscall_mi.h        4 Mar 2018 20:08:23 -0000
@@ -32,6 +32,7 @@
  */
 
 #include <sys/pledge.h>
+#include <uvm/uvm_extern.h>
 
 #ifdef KTRACE
 #include <sys/ktrace.h>
@@ -48,6 +49,7 @@ mi_syscall(struct proc *p, register_t co
        uint64_t tval;
        int lock = !(callp->sy_flags & SY_NOLOCK);
        int error, pledged;
+       vaddr_t sp = PROC_STACK(p);
 
        /* refresh the thread's cache of the process's creds */
        refreshcreds(p);
@@ -65,7 +67,25 @@ mi_syscall(struct proc *p, register_t co
        }
 #endif
 
-       if (lock)
+       if (p->p_vmspace->vm_map.serial != p->p_spserial ||
+           p->p_spstart == 0 || sp < p->p_spstart || sp >= p->p_spend) {
+               KERNEL_LOCK();
+
+               if (!uvm_map_check_stack_range(p, sp)) {
+                       struct sigaction sa;
+
+                       printf("syscall: sp %lx not inside %lx-%lx\n",
+                           sp, p->p_spstart, p->p_spend);
+
+                       memset(&sa, 0, sizeof sa);
+                       sa.sa_handler = SIG_DFL;
+                       setsigvec(p, SIGABRT, &sa);
+                       psignal(p, SIGABRT);
+                       KERNEL_UNLOCK();
+                       return (EPERM);
+               }
+               lock = 1;
+       } else if (lock)
                KERNEL_LOCK();
        pledged = (p->p_p->ps_flags & PS_PLEDGE);
        if (pledged && (error = pledge_syscall(p, code, &tval))) {
Index: sys/uvm/uvm.h
===================================================================
RCS file: /cvs/src/sys/uvm/uvm.h,v
retrieving revision 1.61
diff -u -p -u -r1.61 uvm.h
--- sys/uvm/uvm.h       11 Aug 2016 01:17:33 -0000      1.61
+++ sys/uvm/uvm.h       4 Mar 2018 20:08:23 -0000
@@ -88,6 +88,7 @@ struct uvm {
 #define UVM_ET_NEEDSCOPY       0x08    /* needs_copy */
 #define UVM_ET_HOLE            0x10    /* no backend */
 #define UVM_ET_NOFAULT         0x20    /* don't fault */
+#define UVM_ET_STACK           0x40    /* this is a stack */
 #define UVM_ET_FREEMAPPED      0x80    /* map entry is on free list (DEBUG) */
 
 #define UVM_ET_ISOBJ(E)                (((E)->etype & UVM_ET_OBJ) != 0)
@@ -96,6 +97,7 @@ struct uvm {
 #define UVM_ET_ISNEEDSCOPY(E)  (((E)->etype & UVM_ET_NEEDSCOPY) != 0)
 #define UVM_ET_ISHOLE(E)       (((E)->etype & UVM_ET_HOLE) != 0)
 #define UVM_ET_ISNOFAULT(E)    (((E)->etype & UVM_ET_NOFAULT) != 0)
+#define UVM_ET_ISSTACK(E)      (((E)->etype & UVM_ET_STACK) != 0)
 
 #ifdef _KERNEL
 
Index: sys/uvm/uvm_extern.h
===================================================================
RCS file: /cvs/src/sys/uvm/uvm_extern.h,v
retrieving revision 1.142
diff -u -p -u -r1.142 uvm_extern.h
--- sys/uvm/uvm_extern.h        30 Apr 2017 13:04:49 -0000      1.142
+++ sys/uvm/uvm_extern.h        4 Mar 2018 20:08:23 -0000
@@ -111,7 +111,7 @@ typedef int         vm_prot_t;
 #define UVM_FLAG_QUERY   0x0400000 /* do everything, except actual execution */
 #define UVM_FLAG_NOFAULT 0x0800000 /* don't fault */
 #define UVM_FLAG_UNMAP   0x1000000 /* unmap to make space */
-
+#define UVM_FLAG_STACK   0x2000000 /* page may contain a stack */
 
 /* macros to extract info */
 #define UVM_PROTECTION(X)      ((X) & PROT_MASK)
Index: sys/uvm/uvm_fault.c
===================================================================
RCS file: /cvs/src/sys/uvm/uvm_fault.c,v
retrieving revision 1.92
diff -u -p -u -r1.92 uvm_fault.c
--- sys/uvm/uvm_fault.c 20 Jul 2017 18:22:25 -0000      1.92
+++ sys/uvm/uvm_fault.c 4 Mar 2018 20:08:23 -0000
@@ -234,7 +234,8 @@ uvmfault_amapcopy(struct uvm_faultinfo *
 
                /* copy if needed. */
                if (UVM_ET_ISNEEDSCOPY(ufi->entry))
-                       amap_copy(ufi->map, ufi->entry, M_NOWAIT, TRUE, 
+                       amap_copy(ufi->map, ufi->entry, M_NOWAIT,
+                               UVM_ET_ISSTACK(ufi->entry) ? FALSE : TRUE,
                                ufi->orig_rvaddr, ufi->orig_rvaddr + 1);
 
                /* didn't work?  must be out of RAM.  sleep. */
Index: sys/uvm/uvm_map.c
===================================================================
RCS file: /cvs/src/sys/uvm/uvm_map.c,v
retrieving revision 1.233
diff -u -p -u -r1.233 uvm_map.c
--- sys/uvm/uvm_map.c   30 Nov 2017 00:36:10 -0000      1.233
+++ sys/uvm/uvm_map.c   6 Mar 2018 16:38:59 -0000
@@ -1000,8 +1000,14 @@ uvm_mapanon(struct vm_map *map, vaddr_t 
                KASSERT((*addr & PAGE_MASK) == 0);
 
                /* Check that the space is available. */
-               if (flags & UVM_FLAG_UNMAP)
+               if (flags & UVM_FLAG_UNMAP) {
+                       if ((flags & UVM_FLAG_STACK) &&
+                           !uvm_map_is_stack_remappable(map, *addr, sz)) {
+                               error = EINVAL;
+                               goto unlock;
+                       }
                        uvm_unmap_remove(map, *addr, *addr + sz, &dead, FALSE, 
TRUE);
+               }
                if (!uvm_map_isavail(map, NULL, &first, &last, *addr, sz)) {
                        error = ENOMEM;
                        goto unlock;
@@ -1064,6 +1070,11 @@ uvm_mapanon(struct vm_map *map, vaddr_t 
        entry->inheritance = inherit;
        entry->wired_count = 0;
        entry->advice = advice;
+       if (flags & UVM_FLAG_STACK) {
+               entry->etype |= UVM_ET_STACK;
+               if (flags & (UVM_FLAG_FIXED | UVM_FLAG_UNMAP))
+                       map->serial++;
+       }
        if (flags & UVM_FLAG_COPYONW) {
                entry->etype |= UVM_ET_COPYONWRITE;
                if ((flags & UVM_FLAG_OVERLAY) == 0)
@@ -1320,6 +1331,11 @@ uvm_map(struct vm_map *map, vaddr_t *add
        entry->inheritance = inherit;
        entry->wired_count = 0;
        entry->advice = advice;
+       if (flags & UVM_FLAG_STACK) {
+               entry->etype |= UVM_ET_STACK;
+               if (flags & UVM_FLAG_UNMAP)
+                       map->serial++;
+       }
        if (uobj)
                entry->etype |= UVM_ET_OBJ;
        else if (flags & UVM_FLAG_HOLE)
@@ -1746,6 +1762,139 @@ uvm_map_lookup_entry(struct vm_map *map,
 }
 
 /*
+ * Inside a vm_map find the sp address and verify MAP_STACK, and also
+ * remember low and high regions of that of region  which is marked
+ * with MAP_STACK.  Return TRUE.
+ * If sp isn't in a MAP_STACK region return FALSE.
+ */
+boolean_t
+uvm_map_check_stack_range(struct proc *p, vaddr_t sp)
+{
+       vm_map_t map = &p->p_vmspace->vm_map;
+       vm_map_entry_t entry;
+
+       if (sp < map->min_offset || sp >= map->max_offset)
+               return(FALSE);
+
+       /* lock map */
+       vm_map_lock_read(map);
+
+       /* lookup */
+       if (!uvm_map_lookup_entry(map, trunc_page(sp), &entry)) {
+               vm_map_unlock_read(map);
+               return(FALSE);
+       }
+
+       if ((entry->etype & UVM_ET_STACK) == 0) {
+               vm_map_unlock_read(map);
+               return (FALSE);
+       }
+       p->p_spstart = entry->start;
+       p->p_spend = entry->end;
+       p->p_spserial = map->serial;
+       vm_map_unlock_read(map);
+       return(TRUE);
+}
+
+/*
+ * Check whether the given address range can be converted to a MAP_STACK
+ * mapping.
+ *
+ * Must be called with map locked.
+ */
+boolean_t
+uvm_map_is_stack_remappable(struct vm_map *map, vaddr_t addr, vaddr_t sz)
+{
+       vaddr_t end = addr + sz;
+       struct vm_map_entry *first, *iter, *prev = NULL;
+
+       if (!uvm_map_lookup_entry(map, addr, &first)) {
+               printf("map stack 0x%lx-0x%lx of map %p failed: no mapping\n",
+                   addr, end, map);
+               return FALSE;
+       }
+
+       /*
+        * Check that the address range exists, is contiguous, and
+        * has the right protection.
+        */
+       for (iter = first; iter != NULL && iter->start < end;
+           prev = iter, iter = RBT_NEXT(uvm_map_addr, iter)) {
+               /*
+                * Make sure that we do not have holes in the range.
+                */
+#if 0
+               if (prev != NULL) {
+                       printf("prev->start 0x%lx, prev->end 0x%lx, "
+                           "iter->start 0x%lx, iter->end 0x%lx\n",
+                           prev->start, prev->end, iter->start, iter->end);
+               }
+#endif
+
+               if (prev != NULL && prev->end != iter->start) {
+                       printf("map stack 0x%lx-0x%lx of map %p failed: "
+                           "hole in range\n", addr, end, map);
+                       return FALSE;
+               }
+               if (iter->start == iter->end || UVM_ET_ISHOLE(iter)) {
+                       printf("map stack 0x%lx-0x%lx of map %p failed: "
+                           "hole in range\n", addr, end, map);
+                       return FALSE;
+               }
+
+               /*
+                * Now check the protection.
+                */
+#if 0
+               printf("iter prot: 0x%x\n", iter->protection);
+#endif
+               if (iter->protection != (PROT_READ | PROT_WRITE)) {
+                       printf("map stack 0x%lx-0x%lx of map %p failed: "
+                           "bad protection\n", addr, end, map);
+                       return FALSE;
+               }
+       }
+
+       return TRUE;
+}
+
+/*
+ * Remap an existing mapping as a stack range. If there exists
+ * a previous contiguous mapping with the given range [addr, addr + sz),
+ * with protection PROT_READ|PROT_WRITE, then the mapping is dropped,
+ * and a new anon mapping is created and marked as a stack.
+ *
+ * addr and sz must be a multiple of the page size.
+ *
+ * Must be called with map unlocked.
+ */
+int
+uvm_map_remap_as_stack(struct proc *p, vaddr_t addr, vaddr_t sz)
+{
+       vm_map_t map = &p->p_vmspace->vm_map;
+       vaddr_t end = addr + sz;
+
+       int error;
+       int flags = UVM_MAPFLAG(PROT_READ | PROT_WRITE,
+           PROT_READ | PROT_WRITE | PROT_EXEC,
+           MAP_INHERIT_COPY, MADV_NORMAL,
+           UVM_FLAG_STACK | UVM_FLAG_FIXED | UVM_FLAG_UNMAP |
+           UVM_FLAG_COPYONW);
+
+       if ((addr % PAGE_SIZE) != 0 || (sz % PAGE_SIZE) != 0) {
+               return EINVAL;
+       }
+       if (addr < map->min_offset || end >= map->max_offset || end < addr)
+               return EINVAL;
+
+       error = uvm_mapanon(map, &addr, sz, 0, flags);
+       if (error != 0)
+               printf("map stack for pid %d failed\n", p->p_p->ps_pid);
+
+       return error;
+}
+
+/*
  * uvm_map_pie: return a random load address for a PIE executable
  * properly aligned.
  */
@@ -1975,6 +2124,10 @@ uvm_unmap_remove(struct vm_map *map, vad
                        }
                }
 
+               /* A stack has been removed.. */
+               if (UVM_ET_ISSTACK(entry) && (map->flags & VM_MAP_ISVMSPACE))
+                       map->serial++;
+
                /* Kill entry. */
                uvm_unmap_kill_entry(map, entry);
 
@@ -2084,7 +2237,8 @@ uvm_map_pageable_wire(struct vm_map *map
                    UVM_ET_ISNEEDSCOPY(iter) &&
                    ((iter->protection & PROT_WRITE) ||
                    iter->object.uvm_obj == NULL)) {
-                       amap_copy(map, iter, M_WAITOK, TRUE,
+                       amap_copy(map, iter, M_WAITOK,
+                           UVM_ET_ISSTACK(iter) ? FALSE : TRUE,
                            iter->start, iter->end);
                }
                iter->wired_count++;
@@ -2853,11 +3007,12 @@ uvm_map_printit(struct vm_map *map, bool
                    entry, entry->start, entry->end, entry->object.uvm_obj,
                    (long long)entry->offset, entry->aref.ar_amap,
                    entry->aref.ar_pageoff);
-               (*pr)("\tsubmap=%c, cow=%c, nc=%c, prot(max)=%d/%d, inh=%d, "
+               (*pr)("\tsubmap=%c, cow=%c, nc=%c, stack=%c, prot(max)=%d/%d, 
inh=%d, "
                    "wc=%d, adv=%d\n",
                    (entry->etype & UVM_ET_SUBMAP) ? 'T' : 'F',
                    (entry->etype & UVM_ET_COPYONWRITE) ? 'T' : 'F', 
                    (entry->etype & UVM_ET_NEEDSCOPY) ? 'T' : 'F',
+                   (entry->etype & UVM_ET_STACK) ? 'T' : 'F',
                    entry->protection, entry->max_protection,
                    entry->inheritance, entry->wired_count, entry->advice);
 
@@ -4165,7 +4320,8 @@ uvm_map_extract(struct vm_map *srcmap, v
        for (entry = first; entry != NULL && entry->start < end;
            entry = RBT_NEXT(uvm_map_addr, entry)) {
                if (UVM_ET_ISNEEDSCOPY(entry))
-                       amap_copy(srcmap, entry, M_NOWAIT, TRUE, start, end);
+                       amap_copy(srcmap, entry, M_NOWAIT,
+                           UVM_ET_ISSTACK(entry) ? FALSE : TRUE, start, end);
                if (UVM_ET_ISNEEDSCOPY(entry)) {
                        /*
                         * amap_copy failure
Index: sys/uvm/uvm_map.h
===================================================================
RCS file: /cvs/src/sys/uvm/uvm_map.h,v
retrieving revision 1.59
diff -u -p -u -r1.59 uvm_map.h
--- sys/uvm/uvm_map.h   16 Sep 2016 03:39:25 -0000      1.59
+++ sys/uvm/uvm_map.h   4 Mar 2018 20:08:23 -0000
@@ -292,6 +292,7 @@ struct vm_map {
        struct pmap *           pmap;           /* Physical map */
        struct rwlock           lock;           /* Lock for map data */
        struct mutex            mtx;
+       u_int                   serial;         /* signals stack changes */
 
        struct uvm_map_addr     addr;           /* Entry tree, by addr */
 
@@ -393,6 +394,9 @@ int         uvm_map_inherit(vm_map_t, vaddr_t, 
 int            uvm_map_advice(vm_map_t, vaddr_t, vaddr_t, int);
 void           uvm_map_init(void);
 boolean_t      uvm_map_lookup_entry(vm_map_t, vaddr_t, vm_map_entry_t *);
+boolean_t      uvm_map_check_stack_range(struct proc *, vaddr_t sp);
+boolean_t      uvm_map_is_stack_remappable(vm_map_t, vaddr_t, vsize_t);
+int            uvm_map_remap_as_stack(struct proc *, vaddr_t, vsize_t);
 int            uvm_map_replace(vm_map_t, vaddr_t, vaddr_t,
                    vm_map_entry_t, int);
 int            uvm_map_reserve(vm_map_t, vsize_t, vaddr_t, vsize_t,
Index: sys/uvm/uvm_mmap.c
===================================================================
RCS file: /cvs/src/sys/uvm/uvm_mmap.c,v
retrieving revision 1.147
diff -u -p -u -r1.147 uvm_mmap.c
--- sys/uvm/uvm_mmap.c  19 Feb 2018 08:59:53 -0000      1.147
+++ sys/uvm/uvm_mmap.c  4 Mar 2018 20:08:23 -0000
@@ -375,7 +375,6 @@ sys_mmap(struct proc *p, void *v, regist
        size = (vsize_t) SCARG(uap, len);
        prot = SCARG(uap, prot);
        flags = SCARG(uap, flags);
-       flags &= ~MAP_STACK;    /* XXX MAP_STACK coming in 2018 */
        fd = SCARG(uap, fd);
        pos = SCARG(uap, pos);
 
@@ -394,6 +393,16 @@ sys_mmap(struct proc *p, void *v, regist
                return (EINVAL);
        if ((flags & (MAP_FIXED|__MAP_NOREPLACE)) == __MAP_NOREPLACE)
                return (EINVAL);
+       if (flags & MAP_STACK) {
+               if ((flags & (MAP_ANON|MAP_PRIVATE)) != (MAP_ANON|MAP_PRIVATE))
+                       return (EINVAL);
+               if (flags & ~(MAP_STACK|MAP_FIXED|MAP_ANON|MAP_PRIVATE))
+                       return (EINVAL);
+               if (pos != 0)
+                       return (EINVAL);
+               if ((prot & (PROT_READ|PROT_WRITE)) != (PROT_READ|PROT_WRITE))
+                       return (EINVAL);
+       }
        if (size == 0)
                return (EINVAL);
 
@@ -671,7 +680,6 @@ sys_munmap(struct proc *p, void *v, regi
 
        TAILQ_INIT(&dead_entries);
        uvm_unmap_remove(map, addr, addr + size, &dead_entries, FALSE, TRUE);
-
        vm_map_unlock(map);     /* and unlock */
 
        uvm_unmap_detach(&dead_entries, 0);
@@ -1049,6 +1057,8 @@ uvm_mmapanon(vm_map_t map, vaddr_t *addr
        else
                /* shared: create amap now */
                uvmflag |= UVM_FLAG_OVERLAY;
+       if (flags & MAP_STACK)
+               uvmflag |= UVM_FLAG_STACK;
 
        /* set up mapping flags */
        uvmflag = UVM_MAPFLAG(prot, maxprot,
@@ -1155,6 +1165,8 @@ uvm_mmapfile(vm_map_t map, vaddr_t *addr
                uvmflag |= UVM_FLAG_COPYONW;
        if (flags & __MAP_NOFAULT)
                uvmflag |= (UVM_FLAG_NOFAULT | UVM_FLAG_OVERLAY);
+       if (flags & MAP_STACK)
+               uvmflag |= UVM_FLAG_STACK;
 
        /* set up mapping flags */
        uvmflag = UVM_MAPFLAG(prot, maxprot,

Reply via email to