Otto Moerbeek <o...@drijf.net> wrote: > On Mon, Dec 06, 2021 at 05:59:41AM +0000, slembcke wrote: > > > So this is a fairly esoteric question, and I expect the answer might > > be just as esoteric. > > > > I have a little toy fiber/stackless coroutine library that I made a > > few years ago and have been using in some of my hobby projects. > > (https://github.com/slembcke/Tina) The recent 7.0 release rekindled my > > interest in OpenBSD, so I tried building my current game project for > > it, and I was quite pleased that it mostly just needed a few ifdefs > > changed. Well, except for the fiber library... There were two issues, > > the first was trivial to fix. I just had to switch to mmap() + > > MAP_STACK to allocate the stack buffers since OpenBSD requires it. The > > second I found a workaround for easily enough, but I'm really curious > > why it happens. If I mmap() a buffer, I can mark the final page as > > PROT_NONE as a guard page. Then if I set %rsp to the guard page's > > address and push a value, it subtracts and then writes the value just > > below the guard page as expected on Linux/FreeBSD/Mac. However, on > > OpenBSD this fails with a segfault. Either starting 16 bytes lower, or > > replacing the push instruction with a sub and a mov to emulate it > > works fine. So there is an easy workaround, but my question is why? I > > assumed pagefaults were implemented in hardware by the MMU as an > > interrupt or something during reads and writes. Is this a subtle bug > > in OpenBSD, and if it is, how does it even know since the write > > doesn't even affect the protected page?! (Protected memory is black > > magic already! Hah!) I would assume that whatever is happening here > > would have to be handled when initializing the stack for processes or > > threads too. > > > > - Scott > > > > We prefer to see code instead of stories. > > From your story it is not clear if yo are changing the protection > flags of a general buffer or a MAP_STACK region. I t is the latter, > there are extra restrictions on the stack pointer that are enforced in > the kernel on behalf of the user process. You should check > /var/log/messages
or type dmesg Upon every system call entry, both the PC and SP are range-checked against the object they point to, vaguely providing an addition kind of MMU flag bit. This check hinders a variety of ROP pivot methods. Setting the SP just outside the object is considered invalid, since it does not point *inside* an identifiable object, instead it points to the first byte inside unmapped or differently mapped address space, and a non-predecrement storage operation will act upon the wrong object. Otto is suggesting you have something like this in: [program]10549/243766 sp=7fa43021000 inside 7fa4301b000-7fa43021000: not MAP_STACK (For now, these errors are printed, like pledge errors are, in a few more years I hope to delete these publically visible identifiers and have authors recognize the failure mode without blatant logging). It is true the message says "inside", the check is [7fa4301b000,7fa43021000) but I don't want to use bracket notation, so the 7fa43021000 should be printed as 7fa43020fff. This should improve the message, and lets us keep using the easy to understand word "inside" (the format string for this message is in sys/sys/syscall_mi.h) Index: sys/uvm/uvm_map.c =================================================================== RCS file: /cvs/src/sys/uvm/uvm_map.c,v retrieving revision 1.279 diff -u -r1.279 uvm_map.c --- sys/uvm/uvm_map.c 24 Oct 2021 15:23:52 -0000 1.279 +++ sys/uvm/uvm_map.c 6 Dec 2021 16:26:26 -0000 @@ -1894,7 +1894,7 @@ if (!ok) { KERNEL_LOCK(); printf(fmt, p->p_p->ps_comm, p->p_p->ps_pid, p->p_tid, - addr, ie->ie_start, ie->ie_end); + addr, ie->ie_start, ie->ie_end-1); p->p_p->ps_acflag |= AMAP; sv.sival_ptr = (void *)PROC_PC(p); trapsignal(p, SIGSEGV, 0, SEGV_ACCERR, sv);