Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)
On Fri, Sep 7, 2012 at 9:40 PM, John Baldwin j...@freebsd.org wrote: On Friday, September 07, 2012 2:41:20 pm Konstantin Belousov wrote: I think these would be rare? There's no good reason for anything to write to a shared library that I can think of. install(1) does an atomic rename to swap in the new libraries already. After a second thought, I do not like your proposal as well. +x is set for shebang scripts, and allowing PROT_EXEC to set VV_TEXT for them means that such scripts are subject for write denial. Yeah, that's fair. Also, I hunted around to find the description of MAP_TEXT in Solaris 11. It seems from reading that that MAP_TEXT on Solaris isn't used to prevent writes ala VV_TEXT. Instead, it is used as a hint that is apparently used to use superpages for text. -- John Baldwin Hi, I'd like to finish this thread somehow. For security sake, it looks that bounding VV_TEXT with MAP_TEXT is not good idea. Now, I see only two possibilities how to solve the shared libraries issue in general. 1. To have one more permission flag, first for files on which VV_TEXT can be set and second for files on which VV_TEXT may not be set. 2. To activate shared libraries in kernel. The whole situation is following. There are two basic kinds of binaries in system. The first ones only need to be activated, the second ones need to be interpreted by an interpreter which is activated already. While activation is a concern of kernel and should be done in kernel, an interpretation is a concern of an interpreter and as such is done in userland. Unfortunately, even so different in nature, both share x+ permission and can't be distinguished by it. The shared libraries issue is that even they can be activated only, they are interpreted by dynamic linker instead. As VV_TEXT is kernel flag and can be set safely by kernel only, there is no way how to protect them by the flag in this situation. Svata ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)
On Tue, Sep 4, 2012 at 3:00 PM, Konstantin Belousov kostik...@gmail.com wrote: On Tue, Sep 04, 2012 at 02:49:07PM +0200, Svatopluk Kraus wrote: On Mon, Sep 3, 2012 at 2:46 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Sep 03, 2012 at 12:35:08PM +0200, Svatopluk Kraus wrote: Hi, I found out that while the running excecutables and a dynamic linker are protected against writing (ETXTBSY), the loaded shared libraries are not protected. The libraries are mapped by mmap() in dynamic linker (rtld) and there is no way how to set VV_TEXT flag on the libraries vnodes in mmap() code. In linux compability code \compat\linux\linux_misc.c, linux_uselib() sets VV_TEXT flags on a library vnode. In Solaris, MAP_TEXT flag exists which informs mmap() that the mapped region will be used primarily for executing instructions (for better MMU utilization). With these on mind, I propose to implement MAP_TEXT option in mmap() and in case that underlying object is a vnode, set VV_TEXT flag on it. I already have implemented it and with rtld map_object() patch it works fine for me (of course). The rtld patch looks easy, however I'm not sure about mmap patch. After some investigation, it looks that VV_TEXT once set on a vnode remains set until last reference on the vnode is left. So, I don't bother with VV_TEXT unset in munmap() to be consistent. The executables and dynamic linker are activated in kernel, so VV_TEXT is set before activation and cleared if something failed. Shared library activation is done in dynamic linker (i.e., in userland). It's done in steps and mmaping the library is one from them. So, I think that VV_TEXT can be set in mmap() just after everything is finished successfully. This is right, the object reference counter is also used as VV_TEXT counter. It is somewhat unaccurate, but in practice does not cause issues. The patch itself is implemented in vm_mmap_vnode(). If I want to set VV_TEXT flag on a vnode, I need an exclusive lock. In current code, the exclusive lock flag is (mis)used as a flag for vnode_pager_update_writecount() call. (I hope that I didn't miss something.) So, the patch is bigger slightly. I defined the MAP_TEXT flag in extented flags sections. However, I'm feeling the relation to MAP_STACK flag, but not sure if and when reserved flags (in other flags section) can be re-used. Svata Index: libexec/rtld-elf/map_object.c === --- libexec/rtld-elf/map_object.c (revision 239770) +++ libexec/rtld-elf/map_object.c (working copy) @@ -199,7 +199,8 @@ data_prot = convert_prot(segs[i]-p_flags); data_flags = convert_flags(segs[i]-p_flags) | MAP_FIXED; if (mmap(data_addr, data_vlimit - data_vaddr, data_prot, - data_flags | MAP_PREFAULT_READ, fd, data_offset) == (caddr_t) -1) { + data_flags | MAP_PREFAULT_READ | MAP_TEXT, fd, data_offset) == + (caddr_t) -1) { I am not sure that we shall mark all segments mappings with MAP_TEXT. I understand the logic of the change, since we do not want data segment to be changed under us. Still, having MAP_TEXT for non-text segments looks strange. I agree. However, only way how to recognize a text segment is an executable flag set. The new patch for map_object.c is following: Index: libexec/rtld-elf/map_object.c === --- libexec/rtld-elf/map_object.c (revision 239770) +++ libexec/rtld-elf/map_object.c (working copy) @@ -442,5 +442,10 @@ */ if (!(elfflags PF_W)) flags |= MAP_NOCORE; +/* + * Executable mappings are marked MAP_TEXT. + */ +if (elfflags PF_X) + flags |= MAP_TEXT; return flags; } _rtld_error(%s: mmap of data failed: %s, path, rtld_strerror(errno)); goto error1; Index: sys/vm/vm_mmap.c === --- sys/vm/vm_mmap.c (revision 239770) +++ sys/vm/vm_mmap.c (working copy) @@ -1258,10 +1258,13 @@ struct mount *mp; struct ucred *cred; int error, flags, locktype, vfslocked; + int writeable_shared; mp = vp-v_mount; cred = td-td_ucred; - if ((*maxprotp VM_PROT_WRITE) (*flagsp MAP_SHARED)) + flags = *flagsp; + writeable_shared = ((*maxprotp VM_PROT_WRITE) (flags MAP_SHARED)); + if (writeable_shared || ((flags MAP_TEXT) != 0)) locktype = LK_EXCLUSIVE; else locktype = LK_SHARED; @@ -1271,7 +1274,6 @@ return (error); } foff = *foffp; - flags = *flagsp; obj = vp-v_object; if (vp-v_type == VREG) { /* @@ -1294,7 +1296,7 @@ return (error);
Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)
On Tue, Sep 4, 2012 at 6:00 PM, John Baldwin j...@freebsd.org wrote: On Tuesday, September 04, 2012 9:00:39 am Konstantin Belousov wrote: On Tue, Sep 04, 2012 at 02:49:07PM +0200, Svatopluk Kraus wrote: On Mon, Sep 3, 2012 at 2:46 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Sep 03, 2012 at 12:35:08PM +0200, Svatopluk Kraus wrote: Hi, I found out that while the running excecutables and a dynamic linker are protected against writing (ETXTBSY), the loaded shared libraries are not protected. The libraries are mapped by mmap() in dynamic linker (rtld) and there is no way how to set VV_TEXT flag on the libraries vnodes in mmap() code. In linux compability code \compat\linux\linux_misc.c, linux_uselib() sets VV_TEXT flags on a library vnode. In Solaris, MAP_TEXT flag exists which informs mmap() that the mapped region will be used primarily for executing instructions (for better MMU utilization). With these on mind, I propose to implement MAP_TEXT option in mmap() and in case that underlying object is a vnode, set VV_TEXT flag on it. I already have implemented it and with rtld map_object() patch it works fine for me (of course). The rtld patch looks easy, however I'm not sure about mmap patch. After some investigation, it looks that VV_TEXT once set on a vnode remains set until last reference on the vnode is left. So, I don't bother with VV_TEXT unset in munmap() to be consistent. The executables and dynamic linker are activated in kernel, so VV_TEXT is set before activation and cleared if something failed. Shared library activation is done in dynamic linker (i.e., in userland). It's done in steps and mmaping the library is one from them. So, I think that VV_TEXT can be set in mmap() just after everything is finished successfully. This is right, the object reference counter is also used as VV_TEXT counter. It is somewhat unaccurate, but in practice does not cause issues. The patch itself is implemented in vm_mmap_vnode(). If I want to set VV_TEXT flag on a vnode, I need an exclusive lock. In current code, the exclusive lock flag is (mis)used as a flag for vnode_pager_update_writecount() call. (I hope that I didn't miss something.) So, the patch is bigger slightly. I defined the MAP_TEXT flag in extented flags sections. However, I'm feeling the relation to MAP_STACK flag, but not sure if and when reserved flags (in other flags section) can be re-used. Svata Index: libexec/rtld-elf/map_object.c === --- libexec/rtld-elf/map_object.c (revision 239770) +++ libexec/rtld-elf/map_object.c (working copy) @@ -199,7 +199,8 @@ data_prot = convert_prot(segs[i]-p_flags); data_flags = convert_flags(segs[i]-p_flags) | MAP_FIXED; if (mmap(data_addr, data_vlimit - data_vaddr, data_prot, - data_flags | MAP_PREFAULT_READ, fd, data_offset) == (caddr_t) -1) { + data_flags | MAP_PREFAULT_READ | MAP_TEXT, fd, data_offset) == + (caddr_t) -1) { I am not sure that we shall mark all segments mappings with MAP_TEXT. I understand the logic of the change, since we do not want data segment to be changed under us. Still, having MAP_TEXT for non-text segments looks strange. I agree. However, only way how to recognize a text segment is an executable flag set. The new patch for map_object.c is following: Index: libexec/rtld-elf/map_object.c === --- libexec/rtld-elf/map_object.c (revision 239770) +++ libexec/rtld-elf/map_object.c (working copy) @@ -442,5 +442,10 @@ */ if (!(elfflags PF_W)) flags |= MAP_NOCORE; +/* + * Executable mappings are marked MAP_TEXT. + */ +if (elfflags PF_X) + flags |= MAP_TEXT; return flags; } _rtld_error(%s: mmap of data failed: %s, path, rtld_strerror(errno)); goto error1; Index: sys/vm/vm_mmap.c === --- sys/vm/vm_mmap.c (revision 239770) +++ sys/vm/vm_mmap.c (working copy) @@ -1258,10 +1258,13 @@ struct mount *mp; struct ucred *cred; int error, flags, locktype, vfslocked; + int writeable_shared; mp = vp-v_mount; cred = td-td_ucred; - if ((*maxprotp VM_PROT_WRITE) (*flagsp MAP_SHARED)) + flags = *flagsp; + writeable_shared = ((*maxprotp VM_PROT_WRITE) (flags MAP_SHARED)); + if (writeable_shared || ((flags MAP_TEXT) != 0)) locktype = LK_EXCLUSIVE; else locktype = LK_SHARED; @@ -1271,7 +1274,6 @@ return (error); } foff = *foffp; - flags =
Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)
On Fri, Sep 07, 2012 at 05:12:37PM +0200, Svatopluk Kraus wrote: On Tue, Sep 4, 2012 at 6:00 PM, John Baldwin j...@freebsd.org wrote: On Tuesday, September 04, 2012 9:00:39 am Konstantin Belousov wrote: On Tue, Sep 04, 2012 at 02:49:07PM +0200, Svatopluk Kraus wrote: On Mon, Sep 3, 2012 at 2:46 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Sep 03, 2012 at 12:35:08PM +0200, Svatopluk Kraus wrote: Hi, I found out that while the running excecutables and a dynamic linker are protected against writing (ETXTBSY), the loaded shared libraries are not protected. The libraries are mapped by mmap() in dynamic linker (rtld) and there is no way how to set VV_TEXT flag on the libraries vnodes in mmap() code. In linux compability code \compat\linux\linux_misc.c, linux_uselib() sets VV_TEXT flags on a library vnode. In Solaris, MAP_TEXT flag exists which informs mmap() that the mapped region will be used primarily for executing instructions (for better MMU utilization). With these on mind, I propose to implement MAP_TEXT option in mmap() and in case that underlying object is a vnode, set VV_TEXT flag on it. I already have implemented it and with rtld map_object() patch it works fine for me (of course). The rtld patch looks easy, however I'm not sure about mmap patch. After some investigation, it looks that VV_TEXT once set on a vnode remains set until last reference on the vnode is left. So, I don't bother with VV_TEXT unset in munmap() to be consistent. The executables and dynamic linker are activated in kernel, so VV_TEXT is set before activation and cleared if something failed. Shared library activation is done in dynamic linker (i.e., in userland). It's done in steps and mmaping the library is one from them. So, I think that VV_TEXT can be set in mmap() just after everything is finished successfully. This is right, the object reference counter is also used as VV_TEXT counter. It is somewhat unaccurate, but in practice does not cause issues. The patch itself is implemented in vm_mmap_vnode(). If I want to set VV_TEXT flag on a vnode, I need an exclusive lock. In current code, the exclusive lock flag is (mis)used as a flag for vnode_pager_update_writecount() call. (I hope that I didn't miss something.) So, the patch is bigger slightly. I defined the MAP_TEXT flag in extented flags sections. However, I'm feeling the relation to MAP_STACK flag, but not sure if and when reserved flags (in other flags section) can be re-used. Svata Index: libexec/rtld-elf/map_object.c === --- libexec/rtld-elf/map_object.c (revision 239770) +++ libexec/rtld-elf/map_object.c (working copy) @@ -199,7 +199,8 @@ data_prot = convert_prot(segs[i]-p_flags); data_flags = convert_flags(segs[i]-p_flags) | MAP_FIXED; if (mmap(data_addr, data_vlimit - data_vaddr, data_prot, - data_flags | MAP_PREFAULT_READ, fd, data_offset) == (caddr_t) -1) { + data_flags | MAP_PREFAULT_READ | MAP_TEXT, fd, data_offset) == + (caddr_t) -1) { I am not sure that we shall mark all segments mappings with MAP_TEXT. I understand the logic of the change, since we do not want data segment to be changed under us. Still, having MAP_TEXT for non-text segments looks strange. I agree. However, only way how to recognize a text segment is an executable flag set. The new patch for map_object.c is following: Index: libexec/rtld-elf/map_object.c === --- libexec/rtld-elf/map_object.c (revision 239770) +++ libexec/rtld-elf/map_object.c (working copy) @@ -442,5 +442,10 @@ */ if (!(elfflags PF_W)) flags |= MAP_NOCORE; +/* + * Executable mappings are marked MAP_TEXT. + */ +if (elfflags PF_X) + flags |= MAP_TEXT; return flags; } _rtld_error(%s: mmap of data failed: %s, path, rtld_strerror(errno)); goto error1; Index: sys/vm/vm_mmap.c === --- sys/vm/vm_mmap.c (revision 239770) +++ sys/vm/vm_mmap.c (working copy) @@ -1258,10 +1258,13 @@ struct mount *mp; struct ucred *cred; int error, flags, locktype, vfslocked; + int writeable_shared; mp = vp-v_mount; cred = td-td_ucred; - if ((*maxprotp VM_PROT_WRITE) (*flagsp MAP_SHARED)) + flags = *flagsp; + writeable_shared = ((*maxprotp VM_PROT_WRITE) (flags MAP_SHARED)); + if (writeable_shared || ((flags MAP_TEXT) != 0)) locktype =
Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)
On Friday, September 07, 2012 12:02:08 pm Konstantin Belousov wrote: On Fri, Sep 07, 2012 at 05:12:37PM +0200, Svatopluk Kraus wrote: On Tue, Sep 4, 2012 at 6:00 PM, John Baldwin j...@freebsd.org wrote: On Tuesday, September 04, 2012 9:00:39 am Konstantin Belousov wrote: 2. I do not see what would prevent malicious local user from mmaping arbitrary file readonly with MAP_TEXT, thus blocking any modifications to the file. Note that this is not a problem for executables, because kernel only sets VV_TEXT on executables if +x permission is set and file is valid binary which kernel is able to execute. E.g. you might block log writes with VV_TEXT, or other user editing session or whatever, having just read access to corresponding files. Am I wrong ? Hmm, I do think 2) is a bit of a show-stopper. I do wonder why one needs MAP_TEXT at all or if you could key this off of mmap() with PROT_EXEC? Do we require +x permissions for PROT_EXEC? No, it seems we only require a file opened with FREAD. Hmm, perhaps rtld could open a separate fd for PROT_EXEC mappings that used O_EXEC and mmap()'ing an O_EXEC fd could enable VV_TEXT? That would require a file to have +x permisson for an mmap() to enable VV_TEXT. It would also make MAP_TEXT unneeded. It sounds good for me. I will try to patch it this way. However, do you think that will be acceptable to set +x permission to shared libraries in general? Shared libraries have +x by default already. You could make rtld fall back to using the O_RDONLY fd if it can't open an O_EXEC fd in which case you don't get the VV_TEXT protection, but rtld would be backwards compatible. Setting +x on shared libraries can be done. But setting VV_TEXT for such mappings is definitely non-standard behaviour, that could cause locking surprises for unaware system administrator. The issuw would be very hard to diagnose. I'm not sure it would be that obscure. It's the same as getting an error that a binary is busy. In that case you would resort to 'ps' to find the offending process. For this case (a shared library being busy), you could look at the output of 'procstat -af' to find which processes have executable mappings of the library. -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)
On Fri, Sep 07, 2012 at 12:21:37PM -0400, John Baldwin wrote: On Friday, September 07, 2012 12:02:08 pm Konstantin Belousov wrote: On Fri, Sep 07, 2012 at 05:12:37PM +0200, Svatopluk Kraus wrote: On Tue, Sep 4, 2012 at 6:00 PM, John Baldwin j...@freebsd.org wrote: On Tuesday, September 04, 2012 9:00:39 am Konstantin Belousov wrote: 2. I do not see what would prevent malicious local user from mmaping arbitrary file readonly with MAP_TEXT, thus blocking any modifications to the file. Note that this is not a problem for executables, because kernel only sets VV_TEXT on executables if +x permission is set and file is valid binary which kernel is able to execute. E.g. you might block log writes with VV_TEXT, or other user editing session or whatever, having just read access to corresponding files. Am I wrong ? Hmm, I do think 2) is a bit of a show-stopper. I do wonder why one needs MAP_TEXT at all or if you could key this off of mmap() with PROT_EXEC? Do we require +x permissions for PROT_EXEC? No, it seems we only require a file opened with FREAD. Hmm, perhaps rtld could open a separate fd for PROT_EXEC mappings that used O_EXEC and mmap()'ing an O_EXEC fd could enable VV_TEXT? That would require a file to have +x permisson for an mmap() to enable VV_TEXT. It would also make MAP_TEXT unneeded. It sounds good for me. I will try to patch it this way. However, do you think that will be acceptable to set +x permission to shared libraries in general? Shared libraries have +x by default already. You could make rtld fall back to using the O_RDONLY fd if it can't open an O_EXEC fd in which case you don't get the VV_TEXT protection, but rtld would be backwards compatible. Where is it ? On fresh stable/9 machine, I have in /lib: -r--r--r-- 1 root wheel 7216 Dec 3 2011 libipx.so.5 -r--r--r-- 1 root wheel19800 Jun 28 18:33 libjail.so.1 -r--r--r-- 1 root wheel13752 Jun 28 18:33 libkiconv.so.4 ... Setting +x on shared libraries can be done. But setting VV_TEXT for such mappings is definitely non-standard behaviour, that could cause locking surprises for unaware system administrator. The issuw would be very hard to diagnose. I'm not sure it would be that obscure. It's the same as getting an error that a binary is busy. In that case you would resort to 'ps' to find the offending process. For this case (a shared library being busy), you could look at the output of 'procstat -af' to find which processes have executable mappings of the library. I much more worry about rtld reporting 'shared library is busy'. It is fine to not be able to write to mapped library. Rtld errors are usually quite worrying for users, and they just do not understand them. pgpjX3VimO77U.pgp Description: PGP signature
Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)
On Friday, September 07, 2012 12:42:18 pm Konstantin Belousov wrote: On Fri, Sep 07, 2012 at 12:21:37PM -0400, John Baldwin wrote: On Friday, September 07, 2012 12:02:08 pm Konstantin Belousov wrote: On Fri, Sep 07, 2012 at 05:12:37PM +0200, Svatopluk Kraus wrote: On Tue, Sep 4, 2012 at 6:00 PM, John Baldwin j...@freebsd.org wrote: On Tuesday, September 04, 2012 9:00:39 am Konstantin Belousov wrote: 2. I do not see what would prevent malicious local user from mmaping arbitrary file readonly with MAP_TEXT, thus blocking any modifications to the file. Note that this is not a problem for executables, because kernel only sets VV_TEXT on executables if +x permission is set and file is valid binary which kernel is able to execute. E.g. you might block log writes with VV_TEXT, or other user editing session or whatever, having just read access to corresponding files. Am I wrong ? Hmm, I do think 2) is a bit of a show-stopper. I do wonder why one needs MAP_TEXT at all or if you could key this off of mmap() with PROT_EXEC? Do we require +x permissions for PROT_EXEC? No, it seems we only require a file opened with FREAD. Hmm, perhaps rtld could open a separate fd for PROT_EXEC mappings that used O_EXEC and mmap()'ing an O_EXEC fd could enable VV_TEXT? That would require a file to have +x permisson for an mmap() to enable VV_TEXT. It would also make MAP_TEXT unneeded. It sounds good for me. I will try to patch it this way. However, do you think that will be acceptable to set +x permission to shared libraries in general? Shared libraries have +x by default already. You could make rtld fall back to using the O_RDONLY fd if it can't open an O_EXEC fd in which case you don't get the VV_TEXT protection, but rtld would be backwards compatible. Where is it ? On fresh stable/9 machine, I have in /lib: -r--r--r-- 1 root wheel 7216 Dec 3 2011 libipx.so.5 -r--r--r-- 1 root wheel19800 Jun 28 18:33 libjail.so.1 -r--r--r-- 1 root wheel13752 Jun 28 18:33 libkiconv.so.4 ... Hmm, I was looking in /usr/local/lib. It seems at least some ports do install libraries with +x: -rwxr-xr-x 1 root wheel210494 Apr 8 2011 /usr/local/lib/libxml++-2.6.so.2 -rwxr-xr-x 1 root wheel 1515416 Apr 8 2011 /usr/local/lib/libxml2.so.5 -rwxr-xr-x 1 root wheel 44389 Apr 8 2011 /usr/local/lib/libxmlparse.so.1 -rwxr-xr-x 1 root wheel100672 Apr 8 2011 /usr/local/lib/libxmltok.so.1 -rwxr-xr-x 1 root wheel266775 Apr 8 2011 /usr/local/lib/libxslt.so.2 -rw-r--r-- 1 root wheel764602 Apr 8 2011 /usr/local/lib/libxvidcore.so.4 -rwxr-xr-x 1 root wheel 53379 Apr 9 2011 /usr/local/lib/libzip.so.1 Setting +x on shared libraries can be done. But setting VV_TEXT for such mappings is definitely non-standard behaviour, that could cause locking surprises for unaware system administrator. The issuw would be very hard to diagnose. I'm not sure it would be that obscure. It's the same as getting an error that a binary is busy. In that case you would resort to 'ps' to find the offending process. For this case (a shared library being busy), you could look at the output of 'procstat -af' to find which processes have executable mappings of the library. I much more worry about rtld reporting 'shared library is busy'. It is fine to not be able to write to mapped library. Rtld errors are usually quite worrying for users, and they just do not understand them. I think these would be rare? There's no good reason for anything to write to a shared library that I can think of. install(1) does an atomic rename to swap in the new libraries already. -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)
On Fri, Sep 07, 2012 at 02:05:28PM -0400, John Baldwin wrote: On Friday, September 07, 2012 12:42:18 pm Konstantin Belousov wrote: On Fri, Sep 07, 2012 at 12:21:37PM -0400, John Baldwin wrote: On Friday, September 07, 2012 12:02:08 pm Konstantin Belousov wrote: On Fri, Sep 07, 2012 at 05:12:37PM +0200, Svatopluk Kraus wrote: On Tue, Sep 4, 2012 at 6:00 PM, John Baldwin j...@freebsd.org wrote: On Tuesday, September 04, 2012 9:00:39 am Konstantin Belousov wrote: 2. I do not see what would prevent malicious local user from mmaping arbitrary file readonly with MAP_TEXT, thus blocking any modifications to the file. Note that this is not a problem for executables, because kernel only sets VV_TEXT on executables if +x permission is set and file is valid binary which kernel is able to execute. E.g. you might block log writes with VV_TEXT, or other user editing session or whatever, having just read access to corresponding files. Am I wrong ? Hmm, I do think 2) is a bit of a show-stopper. I do wonder why one needs MAP_TEXT at all or if you could key this off of mmap() with PROT_EXEC? Do we require +x permissions for PROT_EXEC? No, it seems we only require a file opened with FREAD. Hmm, perhaps rtld could open a separate fd for PROT_EXEC mappings that used O_EXEC and mmap()'ing an O_EXEC fd could enable VV_TEXT? That would require a file to have +x permisson for an mmap() to enable VV_TEXT. It would also make MAP_TEXT unneeded. It sounds good for me. I will try to patch it this way. However, do you think that will be acceptable to set +x permission to shared libraries in general? Shared libraries have +x by default already. You could make rtld fall back to using the O_RDONLY fd if it can't open an O_EXEC fd in which case you don't get the VV_TEXT protection, but rtld would be backwards compatible. Where is it ? On fresh stable/9 machine, I have in /lib: -r--r--r-- 1 root wheel 7216 Dec 3 2011 libipx.so.5 -r--r--r-- 1 root wheel19800 Jun 28 18:33 libjail.so.1 -r--r--r-- 1 root wheel13752 Jun 28 18:33 libkiconv.so.4 ... Hmm, I was looking in /usr/local/lib. It seems at least some ports do install libraries with +x: -rwxr-xr-x 1 root wheel210494 Apr 8 2011 /usr/local/lib/libxml++-2.6.so.2 -rwxr-xr-x 1 root wheel 1515416 Apr 8 2011 /usr/local/lib/libxml2.so.5 -rwxr-xr-x 1 root wheel 44389 Apr 8 2011 /usr/local/lib/libxmlparse.so.1 -rwxr-xr-x 1 root wheel100672 Apr 8 2011 /usr/local/lib/libxmltok.so.1 -rwxr-xr-x 1 root wheel266775 Apr 8 2011 /usr/local/lib/libxslt.so.2 -rw-r--r-- 1 root wheel764602 Apr 8 2011 /usr/local/lib/libxvidcore.so.4 -rwxr-xr-x 1 root wheel 53379 Apr 9 2011 /usr/local/lib/libzip.so.1 Setting +x on shared libraries can be done. But setting VV_TEXT for such mappings is definitely non-standard behaviour, that could cause locking surprises for unaware system administrator. The issuw would be very hard to diagnose. I'm not sure it would be that obscure. It's the same as getting an error that a binary is busy. In that case you would resort to 'ps' to find the offending process. For this case (a shared library being busy), you could look at the output of 'procstat -af' to find which processes have executable mappings of the library. I much more worry about rtld reporting 'shared library is busy'. It is fine to not be able to write to mapped library. Rtld errors are usually quite worrying for users, and they just do not understand them. I think these would be rare? There's no good reason for anything to write to a shared library that I can think of. install(1) does an atomic rename to swap in the new libraries already. After a second thought, I do not like your proposal as well. +x is set for shebang scripts, and allowing PROT_EXEC to set VV_TEXT for them means that such scripts are subject for write denial. pgphor0hI2qC3.pgp Description: PGP signature
Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)
On Fri, 2012-09-07 at 21:41 +0300, Konstantin Belousov wrote: After a second thought, I do not like your proposal as well. +x is set for shebang scripts, and allowing PROT_EXEC to set VV_TEXT for them means that such scripts are subject for write denial. You say that like it's a bad thing. I hate it when I accidentally edit a script that's running and then the script fails because I did. I would be much happier if it acted just like any other executable and prevented modification while it's running. -- Ian ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)
On Fri, Sep 07, 2012 at 12:48:19PM -0600, Ian Lepore wrote: On Fri, 2012-09-07 at 21:41 +0300, Konstantin Belousov wrote: After a second thought, I do not like your proposal as well. +x is set for shebang scripts, and allowing PROT_EXEC to set VV_TEXT for them means that such scripts are subject for write denial. You say that like it's a bad thing. I hate it when I accidentally edit a script that's running and then the script fails because I did. I would be much happier if it acted just like any other executable and prevented modification while it's running. For me, if other user can block my modifications of my script by the mere fact that script has o+rx rights, is indeed bad. I do use real machines sometime. pgp9lb5pP7Whf.pgp Description: PGP signature
Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)
On Fri, 2012-09-07 at 21:53 +0300, Konstantin Belousov wrote: On Fri, Sep 07, 2012 at 12:48:19PM -0600, Ian Lepore wrote: On Fri, 2012-09-07 at 21:41 +0300, Konstantin Belousov wrote: After a second thought, I do not like your proposal as well. +x is set for shebang scripts, and allowing PROT_EXEC to set VV_TEXT for them means that such scripts are subject for write denial. You say that like it's a bad thing. I hate it when I accidentally edit a script that's running and then the script fails because I did. I would be much happier if it acted just like any other executable and prevented modification while it's running. For me, if other user can block my modifications of my script by the mere fact that script has o+rx rights, is indeed bad. I do use real machines sometime. But you don't feel the same way about a compiled program? I see absolutely no difference between the two, conceptually. To me, changing an application while it's running is bad. It makes no difference what language the application is written in. -- Ian ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)
On Friday, September 07, 2012 2:41:20 pm Konstantin Belousov wrote: I think these would be rare? There's no good reason for anything to write to a shared library that I can think of. install(1) does an atomic rename to swap in the new libraries already. After a second thought, I do not like your proposal as well. +x is set for shebang scripts, and allowing PROT_EXEC to set VV_TEXT for them means that such scripts are subject for write denial. Yeah, that's fair. Also, I hunted around to find the description of MAP_TEXT in Solaris 11. It seems from reading that that MAP_TEXT on Solaris isn't used to prevent writes ala VV_TEXT. Instead, it is used as a hint that is apparently used to use superpages for text. -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)
On Mon, Sep 3, 2012 at 2:46 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Sep 03, 2012 at 12:35:08PM +0200, Svatopluk Kraus wrote: Hi, I found out that while the running excecutables and a dynamic linker are protected against writing (ETXTBSY), the loaded shared libraries are not protected. The libraries are mapped by mmap() in dynamic linker (rtld) and there is no way how to set VV_TEXT flag on the libraries vnodes in mmap() code. In linux compability code \compat\linux\linux_misc.c, linux_uselib() sets VV_TEXT flags on a library vnode. In Solaris, MAP_TEXT flag exists which informs mmap() that the mapped region will be used primarily for executing instructions (for better MMU utilization). With these on mind, I propose to implement MAP_TEXT option in mmap() and in case that underlying object is a vnode, set VV_TEXT flag on it. I already have implemented it and with rtld map_object() patch it works fine for me (of course). The rtld patch looks easy, however I'm not sure about mmap patch. After some investigation, it looks that VV_TEXT once set on a vnode remains set until last reference on the vnode is left. So, I don't bother with VV_TEXT unset in munmap() to be consistent. The executables and dynamic linker are activated in kernel, so VV_TEXT is set before activation and cleared if something failed. Shared library activation is done in dynamic linker (i.e., in userland). It's done in steps and mmaping the library is one from them. So, I think that VV_TEXT can be set in mmap() just after everything is finished successfully. This is right, the object reference counter is also used as VV_TEXT counter. It is somewhat unaccurate, but in practice does not cause issues. The patch itself is implemented in vm_mmap_vnode(). If I want to set VV_TEXT flag on a vnode, I need an exclusive lock. In current code, the exclusive lock flag is (mis)used as a flag for vnode_pager_update_writecount() call. (I hope that I didn't miss something.) So, the patch is bigger slightly. I defined the MAP_TEXT flag in extented flags sections. However, I'm feeling the relation to MAP_STACK flag, but not sure if and when reserved flags (in other flags section) can be re-used. Svata Index: libexec/rtld-elf/map_object.c === --- libexec/rtld-elf/map_object.c (revision 239770) +++ libexec/rtld-elf/map_object.c (working copy) @@ -199,7 +199,8 @@ data_prot = convert_prot(segs[i]-p_flags); data_flags = convert_flags(segs[i]-p_flags) | MAP_FIXED; if (mmap(data_addr, data_vlimit - data_vaddr, data_prot, - data_flags | MAP_PREFAULT_READ, fd, data_offset) == (caddr_t) -1) { + data_flags | MAP_PREFAULT_READ | MAP_TEXT, fd, data_offset) == + (caddr_t) -1) { I am not sure that we shall mark all segments mappings with MAP_TEXT. I understand the logic of the change, since we do not want data segment to be changed under us. Still, having MAP_TEXT for non-text segments looks strange. I agree. However, only way how to recognize a text segment is an executable flag set. The new patch for map_object.c is following: Index: libexec/rtld-elf/map_object.c === --- libexec/rtld-elf/map_object.c (revision 239770) +++ libexec/rtld-elf/map_object.c (working copy) @@ -442,5 +442,10 @@ */ if (!(elfflags PF_W)) flags |= MAP_NOCORE; +/* + * Executable mappings are marked MAP_TEXT. + */ +if (elfflags PF_X) + flags |= MAP_TEXT; return flags; } _rtld_error(%s: mmap of data failed: %s, path, rtld_strerror(errno)); goto error1; Index: sys/vm/vm_mmap.c === --- sys/vm/vm_mmap.c (revision 239770) +++ sys/vm/vm_mmap.c (working copy) @@ -1258,10 +1258,13 @@ struct mount *mp; struct ucred *cred; int error, flags, locktype, vfslocked; + int writeable_shared; mp = vp-v_mount; cred = td-td_ucred; - if ((*maxprotp VM_PROT_WRITE) (*flagsp MAP_SHARED)) + flags = *flagsp; + writeable_shared = ((*maxprotp VM_PROT_WRITE) (flags MAP_SHARED)); + if (writeable_shared || ((flags MAP_TEXT) != 0)) locktype = LK_EXCLUSIVE; else locktype = LK_SHARED; @@ -1271,7 +1274,6 @@ return (error); } foff = *foffp; - flags = *flagsp; obj = vp-v_object; if (vp-v_type == VREG) { /* @@ -1294,7 +1296,7 @@ return (error); } } - if (locktype == LK_EXCLUSIVE) { + if (writeable_shared) { *writecounted = TRUE; vnode_pager_update_writecount(obj, 0, objsize); }
Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)
On Tue, Sep 04, 2012 at 02:49:07PM +0200, Svatopluk Kraus wrote: On Mon, Sep 3, 2012 at 2:46 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Sep 03, 2012 at 12:35:08PM +0200, Svatopluk Kraus wrote: Hi, I found out that while the running excecutables and a dynamic linker are protected against writing (ETXTBSY), the loaded shared libraries are not protected. The libraries are mapped by mmap() in dynamic linker (rtld) and there is no way how to set VV_TEXT flag on the libraries vnodes in mmap() code. In linux compability code \compat\linux\linux_misc.c, linux_uselib() sets VV_TEXT flags on a library vnode. In Solaris, MAP_TEXT flag exists which informs mmap() that the mapped region will be used primarily for executing instructions (for better MMU utilization). With these on mind, I propose to implement MAP_TEXT option in mmap() and in case that underlying object is a vnode, set VV_TEXT flag on it. I already have implemented it and with rtld map_object() patch it works fine for me (of course). The rtld patch looks easy, however I'm not sure about mmap patch. After some investigation, it looks that VV_TEXT once set on a vnode remains set until last reference on the vnode is left. So, I don't bother with VV_TEXT unset in munmap() to be consistent. The executables and dynamic linker are activated in kernel, so VV_TEXT is set before activation and cleared if something failed. Shared library activation is done in dynamic linker (i.e., in userland). It's done in steps and mmaping the library is one from them. So, I think that VV_TEXT can be set in mmap() just after everything is finished successfully. This is right, the object reference counter is also used as VV_TEXT counter. It is somewhat unaccurate, but in practice does not cause issues. The patch itself is implemented in vm_mmap_vnode(). If I want to set VV_TEXT flag on a vnode, I need an exclusive lock. In current code, the exclusive lock flag is (mis)used as a flag for vnode_pager_update_writecount() call. (I hope that I didn't miss something.) So, the patch is bigger slightly. I defined the MAP_TEXT flag in extented flags sections. However, I'm feeling the relation to MAP_STACK flag, but not sure if and when reserved flags (in other flags section) can be re-used. Svata Index: libexec/rtld-elf/map_object.c === --- libexec/rtld-elf/map_object.c (revision 239770) +++ libexec/rtld-elf/map_object.c (working copy) @@ -199,7 +199,8 @@ data_prot = convert_prot(segs[i]-p_flags); data_flags = convert_flags(segs[i]-p_flags) | MAP_FIXED; if (mmap(data_addr, data_vlimit - data_vaddr, data_prot, - data_flags | MAP_PREFAULT_READ, fd, data_offset) == (caddr_t) -1) { + data_flags | MAP_PREFAULT_READ | MAP_TEXT, fd, data_offset) == + (caddr_t) -1) { I am not sure that we shall mark all segments mappings with MAP_TEXT. I understand the logic of the change, since we do not want data segment to be changed under us. Still, having MAP_TEXT for non-text segments looks strange. I agree. However, only way how to recognize a text segment is an executable flag set. The new patch for map_object.c is following: Index: libexec/rtld-elf/map_object.c === --- libexec/rtld-elf/map_object.c (revision 239770) +++ libexec/rtld-elf/map_object.c (working copy) @@ -442,5 +442,10 @@ */ if (!(elfflags PF_W)) flags |= MAP_NOCORE; +/* + * Executable mappings are marked MAP_TEXT. + */ +if (elfflags PF_X) + flags |= MAP_TEXT; return flags; } _rtld_error(%s: mmap of data failed: %s, path, rtld_strerror(errno)); goto error1; Index: sys/vm/vm_mmap.c === --- sys/vm/vm_mmap.c (revision 239770) +++ sys/vm/vm_mmap.c (working copy) @@ -1258,10 +1258,13 @@ struct mount *mp; struct ucred *cred; int error, flags, locktype, vfslocked; + int writeable_shared; mp = vp-v_mount; cred = td-td_ucred; - if ((*maxprotp VM_PROT_WRITE) (*flagsp MAP_SHARED)) + flags = *flagsp; + writeable_shared = ((*maxprotp VM_PROT_WRITE) (flags MAP_SHARED)); + if (writeable_shared || ((flags MAP_TEXT) != 0)) locktype = LK_EXCLUSIVE; else locktype = LK_SHARED; @@ -1271,7 +1274,6 @@ return (error); } foff = *foffp; - flags = *flagsp; obj = vp-v_object; if (vp-v_type == VREG) { /* @@ -1294,7 +1296,7 @@ return (error); } } - if (locktype ==
Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)
On Tuesday, September 04, 2012 9:00:39 am Konstantin Belousov wrote: On Tue, Sep 04, 2012 at 02:49:07PM +0200, Svatopluk Kraus wrote: On Mon, Sep 3, 2012 at 2:46 PM, Konstantin Belousov kostik...@gmail.com wrote: On Mon, Sep 03, 2012 at 12:35:08PM +0200, Svatopluk Kraus wrote: Hi, I found out that while the running excecutables and a dynamic linker are protected against writing (ETXTBSY), the loaded shared libraries are not protected. The libraries are mapped by mmap() in dynamic linker (rtld) and there is no way how to set VV_TEXT flag on the libraries vnodes in mmap() code. In linux compability code \compat\linux\linux_misc.c, linux_uselib() sets VV_TEXT flags on a library vnode. In Solaris, MAP_TEXT flag exists which informs mmap() that the mapped region will be used primarily for executing instructions (for better MMU utilization). With these on mind, I propose to implement MAP_TEXT option in mmap() and in case that underlying object is a vnode, set VV_TEXT flag on it. I already have implemented it and with rtld map_object() patch it works fine for me (of course). The rtld patch looks easy, however I'm not sure about mmap patch. After some investigation, it looks that VV_TEXT once set on a vnode remains set until last reference on the vnode is left. So, I don't bother with VV_TEXT unset in munmap() to be consistent. The executables and dynamic linker are activated in kernel, so VV_TEXT is set before activation and cleared if something failed. Shared library activation is done in dynamic linker (i.e., in userland). It's done in steps and mmaping the library is one from them. So, I think that VV_TEXT can be set in mmap() just after everything is finished successfully. This is right, the object reference counter is also used as VV_TEXT counter. It is somewhat unaccurate, but in practice does not cause issues. The patch itself is implemented in vm_mmap_vnode(). If I want to set VV_TEXT flag on a vnode, I need an exclusive lock. In current code, the exclusive lock flag is (mis)used as a flag for vnode_pager_update_writecount() call. (I hope that I didn't miss something.) So, the patch is bigger slightly. I defined the MAP_TEXT flag in extented flags sections. However, I'm feeling the relation to MAP_STACK flag, but not sure if and when reserved flags (in other flags section) can be re-used. Svata Index: libexec/rtld-elf/map_object.c === --- libexec/rtld-elf/map_object.c (revision 239770) +++ libexec/rtld-elf/map_object.c (working copy) @@ -199,7 +199,8 @@ data_prot = convert_prot(segs[i]-p_flags); data_flags = convert_flags(segs[i]-p_flags) | MAP_FIXED; if (mmap(data_addr, data_vlimit - data_vaddr, data_prot, - data_flags | MAP_PREFAULT_READ, fd, data_offset) == (caddr_t) -1) { + data_flags | MAP_PREFAULT_READ | MAP_TEXT, fd, data_offset) == + (caddr_t) -1) { I am not sure that we shall mark all segments mappings with MAP_TEXT. I understand the logic of the change, since we do not want data segment to be changed under us. Still, having MAP_TEXT for non-text segments looks strange. I agree. However, only way how to recognize a text segment is an executable flag set. The new patch for map_object.c is following: Index: libexec/rtld-elf/map_object.c === --- libexec/rtld-elf/map_object.c (revision 239770) +++ libexec/rtld-elf/map_object.c (working copy) @@ -442,5 +442,10 @@ */ if (!(elfflags PF_W)) flags |= MAP_NOCORE; +/* + * Executable mappings are marked MAP_TEXT. + */ +if (elfflags PF_X) + flags |= MAP_TEXT; return flags; } _rtld_error(%s: mmap of data failed: %s, path, rtld_strerror(errno)); goto error1; Index: sys/vm/vm_mmap.c === --- sys/vm/vm_mmap.c (revision 239770) +++ sys/vm/vm_mmap.c (working copy) @@ -1258,10 +1258,13 @@ struct mount *mp; struct ucred *cred; int error, flags, locktype, vfslocked; + int writeable_shared; mp = vp-v_mount; cred = td-td_ucred; - if ((*maxprotp VM_PROT_WRITE) (*flagsp MAP_SHARED)) + flags = *flagsp; + writeable_shared = ((*maxprotp VM_PROT_WRITE) (flags MAP_SHARED)); + if (writeable_shared || ((flags MAP_TEXT) != 0)) locktype = LK_EXCLUSIVE; else locktype = LK_SHARED; @@ -1271,7 +1274,6 @@ return (error); } foff = *foffp; - flags = *flagsp; obj = vp-v_object; if (vp-v_type == VREG) {
[patch] mmap() MAP_TEXT implementation (to use for shared libraries)
Hi, I found out that while the running excecutables and a dynamic linker are protected against writing (ETXTBSY), the loaded shared libraries are not protected. The libraries are mapped by mmap() in dynamic linker (rtld) and there is no way how to set VV_TEXT flag on the libraries vnodes in mmap() code. In linux compability code \compat\linux\linux_misc.c, linux_uselib() sets VV_TEXT flags on a library vnode. In Solaris, MAP_TEXT flag exists which informs mmap() that the mapped region will be used primarily for executing instructions (for better MMU utilization). With these on mind, I propose to implement MAP_TEXT option in mmap() and in case that underlying object is a vnode, set VV_TEXT flag on it. I already have implemented it and with rtld map_object() patch it works fine for me (of course). The rtld patch looks easy, however I'm not sure about mmap patch. After some investigation, it looks that VV_TEXT once set on a vnode remains set until last reference on the vnode is left. So, I don't bother with VV_TEXT unset in munmap() to be consistent. The executables and dynamic linker are activated in kernel, so VV_TEXT is set before activation and cleared if something failed. Shared library activation is done in dynamic linker (i.e., in userland). It's done in steps and mmaping the library is one from them. So, I think that VV_TEXT can be set in mmap() just after everything is finished successfully. The patch itself is implemented in vm_mmap_vnode(). If I want to set VV_TEXT flag on a vnode, I need an exclusive lock. In current code, the exclusive lock flag is (mis)used as a flag for vnode_pager_update_writecount() call. (I hope that I didn't miss something.) So, the patch is bigger slightly. I defined the MAP_TEXT flag in extented flags sections. However, I'm feeling the relation to MAP_STACK flag, but not sure if and when reserved flags (in other flags section) can be re-used. Svata Index: libexec/rtld-elf/map_object.c === --- libexec/rtld-elf/map_object.c (revision 239770) +++ libexec/rtld-elf/map_object.c (working copy) @@ -199,7 +199,8 @@ data_prot = convert_prot(segs[i]-p_flags); data_flags = convert_flags(segs[i]-p_flags) | MAP_FIXED; if (mmap(data_addr, data_vlimit - data_vaddr, data_prot, - data_flags | MAP_PREFAULT_READ, fd, data_offset) == (caddr_t) -1) { + data_flags | MAP_PREFAULT_READ | MAP_TEXT, fd, data_offset) == + (caddr_t) -1) { _rtld_error(%s: mmap of data failed: %s, path, rtld_strerror(errno)); goto error1; Index: sys/vm/vm_mmap.c === --- sys/vm/vm_mmap.c(revision 239770) +++ sys/vm/vm_mmap.c(working copy) @@ -1258,10 +1258,13 @@ struct mount *mp; struct ucred *cred; int error, flags, locktype, vfslocked; + int writeable_shared; mp = vp-v_mount; cred = td-td_ucred; - if ((*maxprotp VM_PROT_WRITE) (*flagsp MAP_SHARED)) + flags = *flagsp; + writeable_shared = ((*maxprotp VM_PROT_WRITE) (flags MAP_SHARED)); + if (writeable_shared || ((flags MAP_TEXT) != 0)) locktype = LK_EXCLUSIVE; else locktype = LK_SHARED; @@ -1271,7 +1274,6 @@ return (error); } foff = *foffp; - flags = *flagsp; obj = vp-v_object; if (vp-v_type == VREG) { /* @@ -1294,7 +1296,7 @@ return (error); } } - if (locktype == LK_EXCLUSIVE) { + if (writeable_shared) { *writecounted = TRUE; vnode_pager_update_writecount(obj, 0, objsize); } @@ -1337,6 +1339,14 @@ error = ENOMEM; goto done; } + /* +* If MAP_TEXT is announced, set VV_TEXT so no one can write +* to the executable. +*/ + if ((flags MAP_TEXT) != 0) { + ASSERT_VOP_ELOCKED(vp, vv_text); + vp-v_vflag |= VV_TEXT; + } *objp = obj; *flagsp = flags; Index: sys/sys/mman.h === --- sys/sys/mman.h (revision 239770) +++ sys/sys/mman.h (working copy) @@ -91,6 +91,7 @@ */ #defineMAP_NOCORE 0x0002 /* dont include these pages in a coredump */ #defineMAP_PREFAULT_READ 0x0004 /* prefault mapping for reading */ +#defineMAP_TEXT 0x0008 /* map code segment */ #endif /* __BSD_VISIBLE */ #if __POSIX_VISIBLE = 199309 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to
Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)
on 03/09/2012 13:35 Svatopluk Kraus said the following: After some investigation, it looks that VV_TEXT once set on a vnode remains set until last reference on the vnode is left. There was an idea to turn VV_TEXT flag into a v_text counter. Maybe something like that would be useful indeed? -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)
On Mon, Sep 03, 2012 at 12:35:08PM +0200, Svatopluk Kraus wrote: Hi, I found out that while the running excecutables and a dynamic linker are protected against writing (ETXTBSY), the loaded shared libraries are not protected. The libraries are mapped by mmap() in dynamic linker (rtld) and there is no way how to set VV_TEXT flag on the libraries vnodes in mmap() code. In linux compability code \compat\linux\linux_misc.c, linux_uselib() sets VV_TEXT flags on a library vnode. In Solaris, MAP_TEXT flag exists which informs mmap() that the mapped region will be used primarily for executing instructions (for better MMU utilization). With these on mind, I propose to implement MAP_TEXT option in mmap() and in case that underlying object is a vnode, set VV_TEXT flag on it. I already have implemented it and with rtld map_object() patch it works fine for me (of course). The rtld patch looks easy, however I'm not sure about mmap patch. After some investigation, it looks that VV_TEXT once set on a vnode remains set until last reference on the vnode is left. So, I don't bother with VV_TEXT unset in munmap() to be consistent. The executables and dynamic linker are activated in kernel, so VV_TEXT is set before activation and cleared if something failed. Shared library activation is done in dynamic linker (i.e., in userland). It's done in steps and mmaping the library is one from them. So, I think that VV_TEXT can be set in mmap() just after everything is finished successfully. This is right, the object reference counter is also used as VV_TEXT counter. It is somewhat unaccurate, but in practice does not cause issues. The patch itself is implemented in vm_mmap_vnode(). If I want to set VV_TEXT flag on a vnode, I need an exclusive lock. In current code, the exclusive lock flag is (mis)used as a flag for vnode_pager_update_writecount() call. (I hope that I didn't miss something.) So, the patch is bigger slightly. I defined the MAP_TEXT flag in extented flags sections. However, I'm feeling the relation to MAP_STACK flag, but not sure if and when reserved flags (in other flags section) can be re-used. Svata Index: libexec/rtld-elf/map_object.c === --- libexec/rtld-elf/map_object.c (revision 239770) +++ libexec/rtld-elf/map_object.c (working copy) @@ -199,7 +199,8 @@ data_prot = convert_prot(segs[i]-p_flags); data_flags = convert_flags(segs[i]-p_flags) | MAP_FIXED; if (mmap(data_addr, data_vlimit - data_vaddr, data_prot, - data_flags | MAP_PREFAULT_READ, fd, data_offset) == (caddr_t) -1) { + data_flags | MAP_PREFAULT_READ | MAP_TEXT, fd, data_offset) == + (caddr_t) -1) { I am not sure that we shall mark all segments mappings with MAP_TEXT. I understand the logic of the change, since we do not want data segment to be changed under us. Still, having MAP_TEXT for non-text segments looks strange. _rtld_error(%s: mmap of data failed: %s, path, rtld_strerror(errno)); goto error1; Index: sys/vm/vm_mmap.c === --- sys/vm/vm_mmap.c (revision 239770) +++ sys/vm/vm_mmap.c (working copy) @@ -1258,10 +1258,13 @@ struct mount *mp; struct ucred *cred; int error, flags, locktype, vfslocked; + int writeable_shared; mp = vp-v_mount; cred = td-td_ucred; - if ((*maxprotp VM_PROT_WRITE) (*flagsp MAP_SHARED)) + flags = *flagsp; + writeable_shared = ((*maxprotp VM_PROT_WRITE) (flags MAP_SHARED)); + if (writeable_shared || ((flags MAP_TEXT) != 0)) locktype = LK_EXCLUSIVE; else locktype = LK_SHARED; @@ -1271,7 +1274,6 @@ return (error); } foff = *foffp; - flags = *flagsp; obj = vp-v_object; if (vp-v_type == VREG) { /* @@ -1294,7 +1296,7 @@ return (error); } } - if (locktype == LK_EXCLUSIVE) { + if (writeable_shared) { *writecounted = TRUE; vnode_pager_update_writecount(obj, 0, objsize); } @@ -1337,6 +1339,14 @@ error = ENOMEM; goto done; } + /* + * If MAP_TEXT is announced, set VV_TEXT so no one can write + * to the executable. + */ + if ((flags MAP_TEXT) != 0) { + ASSERT_VOP_ELOCKED(vp, vv_text); + vp-v_vflag |= VV_TEXT; + } I do not think we want to set VV_TEXT for device vnodes. *objp = obj; *flagsp = flags; Index: sys/sys/mman.h === --- sys/sys/mman.h(revision 239770) +++ sys/sys/mman.h(working copy) @@ -91,6 +91,7 @@ */ #define