Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)

2012-10-01 Thread Svatopluk Kraus
On Fri, Sep 7, 2012 at 9:40 PM, John Baldwin j...@freebsd.org wrote:
 On Friday, September 07, 2012 2:41:20 pm Konstantin Belousov wrote:
  I think these would be rare?  There's no good reason for anything to write 
  to
  a shared library that I can think of.  install(1) does an atomic rename to 
  swap
  in the new libraries already.

 After a second thought, I do not like your proposal as well. +x is set for
 shebang scripts, and allowing PROT_EXEC to set VV_TEXT for them means
 that such scripts are subject for write denial.

 Yeah, that's fair.  Also, I hunted around to find the description of MAP_TEXT
 in Solaris 11.  It seems from reading that that MAP_TEXT on Solaris isn't used
 to prevent writes ala VV_TEXT.  Instead, it is used as a hint that is
 apparently used to use superpages for text.

 --
 John Baldwin

Hi,

  I'd like to finish this thread somehow. For security sake, it looks
that bounding VV_TEXT with MAP_TEXT is not good idea. Now, I see only
two possibilities how to solve the shared libraries issue in general.

  1. To have one more permission flag, first for files on which
VV_TEXT can be set and second for files on which VV_TEXT may not be
set.

  2. To activate shared libraries in kernel.

  The whole situation is following.

  There are two basic kinds of binaries in system. The first ones only
need to be activated, the second ones need to be interpreted by an
interpreter which is activated already. While activation is a concern
of kernel and should be done in kernel, an interpretation is a concern
of an interpreter and as such is done in userland. Unfortunately, even
so different in nature, both share x+ permission and can't be
distinguished by it.

  The shared libraries issue is that even they can be activated only,
they are interpreted by dynamic linker instead. As VV_TEXT is kernel
flag and can be set safely by kernel only, there is no way how to
protect them by the flag in this situation.

   Svata
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)

2012-09-07 Thread Svatopluk Kraus
On Tue, Sep 4, 2012 at 3:00 PM, Konstantin Belousov kostik...@gmail.com wrote:
 On Tue, Sep 04, 2012 at 02:49:07PM +0200, Svatopluk Kraus wrote:
 On Mon, Sep 3, 2012 at 2:46 PM, Konstantin Belousov kostik...@gmail.com 
 wrote:
  On Mon, Sep 03, 2012 at 12:35:08PM +0200, Svatopluk Kraus wrote:
  Hi,
 
I found out that while the running excecutables and a dynamic linker
  are protected against writing (ETXTBSY), the loaded shared libraries
  are not protected. The libraries are mapped by mmap() in dynamic
  linker (rtld) and there is no way how to set VV_TEXT flag on the
  libraries vnodes in mmap() code.
 
In linux compability code \compat\linux\linux_misc.c, linux_uselib()
  sets VV_TEXT flags on a library vnode. In Solaris, MAP_TEXT flag
  exists which informs mmap() that the mapped region will be used
  primarily for executing instructions (for better MMU utilization).
  With these on mind, I propose to implement MAP_TEXT option in mmap()
  and in case that underlying object is a vnode, set VV_TEXT flag on it.
 
I already have implemented it and with rtld map_object() patch it
  works fine for me (of course). The rtld patch looks easy, however I'm
  not sure about mmap patch.
 
After some investigation, it looks that VV_TEXT once set on a vnode
  remains set until last reference on the vnode is left. So, I don't
  bother with VV_TEXT unset in munmap() to be consistent. The
  executables and dynamic linker are activated in kernel, so VV_TEXT is
  set before activation and cleared if something failed. Shared library
  activation is done in dynamic linker (i.e., in userland). It's done in
  steps and mmaping the library is one from them. So, I think that
  VV_TEXT can be set in mmap() just after everything is finished
  successfully.
  This is right, the object reference counter is also used as
  VV_TEXT counter. It is somewhat unaccurate, but in practice does
  not cause issues.
 
 
The patch itself is implemented in vm_mmap_vnode(). If I want to set
  VV_TEXT flag on a vnode, I need an exclusive lock. In current code,
  the exclusive lock flag is (mis)used as a flag for
  vnode_pager_update_writecount() call. (I hope that I didn't miss
  something.) So, the patch is bigger slightly.
 
I defined the MAP_TEXT flag in extented flags sections. However, I'm
  feeling the relation to MAP_STACK flag, but not sure if and when
  reserved flags (in other flags section) can be re-used.
 
 Svata
 
 
Index: libexec/rtld-elf/map_object.c
  ===
  --- libexec/rtld-elf/map_object.c (revision 239770)
  +++ libexec/rtld-elf/map_object.c (working copy)
  @@ -199,7 +199,8 @@
data_prot = convert_prot(segs[i]-p_flags);
data_flags = convert_flags(segs[i]-p_flags) | MAP_FIXED;
if (mmap(data_addr, data_vlimit - data_vaddr, data_prot,
  -   data_flags | MAP_PREFAULT_READ, fd, data_offset) == (caddr_t) -1) 
  {
  +   data_flags | MAP_PREFAULT_READ | MAP_TEXT, fd, data_offset) ==
  + (caddr_t) -1) {
  I am not sure that we shall mark all segments mappings with MAP_TEXT.
  I understand the logic of the change, since we do not want data segment
  to be changed under us. Still, having MAP_TEXT for non-text segments looks
  strange.
 

 I agree. However, only way how to recognize a text segment is an
 executable flag set. The new patch for map_object.c is following:

 Index: libexec/rtld-elf/map_object.c
 ===
 --- libexec/rtld-elf/map_object.c (revision 239770)
 +++ libexec/rtld-elf/map_object.c (working copy)
 @@ -442,5 +442,10 @@
   */
  if (!(elfflags  PF_W))
   flags |= MAP_NOCORE;
 +/*
 + * Executable mappings are marked MAP_TEXT.
 + */
 +if (elfflags  PF_X)
 + flags |= MAP_TEXT;
  return flags;
  }


_rtld_error(%s: mmap of data failed: %s, path,
rtld_strerror(errno));
goto error1;
  Index: sys/vm/vm_mmap.c
  ===
  --- sys/vm/vm_mmap.c  (revision 239770)
  +++ sys/vm/vm_mmap.c  (working copy)
  @@ -1258,10 +1258,13 @@
struct mount *mp;
struct ucred *cred;
int error, flags, locktype, vfslocked;
  + int writeable_shared;
 
mp = vp-v_mount;
cred = td-td_ucred;
  - if ((*maxprotp  VM_PROT_WRITE)  (*flagsp  MAP_SHARED))
  + flags = *flagsp;
  + writeable_shared = ((*maxprotp  VM_PROT_WRITE)  (flags  
  MAP_SHARED));
  + if (writeable_shared || ((flags  MAP_TEXT) != 0))
locktype = LK_EXCLUSIVE;
else
locktype = LK_SHARED;
  @@ -1271,7 +1274,6 @@
return (error);
}
foff = *foffp;
  - flags = *flagsp;
obj = vp-v_object;
if (vp-v_type == VREG) {
/*
  @@ -1294,7 +1296,7 @@
return (error);
 

Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)

2012-09-07 Thread Svatopluk Kraus
On Tue, Sep 4, 2012 at 6:00 PM, John Baldwin j...@freebsd.org wrote:
 On Tuesday, September 04, 2012 9:00:39 am Konstantin Belousov wrote:
 On Tue, Sep 04, 2012 at 02:49:07PM +0200, Svatopluk Kraus wrote:
  On Mon, Sep 3, 2012 at 2:46 PM, Konstantin Belousov kostik...@gmail.com
 wrote:
   On Mon, Sep 03, 2012 at 12:35:08PM +0200, Svatopluk Kraus wrote:
   Hi,
  
 I found out that while the running excecutables and a dynamic linker
   are protected against writing (ETXTBSY), the loaded shared libraries
   are not protected. The libraries are mapped by mmap() in dynamic
   linker (rtld) and there is no way how to set VV_TEXT flag on the
   libraries vnodes in mmap() code.
  
 In linux compability code \compat\linux\linux_misc.c, linux_uselib()
   sets VV_TEXT flags on a library vnode. In Solaris, MAP_TEXT flag
   exists which informs mmap() that the mapped region will be used
   primarily for executing instructions (for better MMU utilization).
   With these on mind, I propose to implement MAP_TEXT option in mmap()
   and in case that underlying object is a vnode, set VV_TEXT flag on it.
  
 I already have implemented it and with rtld map_object() patch it
   works fine for me (of course). The rtld patch looks easy, however I'm
   not sure about mmap patch.
  
 After some investigation, it looks that VV_TEXT once set on a vnode
   remains set until last reference on the vnode is left. So, I don't
   bother with VV_TEXT unset in munmap() to be consistent. The
   executables and dynamic linker are activated in kernel, so VV_TEXT is
   set before activation and cleared if something failed. Shared library
   activation is done in dynamic linker (i.e., in userland). It's done in
   steps and mmaping the library is one from them. So, I think that
   VV_TEXT can be set in mmap() just after everything is finished
   successfully.
   This is right, the object reference counter is also used as
   VV_TEXT counter. It is somewhat unaccurate, but in practice does
   not cause issues.
  
  
 The patch itself is implemented in vm_mmap_vnode(). If I want to set
   VV_TEXT flag on a vnode, I need an exclusive lock. In current code,
   the exclusive lock flag is (mis)used as a flag for
   vnode_pager_update_writecount() call. (I hope that I didn't miss
   something.) So, the patch is bigger slightly.
  
 I defined the MAP_TEXT flag in extented flags sections. However, I'm
   feeling the relation to MAP_STACK flag, but not sure if and when
   reserved flags (in other flags section) can be re-used.
  
  Svata
  
  
 Index: libexec/rtld-elf/map_object.c
   ===
   --- libexec/rtld-elf/map_object.c (revision 239770)
   +++ libexec/rtld-elf/map_object.c (working copy)
   @@ -199,7 +199,8 @@
 data_prot = convert_prot(segs[i]-p_flags);
 data_flags = convert_flags(segs[i]-p_flags) | MAP_FIXED;
 if (mmap(data_addr, data_vlimit - data_vaddr, data_prot,
   -   data_flags | MAP_PREFAULT_READ, fd, data_offset) == (caddr_t)
 -1) {
   +   data_flags | MAP_PREFAULT_READ | MAP_TEXT, fd, data_offset) ==
   + (caddr_t) -1) {
   I am not sure that we shall mark all segments mappings with MAP_TEXT.
   I understand the logic of the change, since we do not want data segment
   to be changed under us. Still, having MAP_TEXT for non-text segments
 looks
   strange.
  
 
  I agree. However, only way how to recognize a text segment is an
  executable flag set. The new patch for map_object.c is following:
 
  Index: libexec/rtld-elf/map_object.c
  ===
  --- libexec/rtld-elf/map_object.c   (revision 239770)
  +++ libexec/rtld-elf/map_object.c   (working copy)
  @@ -442,5 +442,10 @@
*/
   if (!(elfflags  PF_W))
  flags |= MAP_NOCORE;
  +/*
  + * Executable mappings are marked MAP_TEXT.
  + */
  +if (elfflags  PF_X)
  +   flags |= MAP_TEXT;
   return flags;
   }
 
 
 _rtld_error(%s: mmap of data failed: %s, path,
 rtld_strerror(errno));
 goto error1;
   Index: sys/vm/vm_mmap.c
   ===
   --- sys/vm/vm_mmap.c  (revision 239770)
   +++ sys/vm/vm_mmap.c  (working copy)
   @@ -1258,10 +1258,13 @@
 struct mount *mp;
 struct ucred *cred;
 int error, flags, locktype, vfslocked;
   + int writeable_shared;
  
 mp = vp-v_mount;
 cred = td-td_ucred;
   - if ((*maxprotp  VM_PROT_WRITE)  (*flagsp  MAP_SHARED))
   + flags = *flagsp;
   + writeable_shared = ((*maxprotp  VM_PROT_WRITE)  (flags 
 MAP_SHARED));
   + if (writeable_shared || ((flags  MAP_TEXT) != 0))
 locktype = LK_EXCLUSIVE;
 else
 locktype = LK_SHARED;
   @@ -1271,7 +1274,6 @@
 return (error);
 }
 foff = *foffp;
   - flags = 

Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)

2012-09-07 Thread Konstantin Belousov
On Fri, Sep 07, 2012 at 05:12:37PM +0200, Svatopluk Kraus wrote:
 On Tue, Sep 4, 2012 at 6:00 PM, John Baldwin j...@freebsd.org wrote:
  On Tuesday, September 04, 2012 9:00:39 am Konstantin Belousov wrote:
  On Tue, Sep 04, 2012 at 02:49:07PM +0200, Svatopluk Kraus wrote:
   On Mon, Sep 3, 2012 at 2:46 PM, Konstantin Belousov kostik...@gmail.com
  wrote:
On Mon, Sep 03, 2012 at 12:35:08PM +0200, Svatopluk Kraus wrote:
Hi,
   
  I found out that while the running excecutables and a dynamic linker
are protected against writing (ETXTBSY), the loaded shared libraries
are not protected. The libraries are mapped by mmap() in dynamic
linker (rtld) and there is no way how to set VV_TEXT flag on the
libraries vnodes in mmap() code.
   
  In linux compability code \compat\linux\linux_misc.c, linux_uselib()
sets VV_TEXT flags on a library vnode. In Solaris, MAP_TEXT flag
exists which informs mmap() that the mapped region will be used
primarily for executing instructions (for better MMU utilization).
With these on mind, I propose to implement MAP_TEXT option in mmap()
and in case that underlying object is a vnode, set VV_TEXT flag on it.
   
  I already have implemented it and with rtld map_object() patch it
works fine for me (of course). The rtld patch looks easy, however I'm
not sure about mmap patch.
   
  After some investigation, it looks that VV_TEXT once set on a vnode
remains set until last reference on the vnode is left. So, I don't
bother with VV_TEXT unset in munmap() to be consistent. The
executables and dynamic linker are activated in kernel, so VV_TEXT is
set before activation and cleared if something failed. Shared library
activation is done in dynamic linker (i.e., in userland). It's done in
steps and mmaping the library is one from them. So, I think that
VV_TEXT can be set in mmap() just after everything is finished
successfully.
This is right, the object reference counter is also used as
VV_TEXT counter. It is somewhat unaccurate, but in practice does
not cause issues.
   
   
  The patch itself is implemented in vm_mmap_vnode(). If I want to set
VV_TEXT flag on a vnode, I need an exclusive lock. In current code,
the exclusive lock flag is (mis)used as a flag for
vnode_pager_update_writecount() call. (I hope that I didn't miss
something.) So, the patch is bigger slightly.
   
  I defined the MAP_TEXT flag in extented flags sections. However, I'm
feeling the relation to MAP_STACK flag, but not sure if and when
reserved flags (in other flags section) can be re-used.
   
   Svata
   
   
  Index: libexec/rtld-elf/map_object.c
===
--- libexec/rtld-elf/map_object.c (revision 239770)
+++ libexec/rtld-elf/map_object.c (working copy)
@@ -199,7 +199,8 @@
  data_prot = convert_prot(segs[i]-p_flags);
  data_flags = convert_flags(segs[i]-p_flags) | MAP_FIXED;
  if (mmap(data_addr, data_vlimit - data_vaddr, data_prot,
-   data_flags | MAP_PREFAULT_READ, fd, data_offset) == (caddr_t)
  -1) {
+   data_flags | MAP_PREFAULT_READ | MAP_TEXT, fd, data_offset) ==
+ (caddr_t) -1) {
I am not sure that we shall mark all segments mappings with MAP_TEXT.
I understand the logic of the change, since we do not want data segment
to be changed under us. Still, having MAP_TEXT for non-text segments
  looks
strange.
   
  
   I agree. However, only way how to recognize a text segment is an
   executable flag set. The new patch for map_object.c is following:
  
   Index: libexec/rtld-elf/map_object.c
   ===
   --- libexec/rtld-elf/map_object.c   (revision 239770)
   +++ libexec/rtld-elf/map_object.c   (working copy)
   @@ -442,5 +442,10 @@
 */
if (!(elfflags  PF_W))
   flags |= MAP_NOCORE;
   +/*
   + * Executable mappings are marked MAP_TEXT.
   + */
   +if (elfflags  PF_X)
   +   flags |= MAP_TEXT;
return flags;
}
  
  
  _rtld_error(%s: mmap of data failed: %s, path,
  rtld_strerror(errno));
  goto error1;
Index: sys/vm/vm_mmap.c
===
--- sys/vm/vm_mmap.c  (revision 239770)
+++ sys/vm/vm_mmap.c  (working copy)
@@ -1258,10 +1258,13 @@
  struct mount *mp;
  struct ucred *cred;
  int error, flags, locktype, vfslocked;
+ int writeable_shared;
   
  mp = vp-v_mount;
  cred = td-td_ucred;
- if ((*maxprotp  VM_PROT_WRITE)  (*flagsp  MAP_SHARED))
+ flags = *flagsp;
+ writeable_shared = ((*maxprotp  VM_PROT_WRITE)  (flags 
  MAP_SHARED));
+ if (writeable_shared || ((flags  MAP_TEXT) != 0))
  locktype = 

Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)

2012-09-07 Thread John Baldwin
On Friday, September 07, 2012 12:02:08 pm Konstantin Belousov wrote:
 On Fri, Sep 07, 2012 at 05:12:37PM +0200, Svatopluk Kraus wrote:
  On Tue, Sep 4, 2012 at 6:00 PM, John Baldwin j...@freebsd.org wrote:
   On Tuesday, September 04, 2012 9:00:39 am Konstantin Belousov wrote:
   2. I do not see what would prevent malicious local user from mmaping
  arbitrary file readonly with MAP_TEXT, thus blocking any modifications
  to the file. Note that this is not a problem for executables, because
  kernel only sets VV_TEXT on executables if +x permission is set and
  file is valid binary which kernel is able to execute.
  
  E.g. you might block log writes with VV_TEXT, or other user editing
  session or whatever, having just read access to corresponding files.
  
   Am I wrong ?
  
   Hmm, I do think 2) is a bit of a show-stopper.  I do wonder why one needs
   MAP_TEXT at all or if you could key this off of mmap() with PROT_EXEC?
   Do we require +x permissions for PROT_EXEC?  No, it seems we only require
   a file opened with FREAD.  Hmm, perhaps rtld could open a separate fd for
   PROT_EXEC mappings that used O_EXEC and mmap()'ing an O_EXEC fd could 
   enable
   VV_TEXT?  That would require a file to have +x permisson for an mmap() to
   enable VV_TEXT.  It would also make MAP_TEXT unneeded.
  
  It sounds good for me. I will try to patch it this way. However, do
  you think that will be acceptable to set +x permission to shared
  libraries in general?

Shared libraries have +x by default already.  You could make rtld fall back
to using the O_RDONLY fd if it can't open an O_EXEC fd in which case you don't
get the VV_TEXT protection, but rtld would be backwards compatible.

 Setting +x on shared libraries can be done. But setting VV_TEXT for
 such mappings is definitely non-standard behaviour, that could cause
 locking surprises for unaware system administrator. The issuw would be
 very hard to diagnose.

I'm not sure it would be that obscure.  It's the same as getting an error that
a binary is busy.  In that case you would resort to 'ps' to find the offending
process.  For this case (a shared library being busy), you could look at the
output of 'procstat -af' to find which processes have executable mappings of
the library.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)

2012-09-07 Thread Konstantin Belousov
On Fri, Sep 07, 2012 at 12:21:37PM -0400, John Baldwin wrote:
 On Friday, September 07, 2012 12:02:08 pm Konstantin Belousov wrote:
  On Fri, Sep 07, 2012 at 05:12:37PM +0200, Svatopluk Kraus wrote:
   On Tue, Sep 4, 2012 at 6:00 PM, John Baldwin j...@freebsd.org wrote:
On Tuesday, September 04, 2012 9:00:39 am Konstantin Belousov wrote:
2. I do not see what would prevent malicious local user from mmaping
   arbitrary file readonly with MAP_TEXT, thus blocking any 
modifications
   to the file. Note that this is not a problem for executables, 
because
   kernel only sets VV_TEXT on executables if +x permission is set and
   file is valid binary which kernel is able to execute.
   
   E.g. you might block log writes with VV_TEXT, or other user editing
   session or whatever, having just read access to corresponding files.
   
Am I wrong ?
   
Hmm, I do think 2) is a bit of a show-stopper.  I do wonder why one 
needs
MAP_TEXT at all or if you could key this off of mmap() with PROT_EXEC?
Do we require +x permissions for PROT_EXEC?  No, it seems we only 
require
a file opened with FREAD.  Hmm, perhaps rtld could open a separate fd 
for
PROT_EXEC mappings that used O_EXEC and mmap()'ing an O_EXEC fd could 
enable
VV_TEXT?  That would require a file to have +x permisson for an mmap() 
to
enable VV_TEXT.  It would also make MAP_TEXT unneeded.
   
   It sounds good for me. I will try to patch it this way. However, do
   you think that will be acceptable to set +x permission to shared
   libraries in general?
 
 Shared libraries have +x by default already.  You could make rtld fall back
 to using the O_RDONLY fd if it can't open an O_EXEC fd in which case you don't
 get the VV_TEXT protection, but rtld would be backwards compatible.
Where is it ? On fresh stable/9 machine, I have in /lib:
-r--r--r--  1 root  wheel 7216 Dec  3  2011 libipx.so.5
-r--r--r--  1 root  wheel19800 Jun 28 18:33 libjail.so.1
-r--r--r--  1 root  wheel13752 Jun 28 18:33 libkiconv.so.4
...

 
  Setting +x on shared libraries can be done. But setting VV_TEXT for
  such mappings is definitely non-standard behaviour, that could cause
  locking surprises for unaware system administrator. The issuw would be
  very hard to diagnose.
 
 I'm not sure it would be that obscure.  It's the same as getting an error that
 a binary is busy.  In that case you would resort to 'ps' to find the offending
 process.  For this case (a shared library being busy), you could look at the
 output of 'procstat -af' to find which processes have executable mappings of
 the library.

I much more worry about rtld reporting 'shared library is busy'. It is fine
to not be able to write to mapped library.

Rtld errors are usually quite worrying for users, and they just do not
understand them.


pgpjX3VimO77U.pgp
Description: PGP signature


Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)

2012-09-07 Thread John Baldwin
On Friday, September 07, 2012 12:42:18 pm Konstantin Belousov wrote:
 On Fri, Sep 07, 2012 at 12:21:37PM -0400, John Baldwin wrote:
  On Friday, September 07, 2012 12:02:08 pm Konstantin Belousov wrote:
   On Fri, Sep 07, 2012 at 05:12:37PM +0200, Svatopluk Kraus wrote:
On Tue, Sep 4, 2012 at 6:00 PM, John Baldwin j...@freebsd.org wrote:
 On Tuesday, September 04, 2012 9:00:39 am Konstantin Belousov wrote:
 2. I do not see what would prevent malicious local user from mmaping
arbitrary file readonly with MAP_TEXT, thus blocking any 
 modifications
to the file. Note that this is not a problem for executables, 
 because
kernel only sets VV_TEXT on executables if +x permission is set 
 and
file is valid binary which kernel is able to execute.

E.g. you might block log writes with VV_TEXT, or other user 
 editing
session or whatever, having just read access to corresponding 
 files.

 Am I wrong ?

 Hmm, I do think 2) is a bit of a show-stopper.  I do wonder why one 
 needs
 MAP_TEXT at all or if you could key this off of mmap() with PROT_EXEC?
 Do we require +x permissions for PROT_EXEC?  No, it seems we only 
 require
 a file opened with FREAD.  Hmm, perhaps rtld could open a separate fd 
 for
 PROT_EXEC mappings that used O_EXEC and mmap()'ing an O_EXEC fd could 
 enable
 VV_TEXT?  That would require a file to have +x permisson for an 
 mmap() to
 enable VV_TEXT.  It would also make MAP_TEXT unneeded.

It sounds good for me. I will try to patch it this way. However, do
you think that will be acceptable to set +x permission to shared
libraries in general?
  
  Shared libraries have +x by default already.  You could make rtld fall back
  to using the O_RDONLY fd if it can't open an O_EXEC fd in which case you 
  don't
  get the VV_TEXT protection, but rtld would be backwards compatible.
 Where is it ? On fresh stable/9 machine, I have in /lib:
 -r--r--r--  1 root  wheel 7216 Dec  3  2011 libipx.so.5
 -r--r--r--  1 root  wheel19800 Jun 28 18:33 libjail.so.1
 -r--r--r--  1 root  wheel13752 Jun 28 18:33 libkiconv.so.4
 ...

Hmm, I was looking in /usr/local/lib.  It seems at least some ports do
install libraries with +x:

-rwxr-xr-x  1 root  wheel210494 Apr  8  2011 
/usr/local/lib/libxml++-2.6.so.2
-rwxr-xr-x  1 root  wheel   1515416 Apr  8  2011 /usr/local/lib/libxml2.so.5
-rwxr-xr-x  1 root  wheel 44389 Apr  8  2011 /usr/local/lib/libxmlparse.so.1
-rwxr-xr-x  1 root  wheel100672 Apr  8  2011 /usr/local/lib/libxmltok.so.1
-rwxr-xr-x  1 root  wheel266775 Apr  8  2011 /usr/local/lib/libxslt.so.2
-rw-r--r--  1 root  wheel764602 Apr  8  2011 /usr/local/lib/libxvidcore.so.4
-rwxr-xr-x  1 root  wheel 53379 Apr  9  2011 /usr/local/lib/libzip.so.1

   Setting +x on shared libraries can be done. But setting VV_TEXT for
   such mappings is definitely non-standard behaviour, that could cause
   locking surprises for unaware system administrator. The issuw would be
   very hard to diagnose.
  
  I'm not sure it would be that obscure.  It's the same as getting an error 
  that
  a binary is busy.  In that case you would resort to 'ps' to find the 
  offending
  process.  For this case (a shared library being busy), you could look at the
  output of 'procstat -af' to find which processes have executable mappings of
  the library.
 
 I much more worry about rtld reporting 'shared library is busy'. It is fine
 to not be able to write to mapped library.
 
 Rtld errors are usually quite worrying for users, and they just do not
 understand them.

I think these would be rare?  There's no good reason for anything to write to
a shared library that I can think of.  install(1) does an atomic rename to swap
in the new libraries already.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)

2012-09-07 Thread Konstantin Belousov
On Fri, Sep 07, 2012 at 02:05:28PM -0400, John Baldwin wrote:
 On Friday, September 07, 2012 12:42:18 pm Konstantin Belousov wrote:
  On Fri, Sep 07, 2012 at 12:21:37PM -0400, John Baldwin wrote:
   On Friday, September 07, 2012 12:02:08 pm Konstantin Belousov wrote:
On Fri, Sep 07, 2012 at 05:12:37PM +0200, Svatopluk Kraus wrote:
 On Tue, Sep 4, 2012 at 6:00 PM, John Baldwin j...@freebsd.org wrote:
  On Tuesday, September 04, 2012 9:00:39 am Konstantin Belousov wrote:
  2. I do not see what would prevent malicious local user from 
  mmaping
 arbitrary file readonly with MAP_TEXT, thus blocking any 
  modifications
 to the file. Note that this is not a problem for executables, 
  because
 kernel only sets VV_TEXT on executables if +x permission is set 
  and
 file is valid binary which kernel is able to execute.
 
 E.g. you might block log writes with VV_TEXT, or other user 
  editing
 session or whatever, having just read access to corresponding 
  files.
 
  Am I wrong ?
 
  Hmm, I do think 2) is a bit of a show-stopper.  I do wonder why one 
  needs
  MAP_TEXT at all or if you could key this off of mmap() with 
  PROT_EXEC?
  Do we require +x permissions for PROT_EXEC?  No, it seems we only 
  require
  a file opened with FREAD.  Hmm, perhaps rtld could open a separate 
  fd for
  PROT_EXEC mappings that used O_EXEC and mmap()'ing an O_EXEC fd 
  could enable
  VV_TEXT?  That would require a file to have +x permisson for an 
  mmap() to
  enable VV_TEXT.  It would also make MAP_TEXT unneeded.
 
 It sounds good for me. I will try to patch it this way. However, do
 you think that will be acceptable to set +x permission to shared
 libraries in general?
   
   Shared libraries have +x by default already.  You could make rtld fall 
   back
   to using the O_RDONLY fd if it can't open an O_EXEC fd in which case you 
   don't
   get the VV_TEXT protection, but rtld would be backwards compatible.
  Where is it ? On fresh stable/9 machine, I have in /lib:
  -r--r--r--  1 root  wheel 7216 Dec  3  2011 libipx.so.5
  -r--r--r--  1 root  wheel19800 Jun 28 18:33 libjail.so.1
  -r--r--r--  1 root  wheel13752 Jun 28 18:33 libkiconv.so.4
  ...
 
 Hmm, I was looking in /usr/local/lib.  It seems at least some ports do
 install libraries with +x:
 
 -rwxr-xr-x  1 root  wheel210494 Apr  8  2011 
 /usr/local/lib/libxml++-2.6.so.2
 -rwxr-xr-x  1 root  wheel   1515416 Apr  8  2011 /usr/local/lib/libxml2.so.5
 -rwxr-xr-x  1 root  wheel 44389 Apr  8  2011 
 /usr/local/lib/libxmlparse.so.1
 -rwxr-xr-x  1 root  wheel100672 Apr  8  2011 /usr/local/lib/libxmltok.so.1
 -rwxr-xr-x  1 root  wheel266775 Apr  8  2011 /usr/local/lib/libxslt.so.2
 -rw-r--r--  1 root  wheel764602 Apr  8  2011 
 /usr/local/lib/libxvidcore.so.4
 -rwxr-xr-x  1 root  wheel 53379 Apr  9  2011 /usr/local/lib/libzip.so.1
 
Setting +x on shared libraries can be done. But setting VV_TEXT for
such mappings is definitely non-standard behaviour, that could cause
locking surprises for unaware system administrator. The issuw would be
very hard to diagnose.
   
   I'm not sure it would be that obscure.  It's the same as getting an error 
   that
   a binary is busy.  In that case you would resort to 'ps' to find the 
   offending
   process.  For this case (a shared library being busy), you could look at 
   the
   output of 'procstat -af' to find which processes have executable mappings 
   of
   the library.
  
  I much more worry about rtld reporting 'shared library is busy'. It is fine
  to not be able to write to mapped library.
  
  Rtld errors are usually quite worrying for users, and they just do not
  understand them.
 
 I think these would be rare?  There's no good reason for anything to write to
 a shared library that I can think of.  install(1) does an atomic rename to 
 swap
 in the new libraries already.

After a second thought, I do not like your proposal as well. +x is set for
shebang scripts, and allowing PROT_EXEC to set VV_TEXT for them means
that such scripts are subject for write denial.


pgphor0hI2qC3.pgp
Description: PGP signature


Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)

2012-09-07 Thread Ian Lepore
On Fri, 2012-09-07 at 21:41 +0300, Konstantin Belousov wrote:
 After a second thought, I do not like your proposal as well. +x is set for
 shebang scripts, and allowing PROT_EXEC to set VV_TEXT for them means
 that such scripts are subject for write denial.

You say that like it's a bad thing.  I hate it when I accidentally edit
a script that's running and then the script fails because I did.  I
would be much happier if it acted just like any other executable and
prevented modification while it's running.

-- Ian


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)

2012-09-07 Thread Konstantin Belousov
On Fri, Sep 07, 2012 at 12:48:19PM -0600, Ian Lepore wrote:
 On Fri, 2012-09-07 at 21:41 +0300, Konstantin Belousov wrote:
  After a second thought, I do not like your proposal as well. +x is set for
  shebang scripts, and allowing PROT_EXEC to set VV_TEXT for them means
  that such scripts are subject for write denial.
 
 You say that like it's a bad thing.  I hate it when I accidentally edit
 a script that's running and then the script fails because I did.  I
 would be much happier if it acted just like any other executable and
 prevented modification while it's running.

For me, if other user can block my modifications of my script by the mere
fact that script has o+rx rights, is indeed bad. I do use real machines
sometime.


pgp9lb5pP7Whf.pgp
Description: PGP signature


Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)

2012-09-07 Thread Ian Lepore
On Fri, 2012-09-07 at 21:53 +0300, Konstantin Belousov wrote:
 On Fri, Sep 07, 2012 at 12:48:19PM -0600, Ian Lepore wrote:
  On Fri, 2012-09-07 at 21:41 +0300, Konstantin Belousov wrote:
   After a second thought, I do not like your proposal as well. +x is set for
   shebang scripts, and allowing PROT_EXEC to set VV_TEXT for them means
   that such scripts are subject for write denial.
  
  You say that like it's a bad thing.  I hate it when I accidentally edit
  a script that's running and then the script fails because I did.  I
  would be much happier if it acted just like any other executable and
  prevented modification while it's running.
 
 For me, if other user can block my modifications of my script by the mere
 fact that script has o+rx rights, is indeed bad. I do use real machines
 sometime.

But you don't feel the same way about a compiled program?

I see absolutely no difference between the two, conceptually.  To me,
changing an application while it's running is bad.  It makes no
difference what language the application is written in.

-- Ian


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)

2012-09-07 Thread John Baldwin
On Friday, September 07, 2012 2:41:20 pm Konstantin Belousov wrote:
  I think these would be rare?  There's no good reason for anything to write 
  to
  a shared library that I can think of.  install(1) does an atomic rename to 
  swap
  in the new libraries already.
 
 After a second thought, I do not like your proposal as well. +x is set for
 shebang scripts, and allowing PROT_EXEC to set VV_TEXT for them means
 that such scripts are subject for write denial.

Yeah, that's fair.  Also, I hunted around to find the description of MAP_TEXT
in Solaris 11.  It seems from reading that that MAP_TEXT on Solaris isn't used
to prevent writes ala VV_TEXT.  Instead, it is used as a hint that is
apparently used to use superpages for text.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)

2012-09-04 Thread Svatopluk Kraus
On Mon, Sep 3, 2012 at 2:46 PM, Konstantin Belousov kostik...@gmail.com wrote:
 On Mon, Sep 03, 2012 at 12:35:08PM +0200, Svatopluk Kraus wrote:
 Hi,

   I found out that while the running excecutables and a dynamic linker
 are protected against writing (ETXTBSY), the loaded shared libraries
 are not protected. The libraries are mapped by mmap() in dynamic
 linker (rtld) and there is no way how to set VV_TEXT flag on the
 libraries vnodes in mmap() code.

   In linux compability code \compat\linux\linux_misc.c, linux_uselib()
 sets VV_TEXT flags on a library vnode. In Solaris, MAP_TEXT flag
 exists which informs mmap() that the mapped region will be used
 primarily for executing instructions (for better MMU utilization).
 With these on mind, I propose to implement MAP_TEXT option in mmap()
 and in case that underlying object is a vnode, set VV_TEXT flag on it.

   I already have implemented it and with rtld map_object() patch it
 works fine for me (of course). The rtld patch looks easy, however I'm
 not sure about mmap patch.

   After some investigation, it looks that VV_TEXT once set on a vnode
 remains set until last reference on the vnode is left. So, I don't
 bother with VV_TEXT unset in munmap() to be consistent. The
 executables and dynamic linker are activated in kernel, so VV_TEXT is
 set before activation and cleared if something failed. Shared library
 activation is done in dynamic linker (i.e., in userland). It's done in
 steps and mmaping the library is one from them. So, I think that
 VV_TEXT can be set in mmap() just after everything is finished
 successfully.
 This is right, the object reference counter is also used as
 VV_TEXT counter. It is somewhat unaccurate, but in practice does
 not cause issues.


   The patch itself is implemented in vm_mmap_vnode(). If I want to set
 VV_TEXT flag on a vnode, I need an exclusive lock. In current code,
 the exclusive lock flag is (mis)used as a flag for
 vnode_pager_update_writecount() call. (I hope that I didn't miss
 something.) So, the patch is bigger slightly.

   I defined the MAP_TEXT flag in extented flags sections. However, I'm
 feeling the relation to MAP_STACK flag, but not sure if and when
 reserved flags (in other flags section) can be re-used.

Svata


   Index: libexec/rtld-elf/map_object.c
 ===
 --- libexec/rtld-elf/map_object.c (revision 239770)
 +++ libexec/rtld-elf/map_object.c (working copy)
 @@ -199,7 +199,8 @@
   data_prot = convert_prot(segs[i]-p_flags);
   data_flags = convert_flags(segs[i]-p_flags) | MAP_FIXED;
   if (mmap(data_addr, data_vlimit - data_vaddr, data_prot,
 -   data_flags | MAP_PREFAULT_READ, fd, data_offset) == (caddr_t) -1) {
 +   data_flags | MAP_PREFAULT_READ | MAP_TEXT, fd, data_offset) ==
 + (caddr_t) -1) {
 I am not sure that we shall mark all segments mappings with MAP_TEXT.
 I understand the logic of the change, since we do not want data segment
 to be changed under us. Still, having MAP_TEXT for non-text segments looks
 strange.


I agree. However, only way how to recognize a text segment is an
executable flag set. The new patch for map_object.c is following:

Index: libexec/rtld-elf/map_object.c
===
--- libexec/rtld-elf/map_object.c   (revision 239770)
+++ libexec/rtld-elf/map_object.c   (working copy)
@@ -442,5 +442,10 @@
  */
 if (!(elfflags  PF_W))
flags |= MAP_NOCORE;
+/*
+ * Executable mappings are marked MAP_TEXT.
+ */
+if (elfflags  PF_X)
+   flags |= MAP_TEXT;
 return flags;
 }


   _rtld_error(%s: mmap of data failed: %s, path,
   rtld_strerror(errno));
   goto error1;
 Index: sys/vm/vm_mmap.c
 ===
 --- sys/vm/vm_mmap.c  (revision 239770)
 +++ sys/vm/vm_mmap.c  (working copy)
 @@ -1258,10 +1258,13 @@
   struct mount *mp;
   struct ucred *cred;
   int error, flags, locktype, vfslocked;
 + int writeable_shared;

   mp = vp-v_mount;
   cred = td-td_ucred;
 - if ((*maxprotp  VM_PROT_WRITE)  (*flagsp  MAP_SHARED))
 + flags = *flagsp;
 + writeable_shared = ((*maxprotp  VM_PROT_WRITE)  (flags  
 MAP_SHARED));
 + if (writeable_shared || ((flags  MAP_TEXT) != 0))
   locktype = LK_EXCLUSIVE;
   else
   locktype = LK_SHARED;
 @@ -1271,7 +1274,6 @@
   return (error);
   }
   foff = *foffp;
 - flags = *flagsp;
   obj = vp-v_object;
   if (vp-v_type == VREG) {
   /*
 @@ -1294,7 +1296,7 @@
   return (error);
   }
   }
 - if (locktype == LK_EXCLUSIVE) {
 + if (writeable_shared) {
   *writecounted = TRUE;
   vnode_pager_update_writecount(obj, 0, objsize);
   }

Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)

2012-09-04 Thread Konstantin Belousov
On Tue, Sep 04, 2012 at 02:49:07PM +0200, Svatopluk Kraus wrote:
 On Mon, Sep 3, 2012 at 2:46 PM, Konstantin Belousov kostik...@gmail.com 
 wrote:
  On Mon, Sep 03, 2012 at 12:35:08PM +0200, Svatopluk Kraus wrote:
  Hi,
 
I found out that while the running excecutables and a dynamic linker
  are protected against writing (ETXTBSY), the loaded shared libraries
  are not protected. The libraries are mapped by mmap() in dynamic
  linker (rtld) and there is no way how to set VV_TEXT flag on the
  libraries vnodes in mmap() code.
 
In linux compability code \compat\linux\linux_misc.c, linux_uselib()
  sets VV_TEXT flags on a library vnode. In Solaris, MAP_TEXT flag
  exists which informs mmap() that the mapped region will be used
  primarily for executing instructions (for better MMU utilization).
  With these on mind, I propose to implement MAP_TEXT option in mmap()
  and in case that underlying object is a vnode, set VV_TEXT flag on it.
 
I already have implemented it and with rtld map_object() patch it
  works fine for me (of course). The rtld patch looks easy, however I'm
  not sure about mmap patch.
 
After some investigation, it looks that VV_TEXT once set on a vnode
  remains set until last reference on the vnode is left. So, I don't
  bother with VV_TEXT unset in munmap() to be consistent. The
  executables and dynamic linker are activated in kernel, so VV_TEXT is
  set before activation and cleared if something failed. Shared library
  activation is done in dynamic linker (i.e., in userland). It's done in
  steps and mmaping the library is one from them. So, I think that
  VV_TEXT can be set in mmap() just after everything is finished
  successfully.
  This is right, the object reference counter is also used as
  VV_TEXT counter. It is somewhat unaccurate, but in practice does
  not cause issues.
 
 
The patch itself is implemented in vm_mmap_vnode(). If I want to set
  VV_TEXT flag on a vnode, I need an exclusive lock. In current code,
  the exclusive lock flag is (mis)used as a flag for
  vnode_pager_update_writecount() call. (I hope that I didn't miss
  something.) So, the patch is bigger slightly.
 
I defined the MAP_TEXT flag in extented flags sections. However, I'm
  feeling the relation to MAP_STACK flag, but not sure if and when
  reserved flags (in other flags section) can be re-used.
 
 Svata
 
 
Index: libexec/rtld-elf/map_object.c
  ===
  --- libexec/rtld-elf/map_object.c (revision 239770)
  +++ libexec/rtld-elf/map_object.c (working copy)
  @@ -199,7 +199,8 @@
data_prot = convert_prot(segs[i]-p_flags);
data_flags = convert_flags(segs[i]-p_flags) | MAP_FIXED;
if (mmap(data_addr, data_vlimit - data_vaddr, data_prot,
  -   data_flags | MAP_PREFAULT_READ, fd, data_offset) == (caddr_t) -1) {
  +   data_flags | MAP_PREFAULT_READ | MAP_TEXT, fd, data_offset) ==
  + (caddr_t) -1) {
  I am not sure that we shall mark all segments mappings with MAP_TEXT.
  I understand the logic of the change, since we do not want data segment
  to be changed under us. Still, having MAP_TEXT for non-text segments looks
  strange.
 
 
 I agree. However, only way how to recognize a text segment is an
 executable flag set. The new patch for map_object.c is following:
 
 Index: libexec/rtld-elf/map_object.c
 ===
 --- libexec/rtld-elf/map_object.c (revision 239770)
 +++ libexec/rtld-elf/map_object.c (working copy)
 @@ -442,5 +442,10 @@
   */
  if (!(elfflags  PF_W))
   flags |= MAP_NOCORE;
 +/*
 + * Executable mappings are marked MAP_TEXT.
 + */
 +if (elfflags  PF_X)
 + flags |= MAP_TEXT;
  return flags;
  }
 
 
_rtld_error(%s: mmap of data failed: %s, path,
rtld_strerror(errno));
goto error1;
  Index: sys/vm/vm_mmap.c
  ===
  --- sys/vm/vm_mmap.c  (revision 239770)
  +++ sys/vm/vm_mmap.c  (working copy)
  @@ -1258,10 +1258,13 @@
struct mount *mp;
struct ucred *cred;
int error, flags, locktype, vfslocked;
  + int writeable_shared;
 
mp = vp-v_mount;
cred = td-td_ucred;
  - if ((*maxprotp  VM_PROT_WRITE)  (*flagsp  MAP_SHARED))
  + flags = *flagsp;
  + writeable_shared = ((*maxprotp  VM_PROT_WRITE)  (flags  
  MAP_SHARED));
  + if (writeable_shared || ((flags  MAP_TEXT) != 0))
locktype = LK_EXCLUSIVE;
else
locktype = LK_SHARED;
  @@ -1271,7 +1274,6 @@
return (error);
}
foff = *foffp;
  - flags = *flagsp;
obj = vp-v_object;
if (vp-v_type == VREG) {
/*
  @@ -1294,7 +1296,7 @@
return (error);
}
}
  - if (locktype == 

Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)

2012-09-04 Thread John Baldwin
On Tuesday, September 04, 2012 9:00:39 am Konstantin Belousov wrote:
 On Tue, Sep 04, 2012 at 02:49:07PM +0200, Svatopluk Kraus wrote:
  On Mon, Sep 3, 2012 at 2:46 PM, Konstantin Belousov kostik...@gmail.com 
wrote:
   On Mon, Sep 03, 2012 at 12:35:08PM +0200, Svatopluk Kraus wrote:
   Hi,
  
 I found out that while the running excecutables and a dynamic linker
   are protected against writing (ETXTBSY), the loaded shared libraries
   are not protected. The libraries are mapped by mmap() in dynamic
   linker (rtld) and there is no way how to set VV_TEXT flag on the
   libraries vnodes in mmap() code.
  
 In linux compability code \compat\linux\linux_misc.c, linux_uselib()
   sets VV_TEXT flags on a library vnode. In Solaris, MAP_TEXT flag
   exists which informs mmap() that the mapped region will be used
   primarily for executing instructions (for better MMU utilization).
   With these on mind, I propose to implement MAP_TEXT option in mmap()
   and in case that underlying object is a vnode, set VV_TEXT flag on it.
  
 I already have implemented it and with rtld map_object() patch it
   works fine for me (of course). The rtld patch looks easy, however I'm
   not sure about mmap patch.
  
 After some investigation, it looks that VV_TEXT once set on a vnode
   remains set until last reference on the vnode is left. So, I don't
   bother with VV_TEXT unset in munmap() to be consistent. The
   executables and dynamic linker are activated in kernel, so VV_TEXT is
   set before activation and cleared if something failed. Shared library
   activation is done in dynamic linker (i.e., in userland). It's done in
   steps and mmaping the library is one from them. So, I think that
   VV_TEXT can be set in mmap() just after everything is finished
   successfully.
   This is right, the object reference counter is also used as
   VV_TEXT counter. It is somewhat unaccurate, but in practice does
   not cause issues.
  
  
 The patch itself is implemented in vm_mmap_vnode(). If I want to set
   VV_TEXT flag on a vnode, I need an exclusive lock. In current code,
   the exclusive lock flag is (mis)used as a flag for
   vnode_pager_update_writecount() call. (I hope that I didn't miss
   something.) So, the patch is bigger slightly.
  
 I defined the MAP_TEXT flag in extented flags sections. However, I'm
   feeling the relation to MAP_STACK flag, but not sure if and when
   reserved flags (in other flags section) can be re-used.
  
  Svata
  
  
 Index: libexec/rtld-elf/map_object.c
   ===
   --- libexec/rtld-elf/map_object.c (revision 239770)
   +++ libexec/rtld-elf/map_object.c (working copy)
   @@ -199,7 +199,8 @@
 data_prot = convert_prot(segs[i]-p_flags);
 data_flags = convert_flags(segs[i]-p_flags) | MAP_FIXED;
 if (mmap(data_addr, data_vlimit - data_vaddr, data_prot,
   -   data_flags | MAP_PREFAULT_READ, fd, data_offset) == (caddr_t) 
-1) {
   +   data_flags | MAP_PREFAULT_READ | MAP_TEXT, fd, data_offset) ==
   + (caddr_t) -1) {
   I am not sure that we shall mark all segments mappings with MAP_TEXT.
   I understand the logic of the change, since we do not want data segment
   to be changed under us. Still, having MAP_TEXT for non-text segments 
looks
   strange.
  
  
  I agree. However, only way how to recognize a text segment is an
  executable flag set. The new patch for map_object.c is following:
  
  Index: libexec/rtld-elf/map_object.c
  ===
  --- libexec/rtld-elf/map_object.c   (revision 239770)
  +++ libexec/rtld-elf/map_object.c   (working copy)
  @@ -442,5 +442,10 @@
*/
   if (!(elfflags  PF_W))
  flags |= MAP_NOCORE;
  +/*
  + * Executable mappings are marked MAP_TEXT.
  + */
  +if (elfflags  PF_X)
  +   flags |= MAP_TEXT;
   return flags;
   }
  
  
 _rtld_error(%s: mmap of data failed: %s, path,
 rtld_strerror(errno));
 goto error1;
   Index: sys/vm/vm_mmap.c
   ===
   --- sys/vm/vm_mmap.c  (revision 239770)
   +++ sys/vm/vm_mmap.c  (working copy)
   @@ -1258,10 +1258,13 @@
 struct mount *mp;
 struct ucred *cred;
 int error, flags, locktype, vfslocked;
   + int writeable_shared;
  
 mp = vp-v_mount;
 cred = td-td_ucred;
   - if ((*maxprotp  VM_PROT_WRITE)  (*flagsp  MAP_SHARED))
   + flags = *flagsp;
   + writeable_shared = ((*maxprotp  VM_PROT_WRITE)  (flags  
MAP_SHARED));
   + if (writeable_shared || ((flags  MAP_TEXT) != 0))
 locktype = LK_EXCLUSIVE;
 else
 locktype = LK_SHARED;
   @@ -1271,7 +1274,6 @@
 return (error);
 }
 foff = *foffp;
   - flags = *flagsp;
 obj = vp-v_object;
 if (vp-v_type == VREG) {

[patch] mmap() MAP_TEXT implementation (to use for shared libraries)

2012-09-03 Thread Svatopluk Kraus
Hi,

  I found out that while the running excecutables and a dynamic linker
are protected against writing (ETXTBSY), the loaded shared libraries
are not protected. The libraries are mapped by mmap() in dynamic
linker (rtld) and there is no way how to set VV_TEXT flag on the
libraries vnodes in mmap() code.

  In linux compability code \compat\linux\linux_misc.c, linux_uselib()
sets VV_TEXT flags on a library vnode. In Solaris, MAP_TEXT flag
exists which informs mmap() that the mapped region will be used
primarily for executing instructions (for better MMU utilization).
With these on mind, I propose to implement MAP_TEXT option in mmap()
and in case that underlying object is a vnode, set VV_TEXT flag on it.

  I already have implemented it and with rtld map_object() patch it
works fine for me (of course). The rtld patch looks easy, however I'm
not sure about mmap patch.

  After some investigation, it looks that VV_TEXT once set on a vnode
remains set until last reference on the vnode is left. So, I don't
bother with VV_TEXT unset in munmap() to be consistent. The
executables and dynamic linker are activated in kernel, so VV_TEXT is
set before activation and cleared if something failed. Shared library
activation is done in dynamic linker (i.e., in userland). It's done in
steps and mmaping the library is one from them. So, I think that
VV_TEXT can be set in mmap() just after everything is finished
successfully.

  The patch itself is implemented in vm_mmap_vnode(). If I want to set
VV_TEXT flag on a vnode, I need an exclusive lock. In current code,
the exclusive lock flag is (mis)used as a flag for
vnode_pager_update_writecount() call. (I hope that I didn't miss
something.) So, the patch is bigger slightly.

  I defined the MAP_TEXT flag in extented flags sections. However, I'm
feeling the relation to MAP_STACK flag, but not sure if and when
reserved flags (in other flags section) can be re-used.

   Svata


  Index: libexec/rtld-elf/map_object.c
===
--- libexec/rtld-elf/map_object.c   (revision 239770)
+++ libexec/rtld-elf/map_object.c   (working copy)
@@ -199,7 +199,8 @@
data_prot = convert_prot(segs[i]-p_flags);
data_flags = convert_flags(segs[i]-p_flags) | MAP_FIXED;
if (mmap(data_addr, data_vlimit - data_vaddr, data_prot,
- data_flags | MAP_PREFAULT_READ, fd, data_offset) == (caddr_t) -1) {
+ data_flags | MAP_PREFAULT_READ | MAP_TEXT, fd, data_offset) ==
+   (caddr_t) -1) {
_rtld_error(%s: mmap of data failed: %s, path,
rtld_strerror(errno));
goto error1;
Index: sys/vm/vm_mmap.c
===
--- sys/vm/vm_mmap.c(revision 239770)
+++ sys/vm/vm_mmap.c(working copy)
@@ -1258,10 +1258,13 @@
struct mount *mp;
struct ucred *cred;
int error, flags, locktype, vfslocked;
+   int writeable_shared;

mp = vp-v_mount;
cred = td-td_ucred;
-   if ((*maxprotp  VM_PROT_WRITE)  (*flagsp  MAP_SHARED))
+   flags = *flagsp;
+   writeable_shared = ((*maxprotp  VM_PROT_WRITE)  (flags  
MAP_SHARED));
+   if (writeable_shared || ((flags  MAP_TEXT) != 0))
locktype = LK_EXCLUSIVE;
else
locktype = LK_SHARED;
@@ -1271,7 +1274,6 @@
return (error);
}
foff = *foffp;
-   flags = *flagsp;
obj = vp-v_object;
if (vp-v_type == VREG) {
/*
@@ -1294,7 +1296,7 @@
return (error);
}
}
-   if (locktype == LK_EXCLUSIVE) {
+   if (writeable_shared) {
*writecounted = TRUE;
vnode_pager_update_writecount(obj, 0, objsize);
}
@@ -1337,6 +1339,14 @@
error = ENOMEM;
goto done;
}
+   /*
+* If MAP_TEXT is announced, set VV_TEXT so no one can write
+* to the executable.
+*/
+   if ((flags  MAP_TEXT) != 0) {
+   ASSERT_VOP_ELOCKED(vp, vv_text);
+   vp-v_vflag |= VV_TEXT;
+   }
*objp = obj;
*flagsp = flags;

Index: sys/sys/mman.h
===
--- sys/sys/mman.h  (revision 239770)
+++ sys/sys/mman.h  (working copy)
@@ -91,6 +91,7 @@
  */
 #defineMAP_NOCORE   0x0002 /* dont include these pages in a 
coredump */
 #defineMAP_PREFAULT_READ 0x0004 /* prefault mapping for reading */
+#defineMAP_TEXT 0x0008 /* map code segment */
 #endif /* __BSD_VISIBLE */

 #if __POSIX_VISIBLE = 199309
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to 

Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)

2012-09-03 Thread Andriy Gapon
on 03/09/2012 13:35 Svatopluk Kraus said the following:
 After some investigation, it looks that VV_TEXT once set on a vnode
 remains set until last reference on the vnode is left.

There was an idea to turn VV_TEXT flag into a v_text counter.
Maybe something like that would be useful indeed?

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [patch] mmap() MAP_TEXT implementation (to use for shared libraries)

2012-09-03 Thread Konstantin Belousov
On Mon, Sep 03, 2012 at 12:35:08PM +0200, Svatopluk Kraus wrote:
 Hi,
 
   I found out that while the running excecutables and a dynamic linker
 are protected against writing (ETXTBSY), the loaded shared libraries
 are not protected. The libraries are mapped by mmap() in dynamic
 linker (rtld) and there is no way how to set VV_TEXT flag on the
 libraries vnodes in mmap() code.
 
   In linux compability code \compat\linux\linux_misc.c, linux_uselib()
 sets VV_TEXT flags on a library vnode. In Solaris, MAP_TEXT flag
 exists which informs mmap() that the mapped region will be used
 primarily for executing instructions (for better MMU utilization).
 With these on mind, I propose to implement MAP_TEXT option in mmap()
 and in case that underlying object is a vnode, set VV_TEXT flag on it.
 
   I already have implemented it and with rtld map_object() patch it
 works fine for me (of course). The rtld patch looks easy, however I'm
 not sure about mmap patch.
 
   After some investigation, it looks that VV_TEXT once set on a vnode
 remains set until last reference on the vnode is left. So, I don't
 bother with VV_TEXT unset in munmap() to be consistent. The
 executables and dynamic linker are activated in kernel, so VV_TEXT is
 set before activation and cleared if something failed. Shared library
 activation is done in dynamic linker (i.e., in userland). It's done in
 steps and mmaping the library is one from them. So, I think that
 VV_TEXT can be set in mmap() just after everything is finished
 successfully.
This is right, the object reference counter is also used as
VV_TEXT counter. It is somewhat unaccurate, but in practice does
not cause issues.

 
   The patch itself is implemented in vm_mmap_vnode(). If I want to set
 VV_TEXT flag on a vnode, I need an exclusive lock. In current code,
 the exclusive lock flag is (mis)used as a flag for
 vnode_pager_update_writecount() call. (I hope that I didn't miss
 something.) So, the patch is bigger slightly.
 
   I defined the MAP_TEXT flag in extented flags sections. However, I'm
 feeling the relation to MAP_STACK flag, but not sure if and when
 reserved flags (in other flags section) can be re-used.
 
Svata
 
 
   Index: libexec/rtld-elf/map_object.c
 ===
 --- libexec/rtld-elf/map_object.c (revision 239770)
 +++ libexec/rtld-elf/map_object.c (working copy)
 @@ -199,7 +199,8 @@
   data_prot = convert_prot(segs[i]-p_flags);
   data_flags = convert_flags(segs[i]-p_flags) | MAP_FIXED;
   if (mmap(data_addr, data_vlimit - data_vaddr, data_prot,
 -   data_flags | MAP_PREFAULT_READ, fd, data_offset) == (caddr_t) -1) {
 +   data_flags | MAP_PREFAULT_READ | MAP_TEXT, fd, data_offset) ==
 + (caddr_t) -1) {
I am not sure that we shall mark all segments mappings with MAP_TEXT.
I understand the logic of the change, since we do not want data segment
to be changed under us. Still, having MAP_TEXT for non-text segments looks
strange.

   _rtld_error(%s: mmap of data failed: %s, path,
   rtld_strerror(errno));
   goto error1;
 Index: sys/vm/vm_mmap.c
 ===
 --- sys/vm/vm_mmap.c  (revision 239770)
 +++ sys/vm/vm_mmap.c  (working copy)
 @@ -1258,10 +1258,13 @@
   struct mount *mp;
   struct ucred *cred;
   int error, flags, locktype, vfslocked;
 + int writeable_shared;
 
   mp = vp-v_mount;
   cred = td-td_ucred;
 - if ((*maxprotp  VM_PROT_WRITE)  (*flagsp  MAP_SHARED))
 + flags = *flagsp;
 + writeable_shared = ((*maxprotp  VM_PROT_WRITE)  (flags  
 MAP_SHARED));
 + if (writeable_shared || ((flags  MAP_TEXT) != 0))
   locktype = LK_EXCLUSIVE;
   else
   locktype = LK_SHARED;
 @@ -1271,7 +1274,6 @@
   return (error);
   }
   foff = *foffp;
 - flags = *flagsp;
   obj = vp-v_object;
   if (vp-v_type == VREG) {
   /*
 @@ -1294,7 +1296,7 @@
   return (error);
   }
   }
 - if (locktype == LK_EXCLUSIVE) {
 + if (writeable_shared) {
   *writecounted = TRUE;
   vnode_pager_update_writecount(obj, 0, objsize);
   }
 @@ -1337,6 +1339,14 @@
   error = ENOMEM;
   goto done;
   }
 + /*
 +  * If MAP_TEXT is announced, set VV_TEXT so no one can write
 +  * to the executable.
 +  */
 + if ((flags  MAP_TEXT) != 0) {
 + ASSERT_VOP_ELOCKED(vp, vv_text);
 + vp-v_vflag |= VV_TEXT;
 + }
I do not think we want to set VV_TEXT for device vnodes.

   *objp = obj;
   *flagsp = flags;
 
 Index: sys/sys/mman.h
 ===
 --- sys/sys/mman.h(revision 239770)
 +++ sys/sys/mman.h(working copy)
 @@ -91,6 +91,7 @@
   */
  #define