Re: [PATCH] open: introduce O_NOSTD

2009-08-28 Thread Eric Blake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

According to Ulrich Drepper on 8/27/2009 8:22 AM:
 I hope that my example shows why doing it in the kernel is desirable -
 there is no safe way to keep the pre-O_CLOEXEC efficiency using just the
 library, but there IS a way to do it with kernel support:
 
 You're describing a very special case where the performance implications
 are really minimal and try to argue that is a good enough reason?  I
 don't think so.
 
 If a program really has to do thousands of these safe open calls then it
 can invest time into opening /dev/null for any of the unallocated
 descriptors  3.  You can even embed this logic in the safer_open function.

But that's what I've been trying to explain.  There are cases where
opening a dummy /dev/null descriptor interferes with behavior, and where
you NEED to leave the std descriptors closed.

Again, let's look at coreutils, which already is using a version of
open_safer in many of its utilities.  Consider 'cp -i a b -'.  If b does
not exist, then there is no reason for cp to prompt, so it never reads
from stdin, and the fact that stdin was closed must not interfere with
execution.  Which means cp cannot just blindly warn if it detects that
stdin was closed when it started; it has to wait until conditions warrant
reading from stdin.  On the other hand, if b exists, coreutils' cp
correctly warns the user about the inability to read from stdin:

$ cp -i a b -
cp: overwrite `b'? cp: error closing file: Bad file descriptor
$ echo $?
1

If coreutils were to tie fd 0 to /dev/null, then it can no longer
distinguish a read failure when reading the prompt from stdin, and would
no longer be able to alert the user to the failed interactive copy.

Next, consider 'cp -uv a b -'.  Coreutils uses printf() to print status
messages, but only if an update warranted a message.  Again, that means
coreutils wants to know whether stdout is a valid file, and tying fd 1 to
/dev/null can get in the way:

$ cp -uv a b -
$ rm b
$ cp -uv a b -
cp: write error: Bad file descriptor

Now, couple this with the fact that 'cp -r' can visit thousands of files
during its course of operation.  There is no sane way to tie dummy fds to
the standard descriptors, and still be able to distinguish failure to use
a standard descriptor but only on the execution paths that actually used
that standard descriptor.  Hence, there is a definite use case for keeping
the std fds closed, rather than putting dummies in their place; but to
preserve this setup without kernel support, every single open() (and
dup(), pipe(), socket(), ...) must be wrapped to use fcntl/close to move
new fds back out of this range.  And cp is an example of an existing
program that really DOES end up paying a penalty of thousands of
fcntl/close calls if started with a standard fd closed, in order to
preserve its semantics.

Furthermore, while the coreutils' implementation of open_safer operates
correctly in a single-threaded environment, it cannot close the race in a
multi-threaded context if the application started life with fd 2 closed,
since there is a window of time during open_safer where another thread can
attempt to use stderr and see thread 1's newly opened file before it can
be moved safely out of the way:

thread 1  thread 2
open_safer...
 open - 2
 fcntl(2,F_DUPFD,3)
  fputc(stderr)
  fflush(stderr) - clobbers thread 1's file
 close(2)
open_safer - 3

- --
Don't work too hard, make some time for fun as well!

Eric Blake e...@byu.net
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkqXzXgACgkQ84KuGfSFAYA+ZwCgvuQ3PpDDLhLozqCUm3qyhIVj
aqQAnR+MgdLg5apNEdHdIn9nlr4yRISI
=0tcQ
-END PGP SIGNATURE-




Re: [PATCH] open: introduce O_NOSTD

2009-08-28 Thread Eric Blake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

According to Florian Weimer on 8/27/2009 8:35 AM:
 * Eric Blake:
 
 int open_safer (const char *name, int flags, int mode)
 {
   int fd = open (name, flags | O_CLOEXEC, mode);
   if (0 = fd  fd = 2)
 {
   int dup = fcntl (fd, ((flags  O_CLOEXEC)
 ? F_DUPFD_CLOEXEC : F_DUPFD), 3);
   int saved_errno = errno;
   close (fd);
   errno = saved_errno;
   fd = dup;
 }
   else if (!(flags  O_CLOEXEC))
 {
   if ((flags = fcntl (fd, F_GETFD))  0
   || fcntl (fd, F_SETFD, flags  ~FD_CLOEXEC) == -1)
 {
   int saved_errno = errno;
   close (fd);
   fd = -1;
   errno = saved_errno;
 }
 }
   return fd;
 }
 
 This solves the fd leak,
 
 It's still buggy.

In what manner?  Are you stating that F_DUPFD_CLOEXEC is not atomic?

  You need something like this:
 
 int open_safer(const char *name, int flags, int mode)
 {
   int opened_fd[3] = {0, 0, 0};
   int fd, i, errno_saved;
   while (1) {
 fd = open(name, flags | O_CLOEXEC, mode);
 if (fd  0 || fd  2) {
   break;
 }
 opened_fd[fd] = 1;
   }
   for (int i = 0; i = 2; ++i) {
 if (opened_fd[i]) {
   errno_saved = errno;
   close(i);
   errno = errno_saved;
 }
   }
   return fd;
 }
 
 It's untested, so it's probably still buggy.

Your version fails to clear the cloexec bit of the final fd if the
original caller didn't request O_CLOEXEC.  If the caller requested
O_CLOEXEC, then your version takes 3, 5, or 7 syscalls depending on how
many std fds were closed, while my version takes 3 syscalls regardless of
how many std fds were closed.  Also, your suggestion has a definite race
in that you are calling open() multiple times rather than cloning an
existing fd after the first open(), such that another process could alter
which file is visited between your first and last open().

- --
Don't work too hard, make some time for fun as well!

Eric Blake e...@byu.net
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkqX0W0ACgkQ84KuGfSFAYDYdwCeOB8dt0Rx2QYJkfIsfP452AYj
V7QAoL1FACwnRPhhQ2aTh2EB38i+d34o
=ouPs
-END PGP SIGNATURE-




Re: [PATCH] open: introduce O_NOSTD

2009-08-28 Thread Eric Blake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

According to Florian Weimer on 8/28/2009 6:52 AM:
 If the caller requested O_CLOEXEC, then your version takes 3, 5, or
 7 syscalls depending on how many std fds were closed, while my
 version takes 3 syscalls regardless of how many std fds were closed.
 
 I really don't see a way around that.  You can't pick a descriptor and
 hope that it's unused.

fcntl(,F_DUPFD,3) is NOT like dup2.  It picks the next available
descriptor that is at least 3.

- --
Don't work too hard, make some time for fun as well!

Eric Blake e...@byu.net
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkqX1HEACgkQ84KuGfSFAYAzYQCgnwad5k39bZJqw3Wg1vMED/Fd
zjYAmwenJlImBUE33geRdHC6aoVSx7JG
=omNI
-END PGP SIGNATURE-




Re: [PATCH] open: introduce O_NOSTD

2009-08-28 Thread Eric Blake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

According to Florian Weimer on 8/28/2009 6:52 AM:
 * Eric Blake:
 
 Your version fails to clear the cloexec bit of the final fd if the
 original caller didn't request O_CLOEXEC.
 
 Okay, but you can fix that in a race-free manner (but I thought that
 this was implied by open_safer).

The current semantics of gnulib's open_safer is that the result is
guaranteed to be 3 or larger.  It would require an audit of all gnulib
clients of the open_safer method to see whether it also makes sense to
change the semantics of open_safer to also guarantee that fds start life
with the cloexec bit set.  But maybe that is a change worth making in
gnulib, with applications intending to give an fd to a child process being
required to explicitly clear the cloexec bit.

 Also, your suggestion has a definite race in that you are calling
 open() multiple times rather than cloning an existing fd after the
 first open(), such that another process could alter which file is
 visited between your first and last open().
 
 Sure, but this is an unobservable differen.ce

It is absolutely observable - if the user passed O_CREAT|O_EXCL as part of
their flags, then the second open() will inappropriately fail.

- --
Don't work too hard, make some time for fun as well!

Eric Blake e...@byu.net
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkqX1egACgkQ84KuGfSFAYDKWACeMM4spqCsmgVVwME9+C/1tdpU
g7wAnR9FetGPGr7acXLfLIVvzYZ7tpz3
=VjUY
-END PGP SIGNATURE-




Re: [PATCH] open: introduce O_NOSTD

2009-08-27 Thread Eric Blake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

According to Davide Libenzi on 8/25/2009 3:53 PM:
 Another solution is for the application to sanitize all newly-created
 fds: GNU coreutils provides a wrapper open_safer, which does nothing
 extra in the common case that open() returned 3 or larger, but calls
 fcntl(n,F_DUPFD,3)/close(n) before returning if n was less than 3.
 However, this leads to triple the syscall cost for every open() call
 if the process starts life with a std fd closed; and if O_CLOEXEC is
 not used, still leaves a window of time where the fd can be leaked
 through another thread's use of fork/exec.
 
 I think we can say that the vast majority of the software is not going to 
 notice the proposed open_safer(), performance-wise, since the first three 
 fds are always filled. So IMO the performance impact argument is a weak one.
 If CLOEXEC semantics are needed in the open operation, F_DUPFD_CLOEXEC can 
 be used to match it.

The current gnulib implementation of open_safer (pre-O_CLOEXEC support) is
(roughly):

/* Wrap open, to guarantee that a successful return value is = 3.  */
int open_safer (const char *name, int flags, int mode)
{
  int fd = open (name, flags, mode);
  if (0 = fd  fd = 2)
{
  int dup = fcntl (fd, F_DUPFD, 3);
  int saved_errno = errno;
  close (fd);
  errno = saved_errno;
  fd = dup;
}
  return fd;
}

which has the desired property of no overhead in the common case of all
standard fds open.  But it obviously mishandles the O_CLOEXEC flag.
Here's a first cut at supporting it:

int open_safer (const char *name, int flags, int mode)
{
  int fd = open (name, flags, mode);
  if (0 = fd  fd = 2)
{
  int dup = fcntl (fd, ((flags  O_CLOEXEC)
? F_DUPFD_CLOEXEC : F_DUPFD), 3);
  int saved_errno = errno;
  close (fd);
  errno = saved_errno;
  fd = dup;
}
  return fd;
}

If the user requested open_safer(O_CLOEXEC), then we still have the
desired property of no syscall overhead and no fd leak.  But if the user
intentionally does not pass O_CLOEXEC because they wanted to create an
inheritable fd, but without stomping on standard fds, then this version
still has a window for an fd leak.  So let's try this version, which
guarantees no fd leak, while still keeping the semantics of giving the
user an inheritable fd outside the std fd range:

int open_safer (const char *name, int flags, int mode)
{
  int fd = open (name, flags | O_CLOEXEC, mode);
  if (0 = fd  fd = 2)
{
  int dup = fcntl (fd, ((flags  O_CLOEXEC)
? F_DUPFD_CLOEXEC : F_DUPFD), 3);
  int saved_errno = errno;
  close (fd);
  errno = saved_errno;
  fd = dup;
}
  else if (!(flags  O_CLOEXEC))
{
  if ((flags = fcntl (fd, F_GETFD))  0
  || fcntl (fd, F_SETFD, flags  ~FD_CLOEXEC) == -1)
{
  int saved_errno = errno;
  close (fd);
  fd = -1;
  errno = saved_errno;
}
}
  return fd;
}

This solves the fd leak, and open_safer(O_CLOEXEC) is still cheap in the
common case.  But now the case of open_safer without O_CLOEXEC costs 3
syscalls, regardless of whether the standard fds were already open (if we
assumed there were no possibility of new FD_* flags, we could cut the
common-case penalty from three to two by skipping the fcntl(fd,F_GETFD)
and just using fcntl(fd,F_SETFD,0), but that's asking for problems).

 While the patch is simple, IMO this is something that can be easily taken 
 care in glibc layers w/out huge drawbacks.

I hope that my example shows why doing it in the kernel is desirable -
there is no safe way to keep the pre-O_CLOEXEC efficiency using just the
library, but there IS a way to do it with kernel support:

int open_safer (const char *name, int flags, int mode)
{
  return open (name, flags | O_NOSTD, mode);
}

Also, remember this statement from Ulrich in the series that first
introduced O_CLOEXEC/O_NONBLOCK support across all the fd creation APIs:

In future there will be other uses.  Like a O_* flag to enable the use
of non-sequential descriptors.
http://marc.info/?l=linux-kernelm=121010907127190w=2

- --
Don't work too hard, make some time for fun as well!

Eric Blake e...@byu.net
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkqWj/gACgkQ84KuGfSFAYClIgCdEw6/7+PiFhR7aKGyuLUd5RdR
lbAAmgKLqCC5zlhkOm/x4K+Om7nqeD0i
=Ibq4
-END PGP SIGNATURE-




Re: [PATCH] open: introduce O_NOSTD

2009-08-27 Thread Ulrich Drepper

On 08/27/2009 06:54 AM, Eric Blake wrote:

I hope that my example shows why doing it in the kernel is desirable -
there is no safe way to keep the pre-O_CLOEXEC efficiency using just the
library, but there IS a way to do it with kernel support:


You're describing a very special case where the performance implications 
are really minimal and try to argue that is a good enough reason?  I 
don't think so.


If a program really has to do thousands of these safe open calls then it 
can invest time into opening /dev/null for any of the unallocated 
descriptors  3.  You can even embed this logic in the safer_open function.


--
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖




Re: [PATCH] open: introduce O_NOSTD

2009-08-27 Thread Florian Weimer
* Eric Blake:

 int open_safer (const char *name, int flags, int mode)
 {
   int fd = open (name, flags | O_CLOEXEC, mode);
   if (0 = fd  fd = 2)
 {
   int dup = fcntl (fd, ((flags  O_CLOEXEC)
 ? F_DUPFD_CLOEXEC : F_DUPFD), 3);
   int saved_errno = errno;
   close (fd);
   errno = saved_errno;
   fd = dup;
 }
   else if (!(flags  O_CLOEXEC))
 {
   if ((flags = fcntl (fd, F_GETFD))  0
   || fcntl (fd, F_SETFD, flags  ~FD_CLOEXEC) == -1)
 {
   int saved_errno = errno;
   close (fd);
   fd = -1;
   errno = saved_errno;
 }
 }
   return fd;
 }

 This solves the fd leak,

It's still buggy.  You need something like this:

int open_safer(const char *name, int flags, int mode)
{
  int opened_fd[3] = {0, 0, 0};
  int fd, i, errno_saved;
  while (1) {
fd = open(name, flags | O_CLOEXEC, mode);
if (fd  0 || fd  2) {
  break;
}
opened_fd[fd] = 1;
  }
  for (int i = 0; i = 2; ++i) {
if (opened_fd[i]) {
  errno_saved = errno;
  close(i);
  errno = errno_saved;
}
  }
  return fd;
}

It's untested, so it's probably still buggy.

(O_CLOEXEC should have been a thread attribute, like the base path in
the *_at functions. *sigh*)

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99




Re: [PATCH] open: introduce O_NOSTD

2009-08-27 Thread Davide Libenzi
On Thu, 27 Aug 2009, Eric Blake wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 According to Davide Libenzi on 8/25/2009 3:53 PM:
  Another solution is for the application to sanitize all newly-created
  fds: GNU coreutils provides a wrapper open_safer, which does nothing
  extra in the common case that open() returned 3 or larger, but calls
  fcntl(n,F_DUPFD,3)/close(n) before returning if n was less than 3.
  However, this leads to triple the syscall cost for every open() call
  if the process starts life with a std fd closed; and if O_CLOEXEC is
  not used, still leaves a window of time where the fd can be leaked
  through another thread's use of fork/exec.
  
  I think we can say that the vast majority of the software is not going to 
  notice the proposed open_safer(), performance-wise, since the first three 
  fds are always filled. So IMO the performance impact argument is a weak one.
  If CLOEXEC semantics are needed in the open operation, F_DUPFD_CLOEXEC can 
  be used to match it.
 
 The current gnulib implementation of open_safer (pre-O_CLOEXEC support) is
 (roughly):
 
 /* Wrap open, to guarantee that a successful return value is = 3.  */
 int open_safer (const char *name, int flags, int mode)
 {
   int fd = open (name, flags, mode);
   if (0 = fd  fd = 2)
 {
   int dup = fcntl (fd, F_DUPFD, 3);
   int saved_errno = errno;
   close (fd);
   errno = saved_errno;
   fd = dup;
 }
   return fd;
 }
 
 which has the desired property of no overhead in the common case of all
 standard fds open.  But it obviously mishandles the O_CLOEXEC flag.
 Here's a first cut at supporting it:
 
 int open_safer (const char *name, int flags, int mode)
 {
   int fd = open (name, flags, mode);
   if (0 = fd  fd = 2)
 {
   int dup = fcntl (fd, ((flags  O_CLOEXEC)
 ? F_DUPFD_CLOEXEC : F_DUPFD), 3);
   int saved_errno = errno;
   close (fd);
   errno = saved_errno;
   fd = dup;
 }
   return fd;
 }
 
 If the user requested open_safer(O_CLOEXEC), then we still have the
 desired property of no syscall overhead and no fd leak.  But if the user
 intentionally does not pass O_CLOEXEC because they wanted to create an
 inheritable fd, but without stomping on standard fds, then this version
 still has a window for an fd leak.  So let's try this version, which
 guarantees no fd leak, while still keeping the semantics of giving the
 user an inheritable fd outside the std fd range:
 
 int open_safer (const char *name, int flags, int mode)
 {
   int fd = open (name, flags | O_CLOEXEC, mode);
   if (0 = fd  fd = 2)
 {
   int dup = fcntl (fd, ((flags  O_CLOEXEC)
 ? F_DUPFD_CLOEXEC : F_DUPFD), 3);
   int saved_errno = errno;
   close (fd);
   errno = saved_errno;
   fd = dup;
 }
   else if (!(flags  O_CLOEXEC))
 {
   if ((flags = fcntl (fd, F_GETFD))  0
   || fcntl (fd, F_SETFD, flags  ~FD_CLOEXEC) == -1)
 {
   int saved_errno = errno;
   close (fd);
   fd = -1;
   errno = saved_errno;
 }
 }
   return fd;
 }
 
 This solves the fd leak, and open_safer(O_CLOEXEC) is still cheap in the
 common case.  But now the case of open_safer without O_CLOEXEC costs 3
 syscalls, regardless of whether the standard fds were already open (if we
 assumed there were no possibility of new FD_* flags, we could cut the
 common-case penalty from three to two by skipping the fcntl(fd,F_GETFD)
 and just using fcntl(fd,F_SETFD,0), but that's asking for problems).
 
  While the patch is simple, IMO this is something that can be easily taken 
  care in glibc layers w/out huge drawbacks.
 
 I hope that my example shows why doing it in the kernel is desirable -
 there is no safe way to keep the pre-O_CLOEXEC efficiency using just the
 library, but there IS a way to do it with kernel support:
 
 int open_safer (const char *name, int flags, int mode)
 {
   return open (name, flags | O_NOSTD, mode);
 }

Can't the handling be done on close(), like (modulo some errno save/restore):

int safer_close(int fd) {
int error = close(fd);

if (fd  3  fd = 0) {
if ((fd = open(/dev/null, O_RDWR))  2)
close(fd);
}

return error;
}



- Davide






Re: [PATCH] open: introduce O_NOSTD

2009-08-25 Thread Davide Libenzi
On Tue, 25 Aug 2009, Eric Blake wrote:

 Another solution is for the application to sanitize all newly-created
 fds: GNU coreutils provides a wrapper open_safer, which does nothing
 extra in the common case that open() returned 3 or larger, but calls
 fcntl(n,F_DUPFD,3)/close(n) before returning if n was less than 3.
 However, this leads to triple the syscall cost for every open() call
 if the process starts life with a std fd closed; and if O_CLOEXEC is
 not used, still leaves a window of time where the fd can be leaked
 through another thread's use of fork/exec.

I think we can say that the vast majority of the software is not going to 
notice the proposed open_safer(), performance-wise, since the first three 
fds are always filled. So IMO the performance impact argument is a weak one.
If CLOEXEC semantics are needed in the open operation, F_DUPFD_CLOEXEC can 
be used to match it.
While the patch is simple, IMO this is something that can be easily taken 
care in glibc layers w/out huge drawbacks.


- Davide