Re: [PATCH] open: introduce O_NOSTD
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 According to Ulrich Drepper on 8/27/2009 8:22 AM: I hope that my example shows why doing it in the kernel is desirable - there is no safe way to keep the pre-O_CLOEXEC efficiency using just the library, but there IS a way to do it with kernel support: You're describing a very special case where the performance implications are really minimal and try to argue that is a good enough reason? I don't think so. If a program really has to do thousands of these safe open calls then it can invest time into opening /dev/null for any of the unallocated descriptors 3. You can even embed this logic in the safer_open function. But that's what I've been trying to explain. There are cases where opening a dummy /dev/null descriptor interferes with behavior, and where you NEED to leave the std descriptors closed. Again, let's look at coreutils, which already is using a version of open_safer in many of its utilities. Consider 'cp -i a b -'. If b does not exist, then there is no reason for cp to prompt, so it never reads from stdin, and the fact that stdin was closed must not interfere with execution. Which means cp cannot just blindly warn if it detects that stdin was closed when it started; it has to wait until conditions warrant reading from stdin. On the other hand, if b exists, coreutils' cp correctly warns the user about the inability to read from stdin: $ cp -i a b - cp: overwrite `b'? cp: error closing file: Bad file descriptor $ echo $? 1 If coreutils were to tie fd 0 to /dev/null, then it can no longer distinguish a read failure when reading the prompt from stdin, and would no longer be able to alert the user to the failed interactive copy. Next, consider 'cp -uv a b -'. Coreutils uses printf() to print status messages, but only if an update warranted a message. Again, that means coreutils wants to know whether stdout is a valid file, and tying fd 1 to /dev/null can get in the way: $ cp -uv a b - $ rm b $ cp -uv a b - cp: write error: Bad file descriptor Now, couple this with the fact that 'cp -r' can visit thousands of files during its course of operation. There is no sane way to tie dummy fds to the standard descriptors, and still be able to distinguish failure to use a standard descriptor but only on the execution paths that actually used that standard descriptor. Hence, there is a definite use case for keeping the std fds closed, rather than putting dummies in their place; but to preserve this setup without kernel support, every single open() (and dup(), pipe(), socket(), ...) must be wrapped to use fcntl/close to move new fds back out of this range. And cp is an example of an existing program that really DOES end up paying a penalty of thousands of fcntl/close calls if started with a standard fd closed, in order to preserve its semantics. Furthermore, while the coreutils' implementation of open_safer operates correctly in a single-threaded environment, it cannot close the race in a multi-threaded context if the application started life with fd 2 closed, since there is a window of time during open_safer where another thread can attempt to use stderr and see thread 1's newly opened file before it can be moved safely out of the way: thread 1 thread 2 open_safer... open - 2 fcntl(2,F_DUPFD,3) fputc(stderr) fflush(stderr) - clobbers thread 1's file close(2) open_safer - 3 - -- Don't work too hard, make some time for fun as well! Eric Blake e...@byu.net -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (Cygwin) Comment: Public key at home.comcast.net/~ericblake/eblake.gpg Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkqXzXgACgkQ84KuGfSFAYA+ZwCgvuQ3PpDDLhLozqCUm3qyhIVj aqQAnR+MgdLg5apNEdHdIn9nlr4yRISI =0tcQ -END PGP SIGNATURE-
Re: [PATCH] open: introduce O_NOSTD
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 According to Florian Weimer on 8/27/2009 8:35 AM: * Eric Blake: int open_safer (const char *name, int flags, int mode) { int fd = open (name, flags | O_CLOEXEC, mode); if (0 = fd fd = 2) { int dup = fcntl (fd, ((flags O_CLOEXEC) ? F_DUPFD_CLOEXEC : F_DUPFD), 3); int saved_errno = errno; close (fd); errno = saved_errno; fd = dup; } else if (!(flags O_CLOEXEC)) { if ((flags = fcntl (fd, F_GETFD)) 0 || fcntl (fd, F_SETFD, flags ~FD_CLOEXEC) == -1) { int saved_errno = errno; close (fd); fd = -1; errno = saved_errno; } } return fd; } This solves the fd leak, It's still buggy. In what manner? Are you stating that F_DUPFD_CLOEXEC is not atomic? You need something like this: int open_safer(const char *name, int flags, int mode) { int opened_fd[3] = {0, 0, 0}; int fd, i, errno_saved; while (1) { fd = open(name, flags | O_CLOEXEC, mode); if (fd 0 || fd 2) { break; } opened_fd[fd] = 1; } for (int i = 0; i = 2; ++i) { if (opened_fd[i]) { errno_saved = errno; close(i); errno = errno_saved; } } return fd; } It's untested, so it's probably still buggy. Your version fails to clear the cloexec bit of the final fd if the original caller didn't request O_CLOEXEC. If the caller requested O_CLOEXEC, then your version takes 3, 5, or 7 syscalls depending on how many std fds were closed, while my version takes 3 syscalls regardless of how many std fds were closed. Also, your suggestion has a definite race in that you are calling open() multiple times rather than cloning an existing fd after the first open(), such that another process could alter which file is visited between your first and last open(). - -- Don't work too hard, make some time for fun as well! Eric Blake e...@byu.net -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (Cygwin) Comment: Public key at home.comcast.net/~ericblake/eblake.gpg Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkqX0W0ACgkQ84KuGfSFAYDYdwCeOB8dt0Rx2QYJkfIsfP452AYj V7QAoL1FACwnRPhhQ2aTh2EB38i+d34o =ouPs -END PGP SIGNATURE-
Re: [PATCH] open: introduce O_NOSTD
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 According to Florian Weimer on 8/28/2009 6:52 AM: If the caller requested O_CLOEXEC, then your version takes 3, 5, or 7 syscalls depending on how many std fds were closed, while my version takes 3 syscalls regardless of how many std fds were closed. I really don't see a way around that. You can't pick a descriptor and hope that it's unused. fcntl(,F_DUPFD,3) is NOT like dup2. It picks the next available descriptor that is at least 3. - -- Don't work too hard, make some time for fun as well! Eric Blake e...@byu.net -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (Cygwin) Comment: Public key at home.comcast.net/~ericblake/eblake.gpg Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkqX1HEACgkQ84KuGfSFAYAzYQCgnwad5k39bZJqw3Wg1vMED/Fd zjYAmwenJlImBUE33geRdHC6aoVSx7JG =omNI -END PGP SIGNATURE-
Re: [PATCH] open: introduce O_NOSTD
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 According to Florian Weimer on 8/28/2009 6:52 AM: * Eric Blake: Your version fails to clear the cloexec bit of the final fd if the original caller didn't request O_CLOEXEC. Okay, but you can fix that in a race-free manner (but I thought that this was implied by open_safer). The current semantics of gnulib's open_safer is that the result is guaranteed to be 3 or larger. It would require an audit of all gnulib clients of the open_safer method to see whether it also makes sense to change the semantics of open_safer to also guarantee that fds start life with the cloexec bit set. But maybe that is a change worth making in gnulib, with applications intending to give an fd to a child process being required to explicitly clear the cloexec bit. Also, your suggestion has a definite race in that you are calling open() multiple times rather than cloning an existing fd after the first open(), such that another process could alter which file is visited between your first and last open(). Sure, but this is an unobservable differen.ce It is absolutely observable - if the user passed O_CREAT|O_EXCL as part of their flags, then the second open() will inappropriately fail. - -- Don't work too hard, make some time for fun as well! Eric Blake e...@byu.net -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (Cygwin) Comment: Public key at home.comcast.net/~ericblake/eblake.gpg Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkqX1egACgkQ84KuGfSFAYDKWACeMM4spqCsmgVVwME9+C/1tdpU g7wAnR9FetGPGr7acXLfLIVvzYZ7tpz3 =VjUY -END PGP SIGNATURE-
Re: [PATCH] open: introduce O_NOSTD
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 According to Davide Libenzi on 8/25/2009 3:53 PM: Another solution is for the application to sanitize all newly-created fds: GNU coreutils provides a wrapper open_safer, which does nothing extra in the common case that open() returned 3 or larger, but calls fcntl(n,F_DUPFD,3)/close(n) before returning if n was less than 3. However, this leads to triple the syscall cost for every open() call if the process starts life with a std fd closed; and if O_CLOEXEC is not used, still leaves a window of time where the fd can be leaked through another thread's use of fork/exec. I think we can say that the vast majority of the software is not going to notice the proposed open_safer(), performance-wise, since the first three fds are always filled. So IMO the performance impact argument is a weak one. If CLOEXEC semantics are needed in the open operation, F_DUPFD_CLOEXEC can be used to match it. The current gnulib implementation of open_safer (pre-O_CLOEXEC support) is (roughly): /* Wrap open, to guarantee that a successful return value is = 3. */ int open_safer (const char *name, int flags, int mode) { int fd = open (name, flags, mode); if (0 = fd fd = 2) { int dup = fcntl (fd, F_DUPFD, 3); int saved_errno = errno; close (fd); errno = saved_errno; fd = dup; } return fd; } which has the desired property of no overhead in the common case of all standard fds open. But it obviously mishandles the O_CLOEXEC flag. Here's a first cut at supporting it: int open_safer (const char *name, int flags, int mode) { int fd = open (name, flags, mode); if (0 = fd fd = 2) { int dup = fcntl (fd, ((flags O_CLOEXEC) ? F_DUPFD_CLOEXEC : F_DUPFD), 3); int saved_errno = errno; close (fd); errno = saved_errno; fd = dup; } return fd; } If the user requested open_safer(O_CLOEXEC), then we still have the desired property of no syscall overhead and no fd leak. But if the user intentionally does not pass O_CLOEXEC because they wanted to create an inheritable fd, but without stomping on standard fds, then this version still has a window for an fd leak. So let's try this version, which guarantees no fd leak, while still keeping the semantics of giving the user an inheritable fd outside the std fd range: int open_safer (const char *name, int flags, int mode) { int fd = open (name, flags | O_CLOEXEC, mode); if (0 = fd fd = 2) { int dup = fcntl (fd, ((flags O_CLOEXEC) ? F_DUPFD_CLOEXEC : F_DUPFD), 3); int saved_errno = errno; close (fd); errno = saved_errno; fd = dup; } else if (!(flags O_CLOEXEC)) { if ((flags = fcntl (fd, F_GETFD)) 0 || fcntl (fd, F_SETFD, flags ~FD_CLOEXEC) == -1) { int saved_errno = errno; close (fd); fd = -1; errno = saved_errno; } } return fd; } This solves the fd leak, and open_safer(O_CLOEXEC) is still cheap in the common case. But now the case of open_safer without O_CLOEXEC costs 3 syscalls, regardless of whether the standard fds were already open (if we assumed there were no possibility of new FD_* flags, we could cut the common-case penalty from three to two by skipping the fcntl(fd,F_GETFD) and just using fcntl(fd,F_SETFD,0), but that's asking for problems). While the patch is simple, IMO this is something that can be easily taken care in glibc layers w/out huge drawbacks. I hope that my example shows why doing it in the kernel is desirable - there is no safe way to keep the pre-O_CLOEXEC efficiency using just the library, but there IS a way to do it with kernel support: int open_safer (const char *name, int flags, int mode) { return open (name, flags | O_NOSTD, mode); } Also, remember this statement from Ulrich in the series that first introduced O_CLOEXEC/O_NONBLOCK support across all the fd creation APIs: In future there will be other uses. Like a O_* flag to enable the use of non-sequential descriptors. http://marc.info/?l=linux-kernelm=121010907127190w=2 - -- Don't work too hard, make some time for fun as well! Eric Blake e...@byu.net -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (Cygwin) Comment: Public key at home.comcast.net/~ericblake/eblake.gpg Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkqWj/gACgkQ84KuGfSFAYClIgCdEw6/7+PiFhR7aKGyuLUd5RdR lbAAmgKLqCC5zlhkOm/x4K+Om7nqeD0i =Ibq4 -END PGP SIGNATURE-
Re: [PATCH] open: introduce O_NOSTD
On 08/27/2009 06:54 AM, Eric Blake wrote: I hope that my example shows why doing it in the kernel is desirable - there is no safe way to keep the pre-O_CLOEXEC efficiency using just the library, but there IS a way to do it with kernel support: You're describing a very special case where the performance implications are really minimal and try to argue that is a good enough reason? I don't think so. If a program really has to do thousands of these safe open calls then it can invest time into opening /dev/null for any of the unallocated descriptors 3. You can even embed this logic in the safer_open function. -- ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
Re: [PATCH] open: introduce O_NOSTD
* Eric Blake: int open_safer (const char *name, int flags, int mode) { int fd = open (name, flags | O_CLOEXEC, mode); if (0 = fd fd = 2) { int dup = fcntl (fd, ((flags O_CLOEXEC) ? F_DUPFD_CLOEXEC : F_DUPFD), 3); int saved_errno = errno; close (fd); errno = saved_errno; fd = dup; } else if (!(flags O_CLOEXEC)) { if ((flags = fcntl (fd, F_GETFD)) 0 || fcntl (fd, F_SETFD, flags ~FD_CLOEXEC) == -1) { int saved_errno = errno; close (fd); fd = -1; errno = saved_errno; } } return fd; } This solves the fd leak, It's still buggy. You need something like this: int open_safer(const char *name, int flags, int mode) { int opened_fd[3] = {0, 0, 0}; int fd, i, errno_saved; while (1) { fd = open(name, flags | O_CLOEXEC, mode); if (fd 0 || fd 2) { break; } opened_fd[fd] = 1; } for (int i = 0; i = 2; ++i) { if (opened_fd[i]) { errno_saved = errno; close(i); errno = errno_saved; } } return fd; } It's untested, so it's probably still buggy. (O_CLOEXEC should have been a thread attribute, like the base path in the *_at functions. *sigh*) -- Florian Weimerfwei...@bfk.de BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99
Re: [PATCH] open: introduce O_NOSTD
On Thu, 27 Aug 2009, Eric Blake wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 According to Davide Libenzi on 8/25/2009 3:53 PM: Another solution is for the application to sanitize all newly-created fds: GNU coreutils provides a wrapper open_safer, which does nothing extra in the common case that open() returned 3 or larger, but calls fcntl(n,F_DUPFD,3)/close(n) before returning if n was less than 3. However, this leads to triple the syscall cost for every open() call if the process starts life with a std fd closed; and if O_CLOEXEC is not used, still leaves a window of time where the fd can be leaked through another thread's use of fork/exec. I think we can say that the vast majority of the software is not going to notice the proposed open_safer(), performance-wise, since the first three fds are always filled. So IMO the performance impact argument is a weak one. If CLOEXEC semantics are needed in the open operation, F_DUPFD_CLOEXEC can be used to match it. The current gnulib implementation of open_safer (pre-O_CLOEXEC support) is (roughly): /* Wrap open, to guarantee that a successful return value is = 3. */ int open_safer (const char *name, int flags, int mode) { int fd = open (name, flags, mode); if (0 = fd fd = 2) { int dup = fcntl (fd, F_DUPFD, 3); int saved_errno = errno; close (fd); errno = saved_errno; fd = dup; } return fd; } which has the desired property of no overhead in the common case of all standard fds open. But it obviously mishandles the O_CLOEXEC flag. Here's a first cut at supporting it: int open_safer (const char *name, int flags, int mode) { int fd = open (name, flags, mode); if (0 = fd fd = 2) { int dup = fcntl (fd, ((flags O_CLOEXEC) ? F_DUPFD_CLOEXEC : F_DUPFD), 3); int saved_errno = errno; close (fd); errno = saved_errno; fd = dup; } return fd; } If the user requested open_safer(O_CLOEXEC), then we still have the desired property of no syscall overhead and no fd leak. But if the user intentionally does not pass O_CLOEXEC because they wanted to create an inheritable fd, but without stomping on standard fds, then this version still has a window for an fd leak. So let's try this version, which guarantees no fd leak, while still keeping the semantics of giving the user an inheritable fd outside the std fd range: int open_safer (const char *name, int flags, int mode) { int fd = open (name, flags | O_CLOEXEC, mode); if (0 = fd fd = 2) { int dup = fcntl (fd, ((flags O_CLOEXEC) ? F_DUPFD_CLOEXEC : F_DUPFD), 3); int saved_errno = errno; close (fd); errno = saved_errno; fd = dup; } else if (!(flags O_CLOEXEC)) { if ((flags = fcntl (fd, F_GETFD)) 0 || fcntl (fd, F_SETFD, flags ~FD_CLOEXEC) == -1) { int saved_errno = errno; close (fd); fd = -1; errno = saved_errno; } } return fd; } This solves the fd leak, and open_safer(O_CLOEXEC) is still cheap in the common case. But now the case of open_safer without O_CLOEXEC costs 3 syscalls, regardless of whether the standard fds were already open (if we assumed there were no possibility of new FD_* flags, we could cut the common-case penalty from three to two by skipping the fcntl(fd,F_GETFD) and just using fcntl(fd,F_SETFD,0), but that's asking for problems). While the patch is simple, IMO this is something that can be easily taken care in glibc layers w/out huge drawbacks. I hope that my example shows why doing it in the kernel is desirable - there is no safe way to keep the pre-O_CLOEXEC efficiency using just the library, but there IS a way to do it with kernel support: int open_safer (const char *name, int flags, int mode) { return open (name, flags | O_NOSTD, mode); } Can't the handling be done on close(), like (modulo some errno save/restore): int safer_close(int fd) { int error = close(fd); if (fd 3 fd = 0) { if ((fd = open(/dev/null, O_RDWR)) 2) close(fd); } return error; } - Davide
Re: [PATCH] open: introduce O_NOSTD
On Tue, 25 Aug 2009, Eric Blake wrote: Another solution is for the application to sanitize all newly-created fds: GNU coreutils provides a wrapper open_safer, which does nothing extra in the common case that open() returned 3 or larger, but calls fcntl(n,F_DUPFD,3)/close(n) before returning if n was less than 3. However, this leads to triple the syscall cost for every open() call if the process starts life with a std fd closed; and if O_CLOEXEC is not used, still leaves a window of time where the fd can be leaked through another thread's use of fork/exec. I think we can say that the vast majority of the software is not going to notice the proposed open_safer(), performance-wise, since the first three fds are always filled. So IMO the performance impact argument is a weak one. If CLOEXEC semantics are needed in the open operation, F_DUPFD_CLOEXEC can be used to match it. While the patch is simple, IMO this is something that can be easily taken care in glibc layers w/out huge drawbacks. - Davide