The reference from_fd + 2 needs updating since it assumes two file descriptors have already been closed?
On Tue, Sep 11, 2018 at 10:27 AM, Thomas Stüfe <thomas.stu...@gmail.com> wrote: > Hi all, > > May I please have reviews for this small fix: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8210549 > > patch (full): > http://cr.openjdk.java.net/~stuefe/webrevs/8210549-Use-FD_CLOEXEC-instead-of-close/webrev.00/webrev/ > patch (minimal alternative): > http://cr.openjdk.java.net/~stuefe/webrevs/8210549-Use-FD_CLOEXEC-instead-of-close/webrev.minimal/webrev/ > > See also prior discussion: > http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-September/055173.html > > Submit tests ran with full patch successfully through. > > --- > > Background: > > After fork()/vfork() and before exec(), the child process needs to > close all inherited file descriptors apart from the stdin/out/err pipe > ends. We do this by iterating thru all file descriptors in > /proc/<pid>/fd or whatever the equivalent is on that platform. This is > done using opendir(), readdir(). > > We then close all these file descriptors using close(). > > The problem with that technique is that we may accidentally close the > file descriptor opendir() is using internally to read the directory > content of /proc/<pid>/fd, thereby sawing off the branch we are > sitting on. The coding does some magic to attempt to prevent this: > > <quote> > 86 /* We're trying to close all file descriptors, but opendir() might > 87 * itself be implemented using a file descriptor, and we certainly > 88 * don't want to close that while it's in use. We assume that if > 89 * opendir() is implemented using a file descriptor, then it uses > 90 * the lowest numbered file descriptor, just like open(). So we > 91 * close a couple explicitly. */ > 92 > 93 close(from_fd); /* for possible use by opendir() */ > 94 close(from_fd + 1); /* another one for good luck */ > > ... > 108 while ((dirp = readdir64(dp)) != NULL) { > 109 int fd; > 110 if (isAsciiDigit(dirp->d_name[0]) && > 111 (fd = strtol(dirp->d_name, NULL, 10)) >= from_fd + > 2)Improve "close all filedescriptors" coding. > 112 close(fd); > 113 } > > </quote> > > This workaround can be removed if, instead of outright closing the > file descriptor in the loop, we were to set the file descriptor to > FD_CLOEXEC. Closing all descriptors would be defered to the actual > exec() call some milliseconds later. > > ---- > > The patch changes the close() call to fcntl(FD_CLOEXEC). This removes > the need for the workaround as described. > > In addition to that, the full version of the patch also adds error > handling to the readdir() loop. > > But note that this exposes us to a behavioral change should we run > into a readdir() error: > > - currently, we would leave the remaining file descriptors quietly > unclosed. This usually would have no effect, but in rare cases may > lead to difficult-to-analyze errors when child processes accidentally > disturb parent process IO. > > - With this patch, we will fail if readdir fails. However, we do not > just yet fail the Runtime.exec() call, but enter a fallback code > section here: > > 364 /* close everything */ > 365 if (closeDescriptors() == 0) { /* failed, close the old way */ > 366 int max_fd = (int)sysconf(_SC_OPEN_MAX); > 367 int fd; > 368 for (fd = FAIL_FILENO + 1; fd < max_fd; fd++) > 369 if (close(fd) == -1 && errno != EBADF) > 370 goto WhyCantJohnnyExec; > 371 } > > I am not convinced this is a good fallback. _SC_OPEN_MAX depends on > ulimit -Hn and may be high. On my Ubuntu 16.4 box, default value after > installation is 1048576. When we hit this fallback code, it takes > almost a full second to spawn a single child process. In fork > intensive scenarios this can get pretty bad. > > What do you think? > > Thanks & Best Regards, > > Thomas