On Wed, Sep 5, 2018 at 11:43 AM, Thomas Stüfe <thomas.stu...@gmail.com> wrote:
> On Wed, Sep 5, 2018 at 6:43 PM, Martin Buchholz <marti...@google.com> > wrote: > > > So before the readdir loop we should set errno to 0. Then when readdir > > returns NULL, we should check whether errno is non-zero. > > > > Note that if we fix this, we may run into errors which were hidden > before, namely if we really accidentally did close the file descriptor > used by opendir(). Currently, even if some file descriptors remain > unclosed, that may not necessarily lead to an error. But with your > suggested error handling we would straight away fail. > > Therefore I would combine both fixes - add readdir error handling and > also change to fcntl(FD_CLOEXEC). > > I support this plan. We expect most file descriptors to already have the FD_CLOEXEC flag set, so a useful optimization is to set the flag only when not already set. Probably the code should be conditional on defined(FD_CLOEXEC). > > --- > > > > We do have a way of reporting errno to the parent. See > WhyCantJohnnyExec. > > Ah I see. > > Btw, do you know what happens when we modify errno in a vforked child > process? Would that not modify errno in the parent process too? (errno > is probably implemented as thread local variable, and since the parent > thread sleeps until we exec this may still be okay, or?). > > It's a good question. vfork is typically underspecified, so it is plausible that changing errno in the child affects the parent. I think we should simply make sure that the parent code is robust against this (as it typically is - you check errno when a system API has already returned an indication of an error) > --- > > A bit more background, if you are interested: > > A long time ago, at SAP we swapped Oracle's implementation of > Runtime.exec on Unix against a completely homegrown one. The reason > was that we needed a really robust implementation for our customers, > and with the then current implementation we kept running into > limitations and pathological corner cases. Especially on outlier > platforms like HPUX or AS/400. > > This was long before OpenJDK existed, so there was no way for us to > just fix the coding upstream. In 2005 I was the primary maintainer of the subprocess code at Sun, and there is a good chance you could have gotten changes in if you had asked me! > But now there is, so at regular > intervals I keep eyeing our implementation to check if we could merge. > > The main difference between our version and the upstream one is: we > switched to vfork() for the obvious performance reasons. But since > vfork() is scary we implemented the exec-twice-trick with a little > in-between binary. The main purpose was to minimize the window between > vfork() and the first exec(), where we operate inside the parent > process memory. Everything "dangerous" - calling libc functions, > scanning the proc file system, error handling, tracing etc - is > deferred to after the first exec. And since that window is minimal, > that also reduces the chance of signals in the child process harming > the parent. > > See discussion in src/java.base/unix/native/libjava/ProcessImpl_md.c > Later Oracle introduced something very similar with the jspawnhelper. > But I see that jspawnhelper is not used at all for the vfork() case, > just for posix_spawn(), yes? > Implemented by other folks. I was mostly happy with vfork. I forget the motivation to introduce jspawnhelper, but I'm sure it's in the fossil record somewhere!