> On Sept. 7, 2018, 2:18 a.m., James Peach wrote: > > 3rdparty/libprocess/src/posix/subprocess.hpp > > Lines 195 (patched) > > <https://reviews.apache.org/r/68644/diff/1/?file=2082817#file2082817line195> > > > > We need to be careful here. We are in an async-signal-safe context but > > hashmap and list allocate memory. > > > > Actually iterating over the directory is difficult to do in an > > async-signal-safe context. You can open the directory in the parent and do > > the iteration in the child, but I don't think there's any guarantee that > > the readdir in the child is safe (though AFAIK in glibc it would work). > > Qian Zhang wrote: > Good catch! > > Yeah, we could do `os::lsof()` in the parent (e.g., right before we fork > the child: > https://github.com/apache/mesos/blob/1.7.0/3rdparty/libprocess/src/posix/subprocess.hpp#L261) > , but what if the parent opens a new fd after `os::lsof()` is called but > before the child is forked? In such case, the new fd will be leaked to the > child. > > And it seems `opendir`, `readdir` and `closedir` are not > async-signal-safe according to > http://man7.org/linux/man-pages/man7/signal-safety.7.html > > Maybe we should do something like this: > https://github.com/python/cpython/blob/master/Modules/_posixsubprocess.c#L257:L303 > , but I see they call `getdents64` which seems not async-signal-safe too. > > James Peach wrote: > > Yeah, we could do os::lsof() in the parent (e.g., right before we fork > the child: > https://github.com/apache/mesos/blob/1.7.0/3rdparty/libprocess/src/posix/subprocess.hpp#L261) > , but what if the parent opens a new fd after os::lsof() is called but > before the child is forked? In such case, the new fd will be leaked to the > child. > > Yes, I agree that we should try to avoid this race. > > > And it seems opendir, readdir and closedir are not async-signal-safe > according to > > http://man7.org/linux/man-pages/man7/signal-safety.7.html > > Yup. I believe that this is OK in practice on Linux; at least I see this > approach being used in real code. > > > Maybe we should do something like this: > https://github.com/python/cpython/blob/master/Modules/_posixsubprocess.c#L257:L303 > , but I see they call getdents64 which seems not async-signal-safe too. > > The two approaches that I can find in common use for Linux are to scan > the `fd` directory or to just close everything up to `getdtablesize`. The > former is a bit less portable, and the latter probably has a small > performance cost. The cpython code is probably the best implementation of the > scanning approach that I've seen. I don't know of any way to implement the > scanning approach that is *strictly* async-signal-safe. > > A slightly different approach would be to exec a helper tool that does > the closing. This would have a performance drawback (cost of linking libmesos > at startup), and possibly compatibility issues.
I updated the patch by following what the cpython code did. But I only did it for Linux not for macOS and FreeBSD yet, Apple has decided to deprecate all syscall functions, so we have no async-signal-safe way to get entries from the `/dev/fd` dir on macOS. - Qian ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/68644/#review208417 ----------------------------------------------------------- On Oct. 11, 2018, 10:03 p.m., Qian Zhang wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/68644/ > ----------------------------------------------------------- > > (Updated Oct. 11, 2018, 10:03 p.m.) > > > Review request for mesos, Gilbert Song and James Peach. > > > Bugs: MESOS-9152 > https://issues.apache.org/jira/browse/MESOS-9152 > > > Repository: mesos > > > Description > ------- > > Closed all file descriptors except `whitelist_fds` in posix/subprocess. > > > Diffs > ----- > > 3rdparty/libprocess/src/posix/subprocess.hpp > 007058b61fdcd4716aa793516c842c3cef8c0a29 > 3rdparty/libprocess/src/subprocess.cpp > c0640de2dc4278b884282dfaad98c49c3b067a5b > > > Diff: https://reviews.apache.org/r/68644/diff/2/ > > > Testing > ------- > > > Thanks, > > Qian Zhang > >
