[
https://issues.apache.org/jira/browse/MESOS-9212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
James Peach reassigned MESOS-9212:
----------------------------------
Assignee: James Peach
| [r/68660|https://reviews.apache.org/r/68660] | Disabled SIGCHLD handling in
the libev event loop. |
> Disable SIGCHILD handling in libev.
> -----------------------------------
>
> Key: MESOS-9212
> URL: https://issues.apache.org/jira/browse/MESOS-9212
> Project: Mesos
> Issue Type: Bug
> Reporter: James Peach
> Assignee: James Peach
> Priority: Major
>
> On Fedora 28, building against the system version of libev (version 4.24)
> causes the following tests to fail:
> The following tests fail:
> {noformat}
> [ FAILED ] ReapTest.NonChildProcess
> [ FAILED ] ReapTest.ChildProcess
> [ FAILED ] ReapTest.TerminatedChildProcess
> [ FAILED ] SubprocessTest.PipeOutputToFileDescriptor
> [ FAILED ] SubprocessTest.PipeOutputToPath
> [ FAILED ] SubprocessTest.EnvironmentEcho
> [ FAILED ] SubprocessTest.Status
> [ FAILED ] SubprocessTest.PipeOutput
> [ FAILED ] SubprocessTest.PipeLargeOutput
> [ FAILED ] SubprocessTest.PipeInput
> [ FAILED ] SubprocessTest.PipeRedirect
> [ FAILED ] SubprocessTest.PathOutput
> [ FAILED ] SubprocessTest.PathInput
> [ FAILED ] SubprocessTest.FdOutput
> [ FAILED ] SubprocessTest.FdInput
> [ FAILED ] SubprocessTest.Default
> [ FAILED ] SubprocessTest.Flags
> [ FAILED ] SubprocessTest.Environment
> [ FAILED ] SubprocessTest.EnvironmentWithSpaces
> [ FAILED ] SubprocessTest.EnvironmentWithSpacesAndQuotes
> [ FAILED ] SubprocessTest.EnvironmentOverride
> {noformat}
> This build configuration succeeds:
> {noformat}
> $ ../configure --disable-java --disable-python --enable-silent-rules
> --disable-hardening --disable-werror --disable-libtool-wrappers
> --enable-xfs-disk-isolator --enable-install-module-dependencies
> --enable-port-mapping-isolator --enable-network-ports-isolator
> --with-protobuf=/usr --with-curl=/usr --with-libarchive=/usr
> --with-zookeeper=/usr --prefix=/opt/mesos "CXXFLAGS=-O0 -ggdb3
> -fno-omit-frame-pointer -fvisibility-inlines-hidden
> -Wno-unused-local-typedefs -Wno-deprecated" "CFLAGS=-O0 -ggdb3
> -fno-omit-frame-pointer -Wno-unused-local-typedefs -Wno-deprecated" LDFLAGS=
> CXX=/home/jpeach/src/asf-mesos/build/c++
> CC=/home/jpeach/src/asf-mesos/build/cc LD=/home/jpeach/src/asf-mesos/build/ld
> {noformat}
> This build configuration fails:
> {noformat}
> $ ../configure --disable-java --disable-python --enable-silent-rules
> --disable-hardening --disable-werror --disable-libtool-wrappers
> --enable-xfs-disk-isolator --enable-install-module-dependencies
> --enable-port-mapping-isolator --enable-network-ports-isolator
> --with-protobuf=/usr --with-curl=/usr --with-libarchive=/usr
> --with-zookeeper=/usr --prefix=/opt/mesos "CXXFLAGS=-O0 -ggdb3
> -fno-omit-frame-pointer -fvisibility-inlines-hidden
> -Wno-unused-local-typedefs -Wno-deprecated" "CFLAGS=-O0 -ggdb3
> -fno-omit-frame-pointer -Wno-unused-local-typedefs -Wno-deprecated" LDFLAGS=
> CXX=/home/jpeach/src/asf-mesos/build/c++
> CC=/home/jpeach/src/asf-mesos/build/cc LD=/home/jpeach/src/asf-mesos/build/ld
> --with-libev=/usr
> {noformat}
> I think what happens here is that the child process gets reaped wrongly
> somehow:
> {noformat}
> [==========] Running 1 test from 1 test case.
> [----------] Global test environment set-up.
> [----------] 1 test from SubprocessTest
> [ RUN ] SubprocessTest.EnvironmentWithSpaces
> [pid 25909] clone(child_stack=NULL,
> flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
> child_tidptr=0x7fa11881fcd0) = 25923
> strace: Process 25923 attached
> [pid 25923] execve("/usr/bin/sh", ["sh", "-c", "echo $MESSAGE"], 0x1ff3950 /*
> 1 var */) = 0
> [pid 25923] arch_prctl(ARCH_SET_FS, 0x7f24561c5740) = 0
> [pid 25923] exit_group(0) = ?
> [pid 25923] +++ exited with 0 +++
> [pid 25909] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=25923,
> si_uid=9306, si_status=0, si_utime=0, si_stime=0} ---
> [pid 25922] wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}],
> WNOHANG|WSTOPPED|WCONTINUED, NULL) = 25923
> [pid 25922] wait4(-1, 0x7fa10a74da44, WNOHANG|WSTOPPED|WCONTINUED, NULL) = -1
> ECHILD (No child processes)
> [pid 25919] wait4(25923, 0x7fa10bf50548, WNOHANG, NULL) = -1 ECHILD (No child
> processes)
> ../../../3rdparty/libprocess/src/tests/subprocess_tests.cpp:977: Failure
> (s->status()).get() is NONE
> [ FAILED ] SubprocessTest.EnvironmentWithSpaces (12 ms)
> [----------] 1 test from SubprocessTest (12 ms total)
> [----------] Global test environment tear-down
> [==========] 1 test from 1 test case ran. (12 ms total)
> [ PASSED ] 0 tests.
> [ FAILED ] 1 test, listed below:
> [ FAILED ] SubprocessTest.EnvironmentWithSpaces
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)