[ 
https://issues.apache.org/jira/browse/MESOS-1037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13912199#comment-13912199
 ] 

Benjamin Mahler commented on MESOS-1037:
----------------------------------------

Hm.. we have Fedora 20 CI running at Twitter with no issues.

It would be interesting to see the full process table after the Fork() call in 
Reap.ChildProcess to see what's going on.

> Exited child process status
> ---------------------------
>
>                 Key: MESOS-1037
>                 URL: https://issues.apache.org/jira/browse/MESOS-1037
>             Project: Mesos
>          Issue Type: Bug
>          Components: libprocess
>    Affects Versions: 0.17.0
>         Environment: Fedora 20 - Linux 3.12.10-300.fc20.x86_64 #1 SMP Thu Feb 
> 6 22:11:48 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: Timothy St. Clair
>
> During initial packaging I had turned a blind-eye to some failing tests.  
> Upon further investigation there appears to be an issue with how processes 
> are reaped on (F20 ) 
> It appears that there is a status call on the /proc entry of a child process 
> before waitpid is called, in my env the entry is gone, and the tests yield 
> the following:  
> [----------] 3 tests from Reap
> [ RUN      ] Reap.NonChildProcess
> libprocess/tests/reap_tests.cpp:84: Failure
> status.get() is NONE
> [  FAILED  ] Reap.NonChildProcess (42 ms)
> [ RUN      ] Reap.ChildProcess
> libprocess/tests/reap_tests.cpp:123: Failure
> status.get() is NONE
> [  FAILED  ] Reap.ChildProcess (31 ms)
> [ RUN      ] Reap.TerminatedChildProcess
> libprocess/tests/reap_tests.cpp:150: Failure
> process is NONE
> Process 18856 reaped unexpectedly
> [  FAILED  ] Reap.TerminatedChildProcess (3 ms)
> [----------] 3 tests from Reap (76 ms total)
> [----------] 4 tests from Subprocess
> [ RUN      ] Subprocess.status
> libprocess/tests/subprocess_tests.cpp:40: Failure
> s.get().status().get() is NONE
> [  FAILED  ] Subprocess.status (30 ms)
> [ RUN      ] Subprocess.output
> libprocess/tests/subprocess_tests.cpp:129: Failure
> s.get().status().get() is NONE
> [  FAILED  ] Subprocess.output (31 ms)
> [ RUN      ] Subprocess.input
> libprocess/tests/subprocess_tests.cpp:182: Failure
> s.get().status().get() is NONE
> [  FAILED  ] Subprocess.input (31 ms)
> [ RUN      ] Subprocess.splice
> libprocess/tests/subprocess_tests.cpp:221: Failure
> s.get().status().get() is NONE
> [  FAILED  ] Subprocess.splice (32 ms)
> [----------] 4 tests from Subprocess (124 ms total)
> trace of failure:
> process::ReaperProcess::reap () at reap.cpp:36
> os:process () at linux.hpp:54
> proc:status () at proc.hpp:174 
> Branch for 0.18.0-rc2 build can be found here: 
> https://github.com/timothysc/mesos/tree/0.18.0-post-shuffle



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to