[
https://issues.apache.org/jira/browse/MESOS-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14367823#comment-14367823
]
Ian Downes commented on MESOS-2402:
-----------------------------------
It's trying to run a using a locally installed helper, i.e.,
**/usr/local/libexec/mesos/mesos-containerizer**. Here's the start of the
strace for the forked child:
{noformat}
set_robust_list(0x7f16a9fe39e0, 0x18) = 0
dup2(8, 0) = 0
dup2(9, 1) = 1
dup2(10, 2) = 2
close(8) = 0
close(9) = 0
close(10) = 0
setsid() = 54676
execve("/usr/local/libexec/mesos/mesos-containerizer", ["mesos-containerizer",
"launch", "--command={\"shell\":true,\"value\":"...,
"--commands={\"commands\":[]}", "--directory=/tmp/MesosContaineri"...,
"--pipe_read=6", "--pipe_write=7"], [/* 35 vars */]) = -1 ENOENT (No such file
or directory)
write(2, "ABORT: (../../../3rdparty/libpro"..., 62) = 62
write(2, "Failed to os::execvpe in childMa"..., 61) = 61
...
{noformat}
The launcher needs to have flags.launcher_dir correctly specified if Mesos is
not installed in the default location. For tests, this is done in
MesosTest::CreateSlaveFlags() so that should be used to generate a correct
slave::Flags, not just the default constructor which will have launcher_dir set
to the default of PKGLIBEXECDIR.
Given this, I expect this test should be failing for everyone and everywhere
_unless they happen to have a locally installed Mesos and it therefore
incorrectly uses that binary for the test_. It fails for me on my development
box.
> MesosContainerizerDestroyTest.LauncherDestroyFailure is flaky
> -------------------------------------------------------------
>
> Key: MESOS-2402
> URL: https://issues.apache.org/jira/browse/MESOS-2402
> Project: Mesos
> Issue Type: Bug
> Affects Versions: 0.23.0
> Reporter: Vinod Kone
> Assignee: Vinod Kone
>
> "Failed to os::execvpe in childMain". Never seen this one before.
> {code}
> [ RUN ] MesosContainerizerDestroyTest.LauncherDestroyFailure
> Using temporary directory
> '/tmp/MesosContainerizerDestroyTest_LauncherDestroyFailure_QpjQEn'
> I0224 18:55:49.326912 21391 containerizer.cpp:461] Starting container
> 'test_container' for executor 'executor' of framework ''
> I0224 18:55:49.332252 21391 launcher.cpp:130] Forked child with pid '23496'
> for container 'test_container'
> ABORT: (src/subprocess.cpp:165): Failed to os::execvpe in childMain
> *** Aborted at 1424832949 (unix time) try "date -d @1424832949" if you are
> using GNU date ***
> PC: @ 0x2b178c5db0d5 (unknown)
> I0224 18:55:49.340955 21392 process.cpp:2117] Dropped / Lost event for PID:
> [email protected]:39647
> I0224 18:55:49.342300 21386 containerizer.cpp:911] Destroying container
> 'test_container'
> *** SIGABRT (@0x3e800005bc8) received by PID 23496 (TID 0x2b178f9f0700) from
> PID 23496; stack trace: ***
> @ 0x2b178c397cb0 (unknown)
> @ 0x2b178c5db0d5 (unknown)
> @ 0x2b178c5de83b (unknown)
> @ 0x87a945 _Abort()
> @ 0x2b1789f610b9 process::childMain()
> I0224 18:55:49.391793 21386 containerizer.cpp:1120] Executor for container
> 'test_container' has exited
> I0224 18:55:49.400478 21391 process.cpp:2770] Handling HTTP event for process
> 'metrics' with path: '/metrics/snapshot'
> tests/containerizer_tests.cpp:485: Failure
> Value of: metrics.values["containerizer/mesos/container_destroy_errors"]
> Actual: 16-byte object <02-00 00-00 17-2B 00-00 E0-86 0E-04 00-00 00-00>
> Expected: 1u
> Which is: 1
> [ FAILED ] MesosContainerizerDestroyTest.LauncherDestroyFailure (89 ms)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)