Hi Haosdent and Mesos friends,

I've rebuilt the cluster from scratch and installed mesos 0.24.1 from the
mesosphere apt repo:

$ dpkg -l | grep mesos
ii  mesos                               0.24.1-0.2.35.ubuntu1404
 amd64        Cluster resource manager with efficient resource isolation

Then added the `launcher_dir' flag to /etc/mesos-slave/launcher_dir on the
slaves:

mesos-worker1a:~$ cat /etc/mesos-slave/launcher_dir
/usr/libexec/mesos

And yet the task health-checks are still being launched from the sandbox
directory like before!

I've also tested setting the MESOS_LAUNCHER_DIR env var and get the
identical result (just as before on the cluster where many versions of
mesos had been installed):

STDOUT:

--container="mesos-20151012-184440-1625401536-5050-23953-S0.62d43b8f-6cd1-4c53-9ac8-84dbfc45bbcb"
> --docker="docker" --help="false" --initialize_driver_logging="true"
> --logbufsecs="0" --logging_level="INFO"
> --mapped_directory="/mnt/mesos/sandbox" --quiet="false"
> --sandbox_directory="/tmp/mesos/slaves/20151012-184440-1625401536-5050-23953-S0/frameworks/20151012-184440-1625401536-5050-23953-0000/executors/hello-app_web-v3.33597b73-1943-41b4-a308-76132eebcc91/runs/62d43b8f-6cd1-4c53-9ac8-84dbfc45bbcb"
> --stop_timeout="0ns"
> --container="mesos-20151012-184440-1625401536-5050-23953-S0.62d43b8f-6cd1-4c53-9ac8-84dbfc45bbcb"
> --docker="docker" --help="false" --initialize_driver_logging="true"
> --logbufsecs="0" --logging_level="INFO"
> --mapped_directory="/mnt/mesos/sandbox" --quiet="false"
> --sandbox_directory="/tmp/mesos/slaves/20151012-184440-1625401536-5050-23953-S0/frameworks/20151012-184440-1625401536-5050-23953-0000/executors/hello-app_web-v3.33597b73-1943-41b4-a308-76132eebcc91/runs/62d43b8f-6cd1-4c53-9ac8-84dbfc45bbcb"
> --stop_timeout="0ns"
> Registered docker executor on mesos-worker1a
> Starting task hello-app_web-v3.33597b73-1943-41b4-a308-76132eebcc91
> Launching health check process:
> /tmp/mesos/slaves/20151012-184440-1625401536-5050-23953-S0/frameworks/20151012-184440-1625401536-5050-23953-0000/executors/hello-app_web-v3.33597b73-1943-41b4-a308-76132eebcc91/runs/62d43b8f-6cd1-4c53-9ac8-84dbfc45bbcb/mesos-health-check
> --executor=(1)@192.168.225.58:48912
> --health_check_json={"command":{"shell":true,"value":"docker exec
> mesos-20151012-184440-1625401536-5050-23953-S0.62d43b8f-6cd1-4c53-9ac8-84dbfc45bbcb
> sh -c \" curl --silent --show-error --fail --tcp-nodelay --head -X GET
> --user-agent flux-capacitor-health-checker --max-time 1 http:\/\/
> 127.0.0.1:8000
> \""},"consecutive_failures":6,"delay_seconds":15,"grace_period_seconds":10,"interval_seconds":1,"timeout_seconds":1}
> --task_id=hello-app_web-v3.33597b73-1943-41b4-a308-76132eebcc91
> Health check process launched at pid: 11253



STDERR:

--container="mesos-20151012-184440-1625401536-5050-23953-S0.62d43b8f-6cd1-4c53-9ac8-84dbfc45bbcb"
> --docker="docker" --help="false" --initialize_driver_logging="true"
> --logbufsecs="0" --logging_level="INFO"
> --mapped_directory="/mnt/mesos/sandbox" --quiet="false"
> --sandbox_directory="/tmp/mesos/slaves/20151012-184440-1625401536-5050-23953-S0/frameworks/20151012-184440-1625401536-5050-23953-0000/executors/hello-app_web-v3.33597b73-1943-41b4-a308-76132eebcc91/runs/62d43b8f-6cd1-4c53-9ac8-84dbfc45bbcb"
> --stop_timeout="0ns"
> --container="mesos-20151012-184440-1625401536-5050-23953-S0.62d43b8f-6cd1-4c53-9ac8-84dbfc45bbcb"
> --docker="docker" --help="false" --initialize_driver_logging="true"
> --logbufsecs="0" --logging_level="INFO"
> --mapped_directory="/mnt/mesos/sandbox" --quiet="false"
> --sandbox_directory="/tmp/mesos/slaves/20151012-184440-1625401536-5050-23953-S0/frameworks/20151012-184440-1625401536-5050-23953-0000/executors/hello-app_web-v3.33597b73-1943-41b4-a308-76132eebcc91/runs/62d43b8f-6cd1-4c53-9ac8-84dbfc45bbcb"
> --stop_timeout="0ns"
> Registered docker executor on mesos-worker1a
> Starting task hello-app_web-v3.33597b73-1943-41b4-a308-76132eebcc91
> *Launching health check process:
> /tmp/mesos/slaves/20151012-184440-1625401536-5050-23953-S0/frameworks/20151012-184440-1625401536-5050-23953-0000/executors/hello-app_web-v3.33597b73-1943-41b4-a308-76132eebcc91/runs/62d43b8f-6cd1-4c53-9ac8-84dbfc45bbcb/mesos-health-check*
> --executor=(1)@192.168.225.58:48912
> --health_check_json={"command":{"shell":true,"value":"docker exec
> mesos-20151012-184440-1625401536-5050-23953-S0.62d43b8f-6cd1-4c53-9ac8-84dbfc45bbcb
> sh -c \" curl --silent --show-error --fail --tcp-nodelay --head -X GET
> --user-agent flux-capacitor-health-checker --max-time 1 http:\/\/
> 127.0.0.1:8000
> \""},"consecutive_failures":6,"delay_seconds":15,"grace_period_seconds":10,"interval_seconds":1,"timeout_seconds":1}
> --task_id=hello-app_web-v3.33597b73-1943-41b4-a308-76132eebcc91
> Health check process launched at pid: 11253


Any ideas on where to go from here?  Is there any additional information I
can provide?

Thanks as always,
Jay


On Thu, Oct 8, 2015 at 9:23 PM, haosdent <[email protected]> wrote:

> For flag sent to the executor from containerizer, the flag would stringify
> and become a command line parameter when launch executor.
>
> You could see this in
> https://github.com/apache/mesos/blob/master/src/slave/containerizer/docker.cpp#L279-L288
>
> But for launcher_dir, the executor get it from `argv[0]`, as you mentioned
> above.
> ```
>   string path =
>     envPath.isSome() ? envPath.get()
>                      : os::realpath(Path(argv[0]).dirname()).get();
>
> ```
> So I want to figure out why your argv[0] would become sandbox dir, not
> "/usr/libexec/mesos".
>
> On Fri, Oct 9, 2015 at 12:03 PM, Jay Taylor <[email protected]> wrote:
>
>> I see.  And then how are the flags sent to the executor?
>>
>>
>>
>> On Oct 8, 2015, at 8:56 PM, haosdent <[email protected]> wrote:
>>
>> Yes. The related code is located in
>> https://github.com/apache/mesos/blob/master/src/slave/main.cpp#L123
>>
>> In fact, environment variables starts with MESOS_ would load as flags
>> variables.
>>
>> https://github.com/apache/mesos/blob/master/3rdparty/libprocess/3rdparty/stout/include/stout/flags/flags.hpp#L52
>>
>> On Fri, Oct 9, 2015 at 11:33 AM, Jay Taylor <[email protected]> wrote:
>>
>>> One question for you haosdent-
>>>
>>> You mentioned that the flags.launcher_dir should propagate to the docker
>>> executor all the way up the chain.  Can you show me where this logic is in
>>> the codebase?  I didn't see where that was happening and would like to
>>> understand the mechanism.
>>>
>>> Thanks!
>>> Jay
>>>
>>>
>>>
>>> On Oct 8, 2015, at 8:29 PM, Jay Taylor <[email protected]> wrote:
>>>
>>> Maybe tomorrow I will build a fresh cluster from scratch to see if the
>>> broken behavior experienced today still persists.
>>>
>>> On Oct 8, 2015, at 7:52 PM, haosdent <[email protected]> wrote:
>>>
>>> As far as I know, MESOS_LAUNCHER_DIR is works by set flags.launcher_dir
>>> which would find mesos-docker-executor and mesos-health-check under this
>>> dir. Although the env is not propagated, but MESOS_LAUNCHER_DIR still
>>> works because flags.launcher_dir is get from it.
>>>
>>> For example, because I
>>> ```
>>> export MESOS_LAUNCHER_DIR=/tmp
>>> ```
>>> before start mesos-slave. So when I launch slave, I could find this log
>>> in slave log
>>> ```
>>> I1009 10:27:26.594599  1416 slave.cpp:203] Flags at startup:
>>> xxxxx  --launcher_dir="/tmp"
>>> ```
>>>
>>> And from your log, I not sure why your MESOS_LAUNCHER_DIR become sandbox
>>> dir. Is it because MESOS_LAUNCHER_DIR is overrided in your other scripts?
>>>
>>>
>>> On Fri, Oct 9, 2015 at 1:56 AM, Jay Taylor <[email protected]> wrote:
>>>
>>>> I haven't ever changed MESOS_LAUNCHER_DIR/--launcher_dir before.
>>>>
>>>> I just tried setting both the env var and flag on the slaves, and have
>>>> determined that the env var is not present when it is being checked
>>>> src/docker/executor.cpp @ line 573:
>>>>
>>>>  const Option<string> envPath = os::getenv("MESOS_LAUNCHER_DIR");
>>>>>   string path =
>>>>>     envPath.isSome() ? envPath.get()
>>>>>                      : os::realpath(Path(argv[0]).dirname()).get();
>>>>>   cout << "MESOS_LAUNCHER_DIR: envpath.isSome()->" <<
>>>>> (envPath.isSome() ? "yes" : "no") << endl;
>>>>>   cout << "MESOS_LAUNCHER_DIR: path='" << path << "'" << endl;
>>>>
>>>>
>>>> Exported MESOS_LAUNCHER_DIR env var (and verified it is correctly
>>>> propagated along up to the point of mesos-slave launch):
>>>>
>>>> $ cat /etc/default/mesos-slave
>>>>> export
>>>>> MESOS_MASTER="zk://mesos-primary1a:2181,mesos-primary2a:2181,mesos-primary3a:2181/mesos"
>>>>> export MESOS_CONTAINERIZERS="mesos,docker"
>>>>> export MESOS_EXECUTOR_REGISTRATION_TIMEOUT="5mins"
>>>>> export MESOS_PORT="5050"
>>>>> export MESOS_LAUNCHER_DIR="/usr/libexec/mesos"
>>>>
>>>>
>>>> TASK OUTPUT:
>>>>
>>>>
>>>>> *MESOS_LAUNCHER_DIR: envpath.isSome()->no**MESOS_LAUNCHER_DIR:
>>>>> path='/tmp/mesos/slaves/61373c0e-7349-4173-ab8d-9d7b260e8a30-S1/frameworks/20150924-210922-1608624320-5050-1792-0020/executors/hello-app_web-v3.22f9c7e4-2109-48a9-998e-e116141ec253/runs/41f8eed6-ec6c-4e6f-b1aa-0a2817a600ad'*
>>>>> Registered docker executor on mesos-worker2a
>>>>> Starting task hello-app_web-v3.22f9c7e4-2109-48a9-998e-e116141ec253
>>>>> Launching health check process:
>>>>> /tmp/mesos/slaves/61373c0e-7349-4173-ab8d-9d7b260e8a30-S1/frameworks/20150924-210922-1608624320-5050-1792-0020/executors/hello-app_web-v3.22f9c7e4-2109-48a9-998e-e116141ec253/runs/41f8eed6-ec6c-4e6f-b1aa-0a2817a600ad/mesos-health-check
>>>>> --executor=(1)@192.168.225.59:44523
>>>>> --health_check_json={"command":{"shell":true,"value":"docker exec
>>>>> mesos-61373c0e-7349-4173-ab8d-9d7b260e8a30-S1.41f8eed6-ec6c-4e6f-b1aa-0a2817a600ad
>>>>> sh -c \" \/bin\/bash
>>>>> \""},"consecutive_failures":3,"delay_seconds":5.0,"grace_period_seconds":10.0,"interval_seconds":10.0,"timeout_seconds":10.0}
>>>>> --task_id=hello-app_web-v3.22f9c7e4-2109-48a9-998e-e116141ec253
>>>>> Health check process launched at pid: 2519
>>>>
>>>>
>>>> The env var is not propagated when the docker executor is launched
>>>> in src/slave/containerizer/docker.cpp around line 903:
>>>>
>>>>   vector<string> argv;
>>>>>   argv.push_back("mesos-docker-executor");
>>>>>   // Construct the mesos-docker-executor using the "name" we gave the
>>>>>   // container (to distinguish it from Docker containers not created
>>>>>   // by Mesos).
>>>>>   Try<Subprocess> s = subprocess(
>>>>>       path::join(flags.launcher_dir, "mesos-docker-executor"),
>>>>>       argv,
>>>>>       Subprocess::PIPE(),
>>>>>       Subprocess::PATH(path::join(container->directory, "stdout")),
>>>>>       Subprocess::PATH(path::join(container->directory, "stderr")),
>>>>>       dockerFlags(flags, container->name(), container->directory),
>>>>>       environment,
>>>>>       lambda::bind(&setup, container->directory));
>>>>
>>>>
>>>> A little ways above we can see the environment is setup w/ the
>>>> container tasks defined env vars.
>>>>
>>>> See src/slave/containerizer/docker.cpp around line 871:
>>>>
>>>>   // Include any enviroment variables from ExecutorInfo.
>>>>>   foreach (const Environment::Variable& variable,
>>>>>            container->executor.command().environment().variables()) {
>>>>>     environment[variable.name()] = variable.value();
>>>>>   }
>>>>
>>>>
>>>> Should I file a JIRA for this?  Have I overlooked anything?
>>>>
>>>>
>>>> On Wed, Oct 7, 2015 at 8:11 PM, haosdent <[email protected]> wrote:
>>>>
>>>>> >Not sure what was going on with health-checks in 0.24.0.
>>>>> 0.24.1 should be works.
>>>>>
>>>>> >Do any of you know which host the path
>>>>> "/tmp/mesos/slaves/16b49e90-6852-4c91-8e70-d89c54f25668-S1/frameworks/20150821-214332-1407297728-5050-18973-0000/executors/app-81-1-hello-app_web-v11.b89f2ceb-6d62-11e5-9827-080027477de0/runs/73dbfe88-1dbb-4f61-9a52-c365558cdbfc/mesos-health-check"
>>>>> should exist on? It definitely doesn't exist on the slave, hence execution
>>>>> failing.
>>>>>
>>>>> Does you set MESOS_LAUNCHER_DIR/--launcher_dir incorrectly before? We
>>>>> got mesos-health-check from MESOS_LAUNCHER_DIR/--launcher_id or use the
>>>>> same dir of mesos-docker-executor.
>>>>>
>>>>> On Thu, Oct 8, 2015 at 10:46 AM, Jay Taylor <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Maybe I spoke too soon.
>>>>>>
>>>>>> Now the checks are attempting to run, however the STDERR is not
>>>>>> looking good.  I've added some debugging to the error message output to
>>>>>> show the path, argv, and envp variables:
>>>>>>
>>>>>> STDOUT:
>>>>>>
>>>>>> --container="mesos-16b49e90-6852-4c91-8e70-d89c54f25668-S1.73dbfe88-1dbb-4f61-9a52-c365558cdbfc"
>>>>>>> --docker="docker" --docker_socket="/var/run/docker.sock" --help="false"
>>>>>>> --initialize_driver_logging="true" --logbufsecs="0" 
>>>>>>> --logging_level="INFO"
>>>>>>> --mapped_directory="/mnt/mesos/sandbox" --quiet="false"
>>>>>>> --sandbox_directory="/tmp/mesos/slaves/16b49e90-6852-4c91-8e70-d89c54f25668-S1/frameworks/20150821-214332-1407297728-5050-18973-0000/executors/app-81-1-hello-app_web-v11.b89f2ceb-6d62-11e5-9827-080027477de0/runs/73dbfe88-1dbb-4f61-9a52-c365558cdbfc"
>>>>>>> --stop_timeout="0ns"
>>>>>>> --container="mesos-16b49e90-6852-4c91-8e70-d89c54f25668-S1.73dbfe88-1dbb-4f61-9a52-c365558cdbfc"
>>>>>>> --docker="docker" --docker_socket="/var/run/docker.sock" --help="false"
>>>>>>> --initialize_driver_logging="true" --logbufsecs="0" 
>>>>>>> --logging_level="INFO"
>>>>>>> --mapped_directory="/mnt/mesos/sandbox" --quiet="false"
>>>>>>> --sandbox_directory="/tmp/mesos/slaves/16b49e90-6852-4c91-8e70-d89c54f25668-S1/frameworks/20150821-214332-1407297728-5050-18973-0000/executors/app-81-1-hello-app_web-v11.b89f2ceb-6d62-11e5-9827-080027477de0/runs/73dbfe88-1dbb-4f61-9a52-c365558cdbfc"
>>>>>>> --stop_timeout="0ns"
>>>>>>> Registered docker executor on mesos-worker2a
>>>>>>> Starting task
>>>>>>> app-81-1-hello-app_web-v11.b89f2ceb-6d62-11e5-9827-080027477de0
>>>>>>> Launching health check process:
>>>>>>> /tmp/mesos/slaves/16b49e90-6852-4c91-8e70-d89c54f25668-S1/frameworks/20150821-214332-1407297728-5050-18973-0000/executors/app-81-1-hello-app_web-v11.b89f2ceb-6d62-11e5-9827-080027477de0/runs/73dbfe88-1dbb-4f61-9a52-c365558cdbfc/mesos-health-check
>>>>>>> --executor=(1)@192.168.225.59:43917
>>>>>>> --health_check_json={"command":{"shell":true,"value":"docker exec
>>>>>>> mesos-16b49e90-6852-4c91-8e70-d89c54f25668-S1.73dbfe88-1dbb-4f61-9a52-c365558cdbfc
>>>>>>> sh -c \" exit 1
>>>>>>> \""},"consecutive_failures":3,"delay_seconds":0.0,"grace_period_seconds":10.0,"interval_seconds":10.0,"timeout_seconds":10.0}
>>>>>>> --task_id=app-81-1-hello-app_web-v11.b89f2ceb-6d62-11e5-9827-080027477de0
>>>>>>> Health check process launched at pid: 3012
>>>>>>
>>>>>>
>>>>>> STDERR:
>>>>>>
>>>>>> I1008 02:17:28.870434 2770 exec.cpp:134] Version: 0.26.0
>>>>>>> I1008 02:17:28.871860 2778 exec.cpp:208] Executor registered on
>>>>>>> slave 16b49e90-6852-4c91-8e70-d89c54f25668-S1
>>>>>>> WARNING: Your kernel does not support swap limit capabilities,
>>>>>>> memory limited without swap.
>>>>>>> ABORT: (src/subprocess.cpp:180): Failed to os::execvpe in childMain
>>>>>>> (path.c_str()='/tmp/mesos/slaves/16b49e90-6852-4c91-8e70-d89c54f25668-S1/frameworks/20150821-214332-1407297728-5050-18973-0000/executors/app-81-1-hello-app_web-v11.b89f2ceb-6d62-11e5-9827-080027477de0/runs/73dbfe88-1dbb-4f61-9a52-c365558cdbfc/mesos-health-check',
>>>>>>> argv='/tmp/mesos/slaves/16b49e90-6852-4c91-8e70-d89c54f25668-S1/frameworks/20150821-214332-1407297728-5050-18973-0000/executors/app-81-1-hello-app_web-v11.b89f2ceb-6d62-11e5-9827-080027477de0/runs/73dbfe88-1dbb-4f61-9a52-c365558cdbfc/mesos-health-check',
>>>>>>> envp=''): No such file or directory*** Aborted at 1444270649 (unix time)
>>>>>>> try "date -d @1444270649" if you are using GNU date ***
>>>>>>> PC: @ 0x7f4a37ec6cc9 (unknown)
>>>>>>> *** SIGABRT (@0xbc4) received by PID 3012 (TID 0x7f4a2f9f6700) from
>>>>>>> PID 3012; stack trace: ***
>>>>>>> @ 0x7f4a38265340 (unknown)
>>>>>>> @ 0x7f4a37ec6cc9 (unknown)
>>>>>>> @ 0x7f4a37eca0d8 (unknown)
>>>>>>> @ 0x4191e2 _Abort()
>>>>>>> @ 0x41921c _Abort()
>>>>>>> @ 0x7f4a39dc2768 process::childMain()
>>>>>>> @ 0x7f4a39dc4f59 std::_Function_handler<>::_M_invoke()
>>>>>>> @ 0x7f4a39dc24fc process::defaultClone()
>>>>>>> @ 0x7f4a39dc34fb process::subprocess()
>>>>>>> @ 0x43cc9c
>>>>>>> mesos::internal::docker::DockerExecutorProcess::launchHealthCheck()
>>>>>>> @ 0x7f4a39d924f4 process::ProcessManager::resume()
>>>>>>> @ 0x7f4a39d92827
>>>>>>> _ZNSt6thread5_ImplISt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS6_EEEvEEE6_M_runEv
>>>>>>> @ 0x7f4a38a47e40 (unknown)
>>>>>>> @ 0x7f4a3825d182 start_thread
>>>>>>> @ 0x7f4a37f8a47d (unknown)
>>>>>>
>>>>>>
>>>>>> Do any of you know which host the path 
>>>>>> "/tmp/mesos/slaves/16b49e90-6852-4c91-8e70-d89c54f25668-S1/frameworks/20150821-214332-1407297728-5050-18973-0000/executors/app-81-1-hello-app_web-v11.b89f2ceb-6d62-11e5-9827-080027477de0/runs/73dbfe88-1dbb-4f61-9a52-c365558cdbfc/mesos-health-check"
>>>>>> should exist on? It definitely doesn't exist on the slave, hence
>>>>>> execution failing.
>>>>>>
>>>>>> This is with current master, git hash
>>>>>> 5058fac1083dc91bca54d33c26c810c17ad95dd1.
>>>>>>
>>>>>> commit 5058fac1083dc91bca54d33c26c810c17ad95dd1
>>>>>>> Author: Anand Mazumdar <[email protected]>
>>>>>>> Date:   Tue Oct 6 17:37:41 2015 -0700
>>>>>>
>>>>>>
>>>>>> -Jay
>>>>>>
>>>>>> On Wed, Oct 7, 2015 at 5:23 PM, Jay Taylor <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Update:
>>>>>>>
>>>>>>> I used https://github.com/deric/mesos-deb-packaging to compile and
>>>>>>> package the latest master (0.26.x) and deployed it to the cluster, and 
>>>>>>> now
>>>>>>> health checks are working as advertised in both Marathon and my own
>>>>>>> framework!  Not sure what was going on with health-checks in 0.24.0..
>>>>>>>
>>>>>>> Anyways, thanks again for your help Haosdent!
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Jay
>>>>>>>
>>>>>>> On Wed, Oct 7, 2015 at 12:53 PM, Jay Taylor <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Haosdent,
>>>>>>>>
>>>>>>>> Can you share your Marathon POST request that results in Mesos
>>>>>>>> executing the health checks?
>>>>>>>>
>>>>>>>> Since we can reference the Marathon framework, I've been doing some
>>>>>>>> digging around.
>>>>>>>>
>>>>>>>> Here are the details of my setup and findings:
>>>>>>>>
>>>>>>>> I put a few small hacks in Marathon:
>>>>>>>>
>>>>>>>> (1) Added com.googlecode.protobuf.format to Marathon's dependencies
>>>>>>>>
>>>>>>>> (2) Edited the following files so TaskInfo is dumped as JSON to
>>>>>>>> /tmp/X in both the TaskFactory as well an right before the task is 
>>>>>>>> sent to
>>>>>>>> Mesos via driver.launchTasks:
>>>>>>>>
>>>>>>>> src/main/scala/mesosphere/marathon/tasks/DefaultTaskFactory.scala:
>>>>>>>>
>>>>>>>> $ git diff
>>>>>>>>> src/main/scala/mesosphere/marathon/tasks/DefaultTaskFactory.scala
>>>>>>>>> @@ -25,6 +25,12 @@ class DefaultTaskFactory @Inject() (
>>>>>>>>>
>>>>>>>>>      new TaskBuilder(app, taskIdUtil.newTaskId,
>>>>>>>>> config).buildIfMatches(offer, runningTasks).map {
>>>>>>>>>        case (taskInfo, ports) =>
>>>>>>>>> +        import com.googlecode.protobuf.format.JsonFormat
>>>>>>>>> +        import java.io._
>>>>>>>>> +        val bw = new BufferedWriter(new FileWriter(new
>>>>>>>>> File("/tmp/taskjson1-" + taskInfo.getTaskId.getValue)))
>>>>>>>>> +        bw.write(JsonFormat.printToString(taskInfo))
>>>>>>>>> +        bw.write("\n")
>>>>>>>>> +        bw.close()
>>>>>>>>>          CreatedTask(
>>>>>>>>>            taskInfo,
>>>>>>>>>            MarathonTasks.makeTask(
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> src/main/scala/mesosphere/marathon/core/launcher/impl/TaskLauncherImpl.scala:
>>>>>>>>
>>>>>>>> $ git diff
>>>>>>>>> src/main/scala/mesosphere/marathon/core/launcher/impl/TaskLauncherImpl.scala
>>>>>>>>> @@ -24,6 +24,16 @@ private[launcher] class TaskLauncherImpl(
>>>>>>>>>    override def launchTasks(offerID: OfferID, taskInfos:
>>>>>>>>> Seq[TaskInfo]): Boolean = {
>>>>>>>>>      val launched = withDriver(s"launchTasks($offerID)") { driver
>>>>>>>>> =>
>>>>>>>>>        import scala.collection.JavaConverters._
>>>>>>>>> +      var i = 0
>>>>>>>>> +      for (i <- 0 to taskInfos.length - 1) {
>>>>>>>>> +        import com.googlecode.protobuf.format.JsonFormat
>>>>>>>>> +        import java.io._
>>>>>>>>> +        val file = new File("/tmp/taskJson2-" + i.toString() +
>>>>>>>>> "-" + taskInfos(i).getTaskId.getValue)
>>>>>>>>> +        val bw = new BufferedWriter(new FileWriter(file))
>>>>>>>>> +        bw.write(JsonFormat.printToString(taskInfos(i)))
>>>>>>>>> +        bw.write("\n")
>>>>>>>>> +        bw.close()
>>>>>>>>> +      }
>>>>>>>>>        driver.launchTasks(Collections.singleton(offerID),
>>>>>>>>> taskInfos.asJava)
>>>>>>>>>      }
>>>>>>>>
>>>>>>>>
>>>>>>>> Then I built and deployed the hacked Marathon and restarted the
>>>>>>>> marathon service.
>>>>>>>>
>>>>>>>> Next I created the app via the Marathon API ("hello app" is a
>>>>>>>> container with a simple hello-world ruby app running on
>>>>>>>> 0.0.0.0:8000)
>>>>>>>>
>>>>>>>> curl http://mesos-primary1a:8080/v2/groups -XPOST -H'Content-Type:
>>>>>>>>> application/json' -d'
>>>>>>>>> {
>>>>>>>>>   "id": "/app-81-1-hello-app",
>>>>>>>>>   "apps": [
>>>>>>>>>     {
>>>>>>>>>       "id": "/app-81-1-hello-app/web-v11",
>>>>>>>>>       "container": {
>>>>>>>>>         "type": "DOCKER",
>>>>>>>>>         "docker": {
>>>>>>>>>           "image":
>>>>>>>>> "docker-services1a:5000/gig1/app-81-1-hello-app-1444240966",
>>>>>>>>>           "network": "BRIDGE",
>>>>>>>>>           "portMappings": [
>>>>>>>>>             {
>>>>>>>>>               "containerPort": 8000,
>>>>>>>>>               "hostPort": 0,
>>>>>>>>>               "protocol": "tcp"
>>>>>>>>>             }
>>>>>>>>>           ]
>>>>>>>>>         }
>>>>>>>>>       },
>>>>>>>>>       "env": {
>>>>>>>>>
>>>>>>>>>       },
>>>>>>>>>       "healthChecks": [
>>>>>>>>>         {
>>>>>>>>>           "protocol": "COMMAND",
>>>>>>>>>           "command": {"value": "exit 1"},
>>>>>>>>>           "gracePeriodSeconds": 10,
>>>>>>>>>           "intervalSeconds": 10,
>>>>>>>>>           "timeoutSeconds": 10,
>>>>>>>>>           "maxConsecutiveFailures": 3
>>>>>>>>>         }
>>>>>>>>>       ],
>>>>>>>>>       "instances": 1,
>>>>>>>>>       "cpus": 1,
>>>>>>>>>       "mem": 512
>>>>>>>>>     }
>>>>>>>>>   ]
>>>>>>>>> }
>>>>>>>>
>>>>>>>>
>>>>>>>> $ ls /tmp/
>>>>>>>>>
>>>>>>>>> taskjson1-app-81-1-hello-app_web-v11.84c0f441-6d2a-11e5-98ba-080027477de0
>>>>>>>>>
>>>>>>>>> taskJson2-0-app-81-1-hello-app_web-v11.84c0f441-6d2a-11e5-98ba-080027477de0
>>>>>>>>
>>>>>>>>
>>>>>>>> Do they match?
>>>>>>>>
>>>>>>>> $ md5sum /tmp/task*
>>>>>>>>> 1b5115997e78e2611654059249d99578
>>>>>>>>>  
>>>>>>>>> /tmp/taskjson1-app-81-1-hello-app_web-v11.84c0f441-6d2a-11e5-98ba-080027477de0
>>>>>>>>> 1b5115997e78e2611654059249d99578
>>>>>>>>>  
>>>>>>>>> /tmp/taskJson2-0-app-81-1-hello-app_web-v11.84c0f441-6d2a-11e5-98ba-080027477de0
>>>>>>>>
>>>>>>>>
>>>>>>>> Yes, so I am confident this is the information being sent across
>>>>>>>> the wire to Mesos.
>>>>>>>>
>>>>>>>> Do they contain any health-check information?
>>>>>>>>
>>>>>>>> $ cat
>>>>>>>>> /tmp/taskjson1-app-81-1-hello-app_web-v11.84c0f441-6d2a-11e5-98ba-080027477de0
>>>>>>>>> {
>>>>>>>>>   "name":"web-v11.app-81-1-hello-app",
>>>>>>>>>   "task_id":{
>>>>>>>>>
>>>>>>>>> "value":"app-81-1-hello-app_web-v11.84c0f441-6d2a-11e5-98ba-080027477de0"
>>>>>>>>>   },
>>>>>>>>>   "slave_id":{
>>>>>>>>>     "value":"20150924-210922-1608624320-5050-1792-S1"
>>>>>>>>>   },
>>>>>>>>>   "resources":[
>>>>>>>>>     {
>>>>>>>>>       "name":"cpus",
>>>>>>>>>       "type":"SCALAR",
>>>>>>>>>       "scalar":{
>>>>>>>>>         "value":1.0
>>>>>>>>>       },
>>>>>>>>>       "role":"*"
>>>>>>>>>     },
>>>>>>>>>     {
>>>>>>>>>       "name":"mem",
>>>>>>>>>       "type":"SCALAR",
>>>>>>>>>       "scalar":{
>>>>>>>>>         "value":512.0
>>>>>>>>>       },
>>>>>>>>>       "role":"*"
>>>>>>>>>     },
>>>>>>>>>     {
>>>>>>>>>       "name":"ports",
>>>>>>>>>       "type":"RANGES",
>>>>>>>>>       "ranges":{
>>>>>>>>>         "range":[
>>>>>>>>>           {
>>>>>>>>>             "begin":31641,
>>>>>>>>>             "end":31641
>>>>>>>>>           }
>>>>>>>>>         ]
>>>>>>>>>       },
>>>>>>>>>       "role":"*"
>>>>>>>>>     }
>>>>>>>>>   ],
>>>>>>>>>   "command":{
>>>>>>>>>     "environment":{
>>>>>>>>>       "variables":[
>>>>>>>>>         {
>>>>>>>>>           "name":"PORT_8000",
>>>>>>>>>           "value":"31641"
>>>>>>>>>         },
>>>>>>>>>         {
>>>>>>>>>           "name":"MARATHON_APP_VERSION",
>>>>>>>>>           "value":"2015-10-07T19:35:08.386Z"
>>>>>>>>>         },
>>>>>>>>>         {
>>>>>>>>>           "name":"HOST",
>>>>>>>>>           "value":"mesos-worker1a"
>>>>>>>>>         },
>>>>>>>>>         {
>>>>>>>>>           "name":"MARATHON_APP_DOCKER_IMAGE",
>>>>>>>>>
>>>>>>>>> "value":"docker-services1a:5000/gig1/app-81-1-hello-app-1444240966"
>>>>>>>>>         },
>>>>>>>>>         {
>>>>>>>>>           "name":"MESOS_TASK_ID",
>>>>>>>>>
>>>>>>>>> "value":"app-81-1-hello-app_web-v11.84c0f441-6d2a-11e5-98ba-080027477de0"
>>>>>>>>>         },
>>>>>>>>>         {
>>>>>>>>>           "name":"PORT",
>>>>>>>>>           "value":"31641"
>>>>>>>>>         },
>>>>>>>>>         {
>>>>>>>>>           "name":"PORTS",
>>>>>>>>>           "value":"31641"
>>>>>>>>>         },
>>>>>>>>>         {
>>>>>>>>>           "name":"MARATHON_APP_ID",
>>>>>>>>>           "value":"/app-81-1-hello-app/web-v11"
>>>>>>>>>         },
>>>>>>>>>         {
>>>>>>>>>           "name":"PORT0",
>>>>>>>>>           "value":"31641"
>>>>>>>>>         }
>>>>>>>>>       ]
>>>>>>>>>     },
>>>>>>>>>     "shell":false
>>>>>>>>>   },
>>>>>>>>>   "container":{
>>>>>>>>>     "type":"DOCKER",
>>>>>>>>>     "docker":{
>>>>>>>>>
>>>>>>>>> "image":"docker-services1a:5000/gig1/app-81-1-hello-app-1444240966",
>>>>>>>>>       "network":"BRIDGE",
>>>>>>>>>       "port_mappings":[
>>>>>>>>>         {
>>>>>>>>>           "host_port":31641,
>>>>>>>>>           "container_port":8000,
>>>>>>>>>           "protocol":"tcp"
>>>>>>>>>         }
>>>>>>>>>       ],
>>>>>>>>>       "privileged":false,
>>>>>>>>>       "force_pull_image":false
>>>>>>>>>     }
>>>>>>>>>   }
>>>>>>>>> }
>>>>>>>>
>>>>>>>>
>>>>>>>> No, I don't see anything about any health check.
>>>>>>>>
>>>>>>>> Mesos STDOUT for the launched task:
>>>>>>>>
>>>>>>>> --container="mesos-20150924-210922-1608624320-5050-1792-S1.14335f1f-3774-4862-a55b-e9c76cd0f2da"
>>>>>>>>> --docker="docker" --help="false" --initialize_driver_logging="true"
>>>>>>>>> --logbufsecs="0" --logging_level="INFO"
>>>>>>>>> --mapped_directory="/mnt/mesos/sandbox" --quiet="false"
>>>>>>>>> --sandbox_directory="/tmp/mesos/slaves/20150924-210922-1608624320-5050-1792-S1/frameworks/20150821-214332-1407297728-5050-18973-0000/executors/app-81-1-hello-app_web-v11.84c0f441-6d2a-11e5-98ba-080027477de0/runs/14335f1f-3774-4862-a55b-e9c76cd0f2da"
>>>>>>>>> --stop_timeout="0ns"
>>>>>>>>> --container="mesos-20150924-210922-1608624320-5050-1792-S1.14335f1f-3774-4862-a55b-e9c76cd0f2da"
>>>>>>>>> --docker="docker" --help="false" --initialize_driver_logging="true"
>>>>>>>>> --logbufsecs="0" --logging_level="INFO"
>>>>>>>>> --mapped_directory="/mnt/mesos/sandbox" --quiet="false"
>>>>>>>>> --sandbox_directory="/tmp/mesos/slaves/20150924-210922-1608624320-5050-1792-S1/frameworks/20150821-214332-1407297728-5050-18973-0000/executors/app-81-1-hello-app_web-v11.84c0f441-6d2a-11e5-98ba-080027477de0/runs/14335f1f-3774-4862-a55b-e9c76cd0f2da"
>>>>>>>>> --stop_timeout="0ns"
>>>>>>>>> Registered docker executor on mesos-worker1a
>>>>>>>>> Starting task
>>>>>>>>> app-81-1-hello-app_web-v11.84c0f441-6d2a-11e5-98ba-080027477de0
>>>>>>>>
>>>>>>>>
>>>>>>>> And STDERR:
>>>>>>>>
>>>>>>>> I1007 19:35:08.790743  4612 exec.cpp:134] Version: 0.24.0
>>>>>>>>> I1007 19:35:08.793416  4619 exec.cpp:208] Executor registered on
>>>>>>>>> slave 20150924-210922-1608624320-5050-1792-S1
>>>>>>>>> WARNING: Your kernel does not support swap limit capabilities,
>>>>>>>>> memory limited without swap.
>>>>>>>>
>>>>>>>>
>>>>>>>> Again, nothing about any health checks.
>>>>>>>>
>>>>>>>> Any ideas of other things to try or what I could be missing?  Can't
>>>>>>>> say either way about the Mesos health-check system working or not if
>>>>>>>> Marathon won't put the health-check into the task it sends to Mesos.
>>>>>>>>
>>>>>>>> Thanks for all your help!
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Jay
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>> On Tue, Oct 6, 2015 at 11:24 PM, haosdent <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Maybe you could post your executor stdout/stderr so that we could
>>>>>>>>> know whether health check running not.
>>>>>>>>>
>>>>>>>>> On Wed, Oct 7, 2015 at 2:15 PM, haosdent <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> marathon also use mesos health check. When I use health check, I
>>>>>>>>>> could saw the log like this in executor stdout.
>>>>>>>>>>
>>>>>>>>>> ```
>>>>>>>>>> Registered docker executor on xxxxx
>>>>>>>>>> Starting task
>>>>>>>>>> test-health-check.822a5fd2-6cba-11e5-b5ce-0a0027000000
>>>>>>>>>> Launching health check process:
>>>>>>>>>> /home/haosdent/mesos/build/src/.libs/mesos-health-check 
>>>>>>>>>> --executor=xxxx
>>>>>>>>>> Health check process launched at pid: 9895
>>>>>>>>>> Received task health update, healthy: true
>>>>>>>>>> ```
>>>>>>>>>>
>>>>>>>>>> On Wed, Oct 7, 2015 at 12:51 PM, Jay Taylor <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> I am using my own framework, and the full task info I'm using is
>>>>>>>>>>> posted earlier in this thread.  Do you happen to know if Marathon 
>>>>>>>>>>> uses
>>>>>>>>>>> Mesos's health checks for its health check system?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Oct 6, 2015, at 9:01 PM, haosdent <[email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Yes, launch the health task through its definition in taskinfo.
>>>>>>>>>>> Do you launch your task through Marathon? I could test it in my 
>>>>>>>>>>> side.
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Oct 7, 2015 at 11:56 AM, Jay Taylor <[email protected]
>>>>>>>>>>> > wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Precisely, and there are none of those statements.  Are you or
>>>>>>>>>>>> others confident health-checks are part of the code path when 
>>>>>>>>>>>> defined via
>>>>>>>>>>>> task info for docker container tasks?  Going through the code, I 
>>>>>>>>>>>> wasn't
>>>>>>>>>>>> able to find the linkage for anything other than health-checks 
>>>>>>>>>>>> triggered
>>>>>>>>>>>> through a custom executor.
>>>>>>>>>>>>
>>>>>>>>>>>> With that being said it is a pretty good sized code base and
>>>>>>>>>>>> I'm not very familiar with it, so my analysis this far has by no 
>>>>>>>>>>>> means been
>>>>>>>>>>>> exhaustive.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Oct 6, 2015, at 8:41 PM, haosdent <[email protected]>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> When health check launch, it would have a log like this in your
>>>>>>>>>>>> executor stdout
>>>>>>>>>>>> ```
>>>>>>>>>>>> Health check process launched at pid xxx
>>>>>>>>>>>> ```
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Oct 7, 2015 at 11:37 AM, Jay Taylor <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I'm happy to try this, however wouldn't there be output in the
>>>>>>>>>>>>> logs with the string "health" or "Health" if the health-check 
>>>>>>>>>>>>> were active?
>>>>>>>>>>>>> None of my master or slave logs contain the string..
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Oct 6, 2015, at 7:45 PM, haosdent <[email protected]>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Could you use "exit 1" instead of "sleep 5" to see whether
>>>>>>>>>>>>> could see unhealthy status in your task stdout/stderr.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Oct 7, 2015 at 10:38 AM, Jay Taylor <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> My current version is 0.24.1.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Oct 6, 2015 at 7:30 PM, haosdent <[email protected]>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> yes, adam also help commit it to 0.23.1 and 0.24.1
>>>>>>>>>>>>>>> https://github.com/apache/mesos/commit/8c0ed92de3925d4312429bfba01b9b1ccbcbbef0
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> https://github.com/apache/mesos/commit/09e367cd69aa39c156c9326d44f4a7b829ba3db7
>>>>>>>>>>>>>>> Are you use one of this version?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Oct 7, 2015 at 10:26 AM, haosdent <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I remember 0.23.1 and 0.24.1 contains this backport, let me
>>>>>>>>>>>>>>>> double check.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Oct 7, 2015 at 10:01 AM, Jay Taylor <
>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Oops- Now I see you already said it's in master.  I'll
>>>>>>>>>>>>>>>>> look there :)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks again!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Tue, Oct 6, 2015 at 6:59 PM, Jay Taylor <
>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Great, thanks for the quick reply Tim!
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Do you know if there is a branch I can checkout to test
>>>>>>>>>>>>>>>>>> it out?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Tue, Oct 6, 2015 at 6:54 PM, Timothy Chen <
>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hi Jay,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> We just added health check support for docker tasks
>>>>>>>>>>>>>>>>>>> that's in master but not yet released. It will run docker 
>>>>>>>>>>>>>>>>>>> exec with the
>>>>>>>>>>>>>>>>>>> command you provided as health checks.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> It should be in the next release.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Tim
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Oct 6, 2015, at 6:49 PM, Jay Taylor <
>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Does Mesos support health checks for docker image
>>>>>>>>>>>>>>>>>>> tasks?  Mesos seems to be ignoring the TaskInfo.HealthCheck 
>>>>>>>>>>>>>>>>>>> field for me.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Example TaskInfo JSON received back from Mesos:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>   "name":"hello-app.web.v3",
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>   "task_id":{
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> "value":"hello-app_web-v3.fc05a1a5-1e06-4e61-9879-be0d97cd3eec"
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>   },
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>   "slave_id":{
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     "value":"20150924-210922-1608624320-5050-1792-S1"
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>   },
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>   "resources":[
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>       "name":"cpus",
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>       "type":0,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>       "scalar":{
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>         "value":0.1
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>       }
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     },
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>       "name":"mem",
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>       "type":0,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>       "scalar":{
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>         "value":256
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>       }
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     },
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>       "name":"ports",
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>       "type":1,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>       "ranges":{
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>         "range":[
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>           {
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>             "begin":31002,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>             "end":31002
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>           }
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>         ]
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>       }
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>   ],
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>   "command":{
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     "container":{
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> "image":"docker-services1a:5000/test/app-81-1-hello-app-103"
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     },
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     "shell":false
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>   },
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>   "container":{
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     "type":1,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     "docker":{
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> "image":"docker-services1a:5000/gig1/app-81-1-hello-app-103",
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>       "network":2,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>       "port_mappings":[
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>         {
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>           "host_port":31002,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>           "container_port":8000,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>           "protocol":"tcp"
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>         }
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>       ],
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>       "privileged":false,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>       "parameters":[],
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>       "force_pull_image":false
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>   },
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>   "health_check":{
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     "delay_seconds":5,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     "interval_seconds":10,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     "timeout_seconds":10,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     "consecutive_failures":3,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     "grace_period_seconds":0,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     "command":{
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>       "shell":true,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>       "value":"sleep 5",
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>       "user":"root"
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>   }
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I have searched all machines and containers to see if
>>>>>>>>>>>>>>>>>>> they ever run the command (in this case `sleep 5`), but 
>>>>>>>>>>>>>>>>>>> have not found any
>>>>>>>>>>>>>>>>>>> indication that it is being executed.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> In the mesos src code the health-checks are invoked from
>>>>>>>>>>>>>>>>>>> src/launcher/executor.cpp 
>>>>>>>>>>>>>>>>>>> CommandExecutorProcess::launchTask.  Does this
>>>>>>>>>>>>>>>>>>> mean that health-checks are only supported for custom 
>>>>>>>>>>>>>>>>>>> executors and not for
>>>>>>>>>>>>>>>>>>> docker tasks?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> What I am trying to accomplish is to have the 0/non-zero
>>>>>>>>>>>>>>>>>>> exit-status of a health-check command translate to task 
>>>>>>>>>>>>>>>>>>> health.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>>>>> Jay
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>>> Haosdent Huang
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>> Haosdent Huang
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>> Haosdent Huang
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>> Haosdent Huang
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Best Regards,
>>>>>>>>>>> Haosdent Huang
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Best Regards,
>>>>>>>>>> Haosdent Huang
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best Regards,
>>>>>>>>> Haosdent Huang
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards,
>>>>> Haosdent Huang
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Haosdent Huang
>>>
>>>
>>
>>
>> --
>> Best Regards,
>> Haosdent Huang
>>
>>
>
>
> --
> Best Regards,
> Haosdent Huang
>

Reply via email to