It's definitely not overridden in any of my other scripts.  Like I said 
earlier, I've never touched it except for the first time today.



> On Oct 8, 2015, at 7:52 PM, haosdent <[email protected]> wrote:
> 
> As far as I know, MESOS_LAUNCHER_DIR is works by set flags.launcher_dir which 
> would find mesos-docker-executor and mesos-health-check under this dir. 
> Although the env is not propagated, but MESOS_LAUNCHER_DIR still works 
> because flags.launcher_dir is get from it.
> 
> For example, because I 
> ```
> export MESOS_LAUNCHER_DIR=/tmp
> ```
> before start mesos-slave. So when I launch slave, I could find this log in 
> slave log
> ```
> I1009 10:27:26.594599  1416 slave.cpp:203] Flags at startup: xxxxx  
> --launcher_dir="/tmp"
> ```
> 
> And from your log, I not sure why your MESOS_LAUNCHER_DIR become sandbox dir. 
> Is it because MESOS_LAUNCHER_DIR is overrided in your other scripts?
> 
> 
>> On Fri, Oct 9, 2015 at 1:56 AM, Jay Taylor <[email protected]> wrote:
>> I haven't ever changed MESOS_LAUNCHER_DIR/--launcher_dir before.
>> 
>> I just tried setting both the env var and flag on the slaves, and have 
>> determined that the env var is not present when it is being checked 
>> src/docker/executor.cpp @ line 573:
>> 
>>>  const Option<string> envPath = os::getenv("MESOS_LAUNCHER_DIR");
>>>   string path =
>>>     envPath.isSome() ? envPath.get()
>>>                      : os::realpath(Path(argv[0]).dirname()).get();
>>>   cout << "MESOS_LAUNCHER_DIR: envpath.isSome()->" << (envPath.isSome() ? 
>>> "yes" : "no") << endl;
>>>   cout << "MESOS_LAUNCHER_DIR: path='" << path << "'" << endl;
>> 
>> 
>> Exported MESOS_LAUNCHER_DIR env var (and verified it is correctly propagated 
>> along up to the point of mesos-slave launch):
>> 
>>> $ cat /etc/default/mesos-slave
>>> export 
>>> MESOS_MASTER="zk://mesos-primary1a:2181,mesos-primary2a:2181,mesos-primary3a:2181/mesos"
>>> export MESOS_CONTAINERIZERS="mesos,docker"
>>> export MESOS_EXECUTOR_REGISTRATION_TIMEOUT="5mins"
>>> export MESOS_PORT="5050"
>>> export MESOS_LAUNCHER_DIR="/usr/libexec/mesos"
>> 
>> TASK OUTPUT:
>> 
>>> MESOS_LAUNCHER_DIR: envpath.isSome()->no
>>> MESOS_LAUNCHER_DIR: 
>>> path='/tmp/mesos/slaves/61373c0e-7349-4173-ab8d-9d7b260e8a30-S1/frameworks/20150924-210922-1608624320-5050-1792-0020/executors/hello-app_web-v3.22f9c7e4-2109-48a9-998e-e116141ec253/runs/41f8eed6-ec6c-4e6f-b1aa-0a2817a600ad'
>>> Registered docker executor on mesos-worker2a
>>> Starting task hello-app_web-v3.22f9c7e4-2109-48a9-998e-e116141ec253
>>> Launching health check process: 
>>> /tmp/mesos/slaves/61373c0e-7349-4173-ab8d-9d7b260e8a30-S1/frameworks/20150924-210922-1608624320-5050-1792-0020/executors/hello-app_web-v3.22f9c7e4-2109-48a9-998e-e116141ec253/runs/41f8eed6-ec6c-4e6f-b1aa-0a2817a600ad/mesos-health-check
>>>  --executor=(1)@192.168.225.59:44523 
>>> --health_check_json={"command":{"shell":true,"value":"docker exec 
>>> mesos-61373c0e-7349-4173-ab8d-9d7b260e8a30-S1.41f8eed6-ec6c-4e6f-b1aa-0a2817a600ad
>>>  sh -c \" \/bin\/bash 
>>> \""},"consecutive_failures":3,"delay_seconds":5.0,"grace_period_seconds":10.0,"interval_seconds":10.0,"timeout_seconds":10.0}
>>>  --task_id=hello-app_web-v3.22f9c7e4-2109-48a9-998e-e116141ec253
>>> Health check process launched at pid: 2519
>> 
>> 
>> The env var is not propagated when the docker executor is launched in 
>> src/slave/containerizer/docker.cpp around line 903:
>> 
>>>   vector<string> argv;
>>>   argv.push_back("mesos-docker-executor");
>>>   // Construct the mesos-docker-executor using the "name" we gave the
>>>   // container (to distinguish it from Docker containers not created
>>>   // by Mesos).
>>>   Try<Subprocess> s = subprocess(
>>>       path::join(flags.launcher_dir, "mesos-docker-executor"),
>>>       argv,
>>>       Subprocess::PIPE(),
>>>       Subprocess::PATH(path::join(container->directory, "stdout")),
>>>       Subprocess::PATH(path::join(container->directory, "stderr")),
>>>       dockerFlags(flags, container->name(), container->directory),
>>>       environment,
>>>       lambda::bind(&setup, container->directory));
>> 
>> 
>> A little ways above we can see the environment is setup w/ the container 
>> tasks defined env vars.
>> 
>> See src/slave/containerizer/docker.cpp around line 871:
>> 
>>>   // Include any enviroment variables from ExecutorInfo.
>>>   foreach (const Environment::Variable& variable,
>>>            container->executor.command().environment().variables()) {
>>>     environment[variable.name()] = variable.value();
>>>   }
>> 
>> 
>> Should I file a JIRA for this?  Have I overlooked anything?
>> 
>> 
>>> On Wed, Oct 7, 2015 at 8:11 PM, haosdent <[email protected]> wrote:
>>> >Not sure what was going on with health-checks in 0.24.0.
>>> 0.24.1 should be works.
>>> 
>>> >Do any of you know which host the path 
>>> >"/tmp/mesos/slaves/16b49e90-6852-4c91-8e70-d89c54f25668-S1/frameworks/20150821-214332-1407297728-5050-18973-0000/executors/app-81-1-hello-app_web-v11.b89f2ceb-6d62-11e5-9827-080027477de0/runs/73dbfe88-1dbb-4f61-9a52-c365558cdbfc/mesos-health-check"
>>> > should exist on? It definitely doesn't exist on the slave, hence 
>>> >execution failing.
>>> 
>>> Does you set MESOS_LAUNCHER_DIR/--launcher_dir incorrectly before? We got 
>>> mesos-health-check from MESOS_LAUNCHER_DIR/--launcher_id or use the same 
>>> dir of mesos-docker-executor. 
>>> 
>>>> On Thu, Oct 8, 2015 at 10:46 AM, Jay Taylor <[email protected]> wrote:
>>>> Maybe I spoke too soon.
>>>> 
>>>> Now the checks are attempting to run, however the STDERR is not looking 
>>>> good.  I've added some debugging to the error message output to show the 
>>>> path, argv, and envp variables:
>>>> 
>>>> STDOUT:
>>>> 
>>>>> --container="mesos-16b49e90-6852-4c91-8e70-d89c54f25668-S1.73dbfe88-1dbb-4f61-9a52-c365558cdbfc"
>>>>>  --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" 
>>>>> --initialize_driver_logging="true" --logbufsecs="0" 
>>>>> --logging_level="INFO" --mapped_directory="/mnt/mesos/sandbox" 
>>>>> --quiet="false" 
>>>>> --sandbox_directory="/tmp/mesos/slaves/16b49e90-6852-4c91-8e70-d89c54f25668-S1/frameworks/20150821-214332-1407297728-5050-18973-0000/executors/app-81-1-hello-app_web-v11.b89f2ceb-6d62-11e5-9827-080027477de0/runs/73dbfe88-1dbb-4f61-9a52-c365558cdbfc"
>>>>>  --stop_timeout="0ns"
>>>>> --container="mesos-16b49e90-6852-4c91-8e70-d89c54f25668-S1.73dbfe88-1dbb-4f61-9a52-c365558cdbfc"
>>>>>  --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" 
>>>>> --initialize_driver_logging="true" --logbufsecs="0" 
>>>>> --logging_level="INFO" --mapped_directory="/mnt/mesos/sandbox" 
>>>>> --quiet="false" 
>>>>> --sandbox_directory="/tmp/mesos/slaves/16b49e90-6852-4c91-8e70-d89c54f25668-S1/frameworks/20150821-214332-1407297728-5050-18973-0000/executors/app-81-1-hello-app_web-v11.b89f2ceb-6d62-11e5-9827-080027477de0/runs/73dbfe88-1dbb-4f61-9a52-c365558cdbfc"
>>>>>  --stop_timeout="0ns"
>>>>> Registered docker executor on mesos-worker2a
>>>>> Starting task 
>>>>> app-81-1-hello-app_web-v11.b89f2ceb-6d62-11e5-9827-080027477de0
>>>>> Launching health check process: 
>>>>> /tmp/mesos/slaves/16b49e90-6852-4c91-8e70-d89c54f25668-S1/frameworks/20150821-214332-1407297728-5050-18973-0000/executors/app-81-1-hello-app_web-v11.b89f2ceb-6d62-11e5-9827-080027477de0/runs/73dbfe88-1dbb-4f61-9a52-c365558cdbfc/mesos-health-check
>>>>>  --executor=(1)@192.168.225.59:43917 
>>>>> --health_check_json={"command":{"shell":true,"value":"docker exec 
>>>>> mesos-16b49e90-6852-4c91-8e70-d89c54f25668-S1.73dbfe88-1dbb-4f61-9a52-c365558cdbfc
>>>>>  sh -c \" exit 1 
>>>>> \""},"consecutive_failures":3,"delay_seconds":0.0,"grace_period_seconds":10.0,"interval_seconds":10.0,"timeout_seconds":10.0}
>>>>>  --task_id=app-81-1-hello-app_web-v11.b89f2ceb-6d62-11e5-9827-080027477de0
>>>>> Health check process launched at pid: 3012
>>>> 
>>>> 
>>>> STDERR:
>>>> 
>>>>> I1008 02:17:28.870434 2770 exec.cpp:134] Version: 0.26.0
>>>>> I1008 02:17:28.871860 2778 exec.cpp:208] Executor registered on slave 
>>>>> 16b49e90-6852-4c91-8e70-d89c54f25668-S1
>>>>> WARNING: Your kernel does not support swap limit capabilities, memory 
>>>>> limited without swap.
>>>>> ABORT: (src/subprocess.cpp:180): Failed to os::execvpe in childMain 
>>>>> (path.c_str()='/tmp/mesos/slaves/16b49e90-6852-4c91-8e70-d89c54f25668-S1/frameworks/20150821-214332-1407297728-5050-18973-0000/executors/app-81-1-hello-app_web-v11.b89f2ceb-6d62-11e5-9827-080027477de0/runs/73dbfe88-1dbb-4f61-9a52-c365558cdbfc/mesos-health-check',
>>>>>  
>>>>> argv='/tmp/mesos/slaves/16b49e90-6852-4c91-8e70-d89c54f25668-S1/frameworks/20150821-214332-1407297728-5050-18973-0000/executors/app-81-1-hello-app_web-v11.b89f2ceb-6d62-11e5-9827-080027477de0/runs/73dbfe88-1dbb-4f61-9a52-c365558cdbfc/mesos-health-check',
>>>>>  envp=''): No such file or directory*** Aborted at 1444270649 (unix time) 
>>>>> try "date -d @1444270649" if you are using GNU date ***
>>>>> PC: @ 0x7f4a37ec6cc9 (unknown)
>>>>> *** SIGABRT (@0xbc4) received by PID 3012 (TID 0x7f4a2f9f6700) from PID 
>>>>> 3012; stack trace: ***
>>>>> @ 0x7f4a38265340 (unknown)
>>>>> @ 0x7f4a37ec6cc9 (unknown)
>>>>> @ 0x7f4a37eca0d8 (unknown)
>>>>> @ 0x4191e2 _Abort()
>>>>> @ 0x41921c _Abort()
>>>>> @ 0x7f4a39dc2768 process::childMain()
>>>>> @ 0x7f4a39dc4f59 std::_Function_handler<>::_M_invoke()
>>>>> @ 0x7f4a39dc24fc process::defaultClone()
>>>>> @ 0x7f4a39dc34fb process::subprocess()
>>>>> @ 0x43cc9c 
>>>>> mesos::internal::docker::DockerExecutorProcess::launchHealthCheck()
>>>>> @ 0x7f4a39d924f4 process::ProcessManager::resume()
>>>>> @ 0x7f4a39d92827 
>>>>> _ZNSt6thread5_ImplISt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt11atomic_boolE_St17reference_wrapperIS6_EEEvEEE6_M_runEv
>>>>> @ 0x7f4a38a47e40 (unknown)
>>>>> @ 0x7f4a3825d182 start_thread
>>>>> @ 0x7f4a37f8a47d (unknown)
>>>> 
>>>> Do any of you know which host the path 
>>>> "/tmp/mesos/slaves/16b49e90-6852-4c91-8e70-d89c54f25668-S1/frameworks/20150821-214332-1407297728-5050-18973-0000/executors/app-81-1-hello-app_web-v11.b89f2ceb-6d62-11e5-9827-080027477de0/runs/73dbfe88-1dbb-4f61-9a52-c365558cdbfc/mesos-health-check"
>>>>  should exist on? It definitely doesn't exist on the slave, hence 
>>>> execution failing.
>>>> 
>>>> This is with current master, git hash 
>>>> 5058fac1083dc91bca54d33c26c810c17ad95dd1.
>>>> 
>>>>> commit 5058fac1083dc91bca54d33c26c810c17ad95dd1
>>>>> Author: Anand Mazumdar <[email protected]>
>>>>> Date:   Tue Oct 6 17:37:41 2015 -0700
>>>> 
>>>> 
>>>> -Jay
>>>> 
>>>>> On Wed, Oct 7, 2015 at 5:23 PM, Jay Taylor <[email protected]> wrote:
>>>>> Update:
>>>>> 
>>>>> I used https://github.com/deric/mesos-deb-packaging to compile and 
>>>>> package the latest master (0.26.x) and deployed it to the cluster, and 
>>>>> now health checks are working as advertised in both Marathon and my own 
>>>>> framework!  Not sure what was going on with health-checks in 0.24.0..
>>>>> 
>>>>> Anyways, thanks again for your help Haosdent!
>>>>> 
>>>>> Cheers,
>>>>> Jay
>>>>> 
>>>>>> On Wed, Oct 7, 2015 at 12:53 PM, Jay Taylor <[email protected]> wrote:
>>>>>> Hi Haosdent,
>>>>>> 
>>>>>> Can you share your Marathon POST request that results in Mesos executing 
>>>>>> the health checks?
>>>>>> 
>>>>>> Since we can reference the Marathon framework, I've been doing some 
>>>>>> digging around.
>>>>>> 
>>>>>> Here are the details of my setup and findings:
>>>>>> 
>>>>>> I put a few small hacks in Marathon:
>>>>>> 
>>>>>> (1) Added com.googlecode.protobuf.format to Marathon's dependencies
>>>>>> 
>>>>>> (2) Edited the following files so TaskInfo is dumped as JSON to /tmp/X 
>>>>>> in both the TaskFactory as well an right before the task is sent to 
>>>>>> Mesos via driver.launchTasks:
>>>>>> 
>>>>>> src/main/scala/mesosphere/marathon/tasks/DefaultTaskFactory.scala:
>>>>>> 
>>>>>>> $ git diff 
>>>>>>> src/main/scala/mesosphere/marathon/tasks/DefaultTaskFactory.scala
>>>>>>> @@ -25,6 +25,12 @@ class DefaultTaskFactory @Inject() (
>>>>>>> 
>>>>>>>      new TaskBuilder(app, taskIdUtil.newTaskId, 
>>>>>>> config).buildIfMatches(offer, runningTasks).map {
>>>>>>>        case (taskInfo, ports) =>
>>>>>>> +        import com.googlecode.protobuf.format.JsonFormat
>>>>>>> +        import java.io._
>>>>>>> +        val bw = new BufferedWriter(new FileWriter(new 
>>>>>>> File("/tmp/taskjson1-" + taskInfo.getTaskId.getValue)))
>>>>>>> +        bw.write(JsonFormat.printToString(taskInfo))
>>>>>>> +        bw.write("\n")
>>>>>>> +        bw.close()
>>>>>>>          CreatedTask(
>>>>>>>            taskInfo,
>>>>>>>            MarathonTasks.makeTask(
>>>>>> 
>>>>>> src/main/scala/mesosphere/marathon/core/launcher/impl/TaskLauncherImpl.scala:
>>>>>> 
>>>>>>> $ git diff 
>>>>>>> src/main/scala/mesosphere/marathon/core/launcher/impl/TaskLauncherImpl.scala
>>>>>>> @@ -24,6 +24,16 @@ private[launcher] class TaskLauncherImpl(
>>>>>>>    override def launchTasks(offerID: OfferID, taskInfos: 
>>>>>>> Seq[TaskInfo]): Boolean = {
>>>>>>>      val launched = withDriver(s"launchTasks($offerID)") { driver =>
>>>>>>>        import scala.collection.JavaConverters._
>>>>>>> +      var i = 0
>>>>>>> +      for (i <- 0 to taskInfos.length - 1) {
>>>>>>> +        import com.googlecode.protobuf.format.JsonFormat
>>>>>>> +        import java.io._
>>>>>>> +        val file = new File("/tmp/taskJson2-" + i.toString() + "-" + 
>>>>>>> taskInfos(i).getTaskId.getValue)
>>>>>>> +        val bw = new BufferedWriter(new FileWriter(file))
>>>>>>> +        bw.write(JsonFormat.printToString(taskInfos(i)))
>>>>>>> +        bw.write("\n")
>>>>>>> +        bw.close()
>>>>>>> +      }
>>>>>>>        driver.launchTasks(Collections.singleton(offerID), 
>>>>>>> taskInfos.asJava)
>>>>>>>      }
>>>>>> 
>>>>>> 
>>>>>> Then I built and deployed the hacked Marathon and restarted the marathon 
>>>>>> service.
>>>>>> 
>>>>>> Next I created the app via the Marathon API ("hello app" is a container 
>>>>>> with a simple hello-world ruby app running on 0.0.0.0:8000)
>>>>>> 
>>>>>>> curl http://mesos-primary1a:8080/v2/groups -XPOST -H'Content-Type: 
>>>>>>> application/json' -d'
>>>>>>> {
>>>>>>>   "id": "/app-81-1-hello-app",
>>>>>>>   "apps": [
>>>>>>>     {
>>>>>>>       "id": "/app-81-1-hello-app/web-v11",
>>>>>>>       "container": {
>>>>>>>         "type": "DOCKER",
>>>>>>>         "docker": {
>>>>>>>           "image": 
>>>>>>> "docker-services1a:5000/gig1/app-81-1-hello-app-1444240966",
>>>>>>>           "network": "BRIDGE",
>>>>>>>           "portMappings": [
>>>>>>>             {
>>>>>>>               "containerPort": 8000,
>>>>>>>               "hostPort": 0,
>>>>>>>               "protocol": "tcp"
>>>>>>>             }
>>>>>>>           ]
>>>>>>>         }
>>>>>>>       },
>>>>>>>       "env": {
>>>>>>>         
>>>>>>>       },
>>>>>>>       "healthChecks": [
>>>>>>>         {
>>>>>>>           "protocol": "COMMAND",
>>>>>>>           "command": {"value": "exit 1"},
>>>>>>>           "gracePeriodSeconds": 10,
>>>>>>>           "intervalSeconds": 10,
>>>>>>>           "timeoutSeconds": 10,
>>>>>>>           "maxConsecutiveFailures": 3
>>>>>>>         }
>>>>>>>       ],
>>>>>>>       "instances": 1,
>>>>>>>       "cpus": 1,
>>>>>>>       "mem": 512
>>>>>>>     }
>>>>>>>   ]
>>>>>>> }
>>>>>> 
>>>>>> 
>>>>>>> $ ls /tmp/
>>>>>>> taskjson1-app-81-1-hello-app_web-v11.84c0f441-6d2a-11e5-98ba-080027477de0
>>>>>>> taskJson2-0-app-81-1-hello-app_web-v11.84c0f441-6d2a-11e5-98ba-080027477de0
>>>>>> 
>>>>>> Do they match?
>>>>>> 
>>>>>>> $ md5sum /tmp/task*
>>>>>>> 1b5115997e78e2611654059249d99578  
>>>>>>> /tmp/taskjson1-app-81-1-hello-app_web-v11.84c0f441-6d2a-11e5-98ba-080027477de0
>>>>>>> 1b5115997e78e2611654059249d99578  
>>>>>>> /tmp/taskJson2-0-app-81-1-hello-app_web-v11.84c0f441-6d2a-11e5-98ba-080027477de0
>>>>>> 
>>>>>> Yes, so I am confident this is the information being sent across the 
>>>>>> wire to Mesos.
>>>>>> 
>>>>>> Do they contain any health-check information?
>>>>>> 
>>>>>>> $ cat 
>>>>>>> /tmp/taskjson1-app-81-1-hello-app_web-v11.84c0f441-6d2a-11e5-98ba-080027477de0
>>>>>>> {
>>>>>>>   "name":"web-v11.app-81-1-hello-app",
>>>>>>>   "task_id":{
>>>>>>>     
>>>>>>> "value":"app-81-1-hello-app_web-v11.84c0f441-6d2a-11e5-98ba-080027477de0"
>>>>>>>   },
>>>>>>>   "slave_id":{
>>>>>>>     "value":"20150924-210922-1608624320-5050-1792-S1"
>>>>>>>   },
>>>>>>>   "resources":[
>>>>>>>     {
>>>>>>>       "name":"cpus",
>>>>>>>       "type":"SCALAR",
>>>>>>>       "scalar":{
>>>>>>>         "value":1.0
>>>>>>>       },
>>>>>>>       "role":"*"
>>>>>>>     },
>>>>>>>     {
>>>>>>>       "name":"mem",
>>>>>>>       "type":"SCALAR",
>>>>>>>       "scalar":{
>>>>>>>         "value":512.0
>>>>>>>       },
>>>>>>>       "role":"*"
>>>>>>>     },
>>>>>>>     {
>>>>>>>       "name":"ports",
>>>>>>>       "type":"RANGES",
>>>>>>>       "ranges":{
>>>>>>>         "range":[
>>>>>>>           {
>>>>>>>             "begin":31641,
>>>>>>>             "end":31641
>>>>>>>           }
>>>>>>>         ]
>>>>>>>       },
>>>>>>>       "role":"*"
>>>>>>>     }
>>>>>>>   ],
>>>>>>>   "command":{
>>>>>>>     "environment":{
>>>>>>>       "variables":[
>>>>>>>         {
>>>>>>>           "name":"PORT_8000",
>>>>>>>           "value":"31641"
>>>>>>>         },
>>>>>>>         {
>>>>>>>           "name":"MARATHON_APP_VERSION",
>>>>>>>           "value":"2015-10-07T19:35:08.386Z"
>>>>>>>         },
>>>>>>>         {
>>>>>>>           "name":"HOST",
>>>>>>>           "value":"mesos-worker1a"
>>>>>>>         },
>>>>>>>         {
>>>>>>>           "name":"MARATHON_APP_DOCKER_IMAGE",
>>>>>>>           
>>>>>>> "value":"docker-services1a:5000/gig1/app-81-1-hello-app-1444240966"
>>>>>>>         },
>>>>>>>         {
>>>>>>>           "name":"MESOS_TASK_ID",
>>>>>>>           
>>>>>>> "value":"app-81-1-hello-app_web-v11.84c0f441-6d2a-11e5-98ba-080027477de0"
>>>>>>>         },
>>>>>>>         {
>>>>>>>           "name":"PORT",
>>>>>>>           "value":"31641"
>>>>>>>         },
>>>>>>>         {
>>>>>>>           "name":"PORTS",
>>>>>>>           "value":"31641"
>>>>>>>         },
>>>>>>>         {
>>>>>>>           "name":"MARATHON_APP_ID",
>>>>>>>           "value":"/app-81-1-hello-app/web-v11"
>>>>>>>         },
>>>>>>>         {
>>>>>>>           "name":"PORT0",
>>>>>>>           "value":"31641"
>>>>>>>         }
>>>>>>>       ]
>>>>>>>     },
>>>>>>>     "shell":false
>>>>>>>   },
>>>>>>>   "container":{
>>>>>>>     "type":"DOCKER",
>>>>>>>     "docker":{
>>>>>>>       
>>>>>>> "image":"docker-services1a:5000/gig1/app-81-1-hello-app-1444240966",
>>>>>>>       "network":"BRIDGE",
>>>>>>>       "port_mappings":[
>>>>>>>         {
>>>>>>>           "host_port":31641,
>>>>>>>           "container_port":8000,
>>>>>>>           "protocol":"tcp"
>>>>>>>         }
>>>>>>>       ],
>>>>>>>       "privileged":false,
>>>>>>>       "force_pull_image":false
>>>>>>>     }
>>>>>>>   }
>>>>>>> }
>>>>>> 
>>>>>> No, I don't see anything about any health check.
>>>>>> 
>>>>>> Mesos STDOUT for the launched task:
>>>>>> 
>>>>>>> --container="mesos-20150924-210922-1608624320-5050-1792-S1.14335f1f-3774-4862-a55b-e9c76cd0f2da"
>>>>>>>  --docker="docker" --help="false" --initialize_driver_logging="true" 
>>>>>>> --logbufsecs="0" --logging_level="INFO" 
>>>>>>> --mapped_directory="/mnt/mesos/sandbox" --quiet="false" 
>>>>>>> --sandbox_directory="/tmp/mesos/slaves/20150924-210922-1608624320-5050-1792-S1/frameworks/20150821-214332-1407297728-5050-18973-0000/executors/app-81-1-hello-app_web-v11.84c0f441-6d2a-11e5-98ba-080027477de0/runs/14335f1f-3774-4862-a55b-e9c76cd0f2da"
>>>>>>>  --stop_timeout="0ns"
>>>>>>> --container="mesos-20150924-210922-1608624320-5050-1792-S1.14335f1f-3774-4862-a55b-e9c76cd0f2da"
>>>>>>>  --docker="docker" --help="false" --initialize_driver_logging="true" 
>>>>>>> --logbufsecs="0" --logging_level="INFO" 
>>>>>>> --mapped_directory="/mnt/mesos/sandbox" --quiet="false" 
>>>>>>> --sandbox_directory="/tmp/mesos/slaves/20150924-210922-1608624320-5050-1792-S1/frameworks/20150821-214332-1407297728-5050-18973-0000/executors/app-81-1-hello-app_web-v11.84c0f441-6d2a-11e5-98ba-080027477de0/runs/14335f1f-3774-4862-a55b-e9c76cd0f2da"
>>>>>>>  --stop_timeout="0ns"
>>>>>>> Registered docker executor on mesos-worker1a
>>>>>>> Starting task 
>>>>>>> app-81-1-hello-app_web-v11.84c0f441-6d2a-11e5-98ba-080027477de0
>>>>>> 
>>>>>> 
>>>>>> And STDERR:
>>>>>> 
>>>>>>> I1007 19:35:08.790743  4612 exec.cpp:134] Version: 0.24.0
>>>>>>> I1007 19:35:08.793416  4619 exec.cpp:208] Executor registered on slave 
>>>>>>> 20150924-210922-1608624320-5050-1792-S1
>>>>>>> WARNING: Your kernel does not support swap limit capabilities, memory 
>>>>>>> limited without swap.
>>>>>> 
>>>>>> 
>>>>>> Again, nothing about any health checks.
>>>>>> 
>>>>>> Any ideas of other things to try or what I could be missing?  Can't say 
>>>>>> either way about the Mesos health-check system working or not if 
>>>>>> Marathon won't put the health-check into the task it sends to Mesos.
>>>>>> 
>>>>>> Thanks for all your help!
>>>>>> 
>>>>>> Best,
>>>>>> Jay
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On Tue, Oct 6, 2015 at 11:24 PM, haosdent <[email protected]> wrote:
>>>>>>> Maybe you could post your executor stdout/stderr so that we could know 
>>>>>>> whether health check running not.
>>>>>>> 
>>>>>>>> On Wed, Oct 7, 2015 at 2:15 PM, haosdent <[email protected]> wrote:
>>>>>>>> marathon also use mesos health check. When I use health check, I could 
>>>>>>>> saw the log like this in executor stdout.
>>>>>>>> 
>>>>>>>> ```
>>>>>>>> Registered docker executor on xxxxx
>>>>>>>> Starting task test-health-check.822a5fd2-6cba-11e5-b5ce-0a0027000000
>>>>>>>> Launching health check process: 
>>>>>>>> /home/haosdent/mesos/build/src/.libs/mesos-health-check --executor=xxxx
>>>>>>>> Health check process launched at pid: 9895
>>>>>>>> Received task health update, healthy: true
>>>>>>>> ```
>>>>>>>> 
>>>>>>>>> On Wed, Oct 7, 2015 at 12:51 PM, Jay Taylor <[email protected]> 
>>>>>>>>> wrote:
>>>>>>>>> I am using my own framework, and the full task info I'm using is 
>>>>>>>>> posted earlier in this thread.  Do you happen to know if Marathon 
>>>>>>>>> uses Mesos's health checks for its health check system?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On Oct 6, 2015, at 9:01 PM, haosdent <[email protected]> wrote:
>>>>>>>>>> 
>>>>>>>>>> Yes, launch the health task through its definition in taskinfo. Do 
>>>>>>>>>> you launch your task through Marathon? I could test it in my side.
>>>>>>>>>> 
>>>>>>>>>>> On Wed, Oct 7, 2015 at 11:56 AM, Jay Taylor <[email protected]> 
>>>>>>>>>>> wrote:
>>>>>>>>>>> Precisely, and there are none of those statements.  Are you or 
>>>>>>>>>>> others confident health-checks are part of the code path when 
>>>>>>>>>>> defined via task info for docker container tasks?  Going through 
>>>>>>>>>>> the code, I wasn't able to find the linkage for anything other than 
>>>>>>>>>>> health-checks triggered through a custom executor.
>>>>>>>>>>> 
>>>>>>>>>>> With that being said it is a pretty good sized code base and I'm 
>>>>>>>>>>> not very familiar with it, so my analysis this far has by no means 
>>>>>>>>>>> been exhaustive.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> On Oct 6, 2015, at 8:41 PM, haosdent <[email protected]> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> When health check launch, it would have a log like this in your 
>>>>>>>>>>>> executor stdout
>>>>>>>>>>>> ```
>>>>>>>>>>>> Health check process launched at pid xxx
>>>>>>>>>>>> ```
>>>>>>>>>>>> 
>>>>>>>>>>>>> On Wed, Oct 7, 2015 at 11:37 AM, Jay Taylor <[email protected]> 
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> I'm happy to try this, however wouldn't there be output in the 
>>>>>>>>>>>>> logs with the string "health" or "Health" if the health-check 
>>>>>>>>>>>>> were active?  None of my master or slave logs contain the string..
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Oct 6, 2015, at 7:45 PM, haosdent <[email protected]> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Could you use "exit 1" instead of "sleep 5" to see whether could 
>>>>>>>>>>>>>> see unhealthy status in your task stdout/stderr.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Wed, Oct 7, 2015 at 10:38 AM, Jay Taylor 
>>>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>> My current version is 0.24.1.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Tue, Oct 6, 2015 at 7:30 PM, haosdent <[email protected]> 
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> yes, adam also help commit it to 0.23.1 and 0.24.1 
>>>>>>>>>>>>>>>> https://github.com/apache/mesos/commit/8c0ed92de3925d4312429bfba01b9b1ccbcbbef0
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> https://github.com/apache/mesos/commit/09e367cd69aa39c156c9326d44f4a7b829ba3db7
>>>>>>>>>>>>>>>> Are you use one of this version?
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On Wed, Oct 7, 2015 at 10:26 AM, haosdent 
>>>>>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>>>> I remember 0.23.1 and 0.24.1 contains this backport, let me 
>>>>>>>>>>>>>>>>> double check.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> On Wed, Oct 7, 2015 at 10:01 AM, Jay Taylor 
>>>>>>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>>>>> Oops- Now I see you already said it's in master.  I'll look 
>>>>>>>>>>>>>>>>>> there :)
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Thanks again!
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> On Tue, Oct 6, 2015 at 6:59 PM, Jay Taylor 
>>>>>>>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>> Great, thanks for the quick reply Tim!
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Do you know if there is a branch I can checkout to test it 
>>>>>>>>>>>>>>>>>>> out?
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> On Tue, Oct 6, 2015 at 6:54 PM, Timothy Chen 
>>>>>>>>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>>> Hi Jay, 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> We just added health check support for docker tasks that's 
>>>>>>>>>>>>>>>>>>>> in master but not yet released. It will run docker exec 
>>>>>>>>>>>>>>>>>>>> with the command you provided as health checks.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> It should be in the next release.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Tim
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> On Oct 6, 2015, at 6:49 PM, Jay Taylor 
>>>>>>>>>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Does Mesos support health checks for docker image tasks?  
>>>>>>>>>>>>>>>>>>>>> Mesos seems to be ignoring the TaskInfo.HealthCheck field 
>>>>>>>>>>>>>>>>>>>>> for me.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Example TaskInfo JSON received back from Mesos:
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> {
>>>>>>>>>>>>>>>>>>>>>>>   "name":"hello-app.web.v3",
>>>>>>>>>>>>>>>>>>>>>>>   "task_id":{
>>>>>>>>>>>>>>>>>>>>>>>     
>>>>>>>>>>>>>>>>>>>>>>> "value":"hello-app_web-v3.fc05a1a5-1e06-4e61-9879-be0d97cd3eec"
>>>>>>>>>>>>>>>>>>>>>>>   },
>>>>>>>>>>>>>>>>>>>>>>>   "slave_id":{
>>>>>>>>>>>>>>>>>>>>>>>     "value":"20150924-210922-1608624320-5050-1792-S1"
>>>>>>>>>>>>>>>>>>>>>>>   },
>>>>>>>>>>>>>>>>>>>>>>>   "resources":[
>>>>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>>>>       "name":"cpus",
>>>>>>>>>>>>>>>>>>>>>>>       "type":0,
>>>>>>>>>>>>>>>>>>>>>>>       "scalar":{
>>>>>>>>>>>>>>>>>>>>>>>         "value":0.1
>>>>>>>>>>>>>>>>>>>>>>>       }
>>>>>>>>>>>>>>>>>>>>>>>     },
>>>>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>>>>       "name":"mem",
>>>>>>>>>>>>>>>>>>>>>>>       "type":0,
>>>>>>>>>>>>>>>>>>>>>>>       "scalar":{
>>>>>>>>>>>>>>>>>>>>>>>         "value":256
>>>>>>>>>>>>>>>>>>>>>>>       }
>>>>>>>>>>>>>>>>>>>>>>>     },
>>>>>>>>>>>>>>>>>>>>>>>     {
>>>>>>>>>>>>>>>>>>>>>>>       "name":"ports",
>>>>>>>>>>>>>>>>>>>>>>>       "type":1,
>>>>>>>>>>>>>>>>>>>>>>>       "ranges":{
>>>>>>>>>>>>>>>>>>>>>>>         "range":[
>>>>>>>>>>>>>>>>>>>>>>>           {
>>>>>>>>>>>>>>>>>>>>>>>             "begin":31002,
>>>>>>>>>>>>>>>>>>>>>>>             "end":31002
>>>>>>>>>>>>>>>>>>>>>>>           }
>>>>>>>>>>>>>>>>>>>>>>>         ]
>>>>>>>>>>>>>>>>>>>>>>>       }
>>>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>>>   ],
>>>>>>>>>>>>>>>>>>>>>>>   "command":{
>>>>>>>>>>>>>>>>>>>>>>>     "container":{
>>>>>>>>>>>>>>>>>>>>>>>       
>>>>>>>>>>>>>>>>>>>>>>> "image":"docker-services1a:5000/test/app-81-1-hello-app-103"
>>>>>>>>>>>>>>>>>>>>>>>     },
>>>>>>>>>>>>>>>>>>>>>>>     "shell":false
>>>>>>>>>>>>>>>>>>>>>>>   },
>>>>>>>>>>>>>>>>>>>>>>>   "container":{
>>>>>>>>>>>>>>>>>>>>>>>     "type":1,
>>>>>>>>>>>>>>>>>>>>>>>     "docker":{
>>>>>>>>>>>>>>>>>>>>>>>       
>>>>>>>>>>>>>>>>>>>>>>> "image":"docker-services1a:5000/gig1/app-81-1-hello-app-103",
>>>>>>>>>>>>>>>>>>>>>>>       "network":2,
>>>>>>>>>>>>>>>>>>>>>>>       "port_mappings":[
>>>>>>>>>>>>>>>>>>>>>>>         {
>>>>>>>>>>>>>>>>>>>>>>>           "host_port":31002,
>>>>>>>>>>>>>>>>>>>>>>>           "container_port":8000,
>>>>>>>>>>>>>>>>>>>>>>>           "protocol":"tcp"
>>>>>>>>>>>>>>>>>>>>>>>         }
>>>>>>>>>>>>>>>>>>>>>>>       ],
>>>>>>>>>>>>>>>>>>>>>>>       "privileged":false,
>>>>>>>>>>>>>>>>>>>>>>>       "parameters":[],
>>>>>>>>>>>>>>>>>>>>>>>       "force_pull_image":false
>>>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>>>   },
>>>>>>>>>>>>>>>>>>>>>>>   "health_check":{
>>>>>>>>>>>>>>>>>>>>>>>     "delay_seconds":5,
>>>>>>>>>>>>>>>>>>>>>>>     "interval_seconds":10,
>>>>>>>>>>>>>>>>>>>>>>>     "timeout_seconds":10,
>>>>>>>>>>>>>>>>>>>>>>>     "consecutive_failures":3,
>>>>>>>>>>>>>>>>>>>>>>>     "grace_period_seconds":0,
>>>>>>>>>>>>>>>>>>>>>>>     "command":{
>>>>>>>>>>>>>>>>>>>>>>>       "shell":true,
>>>>>>>>>>>>>>>>>>>>>>>       "value":"sleep 5",
>>>>>>>>>>>>>>>>>>>>>>>       "user":"root"
>>>>>>>>>>>>>>>>>>>>>>>     }
>>>>>>>>>>>>>>>>>>>>>>>   }
>>>>>>>>>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> I have searched all machines and containers to see if 
>>>>>>>>>>>>>>>>>>>>> they ever run the command (in this case `sleep 5`), but 
>>>>>>>>>>>>>>>>>>>>> have not found any indication that it is being executed.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> In the mesos src code the health-checks are invoked from 
>>>>>>>>>>>>>>>>>>>>> src/launcher/executor.cpp 
>>>>>>>>>>>>>>>>>>>>> CommandExecutorProcess::launchTask.  Does this mean that 
>>>>>>>>>>>>>>>>>>>>> health-checks are only supported for custom executors and 
>>>>>>>>>>>>>>>>>>>>> not for docker tasks?
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> What I am trying to accomplish is to have the 0/non-zero 
>>>>>>>>>>>>>>>>>>>>> exit-status of a health-check command translate to task 
>>>>>>>>>>>>>>>>>>>>> health.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>>>>>>> Jay
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>>>> Haosdent Huang
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>>> Haosdent Huang
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>> Haosdent Huang
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> -- 
>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>> Haosdent Huang
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> -- 
>>>>>>>>>> Best Regards,
>>>>>>>>>> Haosdent Huang
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> -- 
>>>>>>>> Best Regards,
>>>>>>>> Haosdent Huang
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> -- 
>>>>>>> Best Regards,
>>>>>>> Haosdent Huang
>>> 
>>> 
>>> 
>>> -- 
>>> Best Regards,
>>> Haosdent Huang
> 
> 
> 
> -- 
> Best Regards,
> Haosdent Huang

Reply via email to