Hi Vinod,

The documentation actually already mentions this, that if a ExecutorInfo is
set in the TaskInfo then it is expected to be a Mesos Executor and it is
expected to be registering with the slave.

Tim

On Tue, Sep 30, 2014 at 11:42 AM, Vinod Kone <[email protected]> wrote:

> Tim, mind updating the documentation
> <http://mesos.apache.org/documentation/latest/docker-containerizer/> to
> make sure others don't fall into the same trap?
>
> On Tue, Sep 30, 2014 at 11:38 AM, Tim Chen <[email protected]> wrote:
>
>> Hi Andy,
>>
>> Good catch, I also missed that as I was just looking at the Docker
>> configurations.
>>
>> You'll set the Executor when you have an custom executor.
>>
>> Let us know if you have any other problems.
>>
>> Tim
>>
>> On Tue, Sep 30, 2014 at 11:02 AM, Andy Grove <[email protected]>
>> wrote:
>>
>>> OK. So I figured out the issue with this and it was my misunderstanding
>>> of executors and tasks.
>>>
>>> My task info had:
>>>
>>> .setExecutor(Protos.ExecutorInfo.newBuilder(executor))
>>>
>>> I should have had this:
>>>
>>>             .setContainer(containerInfoBuilder)
>>>             .setCommand(Protos.CommandInfo.newBuilder().setShell(false))
>>>
>>> I didn't have a mesos executor deployed inside my container which
>>> explains the timeout issue.
>>>
>>> Thanks again for the support.
>>>
>>>
>>> Thanks,
>>>
>>> Andy.
>>>
>>> --
>>> Andy Grove
>>> VP Engineering
>>> CodeFutures Corporation
>>>
>>>
>>>
>>> On Tue, Sep 30, 2014 at 10:20 AM, Andy Grove <[email protected]
>>> > wrote:
>>>
>>>> Hi Tim,
>>>>
>>>> Thanks for helping with this. I am running mesos-master and mesos-slave
>>>> natively on the same host (my desktop). The only container in use is the
>>>> one being launched by the mesos-slave.
>>>>
>>>> I will try your suggestion of running a simple command next.
>>>>
>>>> Here is the output from the slave from this issue though:
>>>>
>>>> I0930 10:13:52.053177 30722 main.cpp:126] Build: 2014-09-29 15:35:37 by
>>>> andy
>>>> I0930 10:13:52.053228 30722 main.cpp:128] Version: 0.20.1
>>>> I0930 10:13:53.055480 30722 containerizer.cpp:89] Using isolation:
>>>> posix/cpu,posix/mem
>>>> I0930 10:13:53.058353 30722 main.cpp:149] Starting Mesos slave
>>>> I0930 10:13:53.059651 30722 slave.cpp:167] Slave started on 1)@
>>>> 127.0.1.1:5051
>>>> I0930 10:13:53.060072 30722 slave.cpp:278] Slave resources: cpus(*):8;
>>>> mem(*):14963; disk(*):1.85648e+06; ports(*):[31000-32000]
>>>> I0930 10:13:53.060226 30722 slave.cpp:306] Slave hostname: davros
>>>> I0930 10:13:53.060253 30722 slave.cpp:307] Slave checkpoint: true
>>>> I0930 10:13:53.064975 30729 state.cpp:33] Recovering state from
>>>> '/tmp/mesos/meta'
>>>> I0930 10:13:53.065352 30725 status_update_manager.cpp:193] Recovering
>>>> status update manager
>>>> I0930 10:13:53.065626 30729 docker.cpp:577] Recovering Docker containers
>>>> I0930 10:13:53.065690 30724 containerizer.cpp:252] Recovering
>>>> containerizer
>>>> I0930 10:13:54.055233 30723 slave.cpp:3198] Finished recovery
>>>> I0930 10:13:54.055448 30723 slave.cpp:589] New master detected at
>>>> [email protected]:5050
>>>> I0930 10:13:54.055532 30723 slave.cpp:625] No credentials provided.
>>>> Attempting to register without authentication
>>>> I0930 10:13:54.055537 30730 status_update_manager.cpp:167] New master
>>>> detected at [email protected]:5050
>>>> I0930 10:13:54.055552 30723 slave.cpp:636] Detecting new master
>>>> I0930 10:13:54.928225 30724 slave.cpp:754] Registered with master
>>>> [email protected]:5050; given slave ID
>>>> 20140930-101303-16777343-5050-30690-0
>>>> I0930 10:13:54.928598 30724 slave.cpp:767] Checkpointing SlaveInfo to
>>>> '/tmp/mesos/meta/slaves/20140930-101303-16777343-5050-30690-0/
>>>> slave.info'
>>>> I0930 10:14:17.330390 30725 slave.cpp:1002] Got assigned task 0 for
>>>> framework 20140930-101303-16777343-5050-30690-0000
>>>> I0930 10:14:17.330557 30725 slave.cpp:1112] Launching task 0 for
>>>> framework 20140930-101303-16777343-5050-30690-0000
>>>> I0930 10:14:17.331296 30725 slave.cpp:1222] Queuing task '0' for
>>>> executor default of framework '20140930-101303-16777343-5050-30690-0000
>>>> *I0930 10:14:17.333109 30730 docker.cpp:984] Starting container
>>>> 'ebb1dca6-cc9d-427f-8faa-f3f723f6ab81' for executor 'default' and framework
>>>> '20140930-101303-16777343-5050-30690-0000'*
>>>> I0930 10:14:20.062705 30730 slave.cpp:2538] Monitoring executor
>>>> 'default' of framework '20140930-101303-16777343-5050-30690-0000' in
>>>> container 'ebb1dca6-cc9d-427f-8faa-f3f723f6ab81'
>>>>
>>>> The container is running quite happily at this point.
>>>>
>>>> I0930 10:14:53.061337 30724 slave.cpp:3053] Current usage 0.76%. Max
>>>> allowed age: 6.247043850997720days
>>>> *I0930 10:15:17.331712 30730 slave.cpp:3010] Terminating executor
>>>> default of framework 20140930-101303-16777343-5050-30690-0000 because it
>>>> did not register within 1mins*
>>>> I0930 10:15:17.332221 30728 docker.cpp:1473] Destroying container
>>>> 'ebb1dca6-cc9d-427f-8faa-f3f723f6ab81'
>>>> I0930 10:15:17.332308 30728 docker.cpp:1568] Running docker kill on
>>>> container 'ebb1dca6-cc9d-427f-8faa-f3f723f6ab81'
>>>> I0930 10:15:18.109361 30730 docker.cpp:1646] Executor for container
>>>> 'ebb1dca6-cc9d-427f-8faa-f3f723f6ab81' has exited
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Andy.
>>>>
>>>> --
>>>> Andy Grove
>>>> VP Engineering
>>>> CodeFutures Corporation
>>>>
>>>>
>>>>
>>>> On Mon, Sep 29, 2014 at 6:25 PM, Tim Chen <[email protected]> wrote:
>>>>
>>>>> Hi Andy,
>>>>>
>>>>> You don't need to specifiy -d as the docker containerizer will set it
>>>>> for you since we run all docker images detached.
>>>>>
>>>>> It seems like the executor just simply can't register with the slave.
>>>>> Can you try just running a simple command without Docker that takes longer
>>>>> than the executor registration timeout to see if you see the same error?
>>>>>
>>>>> Also do you run the mesos slave in a docker container as well?
>>>>>
>>>>> Will be great if you can share the slave log as Vinod suggested too.
>>>>>
>>>>> Tim
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Sep 29, 2014 at 5:15 PM, Vinod Kone <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> I'll let Tim Chen help you out here since he has more context. Some
>>>>>> slave logs around the failed container launch would be helpful.
>>>>>>
>>>>>>
>>>>>> On Mon, Sep 29, 2014 at 5:03 PM, Andy Grove <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Ignore my comment about docker run not returning. That is incorrect.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Andy.
>>>>>>>
>>>>>>> --
>>>>>>> Andy Grove
>>>>>>> VP Engineering
>>>>>>> CodeFutures Corporation
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Sep 29, 2014 at 5:59 PM, Andy Grove <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Hi Vinod,
>>>>>>>>
>>>>>>>> Thanks for the quick response but the image is already on the slave
>>>>>>>> and I see the container being launched almost immediately when my 
>>>>>>>> framework
>>>>>>>> starts (within 1-2 seconds). If I keep running docker ps, this is the 
>>>>>>>> last
>>>>>>>> output I see before the container is killed:
>>>>>>>>
>>>>>>>> $ docker ps
>>>>>>>> CONTAINER ID        IMAGE                                   COMMAND
>>>>>>>>                CREATED             STATUS              PORTS
>>>>>>>> NAMES
>>>>>>>> 45f992c2781f        codefutures/dbshards_zookeeper:latest
>>>>>>>> "/bin/sh -c '/opt/zo   59 seconds ago      Up 58 seconds
>>>>>>>>
>>>>>>>> I am using mesos 0.20.1 and docker 1.2.0 on Ubuntu 14.04.
>>>>>>>>
>>>>>>>> So the container is running fine. It is a long running service i.e.
>>>>>>>> the docker run command will never return. Should I be providing some 
>>>>>>>> option
>>>>>>>> so that the docker executor passed the -d flag to the docker run 
>>>>>>>> command? I
>>>>>>>> guess I should start looking through the mesos source so I can see how 
>>>>>>>> this
>>>>>>>> works.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Andy.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Andy Grove
>>>>>>>> VP Engineering
>>>>>>>> CodeFutures Corporation
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Sep 29, 2014 at 5:49 PM, Vinod Kone <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Trying increasing the executor registration timeout on the slave
>>>>>>>>> (--executor_registration_timeout) to give docker more time to do a 
>>>>>>>>> pull of
>>>>>>>>> the image.
>>>>>>>>>
>>>>>>>>> On Mon, Sep 29, 2014 at 4:41 PM, Andy Grove <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I've working on a prototype Mesos framework to launch docker
>>>>>>>>>> containers. I'm getting as far as seeing my container start up but 
>>>>>>>>>> after
>>>>>>>>>> one minute if gets killed due to:
>>>>>>>>>>
>>>>>>>>>> Terminating executor default of framework
>>>>>>>>>> 20140929-155916-16777343-5050-2708-0004 because it did not register 
>>>>>>>>>> within
>>>>>>>>>> 1mins
>>>>>>>>>>
>>>>>>>>>> Here is the code I am using in my scheduler, which was based on
>>>>>>>>>> one of the examples:
>>>>>>>>>>
>>>>>>>>>>   @Override
>>>>>>>>>>   public void resourceOffers(SchedulerDriver schedulerDriver,
>>>>>>>>>> List<Protos.Offer> offers) {
>>>>>>>>>>     logger.info("resourceOffers() with {} offers",
>>>>>>>>>> offers.size());
>>>>>>>>>>
>>>>>>>>>>     for (Protos.Offer offer : offers) {
>>>>>>>>>>
>>>>>>>>>>       List<Protos.TaskInfo> tasks = new
>>>>>>>>>> ArrayList<Protos.TaskInfo>();
>>>>>>>>>>       if (launchedTasks < totalTasks) {
>>>>>>>>>>         Protos.TaskID taskId = Protos.TaskID.newBuilder()
>>>>>>>>>>             .setValue(Integer.toString(launchedTasks++)).build();
>>>>>>>>>>
>>>>>>>>>>         logger.info("Launching task " + taskId.getValue());
>>>>>>>>>>
>>>>>>>>>>         // docker image info
>>>>>>>>>>         Protos.ContainerInfo.DockerInfo.Builder dockerInfoBuilder
>>>>>>>>>> = Protos.ContainerInfo.DockerInfo.newBuilder();
>>>>>>>>>>
>>>>>>>>>> dockerInfoBuilder.setImage("codefutures/dbshards_zookeeper");
>>>>>>>>>>
>>>>>>>>>>         // container info
>>>>>>>>>>         Protos.ContainerInfo.Builder containerInfoBuilder =
>>>>>>>>>> Protos.ContainerInfo.newBuilder();
>>>>>>>>>>
>>>>>>>>>> containerInfoBuilder.setType(Protos.ContainerInfo.Type.DOCKER);
>>>>>>>>>>         containerInfoBuilder.setDocker(dockerInfoBuilder.build());
>>>>>>>>>>
>>>>>>>>>>         // create executor for the container
>>>>>>>>>>         Protos.ExecutorInfo executor =
>>>>>>>>>> Protos.ExecutorInfo.newBuilder()
>>>>>>>>>>
>>>>>>>>>> .setExecutorId(Protos.ExecutorID.newBuilder().setValue("default"))
>>>>>>>>>>
>>>>>>>>>> .setCommand(Protos.CommandInfo.newBuilder().setShell(false))
>>>>>>>>>>             .setContainer(containerInfoBuilder)
>>>>>>>>>>             .setName("Test Executor (Docker)")
>>>>>>>>>>             .setSource("docker_test")
>>>>>>>>>>             .build();
>>>>>>>>>>
>>>>>>>>>>         // create task to run
>>>>>>>>>>         Protos.TaskInfo task = Protos.TaskInfo.newBuilder()
>>>>>>>>>>             .setName("task " + taskId.getValue())
>>>>>>>>>>             .setTaskId(taskId)
>>>>>>>>>>             .setSlaveId(offer.getSlaveId())
>>>>>>>>>>             .addResources(Protos.Resource.newBuilder()
>>>>>>>>>>                 .setName("cpus")
>>>>>>>>>>                 .setType(Protos.Value.Type.SCALAR)
>>>>>>>>>>
>>>>>>>>>> .setScalar(Protos.Value.Scalar.newBuilder().setValue(1)))
>>>>>>>>>>             .addResources(Protos.Resource.newBuilder()
>>>>>>>>>>                 .setName("mem")
>>>>>>>>>>                 .setType(Protos.Value.Type.SCALAR)
>>>>>>>>>>
>>>>>>>>>> .setScalar(Protos.Value.Scalar.newBuilder().setValue(128)))
>>>>>>>>>>             .setExecutor(Protos.ExecutorInfo.newBuilder(executor))
>>>>>>>>>>             .build();
>>>>>>>>>>
>>>>>>>>>>         tasks.add(task);
>>>>>>>>>>       }
>>>>>>>>>>       Protos.Filters filters =
>>>>>>>>>> Protos.Filters.newBuilder().setRefuseSeconds(1).build();
>>>>>>>>>>
>>>>>>>>>>       schedulerDriver.launchTasks(offer.getId(), tasks, filters);
>>>>>>>>>>     }
>>>>>>>>>>
>>>>>>>>>>   }
>>>>>>>>>>
>>>>>>>>>> Am I missing some steps with this approach?
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Andy.
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Andy Grove
>>>>>>>>>> VP Engineering
>>>>>>>>>> CodeFutures Corporation
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to