It looks like the Marathon framework is continually failing over, have you
sought help from the Marathon developers?


On Mon, Sep 23, 2013 at 2:52 AM, Damien Hardy <[email protected]> wrote:

> Hello there,
>
> I might miss something about framework deployment on mesos.
>
> I try to get chronos or marathon frameworks working with HEAD of mesos
> running distributed.
>
> I mesos topology seams OK slaves report to master and I can see offers of
> resources (total available) on the mesos HTTP interface.
>
> 192.168.255.1 : marathon or chronos
> 192.168.255.2 : zookeeper + mesos master
> 192.168.255.3 : mesos slave
>
> Then I start marathon or chornos (HEAD version for both with pom.xml using
> "<mesos.version>0.15.0-20130910-2</mesos.version>" for example.
>
> It seams succeed in finding master, I can see the frameworks listed.
> But mesos services seams complain permanently, flooding logs on slave with
> :
>
> ```
> 2013-09-23 11:35:37,405:2264(0x7faf54a73700):ZOO_DEBUG@zookeeper_process@1983:
> Got ping response in 0 ms
> W0923 11:35:38.002933  2267 slave.cpp:1322] Ignoring updating pid for
> framework marathon-0.0.6 because it does not exist
> W0923 11:35:38.359627  2269 slave.cpp:1322] Ignoring updating pid for
> framework marathon-0.0.6 because it does not exist
> W0923 11:35:39.003171  2266 slave.cpp:1322] Ignoring updating pid for
> framework marathon-0.0.6 because it does not exist
> ```
>
> and master also with :
>
> I0923 11:35:33.420017  3685 master.cpp:734] Re-registering framework
> marathon-0.0.6 at scheduler(1)@192.168.3.224:58107
> I0923 11:35:33.420178  3685 master.cpp:753] Framework marathon-0.0.6
> failed over
> I0923 11:35:33.668504  3683 master.cpp:1445] Sending 1 offers to framework
> marathon-0.0.6
> W0923 11:35:33.708227  3686 master.cpp:80] No whitelist given. Advertising
> offers for all slaves
> I0923 11:35:33.776002  3686 master.cpp:734] Re-registering framework
> marathon-0.0.6 at scheduler(1)@192.168.3.224:58107
> I0923 11:35:33.776146  3686 master.cpp:753] Framework marathon-0.0.6
> failed over
> I0923 11:35:33.776432  3684 hierarchical_allocator_process.hpp:598]
> Recovered cpus(*):2; mem(*):2942; disk(*):35195; ports(*):[31000-32000]
> (total allocatable: cpus(*):2; mem(*):2942; disk(*):35195;
> ports(*):[31000-32000]) on slave 201309231034-50309312-5050-1111-2 from
> framework marathon-0.0.6
> I0923 11:35:34.419661  3686 master.cpp:734] Re-registering framework
> marathon-0.0.6 at scheduler(1)@192.168.3.224:58107
> I0923 11:35:34.419801  3686 master.cpp:753] Framework marathon-0.0.6
> failed over
> I0923 11:35:34.669680  3684 master.cpp:1445] Sending 1 offers to framework
> marathon-0.0.6
> I0923 11:35:34.776325  3684 master.cpp:734] Re-registering framework
> marathon-0.0.6 at scheduler(1)@192.168.3.224:58107
> I0923 11:35:34.776445  3684 master.cpp:753] Framework marathon-0.0.6
> failed over
> I0923 11:35:34.776748  3684 hierarchical_allocator_process.hpp:598]
> Recovered cpus(*):2; mem(*):2942; disk(*):35195; ports(*):[31000-32000]
> (total allocatable: cpus(*):2; mem(*):2942; disk(*):35195;
> ports(*):[31000-32000]) on slave 201309231034-50309312-5050-1111-2 from
> framework marathon-0.0.6
>
> When I try to start a service with marathon : base on the example given :
>
> marathon -H http://192.168.255.1:8080 start -i chronos -u
> https://s3.amazonaws.com/mesosphere-binaries-public/chronos/chronos.tgz-C 
> "./chronos/bin/demo ./chronos/config/nomail.yml
> ./chronos/target/chronos-1.0-SNAPSHOT.jar"
> Starting app 'chronos'
> ERROR:
>
> Seams to be there :
>
> marathon -H http://192.168.255.1:8080 list
> App ID:    chronos
> Command:   ./chronos/bin/demo ./chronos/config/nomail.yml
> ./chronos/target/chronos-1.0-SNAPSHOT.jar
> Instances: 1
> CPUs:      1.0
> Memory:    10.0 MB
> URI:
> https://s3.amazonaws.com/mesosphere-binaries-public/chronos/chronos.tgz
>
> chronos have the same problem about non existing id on slave, I can create
> scheduled command but it is never executed.
>
> Thank you for any help understanding this.
>
> --
> Damien HARDY
>

Reply via email to