Hello, I give a try with mesos-0.14.0-rc4 and the 2 frameworks can register on mesos. And execute tasks (batch and long running)
2013/9/24 Damien Hardy <[email protected]> > Not yet, > > I suspected some misconfiguration on mesos part because chronos as the > same behaviour. > > > > 2013/9/23 Benjamin Mahler <[email protected]> > >> It looks like the Marathon framework is continually failing over, have >> you sought help from the Marathon developers? >> >> >> On Mon, Sep 23, 2013 at 2:52 AM, Damien Hardy <[email protected]>wrote: >> >>> Hello there, >>> >>> I might miss something about framework deployment on mesos. >>> >>> I try to get chronos or marathon frameworks working with HEAD of mesos >>> running distributed. >>> >>> I mesos topology seams OK slaves report to master and I can see offers >>> of resources (total available) on the mesos HTTP interface. >>> >>> 192.168.255.1 : marathon or chronos >>> 192.168.255.2 : zookeeper + mesos master >>> 192.168.255.3 : mesos slave >>> >>> Then I start marathon or chornos (HEAD version for both with pom.xml >>> using "<mesos.version>0.15.0-20130910-2</mesos.version>" for example. >>> >>> It seams succeed in finding master, I can see the frameworks listed. >>> But mesos services seams complain permanently, flooding logs on slave >>> with : >>> >>> ``` >>> 2013-09-23 >>> 11:35:37,405:2264(0x7faf54a73700):ZOO_DEBUG@zookeeper_process@1983: >>> Got ping response in 0 ms >>> W0923 11:35:38.002933 2267 slave.cpp:1322] Ignoring updating pid for >>> framework marathon-0.0.6 because it does not exist >>> W0923 11:35:38.359627 2269 slave.cpp:1322] Ignoring updating pid for >>> framework marathon-0.0.6 because it does not exist >>> W0923 11:35:39.003171 2266 slave.cpp:1322] Ignoring updating pid for >>> framework marathon-0.0.6 because it does not exist >>> ``` >>> >>> and master also with : >>> >>> I0923 11:35:33.420017 3685 master.cpp:734] Re-registering framework >>> marathon-0.0.6 at scheduler(1)@192.168.3.224:58107 >>> I0923 11:35:33.420178 3685 master.cpp:753] Framework marathon-0.0.6 >>> failed over >>> I0923 11:35:33.668504 3683 master.cpp:1445] Sending 1 offers to >>> framework marathon-0.0.6 >>> W0923 11:35:33.708227 3686 master.cpp:80] No whitelist given. >>> Advertising offers for all slaves >>> I0923 11:35:33.776002 3686 master.cpp:734] Re-registering framework >>> marathon-0.0.6 at scheduler(1)@192.168.3.224:58107 >>> I0923 11:35:33.776146 3686 master.cpp:753] Framework marathon-0.0.6 >>> failed over >>> I0923 11:35:33.776432 3684 hierarchical_allocator_process.hpp:598] >>> Recovered cpus(*):2; mem(*):2942; disk(*):35195; ports(*):[31000-32000] >>> (total allocatable: cpus(*):2; mem(*):2942; disk(*):35195; >>> ports(*):[31000-32000]) on slave 201309231034-50309312-5050-1111-2 from >>> framework marathon-0.0.6 >>> I0923 11:35:34.419661 3686 master.cpp:734] Re-registering framework >>> marathon-0.0.6 at scheduler(1)@192.168.3.224:58107 >>> I0923 11:35:34.419801 3686 master.cpp:753] Framework marathon-0.0.6 >>> failed over >>> I0923 11:35:34.669680 3684 master.cpp:1445] Sending 1 offers to >>> framework marathon-0.0.6 >>> I0923 11:35:34.776325 3684 master.cpp:734] Re-registering framework >>> marathon-0.0.6 at scheduler(1)@192.168.3.224:58107 >>> I0923 11:35:34.776445 3684 master.cpp:753] Framework marathon-0.0.6 >>> failed over >>> I0923 11:35:34.776748 3684 hierarchical_allocator_process.hpp:598] >>> Recovered cpus(*):2; mem(*):2942; disk(*):35195; ports(*):[31000-32000] >>> (total allocatable: cpus(*):2; mem(*):2942; disk(*):35195; >>> ports(*):[31000-32000]) on slave 201309231034-50309312-5050-1111-2 from >>> framework marathon-0.0.6 >>> >>> When I try to start a service with marathon : base on the example given : >>> >>> marathon -H http://192.168.255.1:8080 start -i chronos -u >>> https://s3.amazonaws.com/mesosphere-binaries-public/chronos/chronos.tgz-C >>> "./chronos/bin/demo ./chronos/config/nomail.yml >>> ./chronos/target/chronos-1.0-SNAPSHOT.jar" >>> Starting app 'chronos' >>> ERROR: >>> >>> Seams to be there : >>> >>> marathon -H http://192.168.255.1:8080 list >>> App ID: chronos >>> Command: ./chronos/bin/demo ./chronos/config/nomail.yml >>> ./chronos/target/chronos-1.0-SNAPSHOT.jar >>> Instances: 1 >>> CPUs: 1.0 >>> Memory: 10.0 MB >>> URI: >>> https://s3.amazonaws.com/mesosphere-binaries-public/chronos/chronos.tgz >>> >>> chronos have the same problem about non existing id on slave, I can >>> create scheduled command but it is never executed. >>> >>> Thank you for any help understanding this. >>> >>> -- >>> Damien HARDY >>> >> >> > > > -- > Damien HARDY > IT Infrastructure Architect > Viadeo - 30 rue de la Victoire - 75009 Paris - France > -- Damien HARDY IT Infrastructure Architect Viadeo - 30 rue de la Victoire - 75009 Paris - France

