slave_ping_timeout 1secs
Hi Running Mesos 0.23.0 and noted that cannot start mesos-master with slave_ping_timeout less than 1 second, tried 0.5secs, 500ms and 50us, etc. Is this by design or am I missing something? Cheers, [http://www.cisco.com/web/europe/images/email/signature/logo05.jpg] Nastooh Avessta ENGINEER.SOFTWARE ENGINEERING nave...@cisco.com Phone: +1 604 647 1527 Cisco Systems Limited 595 Burrard Street, Suite 2123 Three Bentall Centre, PO Box 49121 VANCOUVER BRITISH COLUMBIA V7X 1J1 CA Cisco.comhttp://www.cisco.com/ [Think before you print.]Think before you print. This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message. For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/index.html Cisco Systems Canada Co, 181 Bay St., Suite 3400, Toronto, ON, Canada, M5J 2T3. Phone: 416-306-7000; Fax: 416-306-7099. Preferenceshttp://www.cisco.com/offer/subscribe/?sid=000478326 - Unsubscribehttp://www.cisco.com/offer/unsubscribe/?sid=000478327 - Privacyhttp://www.cisco.com/web/siteassets/legal/privacy.html
RE: slave_ping_timeout 1secs
I see. Thank you for the clarification. Can I just change the boundaries in the source code, to suit my needs, or there is more to it? Cheers, [http://www.cisco.com/web/europe/images/email/signature/logo05.jpg] Nastooh Avessta ENGINEER.SOFTWARE ENGINEERING nave...@cisco.com Phone: +1 604 647 1527 Cisco Systems Limited 595 Burrard Street, Suite 2123 Three Bentall Centre, PO Box 49121 VANCOUVER BRITISH COLUMBIA V7X 1J1 CA Cisco.comhttp://www.cisco.com/ [Think before you print.]Think before you print. This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message. For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/index.html Cisco Systems Canada Co, 181 Bay St., Suite 3400, Toronto, ON, Canada, M5J 2T3. Phone: 416-306-7000; Fax: 416-306-7099. Preferenceshttp://www.cisco.com/offer/subscribe/?sid=000478326 - Unsubscribehttp://www.cisco.com/offer/unsubscribe/?sid=000478327 – Privacyhttp://www.cisco.com/web/siteassets/legal/privacy.html From: Yan Xu [mailto:y...@jxu.me] Sent: Tuesday, August 25, 2015 5:49 PM To: user@mesos.apache.org Subject: Re: slave_ping_timeout 1secs Yes: https://github.com/apache/mesos/blob/5de7ea455ec577e19c67a75b1cf98493b40c53fb/src/master/flags.cpp#L383 Was the error message not shown in stderr? -- Jiang Yan Xu y...@jxu.memailto:y...@jxu.me @xujyanhttp://twitter.com/xujyan On Tue, Aug 25, 2015 at 5:41 PM, Nastooh Avessta (navesta) nave...@cisco.commailto:nave...@cisco.com wrote: Hi Running Mesos 0.23.0 and noted that cannot start mesos-master with slave_ping_timeout less than 1 second, tried 0.5secs, 500ms and 50us, etc. Is this by design or am I missing something? Cheers, [http://www.cisco.com/web/europe/images/email/signature/logo05.jpg] Nastooh Avessta ENGINEER.SOFTWARE ENGINEERING nave...@cisco.commailto:nave...@cisco.com Phone: +1 604 647 1527tel:%2B1%20604%20647%201527 Cisco Systems Limited 595 Burrard Street, Suite 2123 Three Bentall Centre, PO Box 49121 VANCOUVER BRITISH COLUMBIA V7X 1J1 CA Cisco.comhttp://www.cisco.com/ [Think before you print.]Think before you print. This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message. For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/index.html Cisco Systems Canada Co, 181 Bay St., Suite 3400, Toronto, ON, Canada, M5J 2T3. Phone: 416-306-7000tel:416-306-7000; Fax: 416-306-7099tel:416-306-7099. Preferenceshttp://www.cisco.com/offer/subscribe/?sid=000478326 - Unsubscribehttp://www.cisco.com/offer/unsubscribe/?sid=000478327 – Privacyhttp://www.cisco.com/web/siteassets/legal/privacy.html
RE: Mesos Modifying User Group
0.23, here I come. Thanks John, will install 0.23 and retest. Cheers, [http://www.cisco.com/web/europe/images/email/signature/logo05.jpg] Nastooh Avessta ENGINEER.SOFTWARE ENGINEERING nave...@cisco.com Phone: +1 604 647 1527 Cisco Systems Limited 595 Burrard Street, Suite 2123 Three Bentall Centre, PO Box 49121 VANCOUVER BRITISH COLUMBIA V7X 1J1 CA Cisco.comhttp://www.cisco.com/ [Think before you print.]Think before you print. This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message. For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/index.html Cisco Systems Canada Co, 181 Bay St., Suite 3400, Toronto, ON, Canada, M5J 2T3. Phone: 416-306-7000; Fax: 416-306-7099. Preferenceshttp://www.cisco.com/offer/subscribe/?sid=000478326 - Unsubscribehttp://www.cisco.com/offer/unsubscribe/?sid=000478327 – Privacyhttp://www.cisco.com/web/siteassets/legal/privacy.html From: John Omernik [mailto:j...@omernik.com] Sent: Thursday, August 13, 2015 5:02 AM To: user@mesos.apache.org Subject: Re: Mesos Modifying User Group I ran into this same issue. For me it manifested as weird permission denied in MapR's NFS implementation, running in bash, etc was fine. But running in on Mesos, it didn't work (permission denied)(Also thank you to MapR for helping me troubleshoot). Good news, there is a patch. https://issues.apache.org/jira/browse/MESOS-719 And it's fixed in Mesos 0.23. I applied the patch and recompiled and it worked great, and when I installed 0.23, it also worked great. Good luck. John On Wed, Aug 12, 2015 at 5:28 PM, Nastooh Avessta (navesta) nave...@cisco.commailto:nave...@cisco.com wrote: Having a bit of a strange problem with Mesos 0.22, running Spark 1.4.0, on Docker 1.6 slaves. Part of my Spark program calls on a script that accesses a GPU. I am able to run this script: 1. As Bash 2. Via Marathon 3. As part of a Spark program running as a standalone master However, when I try to run the same Spark program with Mesos as master, i.e., spark-submit --master mesos://\`cat /etc/mesos/zk\` --deploy-mode client…, I am not able to access dri devices, e.g., mfx init: /dev/dri/renderD128 fd open failed. What seems to be happening is that the group membership of the default user, in this case “ubuntu” is modified by Mesos, i.e., whereas under cases 1-3, above, I get: $ id uid=1000(ubuntu) gid=1000(ubuntu) groups=1000(ubuntu),4(adm),20(dialout),24(cdrom),25(floppy),27(sudo),29(audio),30(dip),44(video),46(plugdev),102(netdev),999(docker) In case of Mesos, I get: uid=1000(ubuntu) gid=1000(ubuntu) groups=1000(ubuntu),0(root) I am wondering if there are configuration parameters that can be passed to Mesos to prevent it from modifying user groups? Cheers, [http://www.cisco.com/web/europe/images/email/signature/logo05.jpg] Nastooh Avessta ENGINEER.SOFTWARE ENGINEERING nave...@cisco.commailto:nave...@cisco.com Phone: +1 604 647 1527tel:%2B1%20604%20647%201527 Cisco Systems Limited 595 Burrard Street, Suite 2123 Three Bentall Centre, PO Box 49121 VANCOUVER BRITISH COLUMBIA V7X 1J1 CA Cisco.comhttp://www.cisco.com/ [Think before you print.]Think before you print. This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message. For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/index.html Cisco Systems Canada Co, 181 Bay St., Suite 3400, Toronto, ON, Canada, M5J 2T3. Phone: 416-306-7000tel:416-306-7000; Fax: 416-306-7099tel:416-306-7099. Preferenceshttp://www.cisco.com/offer/subscribe/?sid=000478326 - Unsubscribehttp://www.cisco.com/offer/unsubscribe/?sid=000478327 – Privacyhttp://www.cisco.com/web/siteassets/legal/privacy.html
RE: Mesos and Docker Slave Problem
Hi Have tried docker 1.7.1 and it works on its own. Had to tear down my 1.7.1 setup; however, time allowing, will try to recapture logs. Cheers, [http://www.cisco.com/web/europe/images/email/signature/logo05.jpg] Nastooh Avessta ENGINEER.SOFTWARE ENGINEERING nave...@cisco.com Phone: +1 604 647 1527 Cisco Systems Limited 595 Burrard Street, Suite 2123 Three Bentall Centre, PO Box 49121 VANCOUVER BRITISH COLUMBIA V7X 1J1 CA Cisco.comhttp://www.cisco.com/ [Think before you print.]Think before you print. This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message. For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/index.html Cisco Systems Canada Co, 181 Bay St., Suite 3400, Toronto, ON, Canada, M5J 2T3. Phone: 416-306-7000; Fax: 416-306-7099. Preferenceshttp://www.cisco.com/offer/subscribe/?sid=000478326 - Unsubscribehttp://www.cisco.com/offer/unsubscribe/?sid=000478327 – Privacyhttp://www.cisco.com/web/siteassets/legal/privacy.html From: Tim Chen [mailto:t...@mesosphere.io] Sent: Friday, August 07, 2015 2:20 PM To: user@mesos.apache.org Subject: Re: Mesos and Docker Slave Problem Hi Nastooh, Can you put verbose log up (GLOG_v=1) and try again to see if you have more information the sandbox log? Also can you manually test the docker run command/image on that machine with 1.7.1 to see if it works? Tim On Fri, Aug 7, 2015 at 1:04 PM, Nastooh Avessta (navesta) nave...@cisco.commailto:nave...@cisco.com wrote: Hi Vinod, So the culprit seems to be docker 1.7.1. I have 2 identical machines, of the 2nd slave type, on one I downgraded to docker 1.6.2, and now am able to deploy tasks, via Marathon. One the other machine, which run on 1.7.1, the same problem is observed. On this machine, here is what I see: root@instance-03ef:/# more /tmp/mesos/slaves/20150807-194745-2150770698-5050-1933-S1/frameworks/20150624-232916-16777343-5050-1628-/executors/hello-gpu1-sleep.d19f…/stderr I0807 19:52:09.067476 588 exec.cpp:132] Version: 0.22.1 I0807 19:54:16.347481 594 exec.cpp:459] Slave exited ... shutting down root@instance-03ef:/# more /tmp/mesos/slaves/20150807-194745-2150770698-5050-1933-S1/frameworks/20150624-232916-16777343-5050-1628-/executors/hello-gpu1-sleep.d19f…/stdout Shutting down root@instance-03ef:/# (Which are identical, to those observed through Mesos GUI) Cheers, [http://www.cisco.com/web/europe/images/email/signature/logo05.jpg] Nastooh Avessta ENGINEER.SOFTWARE ENGINEERING nave...@cisco.commailto:nave...@cisco.com Phone: +1 604 647 1527tel:%2B1%20604%20647%201527 Cisco Systems Limited 595 Burrard Street, Suite 2123 Three Bentall Centre, PO Box 49121 VANCOUVER BRITISH COLUMBIA V7X 1J1 CA Cisco.comhttp://www.cisco.com/ [Think before you print.]Think before you print. This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message. For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/index.html Cisco Systems Canada Co, 181 Bay St., Suite 3400, Toronto, ON, Canada, M5J 2T3. Phone: 416-306-7000tel:416-306-7000; Fax: 416-306-7099tel:416-306-7099. Preferenceshttp://www.cisco.com/offer/subscribe/?sid=000478326 - Unsubscribehttp://www.cisco.com/offer/unsubscribe/?sid=000478327 – Privacyhttp://www.cisco.com/web/siteassets/legal/privacy.html From: Vinod Kone [mailto:vinodk...@gmail.commailto:vinodk...@gmail.com] Sent: Friday, August 07, 2015 12:25 PM To: user@mesos.apache.orgmailto:user@mesos.apache.org Subject: Re: Mesos and Docker Slave Problem On Fri, Aug 7, 2015 at 11:50 AM, Nastooh Avessta (navesta) nave...@cisco.commailto:nave...@cisco.com wrote: I0807 17:56:11.772568 577 slave.cpp:4208] Launching executor hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0 of framework 20150624-232916-16777343-5050-1628- in work directory '/tmp/mesos/slaves/20150807-174737-1982998538-5050-1871-S1/frameworks/20150624-232916-16777343-5050-1628-/executors/hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0/runs/260e0c23-835a-48a8-ab40-c9566077373f' Can you see the stdout/stderr files in this directory for errors?
Mesos and Docker Slave Problem
Hi Running Mesos 0.22.1, on a setup with 2 docker slaves: One is running on kernel 3.13.0, with docker 1.6.2, and the other on kernel 3.14.5, with docker 1.7.1. I am able to run Marathon tasks on the first one, e.g., { id: hello -sleep, cmd: while [ true ] ; do echo 'Hello Marathon' ; sleep 5 ; done, cpus: 0.1, mem: 10.0, instances: 1, } However, trying to run the same task on the 2nd docker, leads to indefinite wait in deployment. All I can gather, in terms of log, is the following on the 2nd slave: Log file created at: 2015/08/07 17:49:05 Running on machine: instance-03e2 Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg I0807 17:49:05.882333 565 logging.cpp:172] INFO level logging started! I0807 17:49:05.883029 565 main.cpp:156] Build: 2015-05-05 06:15:50 by root I0807 17:49:05.883050 565 main.cpp:158] Version: 0.22.1 I0807 17:49:05.883061 565 main.cpp:161] Git tag: 0.22.1 I0807 17:49:05.883071 565 main.cpp:165] Git SHA: d6309f92a7f9af3ab61a878403e3d9c284ea87e0 I0807 17:49:05.883919 565 containerizer.cpp:110] Using isolation: posix/cpu,posix/mem I0807 17:49:05.884652 565 main.cpp:200] Starting Mesos slave I0807 17:49:05.886327 576 slave.cpp:174] Slave started on 1)@10.40.50.117:5051 I0807 17:49:05.887329 576 slave.cpp:322] Slave resources: cpus(*):8; mem(*):14928; disk(*):4975; ports(*):[31000-32000] I0807 17:49:05.887958 576 slave.cpp:351] Slave hostname: 10.40.50.117 I0807 17:49:05.887979 576 slave.cpp:352] Slave checkpoint: true I0807 17:49:05.891291 571 state.cpp:35] Recovering state from '/tmp/mesos/meta' I0807 17:49:05.891484 577 status_update_manager.cpp:197] Recovering status update manager I0807 17:49:05.891784 570 containerizer.cpp:307] Recovering containerizer I0807 17:49:05.892216 577 slave.cpp:3808] Finished recovery I0807 17:49:05.915279 574 group.cpp:313] Group process (group(1)@10.40.50.117:5051) connected to ZooKeeper I0807 17:49:05.915360 574 group.cpp:790] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0) I0807 17:49:05.915385 574 group.cpp:385] Trying to create path '/mesos' in ZooKeeper I0807 17:49:05.919221 574 detector.cpp:138] Detected a new leader: (id='102') I0807 17:49:05.919374 571 group.cpp:659] Trying to get '/mesos/info_000102' in ZooKeeper I0807 17:49:05.921257 571 detector.cpp:452] A new leading master (UPID=master@10.40.50.118:5050) is detected I0807 17:49:05.921408 571 slave.cpp:647] New master detected at master@10.40.50.118:5050 I0807 17:49:05.921423 573 status_update_manager.cpp:171] Pausing sending status updates I0807 17:49:05.921733 571 slave.cpp:672] No credentials provided. Attempting to register without authentication I0807 17:49:05.922029 571 slave.cpp:683] Detecting new master I0807 17:49:06.870721 571 slave.cpp:815] Registered with master master@10.40.50.118:5050; given slave ID 20150807-174737-1982998538-5050-1871-S1 I0807 17:49:06.870865 573 status_update_manager.cpp:178] Resuming sending status updates I0807 17:50:05.901607 577 slave.cpp:3648] Current disk usage 67.68%. Max allowed age: 1.562342327913970days ... I0807 17:55:05.965945 575 slave.cpp:3648] Current disk usage 67.68%. Max allowed age: 1.562339580158692days I0807 17:55:37.012629 575 http.cpp:331] HTTP request for '/slave(1)/state.json' I0807 17:56:05.966291 575 slave.cpp:3648] Current disk usage 67.68%. Max allowed age: 1.562339580158692days I0807 17:56:11.760229 577 slave.cpp:1144] Got assigned task hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0 for framework 20150624-232916-16777343-5050-1628- I0807 17:56:11.762622 577 slave.cpp:1254] Launching task hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0 for framework 20150624-232916-16777343-5050-1628- I0807 17:56:11.772568 577 slave.cpp:4208] Launching executor hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0 of framework 20150624-232916-16777343-5050-1628- in work directory '/tmp/mesos/slaves/20150807-174737-1982998538-5050-1871-S1/frameworks/20150624-232916-16777343-5050-1628-/executors/hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0/runs/260e0c23-835a-48a8-ab40-c9566077373f' I0807 17:56:11.773077 574 containerizer.cpp:484] Starting container '260e0c23-835a-48a8-ab40-c9566077373f' for executor 'hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0' of framework '20150624-232916-16777343-5050-1628-' I0807 17:56:11.773815 577 slave.cpp:1401] Queuing task 'hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0' for executor hello-gpu-sleep.9aef12e0-3d2d-11e5-935a-fe1614f46ae0 of framework '20150624-232916-16777343-5050-1628- I0807 17:56:11.776347 574 launcher.cpp:130] Forked child with pid '599' for container '260e0c23-835a-48a8-ab40-c9566077373f' I0807 17:56:11.776995 574 containerizer.cpp:694] Checkpointing executor's forked pid 599 to
RE: Get List of Active Slaves
I see. Nope, and pointing to the leading master shows the proper result☺ Thanks. Is there a REST equivalent to mesos-resolve, so that one can ascertain who is the leader without having to point to the leader? Cheers, [http://www.cisco.com/web/europe/images/email/signature/logo05.jpg] Nastooh Avessta ENGINEER.SOFTWARE ENGINEERING nave...@cisco.com Phone: +1 604 647 1527 Cisco Systems Limited 595 Burrard Street, Suite 2123 Three Bentall Centre, PO Box 49121 VANCOUVER BRITISH COLUMBIA V7X 1J1 CA Cisco.comhttp://www.cisco.com/ [Think before you print.]Think before you print. This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message. For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/index.html Cisco Systems Canada Co, 181 Bay St., Suite 3400, Toronto, ON, Canada, M5J 2T3. Phone: 416-306-7000; Fax: 416-306-7099. Preferenceshttp://www.cisco.com/offer/subscribe/?sid=000478326 - Unsubscribehttp://www.cisco.com/offer/unsubscribe/?sid=000478327 – Privacyhttp://www.cisco.com/web/siteassets/legal/privacy.html From: Vinod Kone [mailto:vinodk...@gmail.com] Sent: Tuesday, August 04, 2015 3:19 PM To: user@mesos.apache.org Subject: Re: Get List of Active Slaves Is that the leading master? On Tue, Aug 4, 2015 at 3:09 PM, Nastooh Avessta (navesta) nave...@cisco.commailto:nave...@cisco.com wrote: Hi Trying to get the list of active slaves, via cli, e.g. curl http://10.4.50.80:5050/master/slaves | python -m json.tool and am not getting the expected results. The returned value is empty: { slaves: [] } , whereas, looking at web gui I can see that there are deployed slaves. Am I missing something? Cheers, [http://www.cisco.com/web/europe/images/email/signature/logo05.jpg] Nastooh Avessta ENGINEER.SOFTWARE ENGINEERING nave...@cisco.commailto:nave...@cisco.com Phone: +1 604 647 1527tel:%2B1%20604%20647%201527 Cisco Systems Limited 595 Burrard Street, Suite 2123 Three Bentall Centre, PO Box 49121 VANCOUVER BRITISH COLUMBIA V7X 1J1 CA Cisco.comhttp://www.cisco.com/ [Think before you print.]Think before you print. This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message. For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/index.html Cisco Systems Canada Co, 181 Bay St., Suite 3400, Toronto, ON, Canada, M5J 2T3. Phone: 416-306-7000tel:416-306-7000; Fax: 416-306-7099tel:416-306-7099. Preferenceshttp://www.cisco.com/offer/subscribe/?sid=000478326 - Unsubscribehttp://www.cisco.com/offer/unsubscribe/?sid=000478327 – Privacyhttp://www.cisco.com/web/siteassets/legal/privacy.html
Get List of Active Slaves
Hi Trying to get the list of active slaves, via cli, e.g. curl http://10.4.50.80:5050/master/slaves | python -m json.tool and am not getting the expected results. The returned value is empty: { slaves: [] } , whereas, looking at web gui I can see that there are deployed slaves. Am I missing something? Cheers, [http://www.cisco.com/web/europe/images/email/signature/logo05.jpg] Nastooh Avessta ENGINEER.SOFTWARE ENGINEERING nave...@cisco.com Phone: +1 604 647 1527 Cisco Systems Limited 595 Burrard Street, Suite 2123 Three Bentall Centre, PO Box 49121 VANCOUVER BRITISH COLUMBIA V7X 1J1 CA Cisco.comhttp://www.cisco.com/ [Think before you print.]Think before you print. This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message. For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/index.html Cisco Systems Canada Co, 181 Bay St., Suite 3400, Toronto, ON, Canada, M5J 2T3. Phone: 416-306-7000; Fax: 416-306-7099. Preferenceshttp://www.cisco.com/offer/subscribe/?sid=000478326 - Unsubscribehttp://www.cisco.com/offer/unsubscribe/?sid=000478327 - Privacyhttp://www.cisco.com/web/siteassets/legal/privacy.html
Mesos Slave Failover time
Hi Trying to adjust the current failover time to below 10 seconds and don't seem to be able to find the right set of parameters. Currently, it takes around minute and half for master to detect that a slave has gone offline, which seems to correspond to slave_ping_timeout=15*max_slave_ping_timeouts=5. However, I can't find these parameters in mesos-master: # mesos-master --version mesos 0.22.1 #mesos-master --help Usage: mesos-master [...] Supported options: --acls=VALUE The value could be a JSON formatted string of ACLs or a file path containing the JSON formatted ACLs used for authorization. Path could be of the form 'file:///path/to/file' or '/path/to/file'. See the ACLs protobuf in mesos.proto for the expected format. Example: { register_frameworks: [ { principals: { type: ANY }, roles: { values: [a] } } ], run_tasks: [ { principals: { values: [a, b] }, users: { values: [c] } } ], shutdown_frameworks: [ { principals: { values: [a, b] }, framework_principals: { values: [c] } } ] } --allocation_interval=VALUE Amount of time to wait between performing (batch) allocations (e.g., 500ms, 1sec, etc). (default: 1secs) --[no-]authenticate If authenticate is 'true' only authenticated frameworks are allowed to register. If 'false' unauthenticated frameworks are also allowed to register. (default: false) --[no-]authenticate_slaves If 'true' only authenticated slaves are allowed to register. If 'false' unauthenticated slaves are also allowed to register. (default: false) --authenticators=VALUE Authenticator implementation to use when authenticating frameworks and/or slaves. Use the default 'crammd5', or load an alternate authenticator module using --modules. (default: crammd5) --cluster=VALUE Human readable name for the cluster, displayed in the webui. --credentials=VALUE Either a path to a text file with a list of credentials, each line containing 'principal' and 'secret' separated by whitespace, or, a path to a JSON-formatted file containing credentials. Path could be of the form 'file:///path/to/file' or '/path/to/file'. JSON file Example: { credentials: [ { principal: sherman, secret: kitesurf, } ] } Text file Example: username secret --external_log_file=VALUESpecified the externally managed log file. This file will be exposed in the webui and HTTP api. This is useful when using