interesting on this case, please follow Shuai Lin's suggestion, provide some logs to check
2016-02-02 23:05 GMT+08:00 Shuai Lin <[email protected]>: > Is there any warning/error message in marathon logs when it takes a long > time to deploy/redeploy your micro service? Also worth take a look of the > mesos slave logs. > > On Tue, Feb 2, 2016 at 6:55 AM, Rodrick Brown <[email protected]> > wrote: > >> My cluster consist of 9 slaves server split in 1/2 for two primary >> applications (Spark | Scala Microservices) >> >> - Spark - (server 1,2,3,4,8) attributes: "rack:spark" >> - Long running Microservices (server 5,6,7,9) attributes "rack:ms" >> >> >> The spark jobs run in coarse mode and the majority of them are short >> lived they run for about ~10-15 minutes via Chronos and shutdown. They >> start every 15 minutes about ~45 jobs. >> >> We do lots of deploys daily mostly to the "rack:ms" nodes where these >> jobs are started via Marathon and run until we need to deploy a new release >> of code. >> >> Recently I started noticing jobs are taking forever to restart or startup >> like they're not receiving valid offers. >> The cluster resources consists of the following resources I always have >> more than enough idle resources available to bring up/down new services yet >> I've seen one scenario where a service took almost 10 minutes to restart. >> >> >> CPUs Mem >> Total 120 456.8 GB >> Used 53.6 140.5 GB >> Offered 0 0 B >> Idle 66.4 316.3 GB >> How can I combat this delay? I'm not using roles could this be the >> problem? >> Chronos jobs always seem to run fine but they require much less resource >> than my long running Scala services. >> Here is a sample job definition for in Marathon. >> >> { >> "id": "production/index-service", >> "cmd": "env && /opt/orchard/production/index-server/bin/run_jar.sh", >> "cpus": 1.0, >> "mem": 4096, >> "disk": 1000, >> "user": "orchard", >> "instances": 2, >> "constraints": [ >> [ >> "hostname","UNIQUE" >> ], >> [ >> "rack", "LIKE", "ms" >> ] >> ], >> "requirePorts": true, >> "labels": { >> "ENV": "production", >> "HAPROXY_GROUP": "microservice" >> }, >> "ports": [ >> 31703, >> 31803, >> 31903 >> ], >> "maxLaunchDelaySeconds": 3, >> "backoffFactor": 1.20, >> "healthChecks": [ >> { >> "gracePeriodSeconds": 3, >> "intervalSeconds": 5, >> "maxConsecutiveFailures": 3, >> "protocol": "TCP", >> "portIndex": 1, >> "timeoutSeconds": 5 >> } >> ], >> "upgradeStrategy": { >> "minimumHealthCapacity": 0.5, >> "maximumOverCapacity": 0.2 >> } >> } >> >> Any advice appreciated thanks. >> >> *NOTICE TO RECIPIENTS*: This communication is confidential and intended >> for the use of the addressee only. If you are not an intended recipient of >> this communication, please delete it immediately and notify the sender >> by return email. Unauthorized reading, dissemination, distribution or >> copying of this communication is prohibited. This communication does not >> constitute >> an offer to sell or a solicitation of an indication of interest to purchase >> any loan, security or any other financial product or instrument, nor is it >> an offer to sell or a solicitation of an indication of interest to purchase >> any products or services to any persons who are prohibited from receiving >> such information under applicable law. The contents of this communication >> may not be accurate or complete and are subject to change without notice. >> As such, Orchard App, Inc. (including its subsidiaries and affiliates, >> "Orchard") makes no representation regarding the accuracy or >> completeness of the information contained herein. The intended recipient is >> advised to consult its own professional advisors, including those >> specializing in legal, tax and accounting matters. Orchard does not >> provide legal, tax or accounting advice. >> > > -- Deshi Xiao Twitter: xds2000 E-mail: xiaods(AT)gmail.com

