For our multi framework setup with > 1000 nodes, we do these things
 1.      Tune Aurora to playnice with multi frameworks and not cache resources 
more than 2 mins

 -min_offer_hold_time (default(5, mins))

Change this to 1 min

 With this resources can be cached upto min_offer_hold_time + random value 
between 0-60 secs as controlled by -offer_hold_jitter_window (default (1, mins))

 -offer_reservation_duration

Change this to 1 min

 -offer_filter_duration (default (5, secs))

Change this to a larger value.Basically an offer rejected by framework will not 
be reoffered until thisduration. 30 seconds may be a reasonable value.

 
http://aurora.apache.org/documentation/latest/reference/scheduler-configuration/

Use this page for referenceabout all other configs. You can tune flapping tasks 
etc

  2.      Mesos master configsto counter unfair frameworks

 
 

|  -offer_timeout=VALUE

  |  Duration of time before an offer is rescinded from a framework. This helps 
fairness when running frameworks that hold on to offers, or frameworks that 
accidentally drop offers. If not set, offers do not timeout.

  |


 
Again set this as appropriate if you need to.


Also general noteI will highly recommend each framework have aspecific role but 
all the slave/agent resources can be unreserved role * (unless you want 
specific reservations). DRFwill ensure a level of fairness when a new resource 
comes up which frameworkshould be offered first based on current DRF values for 
the role. Remember DRFdoes not by itself guarantee a balanced fair resourcing 
overall becauseframeworks as they reject resources can be gobbled up by other 
frameworks.Currently DRF operates at 2 levels. First among roles and then 
multipleframeworks that share a role. 



Thx
      From: Mauricio Garavaglia <[email protected]>
 To: [email protected] 
 Sent: Tuesday, December 27, 2016 9:28 AM
 Subject: Re: Aurora Operations
   
Hello!From the top of my head: we increased history_prune_threshold value from 
2 days to a week if some debugging is required. Increased 
transient_task_state_timeout, because some tasks needs more time to be in the 
"killing" state, and we count on that. Also depending on your needs the 
max_schedule_penalty default of 1 min could be too much. We reduced that to 20 
seconds.
On Tue, Dec 27, 2016 at 2:06 PM, Erb, Stephan <[email protected]> 
wrote:

Does anyone else has input here? Any Aurora configuration option with a 
non-default value in your setup is worth sharing here. Questions are welcome as 
well, so that we can hopefully try to answer those in the upcoming operations 
guide. Best regards,Stephan From:Zameer Manji [email protected]
Reply-To: "[email protected]" <[email protected]>
Date: Saturday, 17 December 2016 at 00:39
To: "[email protected]" <[email protected]>
Subject: Re: Aurora Operations For larger clusters it might be necessary to 
increase `-db_max_active_connection_ count` to increase API throughput. You 
will also need to ensure Xmx=Xms when starting the JVM. Setting those flags to 
be the same prevents heap resizing. 

We should consider changing the defaults to match your research. Increasing the 
session timeout and lower the thermos/executor resources to 0 should be done. 
To make Aurora play nicely with other frameworks, I suggest lowering 
`min_offer_hold_time` to 1min or 30s. We can also lower the default here. A 
lower value means Aurora will have more latency when scheduling tasks (since it 
will have to wait for offers) but it will enable other frameworks to have 
resources. I also suggest using the Mesos operator tooling to dynamically 
reserve some minimum amount of resources to Aurora and other frameworks to 
ensure that they are not starved entirely.

I'm surprised setting offer_filter_duration to 0s improves performance but that 
should be something to note. On Fri, Dec 16, 2016 at 8:33 AM, Erb, Stephan 
<[email protected]> wrote:
Hi Aurorans, I would like to start a discussion about Aurora operations and 
gather feedback on how Aurora is configured and operated at your site. The main 
goal is to come up with a set of guidelines that help new users get up to 
speed, and to find ways how we can improve our default configuration and 
documentation. Of course, this is kind of a difficult endeavor. Still, I 
believe this can significantly help Aurora's positioning as one of the most 
scalable and battle-tested Mesos frameworks. I will start with a small 
collection to get the discussion going: # General Advice * Aurora requires a ZK 
ensemble for leader election. This ensemble should not also be used for service 
discover. Otherwise a service discovery error/outage can take down the entire 
cluster. The same applies for the Mesos ZK.* For fast and consistent 
performance, transaction logs should be on distinct disks not used by anything 
else (e.g not even logging). SSDs help as well. This applies to the ZK 
transaction log and the native/replicated log used by Aurora.* If you have made 
an operator error in your cluster, stopping the Mesos masters is a safe step to 
limit the error propagation (e.g. agents do not come up anymore after a 
configuration change). (Disclaimer: these are from this excellent 
talkhttps://www.youtube.com/watch? v=nNrh-gdu9m4) # Aurora ConfigurationJust a 
small collection from what we are using internally or what I have seen 
elsewhere * Thermos resources: The current defaults of CPU and RAM usage are 
invasive. `-thermos_executor_cpu=0` and `-thermos_executor_ram=128MB` seem to 
work just as well in particular since the Mesos egg got slimmer in recent 
releases.* Session timeouts: The default timeout is pretty small (4sec) and can 
lead to unexpected failovers during long GC pauses. A default of 10-15sec seems 
to be more appropriate.* JVM settings: Either `-XX:+UseG1GC 
-XX:+UseStringDeduplication` or `-XX:+UseConcMarkSweepGC` seem to be sane 
defaults. The option `-Djava.net.preferIPv4Stack= true` seems to make sense in 
most cases as well. # Open Questions: * What is the best way to configure and 
use Aurora in a multi-framework setup?* Are there options we recommend for 
smaller clusters (<100 nodes or <5000 tasks)? For example, 
`-offer_filter_duration=0secs` improves scheduling performance on small 
clusters.* Are there options we recommend for larger clusters (>1000)? I am 
looking forward to your contributions. Thanks,Stephan


 -- Zameer Manji



   

Reply via email to