Recommended resources for master / scheduler machines

2015-01-06 Thread Itamar Ostricher
Are there recommendations regarding master / scheduler machines resources as function of cluster size? Say I have a cluster with hundreds of slave machines and thousands of CPUs, with a single framework that will schedule millions of tasks. How does the strength of the master scheduler machines

Running Spark on Mesos

2015-01-06 Thread John Omernik
I have Spark 1.2 running nicely with both the SparkSQL thrift server and running it in iPython. My question is this. I am running on Mesos in fine grained mode, what is the appropriate way to manage the two instances? Should I run a Course grained mode for the Spark SQL Thrift Server so that RDDs

Re: Running services on all slaves

2015-01-06 Thread Tom Arnfeld
I completely agree with Charles, though I think I can appreciate what you're trying to do here. Take the log aggregation service as an example, you want that on every slave to aggregate logs, but want to avoid using yet another layer of configuration management to deploy it. I'm of the

Re: Resize Mesos master quorum

2015-01-06 Thread Benjamin Mahler
Sorry for the delay, operational documentation for the replicated log has been badly needed, I'll get some basic stuff up on the website by next week. In the interim, if you're using *--registry_strict=false (the default)*, you can simply stop the original N masters, rm -rf all the data in

Re: Running services on all slaves

2015-01-06 Thread David Greenberg
We've been experimenting with using Marathon to do this. We've found a couple issues in making in smooth, but those bugs are all in the process of being fixed. Within the next few weeks, I think that Marathon will be a solid choice for this use case. On Tue, Jan 6, 2015 at 4:05 AM, Itamar