Are there recommendations regarding master / scheduler machines resources
as function of cluster size?
Say I have a cluster with hundreds of slave machines and thousands of CPUs,
with a single framework that will schedule millions of tasks.
How does the strength of the master scheduler machines
I have Spark 1.2 running nicely with both the SparkSQL thrift server
and running it in iPython.
My question is this. I am running on Mesos in fine grained mode, what
is the appropriate way to manage the two instances? Should I run a
Course grained mode for the Spark SQL Thrift Server so that RDDs
I completely agree with Charles, though I think I can appreciate what you're
trying to do here. Take the log aggregation service as an example, you want
that on every slave to aggregate logs, but want to avoid using yet another
layer of configuration management to deploy it.
I'm of the
Sorry for the delay, operational documentation for the replicated log has
been badly needed, I'll get some basic stuff up on the website by next week.
In the interim, if you're using *--registry_strict=false (the default)*,
you can simply stop the original N masters, rm -rf all the data in
We've been experimenting with using Marathon to do this. We've found a
couple issues in making in smooth, but those bugs are all in the process of
being fixed. Within the next few weeks, I think that Marathon will be a
solid choice for this use case.
On Tue, Jan 6, 2015 at 4:05 AM, Itamar
5 matches
Mail list logo