Re: heterogeneous cluster hardware

Andrew Ash Thu, 21 Aug 2014 14:20:39 -0700

>From my code reading investigation:

https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala#L194
- where Spark only uses offers that have at least spark.executor.memory
(called sc.executorMemory here)


https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala#L114
- where Spark accepts sc.executorMemory of a resource offer, regardless of
how much more memory was available


On Thu, Aug 21, 2014 at 2:12 PM, Andrew Ash <and...@andrewash.com> wrote:

> I'm actually not sure the Spark+Mesos integration supports dynamically
> allocating memory (it does support dynamically allocating cores though).
>  Has anyone here actually used Spark+Mesos on heterogenous hardware and
> done dynamic memory allocation?
>
> My understanding is that each Spark executor started by Mesos uses
> spark.executor.memory on each node across the cluster regardless of the
> memory that Mesos says is available.
>
>
> On Thu, Aug 21, 2014 at 2:05 PM, Jörn Franke <jornfra...@gmail.com> wrote:
>
>> Hi,
>>
>> No worries ;-) I think this scenario might still be supported by spark
>> running on Mesos or Yarn2. Even your GPU-scenario could be supported. Check
>> out the following resources:
>>
>> * https://spark.apache.org/docs/latest/running-on-mesos.html
>>
>> * http://mesos.berkeley.edu/mesos_tech_report.pdf
>>
>> Best regards,
>>
>> Jörn
>>
>>
>> On Thu, Aug 21, 2014 at 5:42 PM, anthonyjschu...@gmail.com <
>> anthonyjschu...@gmail.com> wrote:
>>
>>> Jörn, thanks for the post...
>>>
>>> Unfortunately, I am stuck with the hardware I have and might not be
>>> able to get budget allocated for a new stack of servers when I've
>>> already got so many "ok" servers on hand... And even more
>>> unfortunately, a large subset of these machines are... shall we say...
>>> extremely humble in their cpus and ram. My group has exclusive access
>>> to the machine and rarely do we need to run concurrent jobs-- What I
>>> really want is max capacity per-job. The applications are massive
>>> machine-learning experiments, so I'm not sure about the feasibility of
>>> breaking it up into concurrent jobs. At this point, I am seriously
>>> considering dropping down to Akka-level programming. Why, oh why,
>>> doesn't spark allow for allocating variable worker threads per host?
>>> this would seem to be the correct point of abstraction that would
>>> allow the construction of massive clusters using "on-hand" hardware?
>>> (the scheduler probably wouldn't have to change at all)
>>>
>>> On Thu, Aug 21, 2014 at 9:25 AM, Jörn Franke [via Apache Spark User
>>> List] <[hidden email]
>>> <http://user/SendEmail.jtp?type=node&node=12587&i=0>> wrote:
>>>
>>> > Hi,
>>> >
>>> > Well, you could use Mesos or Yarn2 to define  resources per Job - you
>>> can
>>> > give only as much resources (cores, memory etc.) per machine as your
>>> "worst"
>>> > machine has. The rest is done by Mesos or Yarn. By doing this you
>>> avoid a
>>> > per machine resource assignment without any disadvantages. You can run
>>> > without any problems run other jobs in parallel and older machines
>>> won't get
>>> > overloaded.
>>> >
>>> > however, you should take care that your cluster does not get too
>>> > heterogeneous.
>>> >
>>> > Best regards,
>>> > Jörn
>>> >
>>> > Le 21 août 2014 16:55, "[hidden email]" <[hidden email]> a écrit :
>>> >>
>>> >> I've got a stack of Dell Commodity servers-- Ram~>(8 to 32Gb) single
>>> or
>>> >> dual
>>> >> quad core processor cores per machine. I think I will have them
>>> loaded
>>> >> with
>>> >> CentOS. Eventually, I may want to add GPUs on the nodes to handle
>>> linear
>>> >> alg. operations...
>>> >>
>>> >> My Idea has been:
>>> >>
>>> >> 1) to find a way to configure Spark to allocate different resources
>>> >> per-machine, per-job. -- at least have a "standard executor"... and
>>> allow
>>> >> different machines to have different numbers of executors.
>>> >>
>>> >> 2) make (using vanilla spark) a pre-run optimization phase which
>>> >> benchmarks
>>> >> the throughput of each node (per hardware), and repartition the
>>> dataset to
>>> >> more efficiently use the hardware rather than rely on Spark
>>> Speculation--
>>> >> which has always seemed a dis-optimal way to balance the load across
>>> >> several
>>> >> differing machines.
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> View this message in context:
>>> >>
>>> http://apache-spark-user-list.1001560.n3.nabble.com/heterogeneous-cluster-hardware-tp11567p12581.html
>>> >> Sent from the Apache Spark User List mailing list archive at
>>> Nabble.com.
>>> >>
>>> >> ---------------------------------------------------------------------
>>> >> To unsubscribe, e-mail: [hidden email]
>>> >> For additional commands, e-mail: [hidden email]
>>> >>
>>> >
>>> >
>>> > ________________________________
>>> > If you reply to this email, your message will be added to the
>>> discussion
>>> > below:
>>> >
>>> http://apache-spark-user-list.1001560.n3.nabble.com/heterogeneous-cluster-hardware-tp11567p12585.html
>>> > To unsubscribe from heterogeneous cluster hardware, click here.
>>> > NAML
>>>
>>>
>>>
>>> --
>>> A  N  T  H  O  N  Y   Ⓙ   S  C  H  U  L  T  E
>>>
>>> ------------------------------
>>> View this message in context: Re: heterogeneous cluster hardware
>>> <http://apache-spark-user-list.1001560.n3.nabble.com/heterogeneous-cluster-hardware-tp11567p12587.html>
>>>
>>> Sent from the Apache Spark User List mailing list archive
>>> <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com.
>>>
>>
>>
>

Re: heterogeneous cluster hardware

Reply via email to