>From my code reading investigation: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala#L194 - where Spark only uses offers that have at least spark.executor.memory (called sc.executorMemory here)
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala#L114 - where Spark accepts sc.executorMemory of a resource offer, regardless of how much more memory was available On Thu, Aug 21, 2014 at 2:12 PM, Andrew Ash <and...@andrewash.com> wrote: > I'm actually not sure the Spark+Mesos integration supports dynamically > allocating memory (it does support dynamically allocating cores though). > Has anyone here actually used Spark+Mesos on heterogenous hardware and > done dynamic memory allocation? > > My understanding is that each Spark executor started by Mesos uses > spark.executor.memory on each node across the cluster regardless of the > memory that Mesos says is available. > > > On Thu, Aug 21, 2014 at 2:05 PM, Jörn Franke <jornfra...@gmail.com> wrote: > >> Hi, >> >> No worries ;-) I think this scenario might still be supported by spark >> running on Mesos or Yarn2. Even your GPU-scenario could be supported. Check >> out the following resources: >> >> * https://spark.apache.org/docs/latest/running-on-mesos.html >> >> * http://mesos.berkeley.edu/mesos_tech_report.pdf >> >> Best regards, >> >> Jörn >> >> >> On Thu, Aug 21, 2014 at 5:42 PM, anthonyjschu...@gmail.com < >> anthonyjschu...@gmail.com> wrote: >> >>> Jörn, thanks for the post... >>> >>> Unfortunately, I am stuck with the hardware I have and might not be >>> able to get budget allocated for a new stack of servers when I've >>> already got so many "ok" servers on hand... And even more >>> unfortunately, a large subset of these machines are... shall we say... >>> extremely humble in their cpus and ram. My group has exclusive access >>> to the machine and rarely do we need to run concurrent jobs-- What I >>> really want is max capacity per-job. The applications are massive >>> machine-learning experiments, so I'm not sure about the feasibility of >>> breaking it up into concurrent jobs. At this point, I am seriously >>> considering dropping down to Akka-level programming. Why, oh why, >>> doesn't spark allow for allocating variable worker threads per host? >>> this would seem to be the correct point of abstraction that would >>> allow the construction of massive clusters using "on-hand" hardware? >>> (the scheduler probably wouldn't have to change at all) >>> >>> On Thu, Aug 21, 2014 at 9:25 AM, Jörn Franke [via Apache Spark User >>> List] <[hidden email] >>> <http://user/SendEmail.jtp?type=node&node=12587&i=0>> wrote: >>> >>> > Hi, >>> > >>> > Well, you could use Mesos or Yarn2 to define resources per Job - you >>> can >>> > give only as much resources (cores, memory etc.) per machine as your >>> "worst" >>> > machine has. The rest is done by Mesos or Yarn. By doing this you >>> avoid a >>> > per machine resource assignment without any disadvantages. You can run >>> > without any problems run other jobs in parallel and older machines >>> won't get >>> > overloaded. >>> > >>> > however, you should take care that your cluster does not get too >>> > heterogeneous. >>> > >>> > Best regards, >>> > Jörn >>> > >>> > Le 21 août 2014 16:55, "[hidden email]" <[hidden email]> a écrit : >>> >> >>> >> I've got a stack of Dell Commodity servers-- Ram~>(8 to 32Gb) single >>> or >>> >> dual >>> >> quad core processor cores per machine. I think I will have them >>> loaded >>> >> with >>> >> CentOS. Eventually, I may want to add GPUs on the nodes to handle >>> linear >>> >> alg. operations... >>> >> >>> >> My Idea has been: >>> >> >>> >> 1) to find a way to configure Spark to allocate different resources >>> >> per-machine, per-job. -- at least have a "standard executor"... and >>> allow >>> >> different machines to have different numbers of executors. >>> >> >>> >> 2) make (using vanilla spark) a pre-run optimization phase which >>> >> benchmarks >>> >> the throughput of each node (per hardware), and repartition the >>> dataset to >>> >> more efficiently use the hardware rather than rely on Spark >>> Speculation-- >>> >> which has always seemed a dis-optimal way to balance the load across >>> >> several >>> >> differing machines. >>> >> >>> >> >>> >> >>> >> >>> >> -- >>> >> View this message in context: >>> >> >>> http://apache-spark-user-list.1001560.n3.nabble.com/heterogeneous-cluster-hardware-tp11567p12581.html >>> >> Sent from the Apache Spark User List mailing list archive at >>> Nabble.com. >>> >> >>> >> --------------------------------------------------------------------- >>> >> To unsubscribe, e-mail: [hidden email] >>> >> For additional commands, e-mail: [hidden email] >>> >> >>> > >>> > >>> > ________________________________ >>> > If you reply to this email, your message will be added to the >>> discussion >>> > below: >>> > >>> http://apache-spark-user-list.1001560.n3.nabble.com/heterogeneous-cluster-hardware-tp11567p12585.html >>> > To unsubscribe from heterogeneous cluster hardware, click here. >>> > NAML >>> >>> >>> >>> -- >>> A N T H O N Y Ⓙ S C H U L T E >>> >>> ------------------------------ >>> View this message in context: Re: heterogeneous cluster hardware >>> <http://apache-spark-user-list.1001560.n3.nabble.com/heterogeneous-cluster-hardware-tp11567p12587.html> >>> >>> Sent from the Apache Spark User List mailing list archive >>> <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com. >>> >> >> >