Hi, No worries ;-) I think this scenario might still be supported by spark running on Mesos or Yarn2. Even your GPU-scenario could be supported. Check out the following resources:
* https://spark.apache.org/docs/latest/running-on-mesos.html * http://mesos.berkeley.edu/mesos_tech_report.pdf Best regards, Jörn On Thu, Aug 21, 2014 at 5:42 PM, anthonyjschu...@gmail.com < anthonyjschu...@gmail.com> wrote: > Jörn, thanks for the post... > > Unfortunately, I am stuck with the hardware I have and might not be > able to get budget allocated for a new stack of servers when I've > already got so many "ok" servers on hand... And even more > unfortunately, a large subset of these machines are... shall we say... > extremely humble in their cpus and ram. My group has exclusive access > to the machine and rarely do we need to run concurrent jobs-- What I > really want is max capacity per-job. The applications are massive > machine-learning experiments, so I'm not sure about the feasibility of > breaking it up into concurrent jobs. At this point, I am seriously > considering dropping down to Akka-level programming. Why, oh why, > doesn't spark allow for allocating variable worker threads per host? > this would seem to be the correct point of abstraction that would > allow the construction of massive clusters using "on-hand" hardware? > (the scheduler probably wouldn't have to change at all) > > On Thu, Aug 21, 2014 at 9:25 AM, Jörn Franke [via Apache Spark User > List] <[hidden email] <http://user/SendEmail.jtp?type=node&node=12587&i=0>> > wrote: > > > Hi, > > > > Well, you could use Mesos or Yarn2 to define resources per Job - you > can > > give only as much resources (cores, memory etc.) per machine as your > "worst" > > machine has. The rest is done by Mesos or Yarn. By doing this you avoid > a > > per machine resource assignment without any disadvantages. You can run > > without any problems run other jobs in parallel and older machines won't > get > > overloaded. > > > > however, you should take care that your cluster does not get too > > heterogeneous. > > > > Best regards, > > Jörn > > > > Le 21 août 2014 16:55, "[hidden email]" <[hidden email]> a écrit : > >> > >> I've got a stack of Dell Commodity servers-- Ram~>(8 to 32Gb) single or > >> dual > >> quad core processor cores per machine. I think I will have them loaded > >> with > >> CentOS. Eventually, I may want to add GPUs on the nodes to handle > linear > >> alg. operations... > >> > >> My Idea has been: > >> > >> 1) to find a way to configure Spark to allocate different resources > >> per-machine, per-job. -- at least have a "standard executor"... and > allow > >> different machines to have different numbers of executors. > >> > >> 2) make (using vanilla spark) a pre-run optimization phase which > >> benchmarks > >> the throughput of each node (per hardware), and repartition the dataset > to > >> more efficiently use the hardware rather than rely on Spark > Speculation-- > >> which has always seemed a dis-optimal way to balance the load across > >> several > >> differing machines. > >> > >> > >> > >> > >> -- > >> View this message in context: > >> > http://apache-spark-user-list.1001560.n3.nabble.com/heterogeneous-cluster-hardware-tp11567p12581.html > >> Sent from the Apache Spark User List mailing list archive at > Nabble.com. > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: [hidden email] > >> For additional commands, e-mail: [hidden email] > >> > > > > > > ________________________________ > > If you reply to this email, your message will be added to the discussion > > below: > > > http://apache-spark-user-list.1001560.n3.nabble.com/heterogeneous-cluster-hardware-tp11567p12585.html > > To unsubscribe from heterogeneous cluster hardware, click here. > > NAML > > > > -- > A N T H O N Y Ⓙ S C H U L T E > > ------------------------------ > View this message in context: Re: heterogeneous cluster hardware > <http://apache-spark-user-list.1001560.n3.nabble.com/heterogeneous-cluster-hardware-tp11567p12587.html> > > Sent from the Apache Spark User List mailing list archive > <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com. >