Hi all, Thanks for your feedback! I uploaded a SPIP doc for the barrier scheduling feature at https://issues.apache.org/jira/browse/SPARK-24374. Please take a look and leave your comments there. I had some offline discussion with +Xingbo Jiang <xingbo.ji...@databricks.com> to help me design the APIs. He is quite familiar with Spark job scheduler and he will share some design ideas on the JIRA.
I will work on SPIPs for the other two proposals: 1) fast data exchange, 2) accelerator-aware scheduling. I definitely need some help for the second one because I'm not familiar with YARN/Mesos/k8s. Best, Xiangrui On Sun, May 20, 2018 at 8:19 PM Felix Cheung <felixcheun...@hotmail.com> wrote: > Very cool. We would be very interested in this. > > What is the plan forward to make progress in each of the three areas? > > > ------------------------------ > *From:* Bryan Cutler <cutl...@gmail.com> > *Sent:* Monday, May 14, 2018 11:37:20 PM > *To:* Xiangrui Meng > *Cc:* Reynold Xin; dev > > *Subject:* Re: Integrating ML/DL frameworks with Spark > Thanks for starting this discussion, I'd also like to see some > improvements in this area and glad to hear that the Pandas UDFs / Arrow > functionality might be useful. I'm wondering if from your initial > investigations you found anything lacking from the Arrow format or possible > improvements that would simplify the data representation? Also, while data > could be handed off in a UDF, would it make sense to also discuss a more > formal way to externalize the data in a way that would also work for the > Scala API? > > Thanks, > Bryan > > On Wed, May 9, 2018 at 4:31 PM, Xiangrui Meng <m...@databricks.com> wrote: > >> Shivaram: Yes, we can call it "gang scheduling" or "barrier >> synchronization". Spark doesn't support it now. The proposal is to have a >> proper support in Spark's job scheduler, so we can integrate well with >> MPI-like frameworks. >> >> >> On Tue, May 8, 2018 at 11:17 AM Nan Zhu <zhunanmcg...@gmail.com> wrote: >> >>> .....how I skipped the last part........ >>> >>> On Tue, May 8, 2018 at 11:16 AM, Reynold Xin <r...@databricks.com> >>> wrote: >>> >>>> Yes, Nan, totally agree. To be on the same page, that's exactly what I >>>> wrote wasn't it? >>>> >>>> On Tue, May 8, 2018 at 11:14 AM Nan Zhu <zhunanmcg...@gmail.com> wrote: >>>> >>>>> besides that, one of the things which is needed by multiple frameworks >>>>> is to schedule tasks in a single wave >>>>> >>>>> i.e. >>>>> >>>>> if some frameworks like xgboost/mxnet requires 50 parallel workers, >>>>> Spark is desired to provide a capability to ensure that either we run 50 >>>>> tasks at once, or we should quit the complete application/job after some >>>>> timeout period >>>>> >>>>> Best, >>>>> >>>>> Nan >>>>> >>>>> On Tue, May 8, 2018 at 11:10 AM, Reynold Xin <r...@databricks.com> >>>>> wrote: >>>>> >>>>>> I think that's what Xiangrui was referring to. Instead of retrying a >>>>>> single task, retry the entire stage, and the entire stage of tasks need >>>>>> to >>>>>> be scheduled all at once. >>>>>> >>>>>> >>>>>> On Tue, May 8, 2018 at 8:53 AM Shivaram Venkataraman < >>>>>> shiva...@eecs.berkeley.edu> wrote: >>>>>> >>>>>>> >>>>>>>> >>>>>>>>> - Fault tolerance and execution model: Spark assumes >>>>>>>>> fine-grained task recovery, i.e. if something fails, only that >>>>>>>>> task is >>>>>>>>> rerun. This doesn’t match the execution model of distributed ML/DL >>>>>>>>> frameworks that are typically MPI-based, and rerunning a single >>>>>>>>> task would >>>>>>>>> lead to the entire system hanging. A whole stage needs to be >>>>>>>>> re-run. >>>>>>>>> >>>>>>>>> This is not only useful for integrating with 3rd-party frameworks, >>>>>>>> but also useful for scaling MLlib algorithms. One of my earliest >>>>>>>> attempts >>>>>>>> in Spark MLlib was to implement All-Reduce primitive (SPARK-1485 >>>>>>>> <https://issues.apache.org/jira/browse/SPARK-1485>). But we ended >>>>>>>> up with some compromised solutions. With the new execution model, we >>>>>>>> can >>>>>>>> set up a hybrid cluster and do all-reduce properly. >>>>>>>> >>>>>>>> >>>>>>> Is there a particular new execution model you are referring to or do >>>>>>> we plan to investigate a new execution model ? For the MPI-like model, >>>>>>> we >>>>>>> also need gang scheduling (i.e. schedule all tasks at once or none of >>>>>>> them) >>>>>>> and I dont think we have support for that in the scheduler right now. >>>>>>> >>>>>>>> >>>>>>>>> -- >>>>>>>> >>>>>>>> Xiangrui Meng >>>>>>>> >>>>>>>> Software Engineer >>>>>>>> >>>>>>>> Databricks Inc. [image: http://databricks.com] >>>>>>>> <http://databricks.com/> >>>>>>>> >>>>>>> >>>>>>> >>>>> >>> -- >> >> Xiangrui Meng >> >> Software Engineer >> >> Databricks Inc. [image: http://databricks.com] <http://databricks.com/> >> > > -- Xiangrui Meng Software Engineer Databricks Inc. [image: http://databricks.com] <http://databricks.com/>