Got it. The explain makes sense. Thank you.
On Thu, Jan 8, 2015 at 1:06 PM, Patrick Wendell [via Apache Spark Developers List] <[email protected]> wrote: > This question is conflating a few different concepts. I think the main > question is whether Spark will have a shuffle implementation that > streams data rather than persisting it to disk/cache as a buffer. > Spark currently decouples the shuffle write from the read using > disk/OS cache as a buffer. The two benefits of this approach this are > that it allows intra-query fault tolerance and it makes it easier to > elastically scale and reschedule work within a job. We consider these > to be design requirements (think about jobs that run for several hours > on hundreds of machines). Impala, and similar systems like dremel and > f1, not offer fault tolerance within a query at present. They also > require gang scheduling the entire set of resources that will exist > for the duration of a query. > > A secondary question is whether our shuffle should have a barrier or > not. Spark's shuffle currently has a hard barrier between map and > reduce stages. We haven't seen really strong evidence that removing > the barrier is a net win. It can help the performance of a single job > (modestly), but in the a multi-tenant workload, it leads to poor > utilization since you have a lot of reduce tasks that are taking up > slots waiting for mappers to finish. Many large scale users of > Map/Reduce disable this feature in production clusters for that > reason. Thus, we haven't seen compelling evidence for removing the > barrier at this point, given the complexity of doing so. > > It is possible that future versions of Spark will support push-based > shuffles, potentially in a mode that remove some of Spark's fault > tolerance properties. But there are many other things we can still > optimize about the shuffle that would likely come before this. > > - Patrick > > On Wed, Jan 7, 2015 at 6:01 PM, 曹雪林 <[hidden email] > <http:///user/SendEmail.jtp?type=node&node=10029&i=0>> wrote: > > > Hi, > > > > I've heard a lot of complain about spark's "pull" style shuffle. > Is > > there any plan to support "push" style shuffle in the near future? > > > > Currently, the shuffle phase must be completed before the next > stage > > starts. While, it is said, in Impala, the shuffled data is "streamed" to > > the next stage handler, which greatly saves time. Will spark support > this > > mechanism one day? > > > > Thanks > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [hidden email] > <http:///user/SendEmail.jtp?type=node&node=10029&i=1> > For additional commands, e-mail: [hidden email] > <http:///user/SendEmail.jtp?type=node&node=10029&i=2> > > > > ------------------------------ > If you reply to this email, your message will be added to the discussion > below: > > http://apache-spark-developers-list.1001551.n3.nabble.com/Fwd-When-will-spark-support-push-style-shuffle-tp10028p10029.html > To start a new topic under Apache Spark Developers List, email > [email protected] > To unsubscribe from Apache Spark Developers List, click here > <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=eHVlbGluY2FvMjAxNEBnbWFpbC5jb218MXwtOTc3NDY2MzAy> > . > NAML > <http://apache-spark-developers-list.1001551.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Fwd-When-will-spark-support-push-style-shuffle-tp10028p10031.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
