+1 (binding) Thanks Saisai
Imran Rashid <im...@therashids.com> 于2019年6月15日周六 上午3:46写道: > +1 (binding) > > I think this is a really important feature for spark. > > First, there is already a lot of interest in alternative shuffle storage > in the community. There is already a lot of interest in alternative > shuffle storage, from dynamic allocation in kubernetes, to even just > improving stability in standard on-premise use of Spark. However, they're > often stuck doing this in forks of Spark, and in ways that are not > maintainable (because they copy-paste many spark internals) or are > incorrect (for not correctly handling speculative execution & stage > retries). > > Second, I think the specific proposal is good for finding the right > balance between flexibility and too much complexity, to allow incremental > improvements. A lot of work has been put into this already to try to > figure out which pieces are essential to make alternative shuffle storage > implementations feasible. > > Of course, that means it doesn't include everything imaginable; some > things still aren't supported, and some will still choose to use the older > ShuffleManager api to give total control over all of shuffle. But we know > there are a reasonable set of things which can be implemented behind the > api as the first step, and it can continue to evolve. > > On Fri, Jun 14, 2019 at 12:13 PM Ilan Filonenko <i...@cornell.edu> wrote: > >> +1 (non-binding). This API is versatile and flexible enough to handle >> Bloomberg's internal use-cases. The ability for us to vary implementation >> strategies is quite appealing. It is also worth to note the minimal changes >> to Spark core in order to make it work. This is a very much needed addition >> within the Spark shuffle story. >> >> On Fri, Jun 14, 2019 at 9:59 AM bo yang <bobyan...@gmail.com> wrote: >> >>> +1 This is great work, allowing plugin of different sort shuffle >>> write/read implementation! Also great to see it retain the current Spark >>> configuration >>> (spark.shuffle.manager=org.apache.spark.shuffle.YourShuffleManagerImpl). >>> >>> >>> On Thu, Jun 13, 2019 at 2:58 PM Matt Cheah <mch...@palantir.com> wrote: >>> >>>> Hi everyone, >>>> >>>> >>>> >>>> I would like to call a vote for the SPIP for SPARK-25299 >>>> <https://issues.apache.org/jira/browse/SPARK-25299>, which proposes to >>>> introduce a pluggable storage API for temporary shuffle data. >>>> >>>> >>>> >>>> You may find the SPIP document here >>>> <https://docs.google.com/document/d/1d6egnL6WHOwWZe8MWv3m8n4PToNacdx7n_0iMSWwhCQ/edit> >>>> . >>>> >>>> >>>> >>>> The discussion thread for the SPIP was conducted here >>>> <https://lists.apache.org/thread.html/2fe82b6b86daadb1d2edaef66a2d1c4dd2f45449656098ee38c50079@%3Cdev.spark.apache.org%3E> >>>> . >>>> >>>> >>>> >>>> Please vote on whether or not this proposal is agreeable to you. >>>> >>>> >>>> >>>> Thanks! >>>> >>>> >>>> >>>> -Matt Cheah >>>> >>>