+1 (non-binding) Great work! On Tue, Jun 18, 2019 at 6:22 AM Vinoo Ganesh <vgan...@palantir.com> wrote:
> +1 (non-binding). > > > > Thanks for pushing this forward, Matt and Yifei. > > > > *From: *Felix Cheung <felixcheun...@hotmail.com> > *Date: *Tuesday, June 18, 2019 at 00:01 > *To: *Yinan Li <liyinan...@gmail.com>, "rb...@netflix.com" < > rb...@netflix.com> > *Cc: *Dongjoon Hyun <dongjoon.h...@gmail.com>, Saisai Shao < > sai.sai.s...@gmail.com>, Imran Rashid <im...@therashids.com>, Ilan > Filonenko <i...@cornell.edu>, bo yang <bobyan...@gmail.com>, Matt Cheah < > mch...@palantir.com>, Spark Dev List <dev@spark.apache.org>, "Yifei Huang > (PD)" <yif...@palantir.com>, Vinoo Ganesh <vgan...@palantir.com>, Imran > Rashid <iras...@cloudera.com> > *Subject: *Re: [VOTE][SPARK-25299] SPIP: Shuffle Storage API > > > > +1 > > > > Glad to see the progress in this space - it’s been more than a year since > the original discussion and effort started. > > > ------------------------------ > > *From:* Yinan Li <liyinan...@gmail.com> > *Sent:* Monday, June 17, 2019 7:14:42 PM > *To:* rb...@netflix.com > *Cc:* Dongjoon Hyun; Saisai Shao; Imran Rashid; Ilan Filonenko; bo yang; > Matt Cheah; Spark Dev List; Yifei Huang (PD); Vinoo Ganesh; Imran Rashid > *Subject:* Re: [VOTE][SPARK-25299] SPIP: Shuffle Storage API > > > > +1 (non-binding) > > > > On Mon, Jun 17, 2019 at 1:58 PM Ryan Blue <rb...@netflix.com.invalid> > wrote: > > +1 (non-binding) > > > > On Sun, Jun 16, 2019 at 11:11 PM Dongjoon Hyun <dongjoon.h...@gmail.com> > wrote: > > +1 > > > > Bests, > > Dongjoon. > > > > > > On Sun, Jun 16, 2019 at 9:41 PM Saisai Shao <sai.sai.s...@gmail.com> > wrote: > > +1 (binding) > > > > Thanks > > Saisai > > > > Imran Rashid <im...@therashids.com> 于2019年6月15日周六 上午3:46写道: > > +1 (binding) > > I think this is a really important feature for spark. > > First, there is already a lot of interest in alternative shuffle storage > in the community. There is already a lot of interest in alternative > shuffle storage, from dynamic allocation in kubernetes, to even just > improving stability in standard on-premise use of Spark. However, they're > often stuck doing this in forks of Spark, and in ways that are not > maintainable (because they copy-paste many spark internals) or are > incorrect (for not correctly handling speculative execution & stage > retries). > > Second, I think the specific proposal is good for finding the right > balance between flexibility and too much complexity, to allow incremental > improvements. A lot of work has been put into this already to try to > figure out which pieces are essential to make alternative shuffle storage > implementations feasible. > > Of course, that means it doesn't include everything imaginable; some > things still aren't supported, and some will still choose to use the older > ShuffleManager api to give total control over all of shuffle. But we know > there are a reasonable set of things which can be implemented behind the > api as the first step, and it can continue to evolve. > > > > On Fri, Jun 14, 2019 at 12:13 PM Ilan Filonenko <i...@cornell.edu> wrote: > > +1 (non-binding). This API is versatile and flexible enough to handle > Bloomberg's internal use-cases. The ability for us to vary implementation > strategies is quite appealing. It is also worth to note the minimal changes > to Spark core in order to make it work. This is a very much needed addition > within the Spark shuffle story. > > > > On Fri, Jun 14, 2019 at 9:59 AM bo yang <bobyan...@gmail.com> wrote: > > +1 This is great work, allowing plugin of different sort shuffle > write/read implementation! Also great to see it retain the current Spark > configuration > (spark.shuffle.manager=org.apache.spark.shuffle.YourShuffleManagerImpl). > > > > > > On Thu, Jun 13, 2019 at 2:58 PM Matt Cheah <mch...@palantir.com> wrote: > > Hi everyone, > > > > I would like to call a vote for the SPIP for SPARK-25299 > [issues.apache.org] > <https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_SPARK-2D25299&d=DwMFJg&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=UG2t14gfU8QHfoj4tUD__9bIVg1xxTM3R8GHmvMUXTU&s=LS6AKX38P5DW6ffk9u5MUvRBEAlAHiA3Ud2KODpWkQU&e=>, > which proposes to introduce a pluggable storage API for temporary shuffle > data. > > > > You may find the SPIP document here [docs.google.com] > <https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1d6egnL6WHOwWZe8MWv3m8n4PToNacdx7n-5F0iMSWwhCQ_edit&d=DwMFJg&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=UG2t14gfU8QHfoj4tUD__9bIVg1xxTM3R8GHmvMUXTU&s=rCSgQGD6L4of4oa0QxiTJ8IPaVdGlZVarhA4-QvO80Q&e=> > . > > > > The discussion thread for the SPIP was conducted here [lists.apache.org] > <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.apache.org_thread.html_2fe82b6b86daadb1d2edaef66a2d1c4dd2f45449656098ee38c50079-40-253Cdev.spark.apache.org-253E&d=DwMFJg&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=7WzLIMu3WvZwd6AMPatqn1KZW39eI6c_oflAHIy1NUc&m=UG2t14gfU8QHfoj4tUD__9bIVg1xxTM3R8GHmvMUXTU&s=kSJizQH7v4OHG6D7aVsLA-m0ApZxOa24CzHZv1EzLxg&e=> > . > > > > Please vote on whether or not this proposal is agreeable to you. > > > > Thanks! > > > > -Matt Cheah > > > > > -- > > Ryan Blue > > Software Engineer > > Netflix > > -- John