Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-28 Thread Ryan Blue
gt;>> > >>> > -Matt Cheah? >>> > >>> > >>> > >>> > From: Ryan Blue >>> > Reply-To: "rb...@netflix.com" >>> > Date: Tuesday, February 26, 2019 at 4:53 PM >>> > To: Matt Cheah >&

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-27 Thread Wenchen Fan
>> > >> > From: Ryan Blue >> > Reply-To: "rb...@netflix.com" >> > Date: Tuesday, February 26, 2019 at 4:53 PM >> > To: Matt Cheah >> > Cc: Sean Owen , Wenchen Fan , >> Xiao Li , Matei Zaharia , >> Spark Dev List >&

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-27 Thread Ryan Blue
gt; > > > > > > > > > -Matt Cheah? > > > > > > > > From: Ryan Blue > > Reply-To: "rb...@netflix.com" > > Date: Tuesday, February 26, 2019 at 4:53 PM > > To: Matt Cheah > > Cc: Sean Owen , Wenchen Fan , > Xiao

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-26 Thread Matei Zaharia
e that for Spark > 4? > > > > > -Matt Cheah? > > > > From: Ryan Blue > Reply-To: "rb...@netflix.com" > Date: Tuesday, February 26, 2019 at 4:53 PM > To: Matt Cheah > Cc: Sean Owen , Wenchen Fan , Xiao Li > , Matei Zaha

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-26 Thread Reynold Xin
t; > > -Matt Cheah? > > > > *From: *Ryan Blue > *Reply-To: *"rb...@netflix.com" > *Date: *Tuesday, February 26, 2019 at 4:53 PM > *To: *Matt Cheah > *Cc: *Sean Owen , Wenchen Fan , > Xiao Li , Matei Zaharia , > Spark Dev List > *Subject: *Re: [DISCU

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-26 Thread Matt Cheah
ary 26, 2019 at 4:40 PM To: Matt Cheah Cc: Sean Owen , Wenchen Fan , Xiao Li , Matei Zaharia , Spark Dev List Subject: Re: [DISCUSS] Spark 3.0 and DataSourceV2 Thanks for bumping this, Matt. I think we can have the discussion here to clarify exactly what we’re committing to and then hav

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-26 Thread Ryan Blue
ebruary 26, 2019 at 4:40 PM > *To: *Matt Cheah > *Cc: *Sean Owen , Wenchen Fan , > Xiao Li , Matei Zaharia , > Spark Dev List > *Subject: *Re: [DISCUSS] Spark 3.0 and DataSourceV2 > > > > Thanks for bumping this, Matt. I think we can have the discussion here to > clari

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-26 Thread Ryan Blue
Thanks for bumping this, Matt. I think we can have the discussion here to clarify exactly what we’re committing to and then have a vote thread once we’re agreed. Getting back to the DSv2 discussion, I think we have a good handle on what would be added: - Plugin system for catalogs -

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-26 Thread Matt Cheah
ia , Spark Dev List Subject: Re: [DISCUSS] Spark 3.0 and DataSourceV2 Thanks for bumping this, Matt. I think we can have the discussion here to clarify exactly what we’re committing to and then have a vote thread once we’re agreed. Getting back to the DSv2 discussion, I think we have a g

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-26 Thread Matt Cheah
What would then be the next steps we'd take to collectively decide on plans and timelines moving forward? Might I suggest scheduling a conference call with appropriate PMCs to put our ideas together? Maybe such a discussion can take place at next week's meeting? Or do we need to have a separate

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-24 Thread Sean Owen
Sure, I don't read anyone making these statements though? Let's assume good intent, that "foo should happen" as "my opinion as a member of the community, which is not solely up to me, is that foo should happen". I understand it's possible for a person to make their opinion over-weighted; this

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-24 Thread Mark Hamstra
> > I’m not quite sure what you mean here. > I'll try to explain once more, then I'll drop it since continuing the rest of the discussion in this thread is more important than getting side-tracked. There is nothing wrong with individuals advocating for what they think should or should not be in

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-24 Thread Ryan Blue
Thanks to Matt for his philosophical take. I agree. The intent is to set a common goal, so that we work toward getting v2 in a usable state as a community. Part of that is making choices to get it done on time, which we have already seen on this thread: setting out more clearly what we mean by

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-22 Thread Mark Hamstra
> > To your other message: I already see a number of PMC members here. Who's > the other entity? > I'll answer indirectly since pointing fingers isn't really my intent. In the absence of a PMC vote, I react negatively to individuals making new declarative policy statements or statements to the

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-22 Thread Sean Owen
To your other message: I already see a number of PMC members here. Who's the other entity? The PMC is the thing that says a thing is a release, sure, but this discussion is properly a community one. And here we are, this is lovely to see. (May I remind everyone to casually, sometime, browse the

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-21 Thread Ryan Blue
also the features that have remained open for the longest time > and we really need to move forward on these. Putting a target release for > 3.0 will help in that regard. > > > > -Matt Cheah > > > > *From: *Ryan Blue > *Reply-To: *"rb...@netflix.com" > *D

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-21 Thread Matt Cheah
ebruary 21, 2019 at 2:22 PM To: Matei Zaharia Cc: Spark Dev List Subject: Re: [DISCUSS] Spark 3.0 and DataSourceV2 I'm all for making releases more often if we want. But this work could really use a target release to motivate getting it done. If we agree that it will block a release, the

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-21 Thread Ryan Blue
I'm all for making releases more often if we want. But this work could really use a target release to motivate getting it done. If we agree that it will block a release, then everyone is motivated to review and get the PRs in. If this work doesn't make it in the 3.0 release, I'm not confident

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-21 Thread Matei Zaharia
How large would the delay be? My 2 cents are that there’s nothing stopping us from making feature releases more often if we want to, so we shouldn’t see this as an “either delay 3.0 or release in >6 months” decision. If the work is likely to get in with a small delay and simplifies our work