+1 to votes to approve proposals. I agree that proposals should have an official mechanism to be accepted, and a vote is an established means of doing that well. I like that it includes a period to review the proposal and I think proposals should have been discussed enough ahead of a vote to survive the possibility of a veto.
I also like the names that are short and (mostly) unique, like SEP. Where I disagree is with the requirement that a committer must formally propose an enhancement. I don't see the value of restricting this: if someone has the will to write up a proposal then they should be encouraged to do so and start a discussion about it. Even if there is a political reality as Cody says, what is the value of codifying that in our process? I think restricting who can submit proposals would only undermine them by pushing contributors out. Maybe I'm missing something here? rb On Mon, Oct 10, 2016 at 7:41 AM, Cody Koeninger <c...@koeninger.org> wrote: > Yes, users suggesting SIPs is a good thing and is explicitly called > out in the linked document under the Who? section. Formally proposing > them, not so much, because of the political realities. > > Yes, implementation strategy definitely affects goals. There are all > kinds of examples of this, I'll pick one that's my fault so as to > avoid sounding like I'm blaming: > > When I implemented the Kafka DStream, one of my (not explicitly agreed > upon by the community) goals was to make sure people could use the > Dstream with however they were already using Kafka at work. The lack > of explicit agreement on that goal led to all kinds of fighting with > committers, that could have been avoided. The lack of explicit > up-front strategy discussion led to the DStream not really working > with compacted topics. I knew about compacted topics, but don't have > a use for them, so had a blind spot there. If there was explicit > up-front discussion that my strategy was "assume that batches can be > defined on the driver solely by beginning and ending offsets", there's > a greater chance that a user would have seen that and said, "hey, what > about non-contiguous offsets in a compacted topic". > > This kind of thing is only going to happen smoothly if we have a > lightweight user-visible process with clear outcomes. > > On Mon, Oct 10, 2016 at 1:34 AM, assaf.mendelson > <assaf.mendel...@rsa.com> wrote: > > I agree with most of what Cody said. > > > > Two things: > > > > First we can always have other people suggest SIPs but mark them as > > “unreviewed” and have committers basically move them forward. The > problem is > > that writing a good document takes time. This way we can leverage non > > committers to do some of this work (it is just another way to > contribute). > > > > > > > > As for strategy, in many cases implementation strategy can affect the > goals. > > I will give a small example: In the current structured streaming > strategy, > > we group by the time to achieve a sliding window. This is definitely an > > implementation decision and not a goal. However, I can think of several > > aggregation functions which have the time inside their calculation > buffer. > > For example, let’s say we want to return a set of all distinct values. > One > > way to implement this would be to make the set into a map and have the > value > > contain the last time seen. Multiplying it across the groupby would cost > a > > lot in performance. So adding such a strategy would have a great effect > on > > the type of aggregations and their performance which does affect the > goal. > > Without adding the strategy, it is easy for whoever goes to the design > > document to not think about these cases. Furthermore, it might be decided > > that these cases are rare enough so that the strategy is still good > enough > > but how would we know it without user feedback? > > > > I believe this example is exactly what Cody was talking about. Since many > > times implementation strategies have a large effect on the goal, we > should > > have it discussed when discussing the goals. In addition, while it is > often > > easy to throw out completely infeasible goals, it is often much harder to > > figure out that the goals are unfeasible without fine tuning. > > > > > > > > > > > > Assaf. > > > > > > > > From: Cody Koeninger-2 [via Apache Spark Developers List] > > [mailto:ml-node+[hidden email]] > > Sent: Monday, October 10, 2016 2:25 AM > > To: Mendelson, Assaf > > Subject: Re: Spark Improvement Proposals > > > > > > > > Only committers should formally submit SIPs because in an apache > > project only commiters have explicit political power. If a user can't > > find a commiter willing to sponsor an SIP idea, they have no way to > > get the idea passed in any case. If I can't find a committer to > > sponsor this meta-SIP idea, I'm out of luck. > > > > I do not believe unrealistic goals can be found solely by inspection. > > We've managed to ignore unrealistic goals even after implementation! > > Focusing on APIs can allow people to think they've solved something, > > when there's really no way of implementing that API while meeting the > > goals. Rapid iteration is clearly the best way to address this, but > > we've already talked about why that hasn't really worked. If adding a > > non-binding API section to the template is important to you, I'm not > > against it, but I don't think it's sufficient. > > > > On your PRD vs design doc spectrum, I'm saying this is closer to a > > PRD. Clear agreement on goals is the most important thing and that's > > why it's the thing I want binding agreement on. But I cannot agree to > > goals unless I have enough minimal technical info to judge whether the > > goals are likely to actually be accomplished. > > > > > > > > On Sun, Oct 9, 2016 at 5:35 PM, Matei Zaharia <[hidden email]> wrote: > > > > > >> Well, I think there are a few things here that don't make sense. First, > >> why > >> should only committers submit SIPs? Development in the project should be > >> open to all contributors, whether they're committers or not. Second, I > >> think > >> unrealistic goals can be found just by inspecting the goals, and I'm not > >> super worried that we'll accept a lot of SIPs that are then infeasible > -- > >> we > >> can then submit new ones. But this depends on whether you want this > >> process > >> to be a "design doc lite", where people also agree on implementation > >> strategy, or just a way to agree on goals. This is what I asked earlier > >> about PRDs vs design docs (and I'm open to either one but I'd just like > >> clarity). Finally, both as a user and designer of software, I always > want > >> to > >> give feedback on APIs, so I'd really like a culture of having those > early. > >> People don't argue about prettiness when they discuss APIs, they argue > >> about > >> the core concepts to expose in order to meet various goals, and then > >> they're > >> stuck maintaining those for a long time. > >> > >> Matei > >> > >> On Oct 9, 2016, at 3:10 PM, Cody Koeninger <[hidden email]> wrote: > >> > >> Users instead of people, sure. Commiters and contributors are (or at > >> least > >> should be) a subset of users. > >> > >> Non goals, sure. I don't care what the name is, but we need to clearly > say > >> e.g. 'no we are not maintaining compatibility with XYZ right now'. > >> > >> API, what I care most about is whether it allows me to accomplish the > >> goals. > >> Arguing about how ugly or pretty it is can be saved for design/ > >> implementation imho. > >> > >> Strategy, this is necessary because otherwise goals can be out of line > >> with > >> reality. Don't propose goals you don't have at least some idea of how > to > >> implement. > >> > >> Rejected strategies, given that commiters are the only ones I'm saying > >> should formally submit SPARKLIs or SIPs, if they put junk in a required > >> section then slap them down for it and tell them to fix it. > >> > >> > >> On Oct 9, 2016 4:36 PM, "Matei Zaharia" <[hidden email]> wrote: > >>> > >>> Yup, this is the stuff that I found unclear. Thanks for clarifying > here, > >>> but we should also clarify it in the writeup. In particular: > >>> > >>> - Goals needs to be about user-facing behavior ("people" is broad) > >>> > >>> - I'd rename Rejected Goals to Non-Goals. Otherwise someone will dig up > >>> one of these and say "Spark's developers have officially rejected X, > >>> which > >>> our awesome system has". > >>> > >>> - For user-facing stuff, I think you need a section on API. Virtually > all > >>> other *IPs I've seen have that. > >>> > >>> - I'm still not sure why the strategy section is needed if the purpose > is > >>> to define user-facing behavior -- unless this is the strategy for > setting > >>> the goals or for defining the API. That sounds squarely like a design > doc > >>> issue. In some sense, who cares whether the proposal is technically > >>> feasible > >>> right now? If it's infeasible, that will be discovered later during > >>> design > >>> and implementation. Same thing with rejected strategies -- listing some > >>> of > >>> those is definitely useful sometimes, but if you make this a *required* > >>> section, people are just going to fill it in with bogus stuff (I've > seen > >>> this happen before). > >>> > >>> Matei > >>> > > > >>> > On Oct 9, 2016, at 2:14 PM, Cody Koeninger <[hidden email]> wrote: > >>> > > >>> > So to focus the discussion on the specific strategy I'm suggesting, > >>> > documented at > >>> > > >>> > > >>> > > >>> > https://github.com/koeninger/spark-1/blob/SIP-0/docs/spark- > improvement-proposals.md > >>> > > >>> > "Goals: What must this allow people to do, that they can't > currently?" > >>> > > >>> > Is it unclear that this is focusing specifically on people-visible > >>> > behavior? > >>> > > >>> > Rejected goals - are important because otherwise people keep trying > >>> > to argue about scope. Of course you can change things later with a > >>> > different SIP and different vote, the point is to focus. > >>> > > >>> > Use cases - are something that people are going to bring up in > >>> > discussion. If they aren't clearly documented as a goal ("This must > >>> > allow me to connect using SSL"), they should be added. > >>> > > >>> > Internal architecture - if the people who need specific behavior are > >>> > implementers of other parts of the system, that's fine. > >>> > > >>> > Rejected strategies - If you have none of these, you have no evidence > >>> > that the proponent didn't just go with the first thing they had in > >>> > mind (or have already implemented), which is a big problem currently. > >>> > Approval isn't binding as to specifics of implementation, so these > >>> > aren't handcuffs. The goals are the contract, the strategy is > >>> > evidence that contract can actually be met. > >>> > > >>> > Design docs - I'm not touching design docs. The markdown file I > >>> > linked specifically says of the strategy section "This is not a full > >>> > design document." Is this unclear? Design docs can be worked on > >>> > obviously, but that's not what I'm concerned with here. > >>> > > >>> > > >>> > > >>> > > >>> > On Sun, Oct 9, 2016 at 2:34 PM, Matei Zaharia <[hidden email]> > >>> > wrote: > >>> >> Hi Cody, > >>> >> > >>> >> I think this would be a lot more concrete if we had a more detailed > >>> >> template > >>> >> for SIPs. Right now, it's not super clear what's in scope -- e.g. > are > >>> >> they > >>> >> a way to solicit feedback on the user-facing behavior or on the > >>> >> internals? > >>> >> "Goals" can cover both things. I've been thinking of SIPs more as > >>> >> Product > >>> >> Requirements Docs (PRDs), which focus on *what* a code change should > >>> >> do > >>> >> as > >>> >> opposed to how. > >>> >> > >>> >> In particular, here are some things that you may or may not consider > >>> >> in > >>> >> scope for SIPs: > >>> >> > >>> >> - Goals and non-goals: This is definitely in scope, and IMO should > >>> >> focus on > >>> >> user-visible behavior (e.g. "system supports SQL window functions" > or > >>> >> "system continues working if one node fails"). BTW I wouldn't say > >>> >> "rejected > >>> >> goals" because some of them might become goals later, so we're not > >>> >> definitively rejecting them. > >>> >> > >>> >> - Public API: Probably should be included in most SIPs unless it's > too > >>> >> large > >>> >> to fully specify then (e.g. "let's add an ML library"). > >>> >> > >>> >> - Use cases: I usually find this very useful in PRDs to better > >>> >> communicate > >>> >> the goals. > >>> >> > >>> >> - Internal architecture: This is usually *not* a thing users can > >>> >> easily > >>> >> comment on and it sounds more like a design doc item. Of course it's > >>> >> important to show that the SIP is feasible to implement. One > >>> >> exception, > >>> >> however, is that I think we'll have some SIPs primarily on internals > >>> >> (e.g. > >>> >> if somebody wants to refactor Spark's query optimizer or something). > >>> >> > >>> >> - Rejected strategies: I personally wouldn't put this, because > what's > >>> >> the > >>> >> point of voting to reject a strategy before you've really begun > >>> >> designing > >>> >> and implementing something? What if you discover that the strategy > is > >>> >> actually better when you start doing stuff? > >>> >> > >>> >> At a super high level, it depends on whether you want the SIPs to be > >>> >> PRDs > >>> >> for getting some quick feedback on the goals of a feature before it > is > >>> >> designed, or something more like full-fledged design docs (just a > more > >>> >> visible design doc for bigger changes). I looked at Kafka's KIPs, > and > >>> >> they > >>> >> actually seem to be more like design docs. This can work too but it > >>> >> does > >>> >> require more work from the proposer and it can lead to the same > >>> >> problems you > >>> >> mentioned with people already having a design and implementation in > >>> >> mind. > >>> >> > >>> >> Basically, the question is, are you trying to iterate faster on > design > >>> >> by > >>> >> adding a step for user feedback earlier? Or are you just trying to > >>> >> make > >>> >> design docs for key features more visible (and their approval more > >>> >> formal)? > >>> >> > >>> >> BTW note that in either case, I'd like to have a template for design > >>> >> docs > >>> >> too, which should also include goals. I think that would've avoided > >>> >> some of > >>> >> the issues you brought up. > >>> >> > >>> >> Matei > >>> >> > >>> >> On Oct 9, 2016, at 10:40 AM, Cody Koeninger <[hidden email]> wrote: > >>> >> > >>> >> Here's my specific proposal (meta-proposal?) > >>> >> > >>> >> Spark Improvement Proposals (SIP) > >>> >> > >>> >> > >>> >> Background: > >>> >> > >>> >> The current problem is that design and implementation of large > >>> >> features > >>> >> are > >>> >> often done in private, before soliciting user feedback. > >>> >> > >>> >> When feedback is solicited, it is often as to detailed design > >>> >> specifics, not > >>> >> focused on goals. > >>> >> > >>> >> When implementation does take place after design, there is often > >>> >> disagreement as to what goals are or are not in scope. > >>> >> > >>> >> This results in commits that don't fully meet user needs. > >>> >> > >>> >> > >>> >> Goals: > >>> >> > >>> >> - Ensure user, contributor, and committer goals are clearly > identified > >>> >> and > >>> >> agreed upon, before implementation takes place. > >>> >> > >>> >> - Ensure that a technically feasible strategy is chosen that is > likely > >>> >> to > >>> >> meet the goals. > >>> >> > >>> >> > >>> >> Rejected Goals: > >>> >> > >>> >> - SIPs are not for detailed design. Design by committee doesn't > work. > >>> >> > >>> >> - SIPs are not for every change. We dont need that much process. > >>> >> > >>> >> > >>> >> Strategy: > >>> >> > >>> >> My suggestion is outlined as a Spark Improvement Proposal process > >>> >> documented > >>> >> at > >>> >> > >>> >> > >>> >> > >>> >> https://github.com/koeninger/spark-1/blob/SIP-0/docs/spark- > improvement-proposals.md > >>> >> > >>> >> Specifics of Jira manipulation are an implementation detail we can > >>> >> figure > >>> >> out. > >>> >> > >>> >> I'm suggesting voting; the need here is for a _clear_ outcome. > >>> >> > >>> >> > >>> >> Rejected Strategies: > >>> >> > >>> >> Having someone who understands the problem implement it first works, > >>> >> but > >>> >> only if significant iteration after user feedback is allowed. > >>> >> > >>> >> Historically this has been problematic due to pressure to limit > public > >>> >> api > >>> >> changes. > >>> >> > >>> >> > >>> >> On Fri, Oct 7, 2016 at 5:16 PM, Reynold Xin <[hidden email]> > >>> >> wrote: > >>> >>> > >>> >>> Alright looks like there are quite a bit of support. We should wait > >>> >>> to > >>> >>> hear from more people too. > >>> >>> > >>> >>> To push this forward, Cody and I will be working together in the > next > >>> >>> couple of weeks to come up with a concrete, detailed proposal on > what > >>> >>> this > >>> >>> entails, and then we can discuss this the specific proposal as > well. > >>> >>> > >>> >>> > >>> >>> On Fri, Oct 7, 2016 at 2:29 PM, Cody Koeninger <[hidden email]> > >>> >>> wrote: > >>> >>>> > >>> >>>> Yeah, in case it wasn't clear, I was talking about SIPs for major > >>> >>>> user-facing or cross-cutting changes, not minor feature adds. > >>> >>>> > >>> >>>> On Fri, Oct 7, 2016 at 3:58 PM, Stavros Kontopoulos > >>> >>>> <[hidden email]> wrote: > >>> >>>>> > >>> >>>>> +1 to the SIP label as long as it does not slow down things and > it > >>> >>>>> targets optimizing efforts, coordination etc. For example really > >>> >>>>> small > >>> >>>>> features should not need to go through this process (assuming > they > >>> >>>>> dont > >>> >>>>> touch public interfaces) or re-factorings and hope it will be > kept > >>> >>>>> this > >>> >>>>> way. So as a guideline doc should be provided, like in the KIP > >>> >>>>> case. > >>> >>>>> > >>> >>>>> IMHO so far aside from tagging things and linking them elsewhere > >>> >>>>> simply > >>> >>>>> having design docs and prototypes implementations in PRs is not > >>> >>>>> something > >>> >>>>> that has not worked so far. What is really a pain in many > projects > >>> >>>>> out there > >>> >>>>> is discontinuity in progress of PRs, missing features, slow > reviews > >>> >>>>> which is > >>> >>>>> understandable to some extent... it is not only about Spark but > >>> >>>>> things can > >>> >>>>> be improved for sure for this project in particular as already > >>> >>>>> stated. > >>> >>>>> > >>> >>>>> On Fri, Oct 7, 2016 at 11:14 PM, Cody Koeninger <[hidden email]> > >>> >>>>> wrote: > >>> >>>>>> > >>> >>>>>> +1 to adding an SIP label and linking it from the website. I > >>> >>>>>> think > >>> >>>>>> it > >>> >>>>>> needs > >>> >>>>>> > >>> >>>>>> - template that focuses it towards soliciting user goals / non > >>> >>>>>> goals > >>> >>>>>> - clear resolution as to which strategy was chosen to pursue. > I'd > >>> >>>>>> recommend a vote. > >>> >>>>>> > >>> >>>>>> Matei asked me to clarify what I meant by changing interfaces, I > >>> >>>>>> think > >>> >>>>>> it's directly relevant to the SIP idea so I'll clarify here, and > >>> >>>>>> split > >>> >>>>>> a thread for the other discussion per Nicholas' request. > >>> >>>>>> > >>> >>>>>> I meant changing public user interfaces. I think the first > design > >>> >>>>>> is > >>> >>>>>> unlikely to be right, because it's done at a time when you have > >>> >>>>>> the > >>> >>>>>> least information. As a user, I find it considerably more > >>> >>>>>> frustrating > >>> >>>>>> to be unable to use a tool to get my job done, than I do having > to > >>> >>>>>> make minor changes to my code in order to take advantage of > >>> >>>>>> features. > >>> >>>>>> I've seen committers be seriously reluctant to allow changes to > >>> >>>>>> @experimental code that are needed in order for it to really > work > >>> >>>>>> right. You need to be able to iterate, and if people on both > >>> >>>>>> sides > >>> >>>>>> of > >>> >>>>>> the fence aren't going to respect that some newer apis are > subject > >>> >>>>>> to > >>> >>>>>> change, then why even mark them as such? > >>> >>>>>> > >>> >>>>>> Ideally a finished SIP should give me a checklist of things that > >>> >>>>>> an > >>> >>>>>> implementation must do, and things that it doesn't need to do. > >>> >>>>>> Contributors/committers should be seriously discouraged from > >>> >>>>>> putting > >>> >>>>>> out a version 0.1 that doesn't have at least a prototype > >>> >>>>>> implementation of all those things, especially if they're then > >>> >>>>>> going > >>> >>>>>> to argue against interface changes necessary to get the the rest > >>> >>>>>> of > >>> >>>>>> the things done in the 0.2 version. > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> On Fri, Oct 7, 2016 at 2:18 PM, Reynold Xin <[hidden email]> > >>> >>>>>> wrote: > >>> >>>>>>> I like the lightweight proposal to add a SIP label. > >>> >>>>>>> > >>> >>>>>>> During Spark 2.0 development, Tom (Graves) and I suggested > using > >>> >>>>>>> wiki > >>> >>>>>>> to > >>> >>>>>>> track the list of major changes, but that never really > >>> >>>>>>> materialized > >>> >>>>>>> due to > >>> >>>>>>> the overhead. Adding a SIP label on major JIRAs and then link > to > >>> >>>>>>> them > >>> >>>>>>> prominently on the Spark website makes a lot of sense. > >>> >>>>>>> > >>> >>>>>>> > >>> >>>>>>> On Fri, Oct 7, 2016 at 10:50 AM, Matei Zaharia > >>> >>>>>>> <[hidden email]> > >>> >>>>>>> wrote: > >>> >>>>>>>> > >>> >>>>>>>> For the improvement proposals, I think one major point was to > >>> >>>>>>>> make > >>> >>>>>>>> them > >>> >>>>>>>> really visible to users who are not contributors, so we should > >>> >>>>>>>> do > >>> >>>>>>>> more than > >>> >>>>>>>> sending stuff to dev@. One very lightweight idea is to have a > >>> >>>>>>>> new > >>> >>>>>>>> type of > >>> >>>>>>>> JIRA called a SIP and have a link to a filter that shows all > >>> >>>>>>>> such > >>> >>>>>>>> JIRAs from > >>> >>>>>>>> http://spark.apache.org. I also like the idea of SIP and > design > >>> >>>>>>>> doc > >>> >>>>>>>> templates (in fact many projects have them). > >>> >>>>>>>> > >>> >>>>>>>> Matei > >>> >>>>>>>> > >>> >>>>>>>> On Oct 7, 2016, at 10:38 AM, Reynold Xin <[hidden email]> > >>> >>>>>>>> wrote: > >>> >>>>>>>> > >>> >>>>>>>> I called Cody last night and talked about some of the topics > in > >>> >>>>>>>> his > >>> >>>>>>>> email. > >>> >>>>>>>> It became clear to me Cody genuinely cares about the project. > >>> >>>>>>>> > >>> >>>>>>>> Some of the frustrations come from the success of the project > >>> >>>>>>>> itself > >>> >>>>>>>> becoming very "hot", and it is difficult to get clarity from > >>> >>>>>>>> people > >>> >>>>>>>> who > >>> >>>>>>>> don't dedicate all their time to Spark. In fact, it is in some > >>> >>>>>>>> ways > >>> >>>>>>>> similar > >>> >>>>>>>> to scaling an engineering team in a successful startup: old > >>> >>>>>>>> processes that > >>> >>>>>>>> worked well might not work so well when it gets to a certain > >>> >>>>>>>> size, > >>> >>>>>>>> cultures > >>> >>>>>>>> can get diluted, building culture vs building process, etc. > >>> >>>>>>>> > >>> >>>>>>>> I also really like to have a more visible process for larger > >>> >>>>>>>> changes, > >>> >>>>>>>> especially major user facing API changes. Historically we > upload > >>> >>>>>>>> design docs > >>> >>>>>>>> for major changes, but it is not always consistent and > difficult > >>> >>>>>>>> to > >>> >>>>>>>> quality > >>> >>>>>>>> of the docs, due to the volunteering nature of the > organization. > >>> >>>>>>>> > >>> >>>>>>>> Some of the more concrete ideas we discussed focus on > building a > >>> >>>>>>>> culture > >>> >>>>>>>> to improve clarity: > >>> >>>>>>>> > >>> >>>>>>>> - Process: Large changes should have design docs posted on > JIRA. > >>> >>>>>>>> One > >>> >>>>>>>> thing > >>> >>>>>>>> Cody and I didn't discuss but an idea that just came to me is > we > >>> >>>>>>>> should > >>> >>>>>>>> create a design doc template for the project and ask everybody > >>> >>>>>>>> to > >>> >>>>>>>> follow. > >>> >>>>>>>> The design doc template should also explicitly list goals and > >>> >>>>>>>> non-goals, to > >>> >>>>>>>> make design doc more consistent. > >>> >>>>>>>> > >>> >>>>>>>> - Process: Email dev@ to solicit feedback. We have some this > >>> >>>>>>>> with > >>> >>>>>>>> some > >>> >>>>>>>> changes, but again very inconsistent. Just posting something > on > >>> >>>>>>>> JIRA > >>> >>>>>>>> isn't > >>> >>>>>>>> sufficient, because there are simply too many JIRAs and the > >>> >>>>>>>> signal > >>> >>>>>>>> get lost > >>> >>>>>>>> in the noise. While this is generally impossible to enforce > >>> >>>>>>>> because > >>> >>>>>>>> we can't > >>> >>>>>>>> force all volunteers to conform to a process (or they might > not > >>> >>>>>>>> even > >>> >>>>>>>> be > >>> >>>>>>>> aware of this), those who are more familiar with the project > >>> >>>>>>>> can > >>> >>>>>>>> help by > >>> >>>>>>>> emailing the dev@ when they see something that hasn't been. > >>> >>>>>>>> > >>> >>>>>>>> - Culture: The design doc author(s) should be open to > feedback. > >>> >>>>>>>> A > >>> >>>>>>>> design > >>> >>>>>>>> doc should serve as the base for discussion and is by no means > >>> >>>>>>>> the > >>> >>>>>>>> final > >>> >>>>>>>> design. Of course, this does not mean the author has to accept > >>> >>>>>>>> every > >>> >>>>>>>> feedback. They should also be comfortable accepting / > rejecting > >>> >>>>>>>> ideas on > >>> >>>>>>>> technical grounds. > >>> >>>>>>>> > >>> >>>>>>>> - Process / Culture: For major ongoing projects, it can be > >>> >>>>>>>> useful > >>> >>>>>>>> to > >>> >>>>>>>> have > >>> >>>>>>>> some monthly Google hangouts that are open to the world. I am > >>> >>>>>>>> actually not > >>> >>>>>>>> sure how well this will work, because of the volunteering > nature > >>> >>>>>>>> and > >>> >>>>>>>> we need > >>> >>>>>>>> to adjust for timezones for people across the globe, but it > >>> >>>>>>>> seems > >>> >>>>>>>> worth > >>> >>>>>>>> trying. > >>> >>>>>>>> > >>> >>>>>>>> - Culture: Contributors (including committers) should be more > >>> >>>>>>>> direct > >>> >>>>>>>> in > >>> >>>>>>>> setting expectations, including whether they are working on a > >>> >>>>>>>> specific > >>> >>>>>>>> issue, whether they will be working on a specific issue, and > >>> >>>>>>>> whether > >>> >>>>>>>> an > >>> >>>>>>>> issue or pr or jira should be rejected. Most people I know in > >>> >>>>>>>> this > >>> >>>>>>>> community > >>> >>>>>>>> are nice and don't enjoy telling other people no, but it is > >>> >>>>>>>> often > >>> >>>>>>>> more > >>> >>>>>>>> annoying to a contributor to not know anything than getting a > >>> >>>>>>>> no. > >>> >>>>>>>> > >>> >>>>>>>> > >>> >>>>>>>> On Fri, Oct 7, 2016 at 10:03 AM, Matei Zaharia > >>> >>>>>>>> <[hidden email]> > >>> >>>>>>>> wrote: > >>> >>>>>>>>> > >>> >>>>>>>>> > >>> >>>>>>>>> Love the idea of a more visible "Spark Improvement Proposal" > >>> >>>>>>>>> process that > >>> >>>>>>>>> solicits user input on new APIs. For what it's worth, I don't > >>> >>>>>>>>> think > >>> >>>>>>>>> committers are trying to minimize their own work -- every > >>> >>>>>>>>> committer > >>> >>>>>>>>> cares > >>> >>>>>>>>> about making the software useful for users. However, it is > >>> >>>>>>>>> always > >>> >>>>>>>>> hard to > >>> >>>>>>>>> get user input and so it helps to have this kind of process. > >>> >>>>>>>>> I've > >>> >>>>>>>>> certainly > >>> >>>>>>>>> looked at the *IPs a lot in other software I use just to see > >>> >>>>>>>>> the > >>> >>>>>>>>> biggest > >>> >>>>>>>>> things on the roadmap. > >>> >>>>>>>>> > >>> >>>>>>>>> When you're talking about "changing interfaces", are you > >>> >>>>>>>>> talking > >>> >>>>>>>>> about > >>> >>>>>>>>> public or internal APIs? I do think many people hate changing > >>> >>>>>>>>> public APIs > >>> >>>>>>>>> and I actually think that's for the best of the project. > That's > >>> >>>>>>>>> a > >>> >>>>>>>>> technical > >>> >>>>>>>>> debate, but basically, the worst thing when you're using a > >>> >>>>>>>>> piece > >>> >>>>>>>>> of > >>> >>>>>>>>> software > >>> >>>>>>>>> is that the developers constantly ask you to rewrite your app > >>> >>>>>>>>> to > >>> >>>>>>>>> update to a > >>> >>>>>>>>> new version (and thus benefit from bug fixes, etc). Cue > anyone > >>> >>>>>>>>> who's used > >>> >>>>>>>>> Protobuf, or Guava. The "let's get everyone to change their > >>> >>>>>>>>> code > >>> >>>>>>>>> this > >>> >>>>>>>>> release" model works well within a single large company, but > >>> >>>>>>>>> doesn't work > >>> >>>>>>>>> well for a community, which is why nearly all *very* widely > >>> >>>>>>>>> used > >>> >>>>>>>>> programming > >>> >>>>>>>>> interfaces (I'm talking things like Java standard library, > >>> >>>>>>>>> Windows > >>> >>>>>>>>> API, etc) > >>> >>>>>>>>> almost *never* break backwards compatibility. All this is > done > >>> >>>>>>>>> within reason > >>> >>>>>>>>> though, e.g. we do change things in major releases (2.x, 3.x, > >>> >>>>>>>>> etc). > >>> >>>>>>>> > >>> >>>>>>>> > >>> >>>>>>>> > >>> >>>>>>>> > >>> >>>>>>> > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> ------------------------------------------------------------ > --------- > >>> >>>>>> To unsubscribe e-mail: [hidden email] > >>> >>>>>> > >>> >>>>> > >>> >>>>> > >>> >>>>> > >>> >>>>> -- > >>> >>>>> Stavros Kontopoulos > >>> >>>>> Senior Software Engineer > >>> >>>>> Lightbend, Inc. > >>> >>>>> p: +30 6977967274 > >>> >>>>> e: [hidden email] > >>> >>>>> > >>> >>>>> > >>> >>>> > >>> >>> > >>> >> > >>> >> > >>> > >> > > > > > > --------------------------------------------------------------------- > > To unsubscribe e-mail: [hidden email] > > > > > > ________________________________ > > > > If you reply to this email, your message will be added to the discussion > > below: > > > > http://apache-spark-developers-list.1001551.n3. > nabble.com/Spark-Improvement-Proposals-tp19268p19359.html > > > > To start a new topic under Apache Spark Developers List, email [hidden > > email] > > To unsubscribe from Apache Spark Developers List, click here. > > NAML > > > > > > ________________________________ > > View this message in context: RE: Spark Improvement Proposals > > Sent from the Apache Spark Developers List mailing list archive at > > Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > > -- Ryan Blue Software Engineer Netflix