You did well ! It's an interesting discussion we have and it's great to have it on the mailing list (better than in Jira or PR comments IMHO).
Thanks ! Regards JB On Oct 27, 2016, 20:39, at 20:39, Robert Bradshaw <[email protected]> wrote: >+1 to all Dan says. > >I only brought this up because it seemed new contributors (yay) >jumping in and renaming a core transform based on "Something to >consider" deserved a couple more more eyeballs, but didn't intend for >it to become a big deal. > >On Thu, Oct 27, 2016 at 11:03 AM, Dan Halperin ><[email protected]> wrote: >> Folks, I don't think this needs to be a "vote". This is just not that >big a >> deal :). It is important to be transparent and have these discussions >on >> the list, which is why we brought it here from GitHub/JIRA, but at >the end >> of the day I hope that a small group of committers and developers can >> assess "good enough" consensus for these minor issues. >> >> Here's my assessment: >> * We don't really have any rules about naming transforms. "Should be >a >> verb" is a sort of guiding principle inherited from the Google Flume >> project from which Dataflow evolved, but honestly we violate this >rule for >> clarity all over the place. ("Values", for example). >> * The "Big Data" community is significantly more familiar with the >concept >> of Distinct -- Jesse, who filed the original JIRA, is a good example >here. >> * Finally, nobody feels very strongly. We could argue minor points of >each >> solution, but at the end of the day I don't think anyone wants to >block a >> change. >> >> Let's go with Distinct. It's important to align Beam with the open >source >> big data community. (And thanks Jesse, our newest (*tied) committer, >for >> pushing us in the right direction!) >> >> Jesse, can you please take charge of wrapping up the PR and merging >it? >> >> Thanks! >> Dan >> >> On Wed, Oct 26, 2016 at 11:12 PM, Jean-Baptiste Onofré ><[email protected]> >> wrote: >> >>> Just to clarify. Davor is right for a code modification change: -1 >means a >>> veto. >>> I meant that -1 is not a veto for a release vote. >>> >>> Anyway, even if it's not a formal code, we can have a discussion >with >>> "options" a,b and c. >>> >>> Regards >>> JB >>> >>> >>> >>> On Oct 27, 2016, 06:48, at 06:48, Davor Bonaci ><[email protected]> >>> wrote: >>> >In terms of reaching a decision on any code or design changes, >>> >including >>> >this one, I'd suggest going without formal votes. Voting process >for >>> >code >>> >modifications between choices A and B doesn't necessarily end with >a >>> >decision A or B -- a single (qualified) -1 vote is a veto and >cannot be >>> >overridden [1]. Said differently, the guideline is that code >changes >>> >should >>> >be made by consensus; not by one group outvoting another. I'd like >to >>> >avoid >>> >setting such precedent; we should try to drive consensus, as >opposed to >>> >attempting to outvote another part of the community. >>> > >>> >In this particular case, we have had a great discussion. Many >>> >contributors >>> >brought different perspectives. Consequently, some opinions have >been >>> >likely changed. At this point, someone should summarize the >arguments, >>> >try >>> >to critique them from a neutral standpoint, and suggest a refined >>> >proposal >>> >that takes these perspectives into account. If nobody objects in a >>> >short >>> >time, we should consider this decided. [ I can certainly help here, >but >>> >I'd >>> >love to see somebody else do it! ] >>> > >>> >[1] http://www.apache.org/foundation/voting.html >>> > >>> >On Wed, Oct 26, 2016 at 7:35 AM, Ben Chambers >>> ><[email protected]> >>> >wrote: >>> > >>> >> I also like Distinct since it doesn't make it sound like it >modifies >>> >any >>> >> underlying collection. RemoveDuplicates makes it sound like the >>> >duplicates >>> >> are removed, rather than a new PCollection without duplicates >being >>> >> returned. >>> >> >>> >> On Wed, Oct 26, 2016, 7:36 AM Jean-Baptiste Onofré ><[email protected]> >>> >> wrote: >>> >> >>> >> > Agree. It was more a transition proposal. >>> >> > >>> >> > Regards >>> >> > JB >>> >> > >>> >> > >>> >> > >>> >> > On Oct 26, 2016, 08:31, at 08:31, Robert Bradshaw >>> >> > <[email protected]> wrote: >>> >> > >On Mon, Oct 24, 2016 at 11:02 PM, Jean-Baptiste Onofré >>> >> > ><[email protected]> wrote: >>> >> > >> And what about use RemoveDuplicates and create an alias >Distinct >>> >? >>> >> > > >>> >> > >I'd really like to avoid (long term) aliases--you end up >having to >>> >> > >document (and maintain) them both, and it adds confusion as to >>> >which >>> >> > >one to use (especially if they every diverge), and means >searching >>> >for >>> >> > >one or the other yields half the results. >>> >> > > >>> >> > >> It doesn't break the API and would address both SQL users >and >>> >more >>> >> > >"big data" users. >>> >> > >> >>> >> > >> My $0.01 ;) >>> >> > >> >>> >> > >> Regards >>> >> > >> JB >>> >> > >> >>> >> > >> >>> >> > >> >>> >> > >> On Oct 24, 2016, 22:23, at 22:23, Dan Halperin >>> >> > ><[email protected]> wrote: >>> >> > >>>I find "MakeDistinct" more confusing. My votes in decreasing >>> >> > >>>preference: >>> >> > >>> >>> >> > >>>1. Keep `RemoveDuplicates` name, ensure that important >keywords >>> >are >>> >> > >in >>> >> > >>>the >>> >> > >>>Javadoc. This reduces churn on our users and is honestly >pretty >>> >dang >>> >> > >>> descriptive. >>> >> > >>>2. Rename to `Distinct`, which is clear if you're a SQL user >and >>> >> > >likely >>> >> > >>>less clear otherwise. This is a backwards-incompatible API >>> >change, so >>> >> > >>>we >>> >> > >>>should do it before we go stable. >>> >> > >>> >>> >> > >>>I am not super strong that 1 > 2, but I am very strong that >>> >> > >"Distinct" >>> >> > >>>>>> >>> >> > >>>"MakeDistinct" or and "RemoveDuplicates" >>> >"AvoidDuplicate". >>> >> > >>> >>> >> > >>>Dan >>> >> > >>> >>> >> > >>>On Mon, Oct 24, 2016 at 10:12 AM, Kenneth Knowles >>> >> > >>><[email protected]> >>> >> > >>>wrote: >>> >> > >>> >>> >> > >>>> The precedent that we use verbs has many exceptions. We >have >>> >> > >>>> ApproximateQuantiles, Values, Keys, WithTimestamps, and I >>> >would >>> >> > >even >>> >> > >>>> include Sum (at least when I read it). >>> >> > >>>> >>> >> > >>>> Historical note: the predilection towards verbs is from >the >>> >Google >>> >> > >>>Style >>> >> > >>>> Guide for Java method names >>> >> > >>>> >>> >> > >>><https://google.github.io/styleguide/javaguide.html#s5. >>> >> 2.3-method-names >>> >> > >, >>> >> > >>>> which states "Method names are typically verbs or verb >>> >phrases". >>> >> > >But >>> >> > >>>even >>> >> > >>>> in Google code there are lots of exceptions when it makes >>> >sense, >>> >> > >like >>> >> > >>>> Guava's >>> >> > >>>> Iterables.any(), Iterables.all(), Iterables.toArray(), the >>> >entire >>> >> > >>>> Predicates module, etc. Just an aside; Beam isn't Google >code. >>> >I >>> >> > >>>suggest we >>> >> > >>>> use our judgment rather than a policy. >>> >> > >>>> >>> >> > >>>> I think "Distinct" is one of those exceptions. It is a >>> >standard >>> >> > >>>widespread >>> >> > >>>> name and also reads better as an adjective. I prefer it, >but >>> >also >>> >> > >>>don't >>> >> > >>>> care strongly enough to change it or to change it back :-) >>> >> > >>>> >>> >> > >>>> If we must have a verb, I like it as-is more than >MakeDistinct >>> >and >>> >> > >>>> AvoidDuplicate. >>> >> > >>>> >>> >> > >>>> On Mon, Oct 24, 2016 at 9:46 AM Jesse Anderson >>> >> > >>><[email protected]> >>> >> > >>>> wrote: >>> >> > >>>> >>> >> > >>>> > My original thought for this change was that Crunch uses >the >>> >> > >class >>> >> > >>>name >>> >> > >>>> > Distinct. SQL also uses the keyword distinct. >>> >> > >>>> > >>> >> > >>>> > Maybe the rule should be changed to adjectives or verbs >>> >depending >>> >> > >>>on the >>> >> > >>>> > context. >>> >> > >>>> > >>> >> > >>>> > Using a verb to describe this class really doesn't >connote >>> >what >>> >> > >the >>> >> > >>>class >>> >> > >>>> > does as succinctly as the adjective. >>> >> > >>>> > >>> >> > >>>> > On Mon, Oct 24, 2016 at 9:40 AM Neelesh Salian >>> >> > >>><[email protected]> >>> >> > >>>> > wrote: >>> >> > >>>> > >>> >> > >>>> > > Hello, >>> >> > >>>> > > >>> >> > >>>> > > First of all, thank you to Daniel, Robert and Jesse >for >>> >their >>> >> > >>>review on >>> >> > >>>> > > this: https://issues.apache.org/jira/browse/BEAM-239 >>> >> > >>>> > > >>> >> > >>>> > > A point that came up was using verbs explicitly for >>> >Transforms. >>> >> > >>>> > > Here is the PR: >>> >> > >>>https://github.com/apache/incubator-beam/pull/1164 >>> >> > >>>> > > >>> >> > >>>> > > Posting it to help understand if we have a consensus >for >>> >it and >>> >> > >>>if yes, >>> >> > >>>> > we >>> >> > >>>> > > could perhaps document it for future changes. >>> >> > >>>> > > >>> >> > >>>> > > Thank you. >>> >> > >>>> > > >>> >> > >>>> > > -- >>> >> > >>>> > > Neelesh Srinivas Salian >>> >> > >>>> > > Engineer >>> >> > >>>> > > >>> >> > >>>> > >>> >> > >>>> >>> >> > >>> >> >>>
