You did well ! It's an interesting discussion we have and it's great to have it 
on the mailing list (better than in Jira or PR comments IMHO).

Thanks !

Regards
JB

⁣​

On Oct 27, 2016, 20:39, at 20:39, Robert Bradshaw <[email protected]> 
wrote:
>+1 to all Dan says.
>
>I only brought this up because it seemed new contributors (yay)
>jumping in and renaming a core transform based on "Something to
>consider" deserved a couple more more eyeballs, but didn't intend for
>it to become a big deal.
>
>On Thu, Oct 27, 2016 at 11:03 AM, Dan Halperin
><[email protected]> wrote:
>> Folks, I don't think this needs to be a "vote". This is just not that
>big a
>> deal :). It is important to be transparent and have these discussions
>on
>> the list, which is why we brought it here from GitHub/JIRA, but at
>the end
>> of the day I hope that a small group of committers and developers can
>> assess "good enough" consensus for these minor issues.
>>
>> Here's my assessment:
>> * We don't really have any rules about naming transforms. "Should be
>a
>> verb" is a sort of guiding principle inherited from the Google Flume
>> project from which Dataflow evolved, but honestly we violate this
>rule for
>> clarity all over the place. ("Values", for example).
>> * The "Big Data" community is significantly more familiar with the
>concept
>> of Distinct -- Jesse, who filed the original JIRA, is a good example
>here.
>> * Finally, nobody feels very strongly. We could argue minor points of
>each
>> solution, but at the end of the day I don't think anyone wants to
>block a
>> change.
>>
>> Let's go with Distinct. It's important to align Beam with the open
>source
>> big data community. (And thanks Jesse, our newest (*tied) committer,
>for
>> pushing us in the right direction!)
>>
>> Jesse, can you please take charge of wrapping up the PR and merging
>it?
>>
>> Thanks!
>> Dan
>>
>> On Wed, Oct 26, 2016 at 11:12 PM, Jean-Baptiste Onofré
><[email protected]>
>> wrote:
>>
>>> Just to clarify. Davor is right for a code modification change: -1
>means a
>>> veto.
>>> I meant that -1 is not a veto for a release vote.
>>>
>>> Anyway, even if it's not a formal code, we can have a discussion
>with
>>> "options" a,b and c.
>>>
>>> Regards
>>> JB
>>>
>>> ⁣
>>>
>>> On Oct 27, 2016, 06:48, at 06:48, Davor Bonaci
><[email protected]>
>>> wrote:
>>> >In terms of reaching a decision on any code or design changes,
>>> >including
>>> >this one, I'd suggest going without formal votes. Voting process
>for
>>> >code
>>> >modifications between choices A and B doesn't necessarily end with
>a
>>> >decision A or B -- a single (qualified) -1 vote is a veto and
>cannot be
>>> >overridden [1]. Said differently, the guideline is that code
>changes
>>> >should
>>> >be made by consensus; not by one group outvoting another. I'd like
>to
>>> >avoid
>>> >setting such precedent; we should try to drive consensus, as
>opposed to
>>> >attempting to outvote another part of the community.
>>> >
>>> >In this particular case, we have had a great discussion. Many
>>> >contributors
>>> >brought different perspectives. Consequently, some opinions have
>been
>>> >likely changed. At this point, someone should summarize the
>arguments,
>>> >try
>>> >to critique them from a neutral standpoint, and suggest a refined
>>> >proposal
>>> >that takes these perspectives into account. If nobody objects in a
>>> >short
>>> >time, we should consider this decided. [ I can certainly help here,
>but
>>> >I'd
>>> >love to see somebody else do it! ]
>>> >
>>> >[1] http://www.apache.org/foundation/voting.html
>>> >
>>> >On Wed, Oct 26, 2016 at 7:35 AM, Ben Chambers
>>> ><[email protected]>
>>> >wrote:
>>> >
>>> >> I also like Distinct since it doesn't make it sound like it
>modifies
>>> >any
>>> >> underlying collection. RemoveDuplicates makes it sound like the
>>> >duplicates
>>> >> are removed, rather than a new PCollection without duplicates
>being
>>> >> returned.
>>> >>
>>> >> On Wed, Oct 26, 2016, 7:36 AM Jean-Baptiste Onofré
><[email protected]>
>>> >> wrote:
>>> >>
>>> >> > Agree. It was more a transition proposal.
>>> >> >
>>> >> > Regards
>>> >> > JB
>>> >> >
>>> >> > ⁣
>>> >> >
>>> >> > On Oct 26, 2016, 08:31, at 08:31, Robert Bradshaw
>>> >> > <[email protected]> wrote:
>>> >> > >On Mon, Oct 24, 2016 at 11:02 PM, Jean-Baptiste Onofré
>>> >> > ><[email protected]> wrote:
>>> >> > >> And what about use RemoveDuplicates and create an alias
>Distinct
>>> >?
>>> >> > >
>>> >> > >I'd really like to avoid (long term) aliases--you end up
>having to
>>> >> > >document (and maintain) them both, and it adds confusion as to
>>> >which
>>> >> > >one to use (especially if they every diverge), and means
>searching
>>> >for
>>> >> > >one or the other yields half the results.
>>> >> > >
>>> >> > >> It doesn't break the API and would address both SQL users
>and
>>> >more
>>> >> > >"big data" users.
>>> >> > >>
>>> >> > >> My $0.01 ;)
>>> >> > >>
>>> >> > >> Regards
>>> >> > >> JB
>>> >> > >>
>>> >> > >> ⁣
>>> >> > >>
>>> >> > >> On Oct 24, 2016, 22:23, at 22:23, Dan Halperin
>>> >> > ><[email protected]> wrote:
>>> >> > >>>I find "MakeDistinct" more confusing. My votes in decreasing
>>> >> > >>>preference:
>>> >> > >>>
>>> >> > >>>1. Keep `RemoveDuplicates` name, ensure that important
>keywords
>>> >are
>>> >> > >in
>>> >> > >>>the
>>> >> > >>>Javadoc. This reduces churn on our users and is honestly
>pretty
>>> >dang
>>> >> > >>> descriptive.
>>> >> > >>>2. Rename to `Distinct`, which is clear if you're a SQL user
>and
>>> >> > >likely
>>> >> > >>>less clear otherwise. This is a backwards-incompatible API
>>> >change, so
>>> >> > >>>we
>>> >> > >>>should do it before we go stable.
>>> >> > >>>
>>> >> > >>>I am not super strong that 1 > 2, but I am very strong that
>>> >> > >"Distinct"
>>> >> > >>>>>>
>>> >> > >>>"MakeDistinct" or and "RemoveDuplicates" >>>
>"AvoidDuplicate".
>>> >> > >>>
>>> >> > >>>Dan
>>> >> > >>>
>>> >> > >>>On Mon, Oct 24, 2016 at 10:12 AM, Kenneth Knowles
>>> >> > >>><[email protected]>
>>> >> > >>>wrote:
>>> >> > >>>
>>> >> > >>>> The precedent that we use verbs has many exceptions. We
>have
>>> >> > >>>> ApproximateQuantiles, Values, Keys, WithTimestamps, and I
>>> >would
>>> >> > >even
>>> >> > >>>> include Sum (at least when I read it).
>>> >> > >>>>
>>> >> > >>>> Historical note: the predilection towards verbs is from
>the
>>> >Google
>>> >> > >>>Style
>>> >> > >>>> Guide for Java method names
>>> >> > >>>>
>>> >> > >>><https://google.github.io/styleguide/javaguide.html#s5.
>>> >> 2.3-method-names
>>> >> > >,
>>> >> > >>>> which states "Method names are typically verbs or verb
>>> >phrases".
>>> >> > >But
>>> >> > >>>even
>>> >> > >>>> in Google code there are lots of exceptions when it makes
>>> >sense,
>>> >> > >like
>>> >> > >>>> Guava's
>>> >> > >>>> Iterables.any(), Iterables.all(), Iterables.toArray(), the
>>> >entire
>>> >> > >>>> Predicates module, etc. Just an aside; Beam isn't Google
>code.
>>> >I
>>> >> > >>>suggest we
>>> >> > >>>> use our judgment rather than a policy.
>>> >> > >>>>
>>> >> > >>>> I think "Distinct" is one of those exceptions. It is a
>>> >standard
>>> >> > >>>widespread
>>> >> > >>>> name and also reads better as an adjective. I prefer it,
>but
>>> >also
>>> >> > >>>don't
>>> >> > >>>> care strongly enough to change it or to change it back :-)
>>> >> > >>>>
>>> >> > >>>> If we must have a verb, I like it as-is more than
>MakeDistinct
>>> >and
>>> >> > >>>> AvoidDuplicate.
>>> >> > >>>>
>>> >> > >>>> On Mon, Oct 24, 2016 at 9:46 AM Jesse Anderson
>>> >> > >>><[email protected]>
>>> >> > >>>> wrote:
>>> >> > >>>>
>>> >> > >>>> > My original thought for this change was that Crunch uses
>the
>>> >> > >class
>>> >> > >>>name
>>> >> > >>>> > Distinct. SQL also uses the keyword distinct.
>>> >> > >>>> >
>>> >> > >>>> > Maybe the rule should be changed to adjectives or verbs
>>> >depending
>>> >> > >>>on the
>>> >> > >>>> > context.
>>> >> > >>>> >
>>> >> > >>>> > Using a verb to describe this class really doesn't
>connote
>>> >what
>>> >> > >the
>>> >> > >>>class
>>> >> > >>>> > does as succinctly as the adjective.
>>> >> > >>>> >
>>> >> > >>>> > On Mon, Oct 24, 2016 at 9:40 AM Neelesh Salian
>>> >> > >>><[email protected]>
>>> >> > >>>> > wrote:
>>> >> > >>>> >
>>> >> > >>>> > > Hello,
>>> >> > >>>> > >
>>> >> > >>>> > > First of all, thank you to Daniel, Robert and Jesse
>for
>>> >their
>>> >> > >>>review on
>>> >> > >>>> > > this: https://issues.apache.org/jira/browse/BEAM-239
>>> >> > >>>> > >
>>> >> > >>>> > > A point that came up was using verbs explicitly for
>>> >Transforms.
>>> >> > >>>> > > Here is the PR:
>>> >> > >>>https://github.com/apache/incubator-beam/pull/1164
>>> >> > >>>> > >
>>> >> > >>>> > > Posting it to help understand if we have a consensus
>for
>>> >it and
>>> >> > >>>if yes,
>>> >> > >>>> > we
>>> >> > >>>> > > could perhaps document it for future changes.
>>> >> > >>>> > >
>>> >> > >>>> > > Thank you.
>>> >> > >>>> > >
>>> >> > >>>> > > --
>>> >> > >>>> > > Neelesh Srinivas Salian
>>> >> > >>>> > > Engineer
>>> >> > >>>> > >
>>> >> > >>>> >
>>> >> > >>>>
>>> >> >
>>> >>
>>>

Reply via email to