Sure

On Thu, Oct 27, 2016, 8:04 PM Dan Halperin <[email protected]>
wrote:

> Folks, I don't think this needs to be a "vote". This is just not that big a
> deal :). It is important to be transparent and have these discussions on
> the list, which is why we brought it here from GitHub/JIRA, but at the end
> of the day I hope that a small group of committers and developers can
> assess "good enough" consensus for these minor issues.
>
> Here's my assessment:
> * We don't really have any rules about naming transforms. "Should be a
> verb" is a sort of guiding principle inherited from the Google Flume
> project from which Dataflow evolved, but honestly we violate this rule for
> clarity all over the place. ("Values", for example).
> * The "Big Data" community is significantly more familiar with the concept
> of Distinct -- Jesse, who filed the original JIRA, is a good example here.
> * Finally, nobody feels very strongly. We could argue minor points of each
> solution, but at the end of the day I don't think anyone wants to block a
> change.
>
> Let's go with Distinct. It's important to align Beam with the open source
> big data community. (And thanks Jesse, our newest (*tied) committer, for
> pushing us in the right direction!)
>
> Jesse, can you please take charge of wrapping up the PR and merging it?
>
> Thanks!
> Dan
>
> On Wed, Oct 26, 2016 at 11:12 PM, Jean-Baptiste Onofré <[email protected]>
> wrote:
>
> > Just to clarify. Davor is right for a code modification change: -1 means
> a
> > veto.
> > I meant that -1 is not a veto for a release vote.
> >
> > Anyway, even if it's not a formal code, we can have a discussion with
> > "options" a,b and c.
> >
> > Regards
> > JB
> >
> > ⁣​
> >
> > On Oct 27, 2016, 06:48, at 06:48, Davor Bonaci <[email protected]
> >
> > wrote:
> > >In terms of reaching a decision on any code or design changes,
> > >including
> > >this one, I'd suggest going without formal votes. Voting process for
> > >code
> > >modifications between choices A and B doesn't necessarily end with a
> > >decision A or B -- a single (qualified) -1 vote is a veto and cannot be
> > >overridden [1]. Said differently, the guideline is that code changes
> > >should
> > >be made by consensus; not by one group outvoting another. I'd like to
> > >avoid
> > >setting such precedent; we should try to drive consensus, as opposed to
> > >attempting to outvote another part of the community.
> > >
> > >In this particular case, we have had a great discussion. Many
> > >contributors
> > >brought different perspectives. Consequently, some opinions have been
> > >likely changed. At this point, someone should summarize the arguments,
> > >try
> > >to critique them from a neutral standpoint, and suggest a refined
> > >proposal
> > >that takes these perspectives into account. If nobody objects in a
> > >short
> > >time, we should consider this decided. [ I can certainly help here, but
> > >I'd
> > >love to see somebody else do it! ]
> > >
> > >[1] http://www.apache.org/foundation/voting.html
> > >
> > >On Wed, Oct 26, 2016 at 7:35 AM, Ben Chambers
> > ><[email protected]>
> > >wrote:
> > >
> > >> I also like Distinct since it doesn't make it sound like it modifies
> > >any
> > >> underlying collection. RemoveDuplicates makes it sound like the
> > >duplicates
> > >> are removed, rather than a new PCollection without duplicates being
> > >> returned.
> > >>
> > >> On Wed, Oct 26, 2016, 7:36 AM Jean-Baptiste Onofré <[email protected]>
> > >> wrote:
> > >>
> > >> > Agree. It was more a transition proposal.
> > >> >
> > >> > Regards
> > >> > JB
> > >> >
> > >> > ⁣​
> > >> >
> > >> > On Oct 26, 2016, 08:31, at 08:31, Robert Bradshaw
> > >> > <[email protected]> wrote:
> > >> > >On Mon, Oct 24, 2016 at 11:02 PM, Jean-Baptiste Onofré
> > >> > ><[email protected]> wrote:
> > >> > >> And what about use RemoveDuplicates and create an alias Distinct
> > >?
> > >> > >
> > >> > >I'd really like to avoid (long term) aliases--you end up having to
> > >> > >document (and maintain) them both, and it adds confusion as to
> > >which
> > >> > >one to use (especially if they every diverge), and means searching
> > >for
> > >> > >one or the other yields half the results.
> > >> > >
> > >> > >> It doesn't break the API and would address both SQL users and
> > >more
> > >> > >"big data" users.
> > >> > >>
> > >> > >> My $0.01 ;)
> > >> > >>
> > >> > >> Regards
> > >> > >> JB
> > >> > >>
> > >> > >> ⁣
> > >> > >>
> > >> > >> On Oct 24, 2016, 22:23, at 22:23, Dan Halperin
> > >> > ><[email protected]> wrote:
> > >> > >>>I find "MakeDistinct" more confusing. My votes in decreasing
> > >> > >>>preference:
> > >> > >>>
> > >> > >>>1. Keep `RemoveDuplicates` name, ensure that important keywords
> > >are
> > >> > >in
> > >> > >>>the
> > >> > >>>Javadoc. This reduces churn on our users and is honestly pretty
> > >dang
> > >> > >>> descriptive.
> > >> > >>>2. Rename to `Distinct`, which is clear if you're a SQL user and
> > >> > >likely
> > >> > >>>less clear otherwise. This is a backwards-incompatible API
> > >change, so
> > >> > >>>we
> > >> > >>>should do it before we go stable.
> > >> > >>>
> > >> > >>>I am not super strong that 1 > 2, but I am very strong that
> > >> > >"Distinct"
> > >> > >>>>>>
> > >> > >>>"MakeDistinct" or and "RemoveDuplicates" >>> "AvoidDuplicate".
> > >> > >>>
> > >> > >>>Dan
> > >> > >>>
> > >> > >>>On Mon, Oct 24, 2016 at 10:12 AM, Kenneth Knowles
> > >> > >>><[email protected]>
> > >> > >>>wrote:
> > >> > >>>
> > >> > >>>> The precedent that we use verbs has many exceptions. We have
> > >> > >>>> ApproximateQuantiles, Values, Keys, WithTimestamps, and I
> > >would
> > >> > >even
> > >> > >>>> include Sum (at least when I read it).
> > >> > >>>>
> > >> > >>>> Historical note: the predilection towards verbs is from the
> > >Google
> > >> > >>>Style
> > >> > >>>> Guide for Java method names
> > >> > >>>>
> > >> > >>><https://google.github.io/styleguide/javaguide.html#s5.
> > >> 2.3-method-names
> > >> > >,
> > >> > >>>> which states "Method names are typically verbs or verb
> > >phrases".
> > >> > >But
> > >> > >>>even
> > >> > >>>> in Google code there are lots of exceptions when it makes
> > >sense,
> > >> > >like
> > >> > >>>> Guava's
> > >> > >>>> Iterables.any(), Iterables.all(), Iterables.toArray(), the
> > >entire
> > >> > >>>> Predicates module, etc. Just an aside; Beam isn't Google code.
> > >I
> > >> > >>>suggest we
> > >> > >>>> use our judgment rather than a policy.
> > >> > >>>>
> > >> > >>>> I think "Distinct" is one of those exceptions. It is a
> > >standard
> > >> > >>>widespread
> > >> > >>>> name and also reads better as an adjective. I prefer it, but
> > >also
> > >> > >>>don't
> > >> > >>>> care strongly enough to change it or to change it back :-)
> > >> > >>>>
> > >> > >>>> If we must have a verb, I like it as-is more than MakeDistinct
> > >and
> > >> > >>>> AvoidDuplicate.
> > >> > >>>>
> > >> > >>>> On Mon, Oct 24, 2016 at 9:46 AM Jesse Anderson
> > >> > >>><[email protected]>
> > >> > >>>> wrote:
> > >> > >>>>
> > >> > >>>> > My original thought for this change was that Crunch uses the
> > >> > >class
> > >> > >>>name
> > >> > >>>> > Distinct. SQL also uses the keyword distinct.
> > >> > >>>> >
> > >> > >>>> > Maybe the rule should be changed to adjectives or verbs
> > >depending
> > >> > >>>on the
> > >> > >>>> > context.
> > >> > >>>> >
> > >> > >>>> > Using a verb to describe this class really doesn't connote
> > >what
> > >> > >the
> > >> > >>>class
> > >> > >>>> > does as succinctly as the adjective.
> > >> > >>>> >
> > >> > >>>> > On Mon, Oct 24, 2016 at 9:40 AM Neelesh Salian
> > >> > >>><[email protected]>
> > >> > >>>> > wrote:
> > >> > >>>> >
> > >> > >>>> > > Hello,
> > >> > >>>> > >
> > >> > >>>> > > First of all, thank you to Daniel, Robert and Jesse for
> > >their
> > >> > >>>review on
> > >> > >>>> > > this: https://issues.apache.org/jira/browse/BEAM-239
> > >> > >>>> > >
> > >> > >>>> > > A point that came up was using verbs explicitly for
> > >Transforms.
> > >> > >>>> > > Here is the PR:
> > >> > >>>https://github.com/apache/incubator-beam/pull/1164
> > >> > >>>> > >
> > >> > >>>> > > Posting it to help understand if we have a consensus for
> > >it and
> > >> > >>>if yes,
> > >> > >>>> > we
> > >> > >>>> > > could perhaps document it for future changes.
> > >> > >>>> > >
> > >> > >>>> > > Thank you.
> > >> > >>>> > >
> > >> > >>>> > > --
> > >> > >>>> > > Neelesh Srinivas Salian
> > >> > >>>> > > Engineer
> > >> > >>>> > >
> > >> > >>>> >
> > >> > >>>>
> > >> >
> > >>
> >
>

Reply via email to