Re: Hazelcast Jet Runner

Maximilian Michels Tue, 09 Jul 2019 13:24:34 -0700

We should fork the discussion around removing instances of @Experimental, but 
it was good to mention it here.


As for the Jet runner, I can only second Ismael: The Jet runner is the first 
runner I can think of that came with ValidatesRunner and Nexmark out of the 
box. Of course that doesn't mean the runner is "battled-tested", but we do not 
have other means to test its maturity.

For the future, we could come up with other criteria, e.g. a "probation 
period", but enforcing this now seems arbitrary.

If the authors of the Runners decide that it is experimental, so be it. 
Otherwise I would leave it to the user to decide (it might be helpful to list 
the inception date of each runner). That said, I value your concern Kenn. I can 
see that we establish a consistent onboarding of new runners which may involve 
marking them experimental for a while.

-Max

On 01.07.19 22:20, Kenneth Knowles wrote:
>
>
> On Wed, Jun 12, 2019 at 2:32 AM Ismaël Mejía <[email protected]
> <mailto:[email protected]>> wrote:
>
>     Seems the discussion moved a bit of my original intent that was to
>     make the Jet runner directory to be just called runners/jet in the
>     directory and mark the 'experimental' part of it in documentation as
>     we do for all other things in Beam.
>
>
> Thanks for returning to the one question at hand. We don't have to make
> an overall decision about all "experimental" things.
>  
>
>     Can we do this or is there still any considerable argument to not do it?
>
>
> I think we actually have some competing goals:
>
>     I agree 100% on the arguments, but let’s think in the reverse terms,
>     highlighting lack of maturity can play against the intended goal of
>     use and adoption even if for a noble reason. It is basic priming 101
>     [1].
>
>
> _My_ goal is exactly to highlight lack of maturity so that users are not
> harmed by either (1) necessary breaking changes or (2) permanent low
> quality. Only users who are willing to follow along with the project and
> update their own code regularly should use experimental features.
>
> Evaluating the Jet runner I am convinced by your arguments, because
> looking at the two dangers:
> (1) necessary breaking changes -- runners don't really have their own
> APIs to break, except their own small set of APIs and pipeline options
> (2) permanent low quality -- because there is no API design possible,
> there's no risk of permanent low quality except by fundamental
> mismatches. Plus as you mention the testing is already quite good.
>
> So I am OK to not call it experimental. But I have a slight remaining
> concern that it did not really go through what other runners went
> through. I hope this just means it is more mature. I hope it does not
> indicate that we are reducing rigor.
>
> Kenn
>  
>
>     On Wed, May 29, 2019 at 3:02 PM Reza Rokni <[email protected]
>     <mailto:[email protected]>> wrote:
>     >
>     > Hi,
>     >
>     > Over 800 usages under java, might be worth doing a few PR...
>     >
>     > Also suggest we use a very light review process: First round go
>     for low hanging fruit, if anyone does a -1 against a change then we
>     leave that for round two.
>     >
>     > Thoughts?
>     >
>     > Cheers
>     >
>     > Reza
>     >
>     > On Wed, 29 May 2019 at 12:05, Kenneth Knowles <[email protected]
>     <mailto:[email protected]>> wrote:
>     >>
>     >>
>     >>
>     >> On Mon, May 27, 2019 at 4:05 PM Reza Rokni <[email protected]
>     <mailto:[email protected]>> wrote:
>     >>>
>     >>> "Many APIs that have been in place for years and are used by
>     most Beam users are still marked Experimental."
>     >>>
>     >>> Should there be a formal process in place to start 'graduating'
>     features out of @Experimental? Perhaps even target an up coming
>     release with a PR to remove the annotation from well established API's?
>     >>
>     >>
>     >> Good idea. I think a PR like this would be an opportunity to
>     discuss whether the feature is non-experimental. Probably many of
>     them are ready. It would help to address Ismael's very good point
>     that this new practice could make users think the old Experimental
>     stuff is not experimental. Maybe it is true that it is not really
>     still Experimental.
>     >>
>     >> Kenn
>     >>
>     >>
>     >>>
>     >>> On Tue, 28 May 2019 at 06:44, Reuven Lax <[email protected]
>     <mailto:[email protected]>> wrote:
>     >>>>
>     >>>> We generally use Experimental for two different things, which
>     leads to confusion.
>     >>>>   1. Features that work stably, but where we think we might
>     still make some changes to the API.
>     >>>>   2. New features that we think might not yet be stable.
>     >>>>
>     >>>> This dual usage leads to a lot of confusion IMO. The fact that
>     we tend to forget to remove the @Experimental tag also makes it
>     somewhat useless. Many APIs that have been in place for years and
>     are used by most Beam users are still marked Experimental.
>     >>>>
>     >>>> Reuven
>     >>>>
>     >>>> On Mon, May 27, 2019 at 2:16 PM Ismaël Mejía <[email protected]
>     <mailto:[email protected]>> wrote:
>     >>>>>
>     >>>>> > Personally, I think that it is good that moving from
>     experimental to non-experimental is a breaking change in the
>     dependency - one has backwards-incompatible changes and the other
>     does not. If artifacts had separate versioning we could use 0.x for
>     this.
>     >>>>>
>     >>>>> In theory it seems so, but in practice it is an annoyance to
>     an end
>     >>>>> user that already took the ‘risk’ of using an experimental
>     feature.
>     >>>>> Awareness is probably not the most important reason to break
>     existing
>     >>>>> code (even if it could be easily fixed). The alternative of
>     doing this
>     >>>>> with version numbers at least seems less impacting but can be
>     >>>>> confusing.
>     >>>>>
>     >>>>> > But biggest motivation for me are these:
>     >>>>> >
>     >>>>> >  - using experimental features should be opt-in
>     >>>>> >  - should be impossible to use an experimental feature
>     without knowing it (so "opt-in" to a normal-looking feature is not
>     enough)
>     >>>>> > - developers of an experimental feature should be motivated
>     to "graduate" it
>     >>>>>
>     >>>>> The fundamental problem of this approach is inconsistency with our
>     >>>>> present/past. So far we have ‘Experimental’ features
>     everywhere. So
>     >>>>> suddenly becoming opt-in let us in an inconsistent state. For
>     example
>     >>>>> all IOs are marked internally as Experimental but not at the
>     level of
>     >>>>> directories/artifacts. Adding this suffix in a new IO apart of
>     adding
>     >>>>> fear of use to the end users may also give the fake impression
>     that
>     >>>>> the older ones not explicitly marked are not experimental.
>     >>>>>
>     >>>>> What will be the state for example in the case of runner
>     modules that
>     >>>>> contain both mature and well tested runners like old Flink and
>     Spark
>     >>>>> runners vs the more experimental new translations for Portability,
>     >>>>> again more confusion.
>     >>>>>
>     >>>>> > FWIW I don't think "experimental" should be viewed as a bad
>     thing. It just means you are able to make backwards-incompatible
>     changes, and that users should be aware that they will need to
>     adjust APIs (probably only a little) with new releases. Most
>     software is not very good until it has been around for a long time,
>     and in my experience the problem is missing the mark on
>     abstractions, so backwards compatibility *must* be broken to achieve
>     quality. Freezing it early dooms it to never achieving high quality.
>     I know of projects where the users explicitly requested that the
>     developers not freeze the API but instead prioritize speed and quality.
>     >>>>>
>     >>>>> I agree 100% on the arguments, but let’s think in the reverse
>     terms,
>     >>>>> highlighting lack of maturity can play against the intended
>     goal of
>     >>>>> use and adoption even if for a noble reason. It is basic
>     priming 101
>     >>>>> [1].
>     >>>>>
>     >>>>> > Maybe the word is just too negative-sounding? Alternatives
>     might be "unstable" or "incubating".
>     >>>>>
>     >>>>> Yes! “experimental” should not be viewed as a bad thing unless
>     you are
>     >>>>> a company that has less resources and is trying to protect its
>     >>>>> investment so in that case they may doubt to use it. In this case
>     >>>>> probably incubating is a better term because it has less of the
>     >>>>> ‘tentative’ dimension associated with Experimental.
>     >>>>>
>     >>>>> > Now, for the Jet runner, most runners sit on a branch for a
>     while, not being released at all, and move to master as their
>     "graduation". I think releasing under an "experimental" name is an
>     improvement, making it available to users to try out. But we
>     probably should have discussed before doing something different than
>     all the other runners.
>     >>>>>
>     >>>>> There is something I don’t get in the case of Jet runner. From the
>     >>>>> discussion in this thread it seems it has everything required
>     to not
>     >>>>> be ‘experimental’. It passes ValidatesRunner and can even run
>     Nexmark
>     >>>>> that’s more that some runners already merged in master, so I still
>     >>>>> don’t get why we want to give it a different connotation.
>     >>>>>
>     >>>>> [1] https://en.wikipedia.org/wiki/Priming_(psychology)
>     >>>>>
>     >>>>> On Sun, May 26, 2019 at 4:43 AM Kenneth Knowles
>     <[email protected] <mailto:[email protected]>> wrote:
>     >>>>> >
>     >>>>> > Personally, I think that it is good that moving from
>     experimental to non-experimental is a breaking change in the
>     dependency - one has backwards-incompatible changes and the other
>     does not. If artifacts had separate versioning we could use 0.x for
>     this.
>     >>>>> >
>     >>>>> > But biggest motivation for me are these:
>     >>>>> >
>     >>>>> >  - using experimental features should be opt-in
>     >>>>> >  - should be impossible to use an experimental feature
>     without knowing it (so "opt-in" to a normal-looking feature is not
>     enough)
>     >>>>> >  - developers of an experimental feature should be motivated
>     to "graduate" it
>     >>>>> >
>     >>>>> > So I think a user of an experimental feature should have to
>     actually type the word "experimental" either on the command line or
>     in their dependencies. That's just my opinion. In the thread [1]
>     myself and Robert were the ones that went in this direction of
>     opt-in. But it was mostly lazy consensus, plus the review on the
>     pull request, that got us to this state. Definitely worth discussing
>     more.
>     >>>>> >
>     >>>>> > FWIW I don't think "experimental" should be viewed as a bad
>     thing. It just means you are able to make backwards-incompatible
>     changes, and that users should be aware that they will need to
>     adjust APIs (probably only a little) with new releases. Most
>     software is not very good until it has been around for a long time,
>     and in my experience the problem is missing the mark on
>     abstractions, so backwards compatibility *must* be broken to achieve
>     quality. Freezing it early dooms it to never achieving high quality.
>     I know of projects where the users explicitly requested that the
>     developers not freeze the API but instead prioritize speed and quality.
>     >>>>> >
>     >>>>> > Maybe the word is just too negative-sounding? Alternatives
>     might be "unstable" or "incubating".
>     >>>>> >
>     >>>>> > Now, for the Jet runner, most runners sit on a branch for a
>     while, not being released at all, and move to master as their
>     "graduation". I think releasing under an "experimental" name is an
>     improvement, making it available to users to try out. But we
>     probably should have discussed before doing something different than
>     all the other runners.
>     >>>>> >
>     >>>>> > Kenn
>     >>>>> >
>     >>>>> > [1]
>     
> https://lists.apache.org/thread.html/302bd51c77feb5c9ce39882316d391535a0fc92e7608a623d9139160@%3Cdev.beam.apache.org%3E
>     >>>>> >
>     >>>>> > On Sat, May 25, 2019 at 1:03 AM Ismaël Mejía
>     <[email protected] <mailto:[email protected]>> wrote:
>     >>>>> >>
>     >>>>> >> Including the experimental suffix in artifact names is not
>     a good idea
>     >>>>> >> either because once we decide that it is not experimantal
>     anymore this
>     >>>>> >> will be a breaking change for users who will need then to
>     update its
>     >>>>> >> dependencies code. Also it is error-prone to use different
>     mappings
>     >>>>> >> for directories and artifacts (even if possible).
>     >>>>> >>
>     >>>>> >> May we reconsider this Kenn? I understand the motivation
>     but I hardly
>     >>>>> >> see this making things better or more clear. Any runner
>     user will end
>     >>>>> >> up reading the runner documentation and capability matrix
>     so he will
>     >>>>> >> catch the current status that way.
>     >>>>> >>
>     >>>>> >>
>     >>>>> >>
>     >>>>> >> On Sat, May 25, 2019 at 8:35 AM Jozsef Bartok
>     <[email protected] <mailto:[email protected]>> wrote:
>     >>>>> >> >
>     >>>>> >> > I missed Ken's input when writing my previous mail. Sorry.
>     >>>>> >> > So, to recap: I should remove "experimental" from any
>     directory names, but find an other way of configuring the artifact
>     so that it still has "experimental" in it's name.
>     >>>>> >> > Right?
>     >>>>> >> >
>     >>>>> >> > On Sat, May 25, 2019 at 9:32 AM Jozsef Bartok
>     <[email protected] <mailto:[email protected]>> wrote:
>     >>>>> >> >>
>     >>>>> >> >> Yes, I'll gladly fix it, we aren't particularly keen to
>     be labeled as experimental either..
>     >>>>> >> >>
>     >>>>> >> >> Btw. initially the "experimental" word was only in the
>     Gradle module name, but then there was some change
>     >>>>> >> >> ([BEAM-4046] decouple gradle project names and maven
>     artifact ids - 4/2/19) which kind of ended up
>     >>>>> >> >> putting it in the directory name. Maybe I should have
>     merged with that differently, but this is how
>     >>>>> >> >> it seemed consistent.
>     >>>>> >> >>
>     >>>>> >> >> Anyways, will fix it in my next PR.
>     >>>>> >> >>
>     >>>>> >> >> On Fri, May 24, 2019 at 5:53 PM Ismaël Mejía
>     <[email protected] <mailto:[email protected]>> wrote:
>     >>>>> >> >>>
>     >>>>> >> >>> I see thanks Jozsef, marking things as Experimental was
>     discussed but
>     >>>>> >> >>> we never agreed on doing this at the directory level.
>     We can cover the
>     >>>>> >> >>> same ground by putting an annotation in the classes (in
>     particular the
>     >>>>> >> >>> JetRunner and JetPipelineOptions classes which are the
>     real public
>     >>>>> >> >>> interface, or in the documentation (in particular
>     website), I do not
>     >>>>> >> >>> see how putting this in the directory name helps and if
>     so we may need
>     >>>>> >> >>> to put this in many other directories which is far from
>     ideal. Any
>     >>>>> >> >>> chance this can be fixed (jet-experimental -> jet) ?
>     >>>>> >> >>>
>     >>>>> >> >>> On Fri, May 24, 2019 at 9:08 AM Jozsef Bartok
>     <[email protected] <mailto:[email protected]>> wrote:
>     >>>>> >> >>> >
>     >>>>> >> >>> > Hi Ismaël!
>     >>>>> >> >>> >
>     >>>>> >> >>> > Quoting Kenn (from PR-8410): "We discussed on list
>     that it would be better to have new things always start as
>     experimental in a way that clearly distinguishes them from the core."
>     >>>>> >> >>> >
>     >>>>> >> >>> > Rgds
>     >>>>> >> >>> >
>     >>>>> >> >>> > On Thu, May 23, 2019 at 10:44 PM Ismaël Mejía
>     <[email protected] <mailto:[email protected]>> wrote:
>     >>>>> >> >>> >>
>     >>>>> >> >>> >> I saw that the runner was merged but I don’t get why
>     the foler is
>     >>>>> >> >>> >> called ‘runners/jet experimental’ and not simply
>     ‘runners/jet’. Is it
>     >>>>> >> >>> >> because the runner does not pass ValidatesRunner? Or
>     because the
>     >>>>> >> >>> >> contributors are few? I don’t really see any reason
>     behind this
>     >>>>> >> >>> >> suffix. And even if the status is not mature that’s
>     not different from
>     >>>>> >> >>> >> other already merged runners.
>     >>>>> >> >>> >>
>     >>>>> >> >>> >> On Fri, Apr 26, 2019 at 9:43 PM Kenneth Knowles
>     <[email protected] <mailto:[email protected]>> wrote:
>     >>>>> >> >>> >> >
>     >>>>> >> >>> >> > Nice! That is *way* more than the PR I was looking
>     for. I just meant that you could update the website/ directory. It
>     is fine to keep the runner in your own repository if you want.
>     >>>>> >> >>> >> >
>     >>>>> >> >>> >> > But I think it is great if you want to contribute
>     it to Apache Beam (hence donate it to the Apache Software
>     Foundation). The benefits include: low-latency testing, free updates
>     when someone does a refactor. Things to consider are: subject to ASF
>     / Beam governance, PMC, commiters, subject to Beam's release cadence
>     (and we might exclude from Beam releases for a little bit).
>     Typically, we have kept runners on a branch until they are somewhat
>     stable. I don't feel strongly about this for disjoint codebases that
>     can easily be excluded from releases. We might want to suffix
>     `-experimental` to the artifacts for some time.
>     >>>>> >> >>> >> >
>     >>>>> >> >>> >> > I commented on the PR about the necessary i.p.
>     clearance steps.
>     >>>>> >> >>> >> >
>     >>>>> >> >>> >> > Kenn
>     >>>>> >> >>> >> >the probl
>      >>>>> >> >>> >> > On Fri, Apr 26, 2019 at 3:59 AM
>     [email protected] <mailto:[email protected]>
>     <[email protected] <mailto:[email protected]>> wrote:
>     >>>>> >> >>> >> >>
>     >>>>> >> >>> >> >> Hi Kenn.
>     >>>>> >> >>> >> >>
>     >>>>> >> >>> >> >> It took me a while to migrate our code to the
>     Beam repo, but I finally have been able to create the Pull Request
>     you asked for, this is it: https://github.com/apache/beam/pull/8410
>     >>>>> >> >>> >> >>
>     >>>>> >> >>> >> >> Looking forward to your feedback!
>     >>>>> >> >>> >> >>
>     >>>>> >> >>> >> >> Best regards,
>     >>>>> >> >>> >> >> Jozsef
>     >>>>> >> >>> >> >>
>     >>>>> >> >>> >> >> On 2019/04/19 20:52:42, Kenneth Knowles
>     <[email protected] <mailto:[email protected]>> wrote:
>     >>>>> >> >>> >> >> > The ValidatesRunner tests are the best source
>     we have for knowing the
>     >>>>> >> >>> >> >> > capabilities of a runner. Are there
>     instructions for running the tests?
>     >>>>> >> >>> >> >> >
>     >>>>> >> >>> >> >> > Assuming we can check it out, then just open a
>     PR to the website with the
>     >>>>> >> >>> >> >> > current capabilities and caveats. Since it is a
>     big deal and could use lots
>     >>>>> >> >>> >> >> > of eyes, I would share the PR link on this thread.
>     >>>>> >> >>> >> >> >
>     >>>>> >> >>> >> >> > Kenn
>     >>>>> >> >>> >> >> >
>     >>>>> >> >>> >> >> > On Thu, Apr 18, 2019 at 11:53 AM Jozsef Bartok
>     <[email protected] <mailto:[email protected]>> wrote:
>     >>>>> >> >>> >> >> >
>     >>>>> >> >>> >> >> > > Hi. We at Hazelcast Jet have been working for
>     a while now to implement a
>     >>>>> >> >>> >> >> > > Java Beam Runner (non-portable) based on
>     Hazelcast Jet (
>     >>>>> >> >>> >> >> > > https://jet.hazelcast.org/). The process is
>     still ongoing (
>     >>>>> >> >>> >> >> > >
>     https://github.com/hazelcast/hazelcast-jet-beam-runner), but we are
>     >>>>> >> >>> >> >> > > aiming for a fully functional, reliable
>     Runner which can proudly join the
>     >>>>> >> >>> >> >> > > Capability Matrix. For that purpose I would
>     like to ask what’s your process
>     >>>>> >> >>> >> >> > > of validating runners? We are already running
>     the @ValidatesRunner tests
>     >>>>> >> >>> >> >> > > and the Nexmark test suite, but beyond that
>     what other steps do we need to
>     >>>>> >> >>> >> >> > > take to get our Runner to the level it needs
>     to be at?
>     >>>>> >> >>> >> >> > >
>     >>>>> >> >>> >> >> >
>     >>>
>     >>>
>     >>>
>     >>> --
>     >>>
>     >>> This email may be confidential and privileged. If you received
>     this communication by mistake, please don't forward it to anyone
>     else, please erase all copies and attachments, and please let me
>     know that it has gone to the wrong person.
>     >>>
>     >>> The above terms reflect a potential business arrangement, are
>     provided solely as a basis for further discussion, and are not
>     intended to be and do not constitute a legally binding obligation.
>     No legally binding obligations will be created, implied, or inferred
>     until an agreement in final form is executed in writing by all
>     parties involved.
>     >
>     >
>     >
>     > --
>     >
>     > This email may be confidential and privileged. If you received
>     this communication by mistake, please don't forward it to anyone
>     else, please erase all copies and attachments, and please let me
>     know that it has gone to the wrong person.
>     >
>     > The above terms reflect a potential business arrangement, are
>     provided solely as a basis for further discussion, and are not
>     intended to be and do not constitute a legally binding obligation.
>     No legally binding obligations will be created, implied, or inferred
>     until an agreement in final form is executed in writing by all
>     parties involved.
>
>

Re: Hazelcast Jet Runner

Reply via email to