Re: Hazelcast Jet Runner

Kenneth Knowles Tue, 28 May 2019 21:04:09 -0700

On Mon, May 27, 2019 at 3:44 PM Reuven Lax <re...@google.com> wrote:

> We generally use Experimental for two different things, which leads to
> confusion.
>   1. Features that work stably, but where we think we might still make
> some changes to the API.
>   2. New features that we think might not yet be stable.
>


Part of my point is that these tend to be related. Often you discover that
you cannot achieve high quality without changing the API. I think once
quality is achieved, verified, and assured it can graduate from being
Experimental. We may still have a better idea later, but we can probably
just give it a different name.

Kenn




> This dual usage leads to a lot of confusion IMO. The fact that we tend to
> forget to remove the @Experimental tag also makes it somewhat useless. Many
> APIs that have been in place for years and are used by most Beam users are
> still marked Experimental.
>
> Reuven
>
> On Mon, May 27, 2019 at 2:16 PM Ismaël Mejía <ieme...@gmail.com> wrote:
>
>> > Personally, I think that it is good that moving from experimental to
>> non-experimental is a breaking change in the dependency - one has
>> backwards-incompatible changes and the other does not. If artifacts had
>> separate versioning we could use 0.x for this.
>>
>> In theory it seems so, but in practice it is an annoyance to an end
>> user that already took the ‘risk’ of using an experimental feature.
>> Awareness is probably not the most important reason to break existing
>> code (even if it could be easily fixed). The alternative of doing this
>> with version numbers at least seems less impacting but can be
>> confusing.
>>
>> > But biggest motivation for me are these:
>> >
>> >  - using experimental features should be opt-in
>> >  - should be impossible to use an experimental feature without knowing
>> it (so "opt-in" to a normal-looking feature is not enough)
>> > - developers of an experimental feature should be motivated to
>> "graduate" it
>>
>> The fundamental problem of this approach is inconsistency with our
>> present/past. So far we have ‘Experimental’ features everywhere. So
>> suddenly becoming opt-in let us in an inconsistent state. For example
>> all IOs are marked internally as Experimental but not at the level of
>> directories/artifacts. Adding this suffix in a new IO apart of adding
>> fear of use to the end users may also give the fake impression that
>> the older ones not explicitly marked are not experimental.
>>
>> What will be the state for example in the case of runner modules that
>> contain both mature and well tested runners like old Flink and Spark
>> runners vs the more experimental new translations for Portability,
>> again more confusion.
>>
>> > FWIW I don't think "experimental" should be viewed as a bad thing. It
>> just means you are able to make backwards-incompatible changes, and that
>> users should be aware that they will need to adjust APIs (probably only a
>> little) with new releases. Most software is not very good until it has been
>> around for a long time, and in my experience the problem is missing the
>> mark on abstractions, so backwards compatibility *must* be broken to
>> achieve quality. Freezing it early dooms it to never achieving high
>> quality. I know of projects where the users explicitly requested that the
>> developers not freeze the API but instead prioritize speed and quality.
>>
>> I agree 100% on the arguments, but let’s think in the reverse terms,
>> highlighting lack of maturity can play against the intended goal of
>> use and adoption even if for a noble reason. It is basic priming 101
>> [1].
>>
>> > Maybe the word is just too negative-sounding? Alternatives might be
>> "unstable" or "incubating".
>>
>> Yes! “experimental” should not be viewed as a bad thing unless you are
>> a company that has less resources and is trying to protect its
>> investment so in that case they may doubt to use it. In this case
>> probably incubating is a better term because it has less of the
>> ‘tentative’ dimension associated with Experimental.
>>
>> > Now, for the Jet runner, most runners sit on a branch for a while, not
>> being released at all, and move to master as their "graduation". I think
>> releasing under an "experimental" name is an improvement, making it
>> available to users to try out. But we probably should have discussed before
>> doing something different than all the other runners.
>>
>> There is something I don’t get in the case of Jet runner. From the
>> discussion in this thread it seems it has everything required to not
>> be ‘experimental’. It passes ValidatesRunner and can even run Nexmark
>> that’s more that some runners already merged in master, so I still
>> don’t get why we want to give it a different connotation.
>>
>> [1] https://en.wikipedia.org/wiki/Priming_(psychology)
>>
>> On Sun, May 26, 2019 at 4:43 AM Kenneth Knowles <k...@apache.org> wrote:
>> >
>> > Personally, I think that it is good that moving from experimental to
>> non-experimental is a breaking change in the dependency - one has
>> backwards-incompatible changes and the other does not. If artifacts had
>> separate versioning we could use 0.x for this.
>> >
>> > But biggest motivation for me are these:
>> >
>> >  - using experimental features should be opt-in
>> >  - should be impossible to use an experimental feature without knowing
>> it (so "opt-in" to a normal-looking feature is not enough)
>> >  - developers of an experimental feature should be motivated to
>> "graduate" it
>> >
>> > So I think a user of an experimental feature should have to actually
>> type the word "experimental" either on the command line or in their
>> dependencies. That's just my opinion. In the thread [1] myself and Robert
>> were the ones that went in this direction of opt-in. But it was mostly lazy
>> consensus, plus the review on the pull request, that got us to this state.
>> Definitely worth discussing more.
>> >
>> > FWIW I don't think "experimental" should be viewed as a bad thing. It
>> just means you are able to make backwards-incompatible changes, and that
>> users should be aware that they will need to adjust APIs (probably only a
>> little) with new releases. Most software is not very good until it has been
>> around for a long time, and in my experience the problem is missing the
>> mark on abstractions, so backwards compatibility *must* be broken to
>> achieve quality. Freezing it early dooms it to never achieving high
>> quality. I know of projects where the users explicitly requested that the
>> developers not freeze the API but instead prioritize speed and quality.
>> >
>> > Maybe the word is just too negative-sounding? Alternatives might be
>> "unstable" or "incubating".
>> >
>> > Now, for the Jet runner, most runners sit on a branch for a while, not
>> being released at all, and move to master as their "graduation". I think
>> releasing under an "experimental" name is an improvement, making it
>> available to users to try out. But we probably should have discussed before
>> doing something different than all the other runners.
>> >
>> > Kenn
>> >
>> > [1]
>> https://lists.apache.org/thread.html/302bd51c77feb5c9ce39882316d391535a0fc92e7608a623d9139160@%3Cdev.beam.apache.org%3E
>> >
>> > On Sat, May 25, 2019 at 1:03 AM Ismaël Mejía <ieme...@gmail.com> wrote:
>> >>
>> >> Including the experimental suffix in artifact names is not a good idea
>> >> either because once we decide that it is not experimantal anymore this
>> >> will be a breaking change for users who will need then to update its
>> >> dependencies code. Also it is error-prone to use different mappings
>> >> for directories and artifacts (even if possible).
>> >>
>> >> May we reconsider this Kenn? I understand the motivation but I hardly
>> >> see this making things better or more clear. Any runner user will end
>> >> up reading the runner documentation and capability matrix so he will
>> >> catch the current status that way.
>> >>
>> >>
>> >>
>> >> On Sat, May 25, 2019 at 8:35 AM Jozsef Bartok <jo...@hazelcast.com>
>> wrote:
>> >> >
>> >> > I missed Ken's input when writing my previous mail. Sorry.
>> >> > So, to recap: I should remove "experimental" from any directory
>> names, but find an other way of configuring the artifact so that it still
>> has "experimental" in it's name.
>> >> > Right?
>> >> >
>> >> > On Sat, May 25, 2019 at 9:32 AM Jozsef Bartok <jo...@hazelcast.com>
>> wrote:
>> >> >>
>> >> >> Yes, I'll gladly fix it, we aren't particularly keen to be labeled
>> as experimental either..
>> >> >>
>> >> >> Btw. initially the "experimental" word was only in the Gradle
>> module name, but then there was some change
>> >> >> ([BEAM-4046] decouple gradle project names and maven artifact ids -
>> 4/2/19) which kind of ended up
>> >> >> putting it in the directory name. Maybe I should have merged with
>> that differently, but this is how
>> >> >> it seemed consistent.
>> >> >>
>> >> >> Anyways, will fix it in my next PR.
>> >> >>
>> >> >> On Fri, May 24, 2019 at 5:53 PM Ismaël Mejía <ieme...@gmail.com>
>> wrote:
>> >> >>>
>> >> >>> I see thanks Jozsef, marking things as Experimental was discussed
>> but
>> >> >>> we never agreed on doing this at the directory level. We can cover
>> the
>> >> >>> same ground by putting an annotation in the classes (in particular
>> the
>> >> >>> JetRunner and JetPipelineOptions classes which are the real public
>> >> >>> interface, or in the documentation (in particular website), I do
>> not
>> >> >>> see how putting this in the directory name helps and if so we may
>> need
>> >> >>> to put this in many other directories which is far from ideal. Any
>> >> >>> chance this can be fixed (jet-experimental -> jet) ?
>> >> >>>
>> >> >>> On Fri, May 24, 2019 at 9:08 AM Jozsef Bartok <jo...@hazelcast.com>
>> wrote:
>> >> >>> >
>> >> >>> > Hi Ismaël!
>> >> >>> >
>> >> >>> > Quoting Kenn (from PR-8410): "We discussed on list that it would
>> be better to have new things always start as experimental in a way that
>> clearly distinguishes them from the core."
>> >> >>> >
>> >> >>> > Rgds
>> >> >>> >
>> >> >>> > On Thu, May 23, 2019 at 10:44 PM Ismaël Mejía <ieme...@gmail.com>
>> wrote:
>> >> >>> >>
>> >> >>> >> I saw that the runner was merged but I don’t get why the foler
>> is
>> >> >>> >> called ‘runners/jet experimental’ and not simply ‘runners/jet’.
>> Is it
>> >> >>> >> because the runner does not pass ValidatesRunner? Or because the
>> >> >>> >> contributors are few? I don’t really see any reason behind this
>> >> >>> >> suffix. And even if the status is not mature that’s not
>> different from
>> >> >>> >> other already merged runners.
>> >> >>> >>
>> >> >>> >> On Fri, Apr 26, 2019 at 9:43 PM Kenneth Knowles <
>> k...@apache.org> wrote:
>> >> >>> >> >
>> >> >>> >> > Nice! That is *way* more than the PR I was looking for. I
>> just meant that you could update the website/ directory. It is fine to keep
>> the runner in your own repository if you want.
>> >> >>> >> >
>> >> >>> >> > But I think it is great if you want to contribute it to
>> Apache Beam (hence donate it to the Apache Software Foundation). The
>> benefits include: low-latency testing, free updates when someone does a
>> refactor. Things to consider are: subject to ASF / Beam governance, PMC,
>> commiters, subject to Beam's release cadence (and we might exclude from
>> Beam releases for a little bit). Typically, we have kept runners on a
>> branch until they are somewhat stable. I don't feel strongly about this for
>> disjoint codebases that can easily be excluded from releases. We might want
>> to suffix `-experimental` to the artifacts for some time.
>> >> >>> >> >
>> >> >>> >> > I commented on the PR about the necessary i.p. clearance
>> steps.
>> >> >>> >> >
>> >> >>> >> > Kenn
>> >> >>> >> >
>> >> >>> >> > On Fri, Apr 26, 2019 at 3:59 AM jo...@hazelcast.com <
>> jo...@hazelcast.com> wrote:
>> >> >>> >> >>
>> >> >>> >> >> Hi Kenn.
>> >> >>> >> >>
>> >> >>> >> >> It took me a while to migrate our code to the Beam repo, but
>> I finally have been able to create the Pull Request you asked for, this is
>> it: https://github.com/apache/beam/pull/8410
>> >> >>> >> >>
>> >> >>> >> >> Looking forward to your feedback!
>> >> >>> >> >>
>> >> >>> >> >> Best regards,
>> >> >>> >> >> Jozsef
>> >> >>> >> >>
>> >> >>> >> >> On 2019/04/19 20:52:42, Kenneth Knowles <k...@apache.org>
>> wrote:
>> >> >>> >> >> > The ValidatesRunner tests are the best source we have for
>> knowing the
>> >> >>> >> >> > capabilities of a runner. Are there instructions for
>> running the tests?
>> >> >>> >> >> >
>> >> >>> >> >> > Assuming we can check it out, then just open a PR to the
>> website with the
>> >> >>> >> >> > current capabilities and caveats. Since it is a big deal
>> and could use lots
>> >> >>> >> >> > of eyes, I would share the PR link on this thread.
>> >> >>> >> >> >
>> >> >>> >> >> > Kenn
>> >> >>> >> >> >
>> >> >>> >> >> > On Thu, Apr 18, 2019 at 11:53 AM Jozsef Bartok <
>> jo...@hazelcast.com> wrote:
>> >> >>> >> >> >
>> >> >>> >> >> > > Hi. We at Hazelcast Jet have been working for a while
>> now to implement a
>> >> >>> >> >> > > Java Beam Runner (non-portable) based on Hazelcast Jet (
>> >> >>> >> >> > > https://jet.hazelcast.org/). The process is still
>> ongoing (
>> >> >>> >> >> > > https://github.com/hazelcast/hazelcast-jet-beam-runner),
>> but we are
>> >> >>> >> >> > > aiming for a fully functional, reliable Runner which can
>> proudly join the
>> >> >>> >> >> > > Capability Matrix. For that purpose I would like to ask
>> what’s your process
>> >> >>> >> >> > > of validating runners? We are already running the
>> @ValidatesRunner tests
>> >> >>> >> >> > > and the Nexmark test suite, but beyond that what other
>> steps do we need to
>> >> >>> >> >> > > take to get our Runner to the level it needs to be at?
>> >> >>> >> >> > >
>> >> >>> >> >> >
>>
>

Re: Hazelcast Jet Runner

Reply via email to