On Wed, May 8, 2019 at 9:29 AM Ahmet Altay <al...@google.com> wrote: > > > *From: *Kenneth Knowles <k...@apache.org> > *Date: *Wed, May 8, 2019 at 9:24 AM > *To: *dev > > >> >> On Fri, Apr 19, 2019 at 3:09 AM Ismaël Mejía <ieme...@gmail.com> wrote: >> >>> It seems we mostly agree that @Experimental is important, and that API >>> changes (removals) on experimental features should happen quickly but still >>> give some time to users so the Experimental purpose is not lost. >>> >>> Ahmet proposal given our current release calendar is close to 2 >>> releases. Can we settle this on 2 releases as a 'minimum time' before >>> removal? (This will let maintainers the option to choose to support it more >>> time if they want as discussed in the related KafkaIO thread but still be >>> friendly with users). >>> >>> Do we agree? >>> >> >> This sounds pretty good to me. >> > > Sounds good to me too. > > >> How can we manage this? Right now we tie most activities (like >> re-triaging flakes) to the release process, since it is the only thing that >> happens regularly for the community. If we don't have some forcing then I >> expect the whole thing will just be forgotten. >> > > Can we pre-create a list of future releases in JIRA, and for each > experimental feature require that a JIRA issue is created for resolving the > experimental status and tag it with the release that will happen after the > minimum time period? >
Great idea. I just created the 2.15.0 release so it reaches far enough ahead for right now. Kenn > >> Kenn >> >> >>> >>> Note: for the other subjects (e.g. when an Experimental feature should >>> become not experimental) I think we will hardly find an agreement so I >>> think this should be treated in a per case basis by the maintainers, but if >>> you want to follow up on that discussion we can open another thread for >>> this. >>> >>> >>> >>> On Sat, Apr 6, 2019 at 1:04 AM Ahmet Altay <al...@google.com> wrote: >>> >>>> I agree that Experimental feature is still very useful. I was trying to >>>> argue that we diluted its value so +1 to reclaim that. >>>> >>>> Back to the original question, in my opinion removing existing >>>> "experimental and deprecated" features in n=1 release will confuse users. >>>> This will likely be a surprise to them because we have been maintaining >>>> this state release after release now. I would propose in the next release >>>> warning users of such a change happening and give them at least 3 months to >>>> upgrade to suggested newer paths. In the future we can have a shorter >>>> timelines assuming that we will set the user expectations right. >>>> >>>> On Fri, Apr 5, 2019 at 3:01 PM Ismaël Mejía <ieme...@gmail.com> wrote: >>>> >>>>> I agree 100% with Kenneth on the multiple advantages that the >>>>> Experimental feature gave us. I also can count multiple places where this >>>>> has been essential in other modules than core. I disagree on the fact that >>>>> the @Experimental annotation has lost sense, it is simply ill defined, and >>>>> probably it is by design because its advantages come from it. >>>>> >>>>> Most of the topics in this thread are a consequence of the this loose >>>>> definition, e.g. (1) not defining how a feature becomes stable, and (2) >>>>> what to do when we want to remove an experimental feature, are ideas that >>>>> we need to decide if we define just continue to handle as we do today. >>>>> >>>>> Defining a target for graduating an Experimental feature is a bit too >>>>> aggressive with not much benefit, in this case we could be losing the >>>>> advantages of Experimental (save if we could change the proposed version >>>>> in >>>>> the future). This probably makes sense for the removal of features but >>>>> makes less sense to decide when some feature becomes stable. Of course in >>>>> the case of the core SDKs packages this is probably more critical but >>>>> nothing guarantees that things will be ready when we expect too. When will >>>>> we tag for stability things like SDF or portability APIs?. We cannot >>>>> predict the future for completion of features. >>>>> >>>>> Nobody has mentioned the LTS releases couldn’t be these like the >>>>> middle points for these decisions? That at least will give LTS some value >>>>> because so far I still have issues to understand the value of this idea >>>>> given that we can do a minor release of any pre-released version. >>>>> >>>>> This debate is super important and nice to have, but we lost focus on >>>>> my initial question. I like the proposal to remove a deprecated >>>>> experimental feature (or part of it) after one release, in particular if >>>>> the feature has a clear replacement path, however for cases like the >>>>> removal of previously supported versions of Kafka one release may be too >>>>> short. Other opinions on this? (or the other topics). >>>>> >>>>> On Fri, Apr 5, 2019 at 10:52 AM Robert Bradshaw <rober...@google.com> >>>>> wrote: >>>>> >>>>>> if it's technically feasible, I am also in favor of requiring >>>>>> experimental features to be (per-tag, Python should be updated) opt-in >>>>>> only. We should probably regularly audit the set of experimental features >>>>>> we ship (I'd say as part of the release, but that process is laborious >>>>>> enough, perhaps we should do it on a half-release cycle?) I think >>>>>> imposing >>>>>> hard deadlines (chosen when a feature is introduced) is too extreme, but >>>>>> might be valuable if opt-in plus regular audit is insufficient. >>>>>> >>>>>> On Thu, Apr 4, 2019 at 5:28 AM Kenneth Knowles <k...@apache.org> >>>>>> wrote: >>>>>> >>>>>>> This all makes me think that we should rethink how we ship >>>>>>> experimental features. My experience is also that (1) users don't know >>>>>>> if >>>>>>> something is experimental or don't think hard about it and (2) we don't >>>>>>> use >>>>>>> experimental time period to gather feedback and make changes. >>>>>>> >>>>>>> How can we change both of these? Perhaps we could require >>>>>>> experimental features to be opt-in. Flags work and also clearly marked >>>>>>> experimental dependencies that a user has to add. Changing the core is >>>>>>> sometimes tricky to put behind a flag but rarely impossible. This way a >>>>>>> contributor is also motivated to gather feedback to mature their >>>>>>> feature to >>>>>>> become default instead of opt-in. >>>>>>> >>>>>>> The need that @Experimental was trying to address is real. We *do* >>>>>>> need a way to try things and get feedback prior to committing to forever >>>>>>> support. We have discovered real problems far too late, or not had the >>>>>>> will >>>>>>> to fix the issue we did find: >>>>>>> - many trigger combinators should probably be deleted >>>>>>> - many triggers cannot meet a good spec with merging windows >>>>>>> - the continuation trigger idea doesn't work well >>>>>>> - CombineFn had to have its spec changed in order to be both >>>>>>> correct and efficient >>>>>>> - OutputTimeFn as a UDF is convenient for Java but it turns out an >>>>>>> enum is better for portability >>>>>>> - Coder contexts turned out to be a major usability problem >>>>>>> - The built-in data types for schemas are evolving (luckily these >>>>>>> are really being worked on!) >>>>>>> >>>>>>> That's just what I can think of off the top of my head. I expect the >>>>>>> examples from IOs are more numerous; in that case it is pretty easy to >>>>>>> fork >>>>>>> and make a new and better IO. >>>>>>> >>>>>>> And as an extreme view, I would prefer if we add a deadline for >>>>>>> experimental features, then our default action is to remove them, not >>>>>>> declare them stable. If noone is trying to mature it and get it out of >>>>>>> opt-in status, then it probably has not matured. And perhaps if noone >>>>>>> care >>>>>>> enough to do that work it also isn't that important. >>>>>>> >>>>>>> Kenn >>>>>>> >>>>>>> On Wed, Apr 3, 2019 at 5:57 PM Ahmet Altay <al...@google.com> wrote: >>>>>>> >>>>>>>> I agree with Reuven that our experimental annotation is not useful >>>>>>>> any more. For example Datastore IO in python sdk is experimental for 2 >>>>>>>> years now. Even though it is marked as experimental an upgrade is >>>>>>>> carefully >>>>>>>> planned [1] as if it is not experimental. Given that I do not think we >>>>>>>> can >>>>>>>> remove features within a small number of minor releases. (Exception to >>>>>>>> this >>>>>>>> would be, if we have a clear knowledge of very low usage of a certain >>>>>>>> IO.) >>>>>>>> >>>>>>>> I am worried that tagging experimental features with release >>>>>>>> versions will add toil to the release process as mentioned and will >>>>>>>> also >>>>>>>> add to the user confusion. What would be the signal to a user if they >>>>>>>> see >>>>>>>> an experimental feature target release bumped between releases? How >>>>>>>> about >>>>>>>> tagging experimental features with JIRAs (similar to TODOs) with an >>>>>>>> action >>>>>>>> to either promote them as supported features or remove them? These >>>>>>>> JIRAs >>>>>>>> could have fix version targets as any other release blocking JIRAs. It >>>>>>>> will >>>>>>>> also clarify who is responsible for a given experimental feature. >>>>>>>> >>>>>>>> [1] >>>>>>>> https://lists.apache.org/thread.html/5ec88967aa4a382db07a60e0101c4eb36165909076867155ab3546a6@%3Cdev.beam.apache.org%3E >>>>>>>> >>>>>>>> On Wed, Apr 3, 2019 at 5:24 PM Reuven Lax <re...@google.com> wrote: >>>>>>>> >>>>>>>>> Experiments are already tagged with a Kind enum >>>>>>>>> (e.g. @Experimental(Kind.Schemas)). >>>>>>>>> >>>>>>>> >>>>>>>> This not the case for python's annotations. It will be a good idea >>>>>>>> to add there as well. >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Apr 3, 2019 at 4:56 PM Ankur Goenka <goe...@google.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> I think a release version with Experimental flag makes sense. >>>>>>>>>> In addition, I think many of our user start to rely on >>>>>>>>>> experimental features because they are not even aware that these >>>>>>>>>> features >>>>>>>>>> are experimental and its really hard to find the experimental >>>>>>>>>> features used >>>>>>>>>> without giving a good look at the Beam code and having some >>>>>>>>>> knowledge about >>>>>>>>>> it. >>>>>>>>>> >>>>>>>>>> It will be good it we can have a step at the pipeline submission >>>>>>>>>> time which can print all the experiments used in verbose mode. This >>>>>>>>>> might >>>>>>>>>> also require to add a meaningful group name for the experiment >>>>>>>>>> example >>>>>>>>>> >>>>>>>>>> @Experimental("SDF", 2.15.0) >>>>>>>>>> >>>>>>>>>> This will of-course add additional effort and require additional >>>>>>>>>> context while tagging experiments. >>>>>>>>>> >>>>>>>>>> On Wed, Apr 3, 2019 at 4:43 PM Reuven Lax <re...@google.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Our Experimental annotation has become almost useless. Many >>>>>>>>>>> core, widely-used parts of the API (e.g. triggers) are still all >>>>>>>>>>> marked as >>>>>>>>>>> experimental. So many users use these features that we couldn't >>>>>>>>>>> really >>>>>>>>>>> change them (in a backwards-incompatible) without hurting many >>>>>>>>>>> users, so >>>>>>>>>>> the fact they are marked Experimental has become a fiction. >>>>>>>>>>> >>>>>>>>>>> Could we add a deadline to the Experimental tag - a release >>>>>>>>>>> version when it will be removed? e.g. >>>>>>>>>>> >>>>>>>>>>> @Experimental(2.15.0) >>>>>>>>>>> >>>>>>>>>>> We can have a test that ensure that the tag is removed at this >>>>>>>>>>> version. Of course if we're not ready to remove experimental by that >>>>>>>>>>> version, it's fine - we can always bump the tagged version. However >>>>>>>>>>> this >>>>>>>>>>> forces us to think about each one. >>>>>>>>>>> >>>>>>>>>>> Downside - it might add more toil to the existing release >>>>>>>>>>> process. >>>>>>>>>>> >>>>>>>>>>> Reuven >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Wed, Apr 3, 2019 at 4:00 PM Kyle Weaver <kcwea...@google.com> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> > We might also want to get in the habit of reviewing if >>>>>>>>>>>> something should no longer be experimental. >>>>>>>>>>>> >>>>>>>>>>>> +1 >>>>>>>>>>>> >>>>>>>>>>>> Kyle Weaver | Software Engineer | kcwea...@google.com | >>>>>>>>>>>> +16502035555 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Apr 3, 2019 at 3:53 PM Kenneth Knowles <k...@apache.org> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> I think option 2 with n=1 minor version seems OK. So users get >>>>>>>>>>>>> the message for one release and it is gone the next. We should >>>>>>>>>>>>> make sure >>>>>>>>>>>>> the deprecation warning says "this is an experimental feature, so >>>>>>>>>>>>> it will >>>>>>>>>>>>> be removed after 1 minor version". And we need a process for >>>>>>>>>>>>> doing it so it >>>>>>>>>>>>> doesn't sit around. I think we should also leave room for using >>>>>>>>>>>>> our own >>>>>>>>>>>>> judgment about whether the user pain is very little and then it >>>>>>>>>>>>> is not >>>>>>>>>>>>> needed to have a deprecation cycle. >>>>>>>>>>>>> >>>>>>>>>>>>> We might also want to get in the habit of reviewing if >>>>>>>>>>>>> something should no longer be experimental. >>>>>>>>>>>>> >>>>>>>>>>>>> Kenn >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Apr 3, 2019 at 2:33 PM Ismaël Mejía <ieme...@gmail.com> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> When we did the first stable release of Beam (2.0.0) we >>>>>>>>>>>>>> decided to >>>>>>>>>>>>>> annotate most of the Beam IOs as @Experimental because we were >>>>>>>>>>>>>> cautious about not getting the APIs right in the first try. >>>>>>>>>>>>>> This was a >>>>>>>>>>>>>> really good decision because we could do serious improvements >>>>>>>>>>>>>> and >>>>>>>>>>>>>> refactorings to them in the first releases without the hassle >>>>>>>>>>>>>> of >>>>>>>>>>>>>> keeping backwards compatibility. However after some more >>>>>>>>>>>>>> releases >>>>>>>>>>>>>> users started to rely on features and supported versions, so >>>>>>>>>>>>>> we ended >>>>>>>>>>>>>> up in a situation where we could not change them arbitrarily >>>>>>>>>>>>>> without >>>>>>>>>>>>>> consequences to the final users. >>>>>>>>>>>>>> >>>>>>>>>>>>>> So we started to deprecate some features and parts of the API >>>>>>>>>>>>>> without >>>>>>>>>>>>>> removing them, e.g. the introduction of HadoopFormatIO >>>>>>>>>>>>>> deprecated >>>>>>>>>>>>>> HadoopInputFormatIO, we deprecated methods of MongoDbIO and >>>>>>>>>>>>>> MqttIO to >>>>>>>>>>>>>> improve the APIs (in most cases with valid/improved >>>>>>>>>>>>>> replacements), and >>>>>>>>>>>>>> recently it was discussed to removal of support for older >>>>>>>>>>>>>> versions in >>>>>>>>>>>>>> KafkaIO. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Keeping deprecated stuff in experimental APIs does not seem >>>>>>>>>>>>>> to make >>>>>>>>>>>>>> sense, but it is what he have started to do to be ‘user >>>>>>>>>>>>>> friendly’, but >>>>>>>>>>>>>> it is probably a good moment to define, what should be the >>>>>>>>>>>>>> clear path >>>>>>>>>>>>>> for removal and breaking changes of experimental features, >>>>>>>>>>>>>> some >>>>>>>>>>>>>> options: >>>>>>>>>>>>>> >>>>>>>>>>>>>> 1. Stay as we were, do not mark things as deprecated and >>>>>>>>>>>>>> remove them >>>>>>>>>>>>>> at will because this is the contract of @Experimental. >>>>>>>>>>>>>> 2. Deprecate stuff and remove it after n versions (where n >>>>>>>>>>>>>> could be 3 releases). >>>>>>>>>>>>>> 3. Deprecate stuff and remove it just after a new LTS is >>>>>>>>>>>>>> decided to >>>>>>>>>>>>>> ensure users who need these features may still have them for >>>>>>>>>>>>>> some >>>>>>>>>>>>>> time. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I would like to know your opinions about this, or if you have >>>>>>>>>>>>>> other >>>>>>>>>>>>>> ideas. Notice that in discussion I refer only to @Experimental >>>>>>>>>>>>>> features. >>>>>>>>>>>>>> >>>>>>>>>>>>>