> As a general rule, fixes pertaining to new functionality are not a good > candidate for a cherry-pick.
I disagree with this statement, RCs are the point when we expect people to discover and test features, that's the whole point of RCs otherwise we will release as it is, so they are the perfect moment to fix issues, in particular if during the RC tests we discover that new features produce unexpected regressions, inconsistent behavior, bad designed APIs or security issues. The task of release manager is not easy and I understand that we should follow the rules to get the release out but getting a release out quickly is not necessarily the main goal, quality matters and the goal of release validation is in part to ensure quality, if this implies cherry picks and new vote + RCs, that's a pity but it is worth. Now talking about this release I don't know if somebody has mentioned it but when I looked at the nexmark dashboards [1] I see a consistent performance regression in all classic runners starting around the 16/06 so probably included in this version. I am OOO so I do not have enough free cycles to check this but if someone has I think it is worth to take a look. If this is important or not to block the release is again a gray area for Beam but still worth to track specially following the conversation that Max opened recently [2]. [1] http://104.154.241.245/d/ahuaA_zGz/nexmark?orgId=1&from=now-90d&to=now [2] https://lists.apache.org/thread.html/r2f6834a64cbc5610663007f5f0ec4d1c6a9726fadf0678d4cc17b018%40%3Cdev.beam.apache.org%3E On Mon, Jul 20, 2020 at 10:20 PM Kenneth Knowles <[email protected]> wrote: > > Agree. Great management of this release discussion. > > While I think Robert laid out the reasons for avoiding cherry picks very > clearly, I just want to emphasize that it is *not* appropriate to treat every > cherry pick according to risk/reward* which ignores the policy. The reasons > for following a *policy* of avoiding cherrypicks are more important > (community > code). Clear published policies: > > - set expectations for people developing code so they can know in advance > whether or not their cherrypick fits the guidelines > - they also know that other cherrypicks will not delay their release unless > it meets the guidelines > - objective guidelines help to eliminate bias, and also communicate that > lack of bias; even just perception or suspicion of bias harms the community > > It is by agreeing on then following policy that we get a predictable and fair > community process. Any "back to first principles" discussion needs to take > into account the meta pro/con of having vs not having a policy. Assertions > about difficulty of rolling a new RC or the risk of a change miss the bigger > picture. > > Valentyn did a great job of being careful - and communicating - about all > these things, so that's doubly excellent. > > One approach that helps to avoid risk in feature launches and cherry picks is > to have the big announcement correspond with a flag flip, aka graduating to > no longer be experimental. Ideally the completed code will have been > available to users for (at least) a release cycle before considering > graduation and widespread announcement. In this pattern it is also easier to > weigh the impact of bugfixes for exceptions to the guidelines. > > Kenn > > *also risk/reward of a cherrypick is mostly uncertain subjective hand waving > except for showstopper bugs or big stage product announcements > > p.s. FWIW setting a wrong environment is a critical correctness bug that I > agree with Cham's assessment and totally agree with a cherrypick. Even though > it isn't a regression itself, correct changes elsewhere can cause a > regression so the user risk could be pretty high. > > On Mon, Jul 20, 2020 at 1:41 AM Maximilian Michels <[email protected]> wrote: >> >> @Valentyn: Thank you for your transparency in the release process and >> for considering pending cherry-pick requests. No blockers from my side. >> >> -Max >> >> On 18.07.20 01:11, Ahmet Altay wrote: >> > Thank you Valentyn. Being a release manager is difficult. It requires >> > balancing between stability, following the process, regressions, >> > timelines. Thank you for following the process, thank you for asking the >> > right questions, thank you for doing the release. >> > >> > >> > On Fri, Jul 17, 2020 at 3:59 PM Robert Bradshaw <[email protected] >> > <mailto:[email protected]>> wrote: >> > >> > Thank you, Valentyn! >> > >> > On Fri, Jul 17, 2020 at 3:25 PM Chamikara Jayalath >> > <[email protected] <mailto:[email protected]>> wrote: >> > > >> > > >> > > >> > > On Fri, Jul 17, 2020 at 3:01 PM Valentyn Tymofieiev >> > <[email protected] <mailto:[email protected]>> wrote: >> > >> >> > >> As a general rule, fixes pertaining to new functionality are not >> > a good candidate for a cherry-pick. >> > >> >> > >> A case for an exception can be made for polishing features >> > related to major wide announcements with a hard deadline, which >> > appears to be the case for xlang on Dataflow. >> > >> >> > >> I will prepare an RC2 with xlang fixes and consider other >> > low-risk additions from issues that were brought to my attention. >> > > >> > > >> > > Thanks Valentyn. >> > > >> > >> >> > >> >> > >> Thanks >> > >> >> > >> >> > >> On Fri, Jul 17, 2020 at 10:36 AM Chamikara Jayalath >> > <[email protected] <mailto:[email protected]>> wrote: >> > >>> >> > >>> >> > >>> >> > >>> On Fri, Jul 17, 2020 at 10:01 AM Robert Bradshaw >> > <[email protected] <mailto:[email protected]>> wrote: >> > >>>> >> > >>>> Taking a step back, the goal of avoiding cherry-picks is to >> > reduce >> > >>>> risk and increase the velocity of our releases, as otherwise the >> > >>>> release manager gets inundated by a never ending list of features >> > >>>> people want to get in that puts the releases further and further >> > >>>> behind (increasing the desire to get features in in a vicious >> > cycle). >> > >>>> On the flip side, the reason we have a release process with >> > candidates >> > >>>> and voting (as opposed to just declaring a commit id every N >> > weeks to >> > >>>> be "the release") is to give us the flexibility to achieve a >> > level of >> > >>>> quality and polish that may not ever occur in HEAD itself. >> > >>>> >> > >>>> With regards to this specific cross-langauge fix, the >> > motivation is >> > >>>> that those working on it at Google want to widely publish this >> > feature >> > >>>> as newly available on Dataflow. The question to answer here >> > (Cham) is >> > >>>> whether this bug is debilitating enough that were it not to be >> > in the >> > >>>> release we would want to hold off advertising this (and related) >> > >>>> features until the next release. (In my understanding, it >> > would result >> > >>>> in a poor enough user experience that it is.) >> > >>> >> > >>> >> > >>> Yes, I think we will have to either hold off on widely >> > publishing the feature or list this as a potential issue that will >> > be fixed in the next release for anybody who tries cross-language >> > pipelines and runs into this. >> > >>> Note that we are getting in a Python Kafka example [1]. So >> > users will potentially try this out anyways. >> > >>> >> > >>> [1] https://github.com/apache/beam/pull/12188 >> > >>> >> > >>> >> > >>>> >> > >>>> >> > >>>> On the other hand, there's the question of the cost of getting >> > this >> > >>>> fix into the release. The change is simple and well contained, >> > so I >> > >>>> think the risk is low (and, in particular, the cost to include >> > it here >> > >>>> is low enough that it's worth the value provided above). >> > >>>> >> > >>>> Looking at the other proposals, >> > >>>> https://github.com/apache/beam/pull/12196 also seems to meet >> > this bar >> > >>>> (there are possible xlang correctness issues at play here), as >> > does >> > >>>> https://github.com/apache/beam/pull/12175 (mostly due to its >> > >>>> simplicity and the fact that doing it later would be a backwards >> > >>>> compatible change). I'm on the fence about >> > >>>> https://github.com/apache/beam/pull/12171 (if an RC2 is in the >> > works >> > >>>> anyway), and IMHO the others are less compelling as having to >> > be done >> > >>>> now. >> > >>> >> > >>> >> > >>> +1 >> > >>> >> > >>>> >> > >>>> >> > >>>> (On the question of a point release, IMHO anything worth >> > considering >> > >>>> for an x.y.1 release definitely meets the bar for inclusion >> > into an RC >> > >>>> of an ongoing release.) >> > >>>> >> > >>>> - Robert >> > >>>> >> > >>>> >> > >>>> On Thu, Jul 16, 2020 at 8:00 PM Chamikara Jayalath >> > <[email protected] <mailto:[email protected]>> wrote: >> > >>>> > >> > >>>> > >> > >>>> > >> > >>>> > On Thu, Jul 16, 2020 at 7:46 PM Chamikara Jayalath >> > <[email protected] <mailto:[email protected]>> wrote: >> > >>>> >> >> > >>>> >> >> > >>>> >> >> > >>>> >> On Thu, Jul 16, 2020 at 7:28 PM Valentyn Tymofieiev >> > <[email protected] <mailto:[email protected]>> wrote: >> > >>>> >>> >> > >>>> >>> >> > >>>> >>> >> > >>>> >>> On Thu, Jul 16, 2020, 19:07 Chamikara Jayalath >> > <[email protected] <mailto:[email protected]>> wrote: >> > >>>> >>>> >> > >>>> >>>> >> > >>>> >>>> >> > >>>> >>>> On Thu, Jul 16, 2020 at 6:16 PM Valentyn Tymofieiev >> > <[email protected] <mailto:[email protected]>> wrote: >> > >>>> >>>>> >> > >>>> >>>>> Thanks for the feedback, help with release validation, >> > and for reaching out on dev@ regarding a cherry-pick request. >> > >>>> >>>>> >> > >>>> >>>>> BEAM-10397 pertains to new functionality (xlang support >> > on Dataflow). Are there any reasons that this fix cannot wait until >> > 2.24.0 (release cut date 4 weeks from now)? >> > >>>> >>>>> >> > >>>> >>>>> For transparency, I would like to list other cherry-pick >> > requests that I received off-the list (stakeholders bcc'ed): >> > >>>> >>>>> - https://github.com/apache/beam/pull/12175 >> > >>>> >>>>> - https://github.com/apache/beam/pull/12196 >> > >>>> >>>>> - https://github.com/apache/beam/pull/12171 >> > >>>> >>>>> - https://issues.apache.org/jira/browse/BEAM-10492 >> > (recently added) >> > >>>> >>>>> - https://issues.apache.org/jira/browse/BEAM-10385 >> > >>>> >>>>> - https://github.com/apache/beam/pull/12187 (was >> > available before any of RC1 artifacts were created and integrated) >> > >>>> >>>> >> > >>>> >>>> >> > >>>> >>>> My main concern is Python changes in >> > https://github.com/apache/beam/pull/12164. Other changes (at least >> > related to x-lang) can wait. >> > >>>> >>>> >> > >>>> >>>>> >> > >>>> >>>>> >> > >>>> >>>>> My response to such requests is guided by the release >> > guide [1]: >> > >>>> >>>>> >> > >>>> >>>>> - None of the issues were a regression from a previous >> > release. >> > >>>> >>>>> - Most are related to new or recently introduced >> > functionality. >> > >>>> >>>>> - 3 of the requests are related to xlang io, which is >> > very exciting and important functionality, but arguably does not >> > impact a large percentage of [existing] users. >> > >>>> >>>> >> > >>>> >>>> >> > >>>> >>>> Agree that this is not a regression from the previous >> > release but it may result in inconsistent behavior when users >> > execute x-lang pipelines. Actually I think this is a pretty serious >> > issue for portability (we are not setting the environment in >> > WindowingStrategy) but for some reason we are not hitting this in >> > other tests. >> > >>>> >>>> >> > >>>> >>>>> >> > >>>> >>>>> >> > >>>> >>>>> So they do not seem to be release-blocking according to >> > the guide. >> > >>>> >>>>> >> > >>>> >>>>> At this point creating a new RC would delay 2.23.0 >> > availability by at least a week. While a new RC will improve the >> > stability of xlang IO, it will also delay the release of features >> > and bug fixes available in 2.23.0. It will also create a precedent >> > of inconsistency with release policy. Should we delay the release if >> > we discover another xlang issue during validation next week? >> > >>>> >>>> >> > >>>> >>>> >> > >>>> >>>> To be honest, I don't think re-validating after the >> > cherry-pick mentioned above will take a week (unless we find other >> > issues). We just need to rebuild and re-validate the Python >> > distribution and may be rebuild Dataflow containers. I'm >> > volunteering to help you with this :) >> > >>>> >>> >> > >>>> >>> >> > >>>> >>> I was taking 72hrs of voting Window into account that must >> > happen outside of the weekend and the fact that I will be OOO for >> > one day. >> > >>>> >> >> > >>>> >> >> > >>>> >> Got it. >> > >>>> >> >> > >>>> >>> >> > >>>> >>> >> > >>>> >>> If the issue you mention seriously impacts (can cause data >> > loss, pipeline failures) all of users on portable stack or other >> > large user base (not just cross-language support in Dataflow (new >> > user-base) ), this is definitely a candidate for an ASAP fix. >> > >>>> >>> >> > >>>> >>> What is your assessment of the size of the user base that >> > is affected by the issue (large, medium, small, does not affect >> > production for any of existing users)? >> > >>>> >> >> > >>>> >> >> > >>>> >> Impact today I think is low but potential for impact in the >> > future is high. For example, if we update Dataflow service or >> > portable runners to require environment in WindowingStrategy, we'll >> > have to either fork for this or require users to upgrade to a Beam >> > version with the fix. >> > >>>> > >> > >>>> > >> > >>>> > Actually, ignore the "portable runners" part. Seems like we >> > already set "context.default_environment_id()" in the >> > WindowingStrategy so impact is likely only for Dataflow where we do >> > not set an environment_id in serialized WindowingStrategy that is >> > set in GBK. >> > >>>> > >> > >>>> >> >> > >>>> >> >> > >>>> >> Thanks, >> > >>>> >> Cham >> > >>>> >> >> > >>>> >>> >> > >>>> >>> >> > >>>> >>> Thanks! >> > >>>> >>> >> > >>>> >>>> >> > >>>> >>>>> >> > >>>> >>>>> >> > >>>> >>>>> My preferred course of action is to continue with RC0, >> > since release velocity is important for product health. >> > >>>> >>>>> >> > >>>> >>>>> Given that we are having this conversation, we can >> > revise the cherry-pick policy if we think it does not adequately >> > cover this situation. >> > >>>> >>>> >> > >>>> >>>> >> > >>>> >>>> Agree. We have a very strong policy currently regarding >> > cherry-picks but it's up to the release manager to look into >> > requests on a case-by-case basis. >> > >>>> >>>> >> > >>>> >>>>> >> > >>>> >>>>> >> > >>>> >>>>> We can also propose a patch-version release with urgent >> > cherry-picks (release 2.23.1), or consider a faster release cadence >> > if 6 weeks is too slow. >> > >>>> >>>> >> > >>>> >>>> >> > >>>> >>>> Honestly I don't think this is practical. Making a new >> > patch release, validation, vote etc will take 2 weeks or so. We >> > either should cherry-pick this into current release or wait till the >> > next one. I think patch releases should be reserved for critical >> > updates to LTS releases. >> > >>>> >>>> >> > >>>> >>>> Thanks, >> > >>>> >>>> Cham >> > >>>> >>>> >> > >>>> >>>>> >> > >>>> >>>>> >> > >>>> >>>>> Thanks, >> > >>>> >>>>> Valentyn >> > >>>> >>>>> >> > >>>> >>>>> [1] >> > https://beam.apache.org/contribute/release-guide/#review-cherry-picks >> > >>>> >>>>> >> > >>>> >>>>> >> > >>>> >>>>> >> > >>>> >>>>> On Wed, Jul 15, 2020 at 5:41 PM Chamikara Jayalath >> > <[email protected] <mailto:[email protected]>> wrote: >> > >>>> >>>>>> >> > >>>> >>>>>> I agree. I think Dataflow x-lang users could run into >> > flaky pipelines due to this. Valentyn, are you OK with creating a >> > new RC that includes the fix (already merged - >> > https://github.com/apache/beam/pull/12164) and preferably >> > https://github.com/apache/beam/pull/12196 ? >> > >>>> >>>>>> >> > >>>> >>>>>> Thanks, >> > >>>> >>>>>> Cham >> > >>>> >>>>>> >> > >>>> >>>>>> On Wed, Jul 15, 2020 at 5:27 PM Heejong Lee >> > <[email protected] <mailto:[email protected]>> wrote: >> > >>>> >>>>>>> >> > >>>> >>>>>>> I think we need to cherry-pick >> > https://issues.apache.org/jira/browse/BEAM-10397 which fixes missing >> > environment errors for Dataflow xlang pipelines. Internally, we have >> > a flaky xlang kafkaio test because of missing environment errors and >> > any xlang pipelines using GroupByKey could encounter this. >> > >>>> >>>>>>> >> > >>>> >>>>>>> On Wed, Jul 15, 2020 at 5:08 PM Ahmet Altay >> > <[email protected] <mailto:[email protected]>> wrote: >> > >>>> >>>>>>>> >> > >>>> >>>>>>>> >> > >>>> >>>>>>>> >> > >>>> >>>>>>>> On Wed, Jul 15, 2020 at 4:55 PM Robert Bradshaw >> > <[email protected] <mailto:[email protected]>> wrote: >> > >>>> >>>>>>>>> >> > >>>> >>>>>>>>> All the artifacts, signatures, and hashes look good. >> > >>>> >>>>>>>>> >> > >>>> >>>>>>>>> I would like to understand the severity of >> > >>>> >>>>>>>>> https://issues.apache.org/jira/browse/BEAM-10397 >> > before giving my >> > >>>> >>>>>>>>> vote. >> > >>>> >>>>>>>> >> > >>>> >>>>>>>> >> > >>>> >>>>>>>> +Heejong Lee to comment on this. >> > >>>> >>>>>>>> >> > >>>> >>>>>>>>> >> > >>>> >>>>>>>>> >> > >>>> >>>>>>>>> On Wed, Jul 15, 2020 at 10:51 AM Pablo Estrada >> > <[email protected] <mailto:[email protected]>> wrote: >> > >>>> >>>>>>>>> > >> > >>>> >>>>>>>>> > +1 >> > >>>> >>>>>>>>> > I was able to run the python 3.8 quickstart from >> > wheels on DirectRunner. >> > >>>> >>>>>>>>> > I verified hashes for Python files. >> > >>>> >>>>>>>>> > -P. >> > >>>> >>>>>>>>> > >> > >>>> >>>>>>>>> > On Fri, Jul 10, 2020 at 4:34 PM Ahmet Altay >> > <[email protected] <mailto:[email protected]>> wrote: >> > >>>> >>>>>>>>> >> >> > >>>> >>>>>>>>> >> I validated the python 3 quickstarts. I had >> > issues with running with python 3.8 wheel files, but did not have >> > issues with source distributions, or other python wheel files. I >> > have not tested python 2 quickstarts. >> > >>>> >>>>>>>> >> > >>>> >>>>>>>> >> > >>>> >>>>>>>> Did someone validate python 3.8 wheels on Dataflow? I >> > was not able to run that. >> > >>>> >>>>>>>> >> > >>>> >>>>>>>>> >> > >>>> >>>>>>>>> >> >> > >>>> >>>>>>>>> >> On Thu, Jul 9, 2020 at 10:53 PM Valentyn >> > Tymofieiev <[email protected] <mailto:[email protected]>> wrote: >> > >>>> >>>>>>>>> >>> >> > >>>> >>>>>>>>> >>> Hi everyone, >> > >>>> >>>>>>>>> >>> >> > >>>> >>>>>>>>> >>> Please review and vote on the release candidate >> > #1 for the version 2.23.0, as follows: >> > >>>> >>>>>>>>> >>> [ ] +1, Approve the release >> > >>>> >>>>>>>>> >>> [ ] -1, Do not approve the release (please >> > provide specific comments) >> > >>>> >>>>>>>>> >>> >> > >>>> >>>>>>>>> >>> >> > >>>> >>>>>>>>> >>> The complete staging area is available for your >> > review, which includes: >> > >>>> >>>>>>>>> >>> * JIRA release notes [1], >> > >>>> >>>>>>>>> >>> * the official Apache source release to be >> > deployed to dist.apache.org <http://dist.apache.org> [2], which is >> > signed with the key with fingerprint 1DF50603225D29A4 [3], >> > >>>> >>>>>>>>> >>> * all artifacts to be deployed to the Maven >> > Central Repository [4], >> > >>>> >>>>>>>>> >>> * source code tag "v2.23.0-RС1" [5], >> > >>>> >>>>>>>>> >>> * website pull request listing the release [6], >> > publishing the API reference manual [7], and the blog post [8]. >> > >>>> >>>>>>>>> >>> * Java artifacts were built with Maven 3.6.0 and >> > Oracle JDK 1.8.0_201-b09 . >> > >>>> >>>>>>>>> >>> * Python artifacts are deployed along with the >> > source release to the dist.apache.org <http://dist.apache.org> [2]. >> > >>>> >>>>>>>>> >>> * Validation sheet with a tab for 2.23.0 release >> > to help with validation [9]. >> > >>>> >>>>>>>>> >>> * Docker images published to Docker Hub [10]. >> > >>>> >>>>>>>>> >>> >> > >>>> >>>>>>>>> >>> The vote will be open for at least 72 hours. It >> > is adopted by majority approval, with at least 3 PMC affirmative votes. >> > >>>> >>>>>>>>> >>> >> > >>>> >>>>>>>>> >>> Thanks, >> > >>>> >>>>>>>>> >>> Release Manager >> > >>>> >>>>>>>>> >>> >> > >>>> >>>>>>>>> >>> [1] >> > >> > https://jira.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12347145 >> > >>>> >>>>>>>>> >>> [2] >> > https://dist.apache.org/repos/dist/dev/beam/2.23.0/ >> > >>>> >>>>>>>>> >>> [3] >> > https://dist.apache.org/repos/dist/release/beam/KEYS >> > >>>> >>>>>>>>> >>> [4] >> > https://repository.apache.org/content/repositories/orgapachebeam-1105/ >> > >>>> >>>>>>>>> >>> [5] https://github.com/apache/beam/tree/v2.23.0-RC1 >> > >>>> >>>>>>>>> >>> [6] https://github.com/apache/beam/pull/12212 >> > >>>> >>>>>>>>> >>> [7] https://github.com/apache/beam-site/pull/605 >> > >>>> >>>>>>>>> >>> [8] https://github.com/apache/beam/pull/12213 >> > >>>> >>>>>>>>> >>> [9] >> > >> > https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=596347973 >> > >>>> >>>>>>>>> >>> [10] >> > https://hub.docker.com/search?q=apache%2Fbeam&type=image >> >
