Re: Apache Beam 2.4.0 release process retrospective and automation possibilities

Robert Bradshaw Fri, 23 Mar 2018 01:53:14 -0700

To put this in context, this was a brain dump of some of the things I
encountered while doing the release. Were I to do a release again, it would
be a lot easier (though still not ideal).

At the high level, rather than focusing on steps, I think it's more
interesting to focus on why we need a human to do the release. IMHO, the
role of a human is

1) Choose a commit. Any commit in master should do, if we had proper test
and code hygiene.
2) Sign the release artifacts. (Frankly, I would be happy with a robot
signing everything but the tag in github, if the keys could be properly
maintained.)
3) Manage the email thread.

To this end, I would like a process where one would propose a release
candidate via a lightweight, fast tool. and jenkins (or some other system)
would recognize the release branch and test, generate artifacts, push
artifacts, and test against those artifacts (including, ideally, nexus and
svn staging). One should be able to easily run this locally, but that
shouldn't be needed. If a human needs to sign, there could be a script for
one to download artifacts, sign them, and push the signatures. On success,
an email body (with all the links and details) would be created, to be sent
out by a human.

More responses inline below (and thanks for the feedback!).

On Thu, Mar 22, 2018 at 11:10 PM Romain Manni-Bucau <rmannibu...@gmail.com>
wrote:

> Hi
>
> Le 23 mars 2018 04:29, "Alan Myrvold" <amyrv...@google.com> a écrit :
>
> Robert explained his experience with the release process
> <https://beam.apache.org/contribute/release-guide/> as the release
> engineer for 2.4.0, and we discussed the prototype shell script for
> checking release progress in pull/4896
> <https://github.com/apache/beam/pull/4896>.
>
> I'd like to help automate the release process, initially just checking
> that all steps look ok, then automating all feasible steps, with the goal
> of reducing the effort of the release engineer per release to less than 1
> hour for creating the first RC.
>
> Overall, it is a large, scattered process. For someone who has done this
> many times (like jb@ in the previous release), it is likely easy. Robert
> was familiar with pypi, making that part easy for him. He was not as
> familiar with the java release artifacts or nexus, making that more of a
> challenge.
>
>    - Several steps are not reversible, so it isn't a restartable process
>    if there are errors.
>
> Can we ensure they are done last? Maven introduced the deployAtEnd feature
> so i guess it can be a model used here.
>

In retrospect, aside from pushing stuff, things weren't strictly
irreversible. I got to be good friends with git clean, git checkout, and
git reset. That last one, however, always feels wrong to use (and is
error-prone).

I also don't think I'll ever get used to the build system (mvn or gradle)
editing code and creating commits, but perhaps that's needed with the way
-SNAPSHOTS are special (though having to edit every file seems overkill).

>
>    - Several steps have high latency; it may take 30min of work before
>    prompting for a password.
>
> What prevents to put the pwd in settings.xml or use an agent - to not be
> prompted?
>

Foreknowledge :). I did end up using an agent. (Personally, I have some
qualms putting my gpg password in a plaintext xml file). There were also
gitbox prompts in some of the steps, and of course svn commit (though the
svn steps weren't high latency). can't remember if there were others.

>
>    - Problems with both his laptop and desktop; GCS wasn't working well
>    on the laptop preventing running tests. Maven extremely slow on his
>    desktop, but he discovered a workaround in his configuration. Would have
>    been nice to use jenkins for most steps instead of relying on
>    laptop/desktop configurations.
>
> As JB mentioned, these were environmental issues. But it wasn't clear at
the time (being new with the release process) and could have been elevated
had I not had to do so many manual steps (including running the tests). I
happened to get particularly unlucky with timing with one of them too.

>
>    -
>    - Many of the steps ran the same tests over and over. Sometimes tests
>    were flaky, so needed to restart a long process due to a test that had
>    passed earlier now flaking.
>
> Wonder if remote tests shouldnt be mocked *during* a release to avoid that
> no luck effect.
>

We should be able to release from a known good commit, as verified by
jenkins, and never run tests again.

>
>    - Robert was new to Nexus, so setting up permissions and navigating
>    the UI was confusing.
>
> When it happens dont hesitate to ping here.
>
>
>    -
>    - Needed to rebuild the google cloud dataflow containers to get
>    dataflow working with the RC, and that ended up being a painful process.
>    The github sdk/python/container is part of the portability effort and
>    should help eliminate googlers needing to do steps like this with each
>    release because that container can be built externally.
>
> Is there a way to see this process or is it "closed"?
>

No, there's no way to see this process. Nothing secret here, but the
spanning of multiple revision control systems, multiple languages,
branching (on both sides) and dependencies (upstream and downstream) and
some archaic setups make this less than ideal. The move to portability
should be a huge win here in decoupling what's built internally vs.
externally (containers).

>
>
>    -
>    - Automated release notes were not seen as valuable due to limitations
>    in any automated documentation.
>
>
> If needed i have a script to do it from jira, i filter issue by the label
> "changelog". RM must review tickets before the release with that.
>
>
>    - Missed a step of changing the java/python version numbers but was
>    able to fix that.
>
> Is it doable through maven/gradle filtering?
>

Note sure. Ideally the version numbers could be pulled out of the pom/a
central location.

>
>    - Some copy/paste errors when creating the voting emails.
>
>
> At tomee we had a cli to do it if interested.
>

+1

>
>    - Many steps are not possible for non-committers.
>
> Sadly intended/expected but you cannhelp on jira review, snapshot
> validation before the release.
>
>
>    - The prototype shell script was seen as helpful, especially since it
>    can be restarted. Some concerns over the maintainability of such a large
>    shell script.
>
> Move to groovy? Or is it just a size issue?
>

In my (first and second hand) experience, bash scripts become difficult to
maintain (readability and correctness issues) once they reach a certain
size and complexity (not just size, but control structure, etc.) There are
several alternatives, groovy and Python among them.

>
> The steps that should not be automated, and need human involvement:
>
>    1. All emails (propose release, ask for votes)
>    2. Picking the commit to start the release
>    3. Signing artifacts
>
> Most everything else should be possible to automate, although
> non-committers do not have access to logging into jira, nexus, or the
> jenkins ui, making some of this tricky to automate for non-contributors.
> Also not clear to us how nexus picks he sequential artifact suffix (1031)?
>
>
> It is a sequence - as db sequence, dont recall if it is per user or repo
> (repo from memory but not 100% sure). If that is to automate close and
> mailing maybe check sonatype plugin which replace deploy one - ensure to
> deactivate auto release - or just nexus api which returns this value. A
> maven (gradle) extension should be writable too.
>
>
> Next steps for me are to enhance the release-checking script, automate
> feasible actions, and pair with the next release engineer to make this
> smoother, especially if they are at google, but even if they are not.
>
>
> +1, anything should be bound to release:perform except the vote process
> and the postvote tasks which can be automated (dist update, site update,
> release staging, ....)
>
>
> Alan
>
>
>

On Thu, Mar 22, 2018 at 11:53 PM Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:

>>   * Robert was new to Nexus, so setting up permissions and navigating
the UI was
>>     confusing.
>
> Not a release concern IMHO, Nexus is straight forward for release with
staging
> repo. I think it's well explained in the release guide.

Nexus felt a bit clunky, but was usable. I think it would make a lot more
sense to someone well versed in maven packaging structure and terminology.
E.g. I still don't really know what the difference is between
https://repository.apache.org/content/repositories/staging/org/apache/beam/
and https://repository.apache.org/content/repositories/orgapachebeam-1031/

(It also didn't help that this is my first release with the exception of a
Python-only bugfix release last year, and so I skipped the "one time setup"
steps at first this time thinking I'd done them, but last time I didn't
bother with nexus because I didn't need it. Again, a one-off, but someone
else might have similar one-offs.)

>>   * Many steps are not possible for non-committers.
>
> And it's normal per Apache rule.

Yes, I think it's fine to require a committer to be in the loop, but maybe
not eveywhere.

>   * Missed a step of changing the java/python version numbers but was
able to
>     fix that.

> Changing version numbers where ? Release plugin (in the prepare goal)
already
> change the versions in the POMs (like a mvn versions:set). I guess you are
> talking about some other files ? Maybe those files could use the
project.version
> from the pom ?

There are version numbers in the code as well. Yes, there should be one
source of truth.

Re: Apache Beam 2.4.0 release process retrospective and automation possibilities

Reply via email to