Robert explained his experience with the release process
<https://beam.apache.org/contribute/release-guide/> as the release engineer
for 2.4.0, and we discussed the prototype shell script for checking release
progress in pull/4896 <https://github.com/apache/beam/pull/4896>.

I'd like to help automate the release process, initially just checking that
all steps look ok, then automating all feasible steps, with the goal of
reducing the effort of the release engineer per release to less than 1 hour
for creating the first RC.

Overall, it is a large, scattered process. For someone who has done this
many times (like jb@ in the previous release), it is likely easy. Robert
was familiar with pypi, making that part easy for him. He was not as
familiar with the java release artifacts or nexus, making that more of a
challenge.

   - Several steps are not reversible, so it isn't a restartable process if
   there are errors.
   - Several steps have high latency; it may take 30min of work before
   prompting for a password.
   - Problems with both his laptop and desktop; GCS wasn't working well on
   the laptop preventing running tests. Maven extremely slow on his desktop,
   but he discovered a workaround in his configuration. Would have been nice
   to use jenkins for most steps instead of relying on laptop/desktop
   configurations.
   - Many of the steps ran the same tests over and over. Sometimes tests
   were flaky, so needed to restart a long process due to a test that had
   passed earlier now flaking.
   - Robert was new to Nexus, so setting up permissions and navigating the
   UI was confusing.
   - Needed to rebuild the google cloud dataflow containers to get dataflow
   working with the RC, and that ended up being a painful process. The github
   sdk/python/container is part of the portability effort and should help
   eliminate googlers needing to do steps like this with each release because
   that container can be built externally.
   - Automated release notes were not seen as valuable due to limitations
   in any automated documentation.
   - Missed a step of changing the java/python version numbers but was able
   to fix that.
   - Some copy/paste errors when creating the voting emails.
   - Many steps are not possible for non-committers.
   - The prototype shell script was seen as helpful, especially since it
   can be restarted. Some concerns over the maintainability of such a large
   shell script.

The steps that should not be automated, and need human involvement:

   1. All emails (propose release, ask for votes)
   2. Picking the commit to start the release
   3. Signing artifacts

Most everything else should be possible to automate, although
non-committers do not have access to logging into jira, nexus, or the
jenkins ui, making some of this tricky to automate for non-contributors.
Also not clear to us how nexus picks he sequential artifact suffix (1031)?

Next steps for me are to enhance the release-checking script, automate
feasible actions, and pair with the next release engineer to make this
smoother, especially if they are at google, but even if they are not.

Alan

Reply via email to