Robert explained his experience with the release process <https://beam.apache.org/contribute/release-guide/> as the release engineer for 2.4.0, and we discussed the prototype shell script for checking release progress in pull/4896 <https://github.com/apache/beam/pull/4896>.
I'd like to help automate the release process, initially just checking that all steps look ok, then automating all feasible steps, with the goal of reducing the effort of the release engineer per release to less than 1 hour for creating the first RC. Overall, it is a large, scattered process. For someone who has done this many times (like jb@ in the previous release), it is likely easy. Robert was familiar with pypi, making that part easy for him. He was not as familiar with the java release artifacts or nexus, making that more of a challenge. - Several steps are not reversible, so it isn't a restartable process if there are errors. - Several steps have high latency; it may take 30min of work before prompting for a password. - Problems with both his laptop and desktop; GCS wasn't working well on the laptop preventing running tests. Maven extremely slow on his desktop, but he discovered a workaround in his configuration. Would have been nice to use jenkins for most steps instead of relying on laptop/desktop configurations. - Many of the steps ran the same tests over and over. Sometimes tests were flaky, so needed to restart a long process due to a test that had passed earlier now flaking. - Robert was new to Nexus, so setting up permissions and navigating the UI was confusing. - Needed to rebuild the google cloud dataflow containers to get dataflow working with the RC, and that ended up being a painful process. The github sdk/python/container is part of the portability effort and should help eliminate googlers needing to do steps like this with each release because that container can be built externally. - Automated release notes were not seen as valuable due to limitations in any automated documentation. - Missed a step of changing the java/python version numbers but was able to fix that. - Some copy/paste errors when creating the voting emails. - Many steps are not possible for non-committers. - The prototype shell script was seen as helpful, especially since it can be restarted. Some concerns over the maintainability of such a large shell script. The steps that should not be automated, and need human involvement: 1. All emails (propose release, ask for votes) 2. Picking the commit to start the release 3. Signing artifacts Most everything else should be possible to automate, although non-committers do not have access to logging into jira, nexus, or the jenkins ui, making some of this tricky to automate for non-contributors. Also not clear to us how nexus picks he sequential artifact suffix (1031)? Next steps for me are to enhance the release-checking script, automate feasible actions, and pair with the next release engineer to make this smoother, especially if they are at google, but even if they are not. Alan
