rmetzger opened a new pull request #11222: [FLINK-15834] Set up nightly builds in Azure & various CI improvements URL: https://github.com/apache/flink/pull/11222 ## What is the purpose of the change This change set up a nightly end to end test, testing special scenarios such as Java 11, hadoop 2.8 or scala 2.12. ## Brief change log - Change the `build-apache-repo.yaml` to trigger the tests nightly (I left out release branches for now, as only master has the azure files present) - change the `job-templates.yml` to build the end to end tests with every commit (pr, push builds). This is an experiment, as we might not have enough resources (but I believe we do). I had two options for this: either transfer the build artifact from the compile phase into the e2e test job, or build there from scratch. I decided for the latter, as the build always takes 20 minutes. So in case the test machines are busy, but the azure provided machines are available, we'll start compiling right away. The precommit tests are executed in this stage, before the end to end tests. This change also fixes many issues with the end to end tests: **Fix Kubernetes E2E tests** Problem: Low disk space was causing K8s to mark the kubelet as "full disk", thus Flink did not schedule there. Problem: Low disk space let the kublet delete unused docker images, including images generated for the test. **Fix Kerberized YARN e2e test** The problem was that YARN was decommissioning NodeManagers because of low disk space. **Fix queryable state e2e test** Problem: The logging pattern recently changed, that's why the extaction of port / ip failed ## Verifying this change This nightly build failed only because of known issues (which also caused the travis nightly test to fail). I believe the structure / environment of the individual nightly tests is fine: https://dev.azure.com/rmetzger/Flink/_build/results?buildId=5590&view=results ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know)
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
