rmetzger opened a new pull request #11222: [FLINK-15834] Set up nightly builds 
in Azure & various CI improvements
URL: https://github.com/apache/flink/pull/11222
 
 
   ## What is the purpose of the change
   
   This change set up a nightly end to end test, testing special scenarios such 
as Java 11, hadoop 2.8 or scala 2.12.
   
   
   ## Brief change log
   
   - Change the `build-apache-repo.yaml` to trigger the tests nightly (I left 
out release branches for now, as only master has the azure files present)
   - change the `job-templates.yml` to build the end to end tests with every 
commit (pr, push builds). This is an experiment, as we might not have enough 
resources (but I believe we do).
   I had two options for this: either transfer the build artifact from the 
compile phase into the e2e test job, or build there from scratch. I decided for 
the latter, as the build always takes 20 minutes. So in case the test machines 
are busy, but the azure provided machines are available, we'll start compiling 
right away.
   The precommit tests are executed in this stage, before the end to end tests.
   
   This change also fixes many issues with the end to end tests:
    
   **Fix Kubernetes E2E tests**
   
   Problem: Low disk space was causing K8s to mark the kubelet as "full disk", 
thus Flink did not schedule there.
   Problem: Low disk space let the kublet delete unused docker images, 
including images generated for the test.
   
   **Fix Kerberized YARN e2e test**
   
   The problem was that YARN was decommissioning NodeManagers because of low 
disk space.
   
   **Fix queryable state e2e test**
   
   Problem: The logging pattern recently changed, that's why the extaction of 
port / ip failed
   
   
   ## Verifying this change
   
   This nightly build failed only because of known issues (which also caused 
the travis nightly test to fail). I believe the structure / environment of the 
individual nightly tests is fine: 
https://dev.azure.com/rmetzger/Flink/_build/results?buildId=5590&view=results
   
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): (yes / **no**)
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
     - The serializers: (yes / **no** / don't know)
     - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / **no** / 
don't know)
     - The S3 file system connector: (yes / **no** / don't know)
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to