zentol opened a new pull request #6642: [FLINK-8819][travis] Rework travis 
script to use stages
URL: https://github.com/apache/flink/pull/6642
 
 
   ## What is the purpose of the change
   
   This PR reworks the travis scripts to use stages. Stages allow jobs to be 
organized in sequential steps, in contrast to the current approach of all jobs 
running in parallel. This allows jobs to depend on each other, with the obvious 
use-case of separating code compilation and test execution.
   A subsequent stage is only executed if the previous stage has completed 
successfully, in that all builds in the stage have completed successfully. In 
other words, if checkstyle fails, no tests are executed, so be mindful of that.
   
   The benefit here really is that we no longer compile (parts of) Flink in 
each profile, and move part of the compilation overhead into a separate 
profile. We don't decrease the total runtime due to added overhead 
(upload/download of cache), but the individual builds are faster, and more 
manageable in the long-term.
   
   An example build can be seen here: 
https://travis-ci.org/zentol/flink/builds/422925766
   
   ## High-level overview
   
   The new scripts define 3 stages: Compile, Test and Cleanup.
   
   In the compile stage we compile Flink and run QA checks like checkstyle. The 
compiled Flink project is placed into the travis cache to make it accessible to 
subsequent builds.
   
   The test stage consists of 5 jobs based on our existing test splitting 
(core, libs, connectors, tests, misc). These builds retrieve the compiled Flink 
version from the cache, install it into the local repository and subsequently 
run the tests.
   
   The cleanup jobs deletes the compiled Flink artifact from the cache. This 
step isn't exactly necessary, but still nice to have.
   
   Some additional small refactorings have been made to separate 
`travis_mvn_watchdog.sh` into individual parts, which we can build on in the 
future.
   
   ## Low-level details
   
   ### Caching
   
   The downside of stages is there is no easy-to-use way to pass on build 
artifacts. The caching approach _works_ but has the caveat that builds have to 
share the same cache. The travis cache is only shared between builds if the 
build configurations are identical; most notably they can't call different 
scripts nor have different environment variables.
   
   As a workaround we map the `TRAVIS_JOB_NUMBER` to a specific stage. (If you 
look at the build linked in the PR, `4583.1` would be the value I'm talking 
about). The order of jobs is deterministic, so for example we always know that 
`1-2` belong to the compile stage, with `2` always being configured for the 
legacy codebase.
   
   ### travis_controller
   All stage-related logic is handled by the `travis_controller` script.
   In short:
   * it determines where we are in the build process based on 
`TRAVIS_JOB_NUMBER`
   * if in compile step
     * remove existing cached flink versions (fail-safe cleanup to prevent 
cache from growing larger over time)
     * compile Flink and do QA checks (shading, dependency convergence, 
checkstyle etc.)
     * copy flink to cache location
     * drop unnecessary files (like original jars) from compiled version
   * if in test step
     * fetch flink from cache
     * update all timestamps to prevent compiler plugins from recompiling 
classes
     * execute `travis_mvn_watchdog.sh`
   * if in cleanup step
     * well, cleanup stuff
   
   ### travis_mvn_watchdog
   
   Despite the above changes `travis_mvn_watchdog.sh` works pretty much like it 
did before. It first `install`s Flink (except now without `clean` as this would 
remove already compiled classes) and then runs `mvn verify`.
   This has the downside that we still package jars twice, which actually takes 
a while. We could skip this in theory by directly invoking the `surefire` 
plugin, but various issue in our build/tests prevent this from working at the 
moment. And I don't want to delay this change further.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to