Hi Vino, These are really good initiatives. Regarding warnings and error-logs, some of these are due to negative test-cases where part of the system are expected to throw exceptions. There are also a bunch of warnings/errors that comes with bringing-up/shutting down hive/spark/hbase services for tests. Maybe, we can try controlling log-levels for such specific tests. For others, we need to look at the warnings/errors more closely to fix them. Historically, We had been more bitten by another CI problem - Tests taking too long to finish and timing out after 50 minutes. We had mitigated this by eliminating redundant tests. We are also introducing integrations tests (new PR coming soon) which runs for another 20-30 mins. We have used parallel jobs feature of travis (https://docs.travis-ci.com/user/speeding-up-the-build/#parallelizing-your-builds-across-virtual-machines) to have both unit and integration tests run in parallel. We realized that the unit-tests themselves needs to be refactored holistically. We would need to do a better job in standardizing the test setup utility functions and abstractions. Also, some tests have huge setup costs but they don't need the entire setup. We need help in looking at the quality of unit-tests across modules (https://jira.apache.org/jira/browse/HUDI-203) Regarding adding multiple stages, I like the idea of using such pipelines. It surely helps improve developer experience in handling CI failures. We would need to make this change also work with job parallelization (Please see https://github.com/apache/incubator-hudi/pull/782/files) This is pretty exciting. Thanks a lot for your suggestions and help in improving CI experience. Balaji.V
On Monday, August 12, 2019, 01:18:07 AM PDT, vino yang <[email protected]> wrote: Hi guys, As you can see, there are many warn and error log messages and exception messages in the log of Travis.[1] Although they do not cause Travis failure, it makes the log file very large and hard to read useful messages. So, I propose we need to try to fix and reduce these unuseful messages and make the log of build job cleaner. Another suggestion: we can split the single build job of Hudi's Travis into multiple stages, e.g. Compile and Test and Clean. Please consider this example[2], there are two advantages to this suggestion: 1) It can fail the Travis as soon as possible, if the PR can not be compiled successfully; 2) Split test jobs based on modules will let the contributor location the failed module as soon as possible, instead of waiting and view the whole log file. What do you think? Best, Vino [1]: https://api.travis-ci.org/v3/job/570626673/log.txt [2]: https://travis-ci.org/apache/flink/builds/555743542
