Hi Vino,
These are really good initiatives. Regarding warnings and error-logs, some of 
these are due to negative test-cases where part of the system are expected to 
throw exceptions.  There are also a bunch of warnings/errors that comes with 
bringing-up/shutting down hive/spark/hbase services for tests. Maybe, we can 
try controlling log-levels for such specific tests.  For others, we need to 
look at the warnings/errors more closely to fix them.
Historically, We had been more bitten by another CI problem - Tests taking too 
long to finish and timing out after 50 minutes. We had mitigated this by 
eliminating redundant tests. We are also introducing integrations tests (new PR 
coming soon) which runs for another 20-30 mins. We have used parallel jobs 
feature of travis 
(https://docs.travis-ci.com/user/speeding-up-the-build/#parallelizing-your-builds-across-virtual-machines)
 to have both unit and integration tests run in parallel. We realized that the 
unit-tests themselves needs to be refactored holistically. We would need to do 
a better job in standardizing the test setup utility functions and 
abstractions.  Also, some tests have huge setup costs but they don't need the 
entire setup. We need help in looking at the quality of unit-tests across 
modules  (https://jira.apache.org/jira/browse/HUDI-203)
Regarding adding multiple stages, I like the idea of using such pipelines. It 
surely helps improve developer experience in handling CI failures. We would 
need to make this change also work with job parallelization (Please see 
https://github.com/apache/incubator-hudi/pull/782/files)
This is pretty exciting. Thanks a lot for your suggestions and help in 
improving CI experience. 
Balaji.V


      On Monday, August 12, 2019, 01:18:07 AM PDT, vino yang 
<[email protected]> wrote:  
 
 Hi guys,

As you can see, there are many warn and error log messages and exception
messages in the log of Travis.[1] Although they do not cause Travis
failure, it makes the log file very large and hard to read useful messages.
So, I propose we need to try to fix and reduce these unuseful messages and
make the log of build job cleaner.

Another suggestion: we can split the single build job of Hudi's Travis into
multiple stages, e.g. Compile and Test and Clean. Please consider this
example[2], there are two advantages to this suggestion:

1) It can fail the Travis as soon as possible, if the PR can not be
compiled successfully;
2) Split test jobs based on modules will let the contributor location the
failed module as soon as possible, instead of waiting and view the whole
log file.

What do you think?

Best,
Vino

[1]: https://api.travis-ci.org/v3/job/570626673/log.txt
[2]: https://travis-ci.org/apache/flink/builds/555743542
  

Reply via email to