Hi Bastian! > Failures with reason "eunit_replicator"
Replicator tests were fixed very recently and should be fine since *now*. > Failures with reason "eunit_compression" Compression failure is something new and seems like we're rarely flaky in this test. Need magic debug io:format/2 for the rescue. > Failures with reason "network" Sometimes this happens and git-wip-us.apache.org is unaccessiable. Here you can only do one of these actions: 1. Try retry till it will be up 2. Fallback to github mirror > Failures with reason "docker" I guess sometmes something wrong happens with a docker service. Worth to get the logs about, but Clemens (@klaemo) may have some more ideas about. > Failures with reason "libdl" Suddenly, the only build log reference provided is HTTP 404 NOT FOUND > Failures with reason "aborted" Actual reason is: /usr/src/couchdb/apache-couchdb-2.0.0-7e892d6/bin/couchjs: error while loading shared libraries: libmozjs185.so.1.0: cannot open shared object file: No such file or directory So seems like SpiderMokey was not installed correctly. P.S. Also it would be nice to send a notification about build failure to dev@ ML to let us be aware about. P.P.S. Thanks for working on CI! That's really cool and helpful. -- ,,,^..^,,, On Sat, Feb 27, 2016 at 8:11 PM, Bastian Krol <[email protected]> wrote: > Hi folks, > > some updates regarding the CouchDB CI setup on builds.apache.org. > > The CouchDB build job (https://builds.apache.org/job/CouchDB/) has now six > variations: > > * Ubuntu 14.04 with default Erlang > * Ubuntu 14.04 with Erlang 18.2 > * Debian 8 with default Erlang > * Debian 8 with Erlang 18.2 > * CentOS 7 with default Erlang > * CentOS 7 with Erlang 18.2 > > However, builds fail abysmal often. > > I need your help to sort this out and improve the build stability. > > I wrote some quick scripts to categorize the failing builds. Please check > the result here: > > https://github.com/basti1302/couchdb-ci/blob/master/utils/analyze-jenkins-logs/ci-errors.markdown > > We can ignore the categories "network", "docker" and "aborted". Most > failures come from failing enuit tests, either replicator (30 failures) or > compression (10 failures). > > Why is that? Are these tests inherently fragile? Is it a symptom of a > problem with the CI setup (a bug in Docker or something similar)? > > Maybe the categorization/root cause analysis is not even correct? > > I'd be grateful if people could chime in here. > > I really would like to get closer to "a failing build usually means there is > a problem in the code"-kind of situation :-) > > Best regards > > Bastian
