We can now reproduce the builds locally (without the need of the web UI) with a single command:
To demonstrate, building the master barnch and building a pull request requires the following commands: $ ursabot project build 'AMD64 Ubuntu 18.04 C++' $ ursabot project build -pr <num> 'AMD64 Ubuntu 18.04 C++' See the output here: https://travis-ci.org/ursa-labs/ursabot/builds/566057077#L988 This effectively means, that the builders defined in ursabot can be directly runned on machones or CI services which have docker installed (with a single command). It also replaces the need of the docker-compose setup. I'm going to write some documentation and prepare the arrow builders for a donation to the arrow codebase (which of course requires a vote). If anyone has a question please don't hesitate to ask! Regards, Krisztian On Tue, Jul 30, 2019 at 4:45 PM Krisztián Szűcs <szucs.kriszt...@gmail.com> wrote: > Ok, but the configuration movement to arrow is orthogonal to > the local reproducibility feature. Could we proceed with that? > > On Tue, Jul 30, 2019 at 4:38 PM Wes McKinney <wesmck...@gmail.com> wrote: > >> I will defer to others to investigate this matter further but I would >> really like to see a concrete and practical path to local >> reproducibility before moving forward on any changes to our current >> CI. >> >> On Tue, Jul 30, 2019 at 7:38 AM Krisztián Szűcs >> <szucs.kriszt...@gmail.com> wrote: >> > >> > Fixed it and restarted a bunch of builds. >> > >> > On Tue, Jul 30, 2019 at 5:13 AM Wes McKinney <wesmck...@gmail.com> >> wrote: >> > >> > > By the way, can you please disable the Buildbot builders that are >> > > causing builds on master to fail? We haven't had a passing build in >> > > over a week. Until we reconcile the build configurations we shouldn't >> > > be failing contributors' builds >> > > >> > > On Mon, Jul 29, 2019 at 8:23 PM Wes McKinney <wesmck...@gmail.com> >> wrote: >> > > > >> > > > On Mon, Jul 29, 2019 at 7:58 PM Krisztián Szűcs >> > > > <szucs.kriszt...@gmail.com> wrote: >> > > > > >> > > > > On Tue, Jul 30, 2019 at 1:38 AM Wes McKinney <wesmck...@gmail.com >> > >> > > wrote: >> > > > > >> > > > > > hi Krisztian, >> > > > > > >> > > > > > Before talking about any code donations or where to run builds, >> I >> > > > > > think we first need to discuss the worrisome situation where we >> have >> > > > > > in some cases 3 (or more) CI configurations for different >> components >> > > > > > in the project. >> > > > > > >> > > > > > Just taking into account out C++ build, we have: >> > > > > > >> > > > > > * A config for Travis CI >> > > > > > * Multiple configurations in Dockerfiles under cpp/ >> > > > > > * A brand new (?) configuration in this third party >> ursa-labs/ursabot >> > > > > > repository >> > > > > > >> > > > > > I note for example that the "AMD64 Conda C++" Buildbot build is >> > > > > > failing while Travis CI is succeeding >> > > > > > >> > > > > > https://ci.ursalabs.org/#builders/66/builds/3196 >> > > > > > >> > > > > > Starting from first principles, at least for Linux-based >> builds, what >> > > > > > I would like to see is: >> > > > > > >> > > > > > * A single build configuration (which can be driven by >> yaml-based >> > > > > > configuration files and environment variables), rather than 3 >> like we >> > > > > > have now. This build configuration should be decoupled from any >> CI >> > > > > > platform, including Travis CI and Buildbot >> > > > > > >> > > > > Yeah, this would be the ideal setup, but I'm afraid the situation >> is a >> > > bit >> > > > > more complicated. >> > > > > >> > > > > TravisCI >> > > > > -------- >> > > > > >> > > > > constructed from a bunch of scripts optimized for travis, this >> setup is >> > > > > slow >> > > > > and hardly compatible with any of the remaining setups. >> > > > > I think we should ditch it. >> > > > > >> > > > > The "docker-compose setup" >> > > > > -------------------------- >> > > > > >> > > > > Most of the Dockerfiles are part of the docker-compose setup >> we've >> > > > > developed. >> > > > > This might be a good candidate as the tool to centralize around >> our >> > > future >> > > > > setup, mostly because docker-compose is widely used, and we could >> setup >> > > > > buildbot builders (or any other CI's) to execute the sequence of >> > > > > docker-compose >> > > > > build and docker-compose run commands. >> > > > > However docker-compose is not suitable for building and running >> > > > > hierarchical >> > > > > images. This is why we have added Makefile [1] to execute a >> "build" >> > > with a >> > > > > single make command instead of manually executing multiple >> commands >> > > > > involving >> > > > > multiple images (which is error prone). It can also leave a lot of >> > > garbage >> > > > > after both containers and images. >> > > > > Docker-compose shines when one needs to orchestrate multiple >> > > containers and >> > > > > their networks / volumes on the same machine. We made it work >> (with a >> > > > > couple of >> > > > > hacky workarounds) for arrow though. >> > > > > Despite that, I still consider the docker-compose setup a good >> > > solution, >> > > > > mostly because its biggest advantage, the local reproducibility. >> > > > > >> > > > >> > > > I think what is missing here is an orchestration tool (for example, >> a >> > > > Python program) to invoke Docker-based development workflows >> involving >> > > > multiple steps. >> > > > >> > > > > Ursabot >> > > > > ------- >> > > > > >> > > > > Ursabot uses low level docker commands to spin up and down the >> > > containers >> > > > > and >> > > > > it also has a utility to nicely build the hierarchical images >> (with >> > > much >> > > > > less >> > > > > maintainable code). The builders are reliable, fast (thanks to >> docker) >> > > and >> > > > > it's >> > > > > great so far. >> > > > > Where it falls short compared to docker-compose is the lack of the >> > > local >> > > > > reproducibility, currently the docker worker cleans up everything >> > > after it >> > > > > except the mounted volumes for caching. `docker-compose run` is a >> > > pretty >> > > > > nice >> > > > > way to shell into the container. >> > > > > >> > > > > Use docker-compose from ursabot? >> > > > > -------------------------------- >> > > > > >> > > > > So assume that we should use docker-compose commands in the >> buildbot >> > > > > builders. >> > > > > Then: >> > > > > - there would be a single build step for all builders [2] (which >> means >> > > a >> > > > > single chunk of unreadable log) - it also complicates working >> with >> > > > > esoteric >> > > > >> > > > I think this is too much of a black-and-white way of looking at >> > > > things. What I would like to see is a build orchestration tool, >> which >> > > > can be used via command line interface, not unlike the current >> > > > crossbow.py and archery command line scripts, that can invoke a >> build >> > > > locally or in a CI setting. >> > > > >> > > > > builders like the on-demand crossbow trigger and the benchmark >> runner >> > > > > - no possibility to customize the buildsteps (like aggregating the >> > > count of >> > > > > warnings) >> > > > > - no time statistics for the steps which would make it harder to >> > > optimize >> > > > > the >> > > > > build times >> > > > > - to properly clean up the container some custom solution would be >> > > required >> > > > > - if we'd need to introduce additional parametrizations to the >> > > > > docker-compose.yaml (for example to add other architectures) >> then it >> > > might >> > > > > require full yaml duplication >> > > > >> > > > I think the tool would need to be higher level than docker-compose >> > > > >> > > > In general I'm not very comfortable introducing a hard dependency on >> > > > Buildbot (or any CI platform, for that matter) into the project. So >> we >> > > > have to figure out a way to move forward without such hard >> dependency >> > > > or go back to the drawing board. >> > > > >> > > > > - exchanging data between the docker-compose container and >> builtbot >> > > would be >> > > > > more complicated, for example the benchmark comment reporter >> reads >> > > > > the result from a file, in order to do the same (reading >> structured >> > > > > output on >> > > > > stdout and stderr from scripts is more error prone) mounted >> volumes >> > > are >> > > > > required, which brings the usual permission problems on linux. >> > > > > - local reproducibility still requires manual intervention >> because the >> > > > > scripts >> > > > > within the docker containers are not pausable, they exit and the >> > > steps >> > > > > until >> > > > > the failed one must be re-executed* after ssh-ing into the >> running >> > > > > container. >> > > > > >> > > > > Honestly I see more issues than advantages here. Let's see the >> other >> > > way >> > > > > around. >> > > > > >> > > > > Local reproducibility with ursabot? >> > > > > ----------------------------------- >> > > > > >> > > > > The most wanted feature what docker-compose has but ursabot >> doesn't is >> > > the >> > > > > local reproducibility. First of all, ursabot can be run locally, >> > > including >> > > > > all >> > > > > if its builders, so the local reproducibility is partially >> resolved. >> > > The >> > > > > missing piece is the interactive shell into the running container, >> > > because >> > > > > buildbot instantly stops and aggressively clean up everything >> after the >> > > > > container. >> > > > > >> > > > > I have three solutions / workarounds in mind: >> > > > > >> > > > > 1. We have all the power of docker and docker-compose from ursabot >> > > through >> > > > > docker-py, and we can easily keep the container running by >> simply >> > > not >> > > > > stopping it [3]. Configuring the locally running buildbot to >> keep >> > > the >> > > > > containers running after a failure seems quite easy. *It has >> the >> > > > > advantage >> > > > > that all of the buildsteps preceding one are already executed, >> so it >> > > > > requires less manual intervention. >> > > > > This could be done on the web UI or even from the CLI, like >> > > > > `ursabot reproduce <builder-name>` >> > > > > 2. Generate the docker-compose.yaml and required scripts from the >> > > Ursabot >> > > > > builder configurations, including the shell scripts. >> > > > > 3. Generate a set of commands to reproduce the failure without >> (even >> > > asking >> > > > > the comment bot "how to reproduce the failing one"). The >> response >> > > would >> > > > > look similar to: >> > > > > ```bash >> > > > > $ docker pull <image> >> > > > > $ docker run -it <image> bash >> > > > > # cmd1 >> > > > > # cmd2 >> > > > > # <- error occurs here -> >> > > > > ``` >> > > > > >> > > > > TL;DR >> > > > > ----- >> > > > > In the first iteration I'd remove the travis configurations. >> > > > > In the second iteration I'd develop a feature for ursabot to make >> local >> > > > > reproducibility possible. >> > > > > >> > > > > [1]: https://github.com/apache/arrow/blob/master/Makefile.docker >> > > > > [2]: https://ci.ursalabs.org/#/builders/87/builds/929 >> > > > > [3]: >> > > > > >> > > >> https://github.com/buildbot/buildbot/blob/e7ff2a3b959cff96c77c07891fa07a35a98e81cb/master/buildbot/worker/docker.py#L343 >> > > > > >> > > > > * A local tool to run any Linux-based builds locally using Docker >> at >> > > > > > the command line, so that CI behavior can be exactly reproduced >> > > > > > locally >> > > > > > >> > > > > > Does that seem achievable? >> > > > > > >> > > > > Thanks, >> > > > > > Wes >> > > > > > >> > > > > > On Mon, Jul 29, 2019 at 6:22 PM Krisztián Szűcs >> > > > > > <szucs.kriszt...@gmail.com> wrote: >> > > > > > > >> > > > > > > Hi All, >> > > > > > > >> > > > > > > Ursabot works pretty well so far, and the CI feedback times >> have >> > > become >> > > > > > > even better* after enabling the docker volume caches, the >> > > development >> > > > > > > and maintenance of it is still not available for the whole >> Arrow >> > > > > > community. >> > > > > > > >> > > > > > > While it wasn't straightforward I've managed to separate to >> source >> > > code >> > > > > > > required to configure the Arrow builders into a separate >> > > directory, which >> > > > > > > eventually can be donated to Arrow. >> > > > > > > The README is under construction, but the code is available >> here >> > > [1]. >> > > > > > > >> > > > > > > Until this codebase is not governed by the Arrow community, >> > > > > > > decommissioning slow travis builds is not possible, so the >> overall >> > > CI >> > > > > > times >> > > > > > > required to merge a PR will remain high. >> > > > > > > >> > > > > > > Regards, Krisztian >> > > > > > > >> > > > > > > * C++ builder times have dropped from ~6-7 minutes to ~3-4 >> minutes >> > > > > > > * Python builder times have dropped from ~7-8 minutes to ~3-5 >> > > minutes >> > > > > > > * ARM C++ builder time have dropped from ~19-20 minutes to >> ~9-12 >> > > minutes >> > > > > > > >> > > > > > > [1]: >> > > > > > > >> > > > > > >> > > >> https://github.com/ursa-labs/ursabot/tree/a46c6aa7b714346b3e4bb7921decb4d4d2f5ed70/projects/arrow >> > > > > > >> > > >> >