Re: [DISCUSS] Build Times are getting out of hand

Casey Stella Tue, 07 Feb 2017 07:05:07 -0800

I believe that some people use travis and some people request Jenkins from
Apache Infra.  That being said, personally, I think we should take the
opportunity to correct the underlying issues.  50 minutes for a build seems
excessive to me.


On Mon, Feb 6, 2017 at 10:07 PM, Otto Fowler <[email protected]>
wrote:

> Is there an alternative to Travis?  Do other like sized apache projects
> have these problems?  Do they use travis?
>
>
> On February 6, 2017 at 17:02:37, Casey Stella ([email protected]) wrote:
>
> For those with pending/building pull requests, it will come as no surprise
> that our build times are increasing at a pace that is worrisome. In fact,
> we have hit a fundamental limit associated with Travis over the weekend.
> We have creeped up into the 40+ minute build territory and travis seems to
> error out at around 49 minutes.
>
> Taking the current build (
> https://travis-ci.org/apache/incubator-metron/jobs/198929446), looking at
> just job times, we're spending about 19 - 20 minutes (1176.53 seconds) in
> tests out of 44 minutes and 42 seconds to do the build. This places the
> unit tests at around 43% of the build time. I say all of this to point out
> that while unit tests are a portion of the build, they are not even the
> majority of the build time. We need an approach that addresses the whole
> build performance holistically and we need it soonest.
>
> To seed the discussion, I will point to a few things that come to mind
> that
> fit into three broad categories:
>
> *Tests are Slow*
>
>
> - *Tactical*: We have around 13 tests that take more than 30 seconds and
> make up 14 minutes of the build. Considering what we can do to speed those
> tests as a tactical approach may be worth considering
> - We are spinning up the same services (e.g. kafka, storm) for multiple
> tests, instead use the docker infrastructure to spin them up once and then
> use them throughout the tests.
>
>
> *Tests aren't parallel*
>
> Currently we cannot run the build in parallel due to the integration test
> infrastructure spinning up its own services that bind to the same ports.
> If we correct this, we can run the builds in parallel with mvn -T
>
> - Correct this by decoupling the infrastructure from the tests and
> refactoring the tests to run in parallel.
> - Make the integration testing infrastructure bind intelligently to
> whatever port is available.
> - Move the integration tests to their own project. This will let us run
> the build in parallel since an individual project's test will be run
> serially.
>
> *Packaging is Painful*
>
> We have a sensitive environment in terms of dependencies. As such, we are
> careful to shade and relocate dependencies that we want to isolate from
> our
> transitive dependencies. The consequences of this is that we spend a lot
> of time in the build shading and relocating maven module output.
>
> - Do the hard work to walk our transitive dependencies and ensure that
> we are including only one copy of every library by using exclusions
> effectively. This will not only bring down build times, it will make sure
> we know what we're including.
> - Try to devise a strategy where we only shade once at the end. This
> could look like some combination of
> - standardizing on the lowest common denominator of a troublesome
> library
> - We shade in dependencies so they can use different versions of
> libraries (e.g. metron-common with a modern version of guava) than the
> final jars.
> - exclusions
> - externalizing infrastructure out to not necessitate spinning up
> hadoop components in-process for integration tests (i.e. hbase server
> conflicts with storm in a few dependencies)
>
> *Final Thoughts*
>
> If I had three to pick, I'd pick
>
> - moving off of the in-memory component infrastructure to docker images
> - fixing the maven poms to exclude correctly
> - ensuring the resulting tests are parallelizable
>
> I will point out that fixing the maven poms to exclude correctly (i.e. we
> choose the version of every jar that we depend on transitively) ticks
> multiple boxes, not just making things faster.
>
> What are your thoughts? What did I miss? We need a plan and we need to
> execute on it soon, otherwise travis is going to keep smacking us hard. It
> may be worth while constructing a tactical plan and then a more strategic
> plan that we can work toward. I was heartened at how much some of these
> suggestions dovetail with the discussion around the future of the docker
> infrastructure.
>
> Best,
>
> Casey
>
>

Re: [DISCUSS] Build Times are getting out of hand

Reply via email to