Re: [DISCUSS] Build Times are getting out of hand

Casey Stella Tue, 07 Feb 2017 07:02:15 -0800

Mike, I can verify that the integration tests do not run in parallel via
mvn -T 1C clean install


At a minimum the integration test infrastructure will need to hunt for an
open port to bind to rather than assuming one.

On Tue, Feb 7, 2017 at 9:26 AM, Michael Miklavcic <
[email protected]> wrote:

> I can't recall, did we have a good solution around Docker and remote
> debugging integration tests from the IDE? On the topic of test refactoring
> and running in parallel, I'm all for it. I know JJ had been doing this on
> his local machine at one point, but we'd need to be sure all tests are
> truly independent. E.g. counts on hbase tables would need to be very
> specific or every test should use unique tables. Also, can we spin up
> something like Docker in Travis? How many cores do we get? I'll look into
> that and see what we get.
>
> I'm all for simplifying our dependencies. Shading the jars takes an
> incredible amount of time and has consistently bitten us repeatedly.
> Another bummer about the jar shading has been that the build runs
> differently in IntelliJ than it does from the Maven command line. I don't
> think we'll get away from it entirely, but we may be able to make this
> better as well.
>
> From my most recent local build, these are the biggest offending modules:
> metron-profiler .................................... SUCCESS [05:56 min]
> metron-parsers ..................................... SUCCESS [09:38 min]
> metron-data-management ............................. SUCCESS [09:15 min]
> elasticsearch-shaded ............................... SUCCESS [08:05 min]
>
> I'm going to take a look at Travis and also see what pom dependencies I can
> start excluding.
>
>
> On Mon, Feb 6, 2017 at 3:02 PM, Casey Stella <[email protected]> wrote:
>
> > For those with pending/building pull requests, it will come as no
> surprise
> > that our build times are increasing at a pace that is worrisome.  In
> fact,
> > we have hit a fundamental limit associated with Travis over the weekend.
> > We have creeped up into the 40+ minute build territory and travis seems
> to
> > error out at around 49 minutes.
> >
> > Taking the current build (
> > https://travis-ci.org/apache/incubator-metron/jobs/198929446), looking
> at
> > just job times, we're spending about 19 - 20 minutes (1176.53 seconds) in
> > tests out of 44 minutes and 42 seconds to do the build.  This places the
> > unit tests at around 43% of the build time.  I say all of this to point
> out
> > that while unit tests are a portion of the build, they are not even the
> > majority of the build time.  We need an approach that addresses the whole
> > build performance holistically and we need it soonest.
> >
> > To seed the discussion, I will point to a few things that come to mind
> that
> > fit into three broad categories:
> >
> > *Tests are Slow*
> >
> >
> >    - *Tactical*: We have around 13 tests that take more than 30 seconds
> and
> >    make up 14 minutes of the build.  Considering what we can do to speed
> > those
> >    tests as a tactical approach may be worth considering
> >    - We are spinning up the same services (e.g. kafka, storm) for
> multiple
> >    tests, instead use the docker infrastructure to spin them up once and
> > then
> >    use them throughout the tests.
> >
> >
> > *Tests aren't parallel*
> >
> > Currently we cannot run the build in parallel due to the integration test
> > infrastructure spinning up its own services that bind to the same ports.
> > If we correct this, we can run the builds in parallel with mvn -T
> >
> >    - Correct this by decoupling the infrastructure from the tests and
> >    refactoring the tests to run in parallel.
> >    - Make the integration testing infrastructure bind intelligently to
> >    whatever port is available.
> >    - Move the integration tests to their own project.  This will let us
> run
> >    the build in parallel since an individual project's test will be run
> >    serially.
> >
> > *Packaging is Painful*
> >
> > We have a sensitive environment in terms of dependencies.  As such, we
> are
> > careful to shade and relocate dependencies that we want to isolate from
> our
> > transitive dependencies.  The consequences of this is that we spend a lot
> > of time in the build shading and relocating maven module output.
> >
> >    - Do the hard work to walk our transitive dependencies and ensure that
> >    we are including only one copy of every library by using exclusions
> >    effectively.  This will not only bring down build times, it will make
> > sure
> >    we know what we're including.
> >    - Try to devise a strategy where we only shade once at the end.  This
> >    could look like some combination of
> >       - standardizing on the lowest common denominator of a troublesome
> >       library
> >          - We shade in dependencies so they can use different versions of
> >          libraries (e.g. metron-common with a modern version of guava)
> > than the
> >          final jars.
> >       - exclusions
> >       - externalizing infrastructure out to not necessitate spinning up
> >       hadoop components in-process for integration tests (i.e. hbase
> server
> >       conflicts with storm in a few dependencies)
> >
> > *Final Thoughts*
> >
> > If I had three to pick, I'd pick
> >
> >    - moving off of the in-memory component infrastructure to docker
> images
> >    - fixing the maven poms to exclude correctly
> >    - ensuring the resulting tests are parallelizable
> >
> > I will point out that fixing the maven poms to exclude correctly (i.e. we
> > choose the version of every jar that we depend on transitively) ticks
> > multiple boxes, not just making things faster.
> >
> > What are your thoughts?  What did I miss?  We need a plan and we need to
> > execute on it soon, otherwise travis is going to keep smacking us hard.
> It
> > may be worth while constructing a tactical plan and then a more strategic
> > plan that we can work toward.  I was heartened at how much some of these
> > suggestions dovetail with the discussion around the future of the docker
> > infrastructure.
> >
> > Best,
> >
> > Casey
> >
>

Re: [DISCUSS] Build Times are getting out of hand

Reply via email to