Mike, I can verify that the integration tests do not run in parallel via mvn -T 1C clean install
At a minimum the integration test infrastructure will need to hunt for an open port to bind to rather than assuming one. On Tue, Feb 7, 2017 at 9:26 AM, Michael Miklavcic < [email protected]> wrote: > I can't recall, did we have a good solution around Docker and remote > debugging integration tests from the IDE? On the topic of test refactoring > and running in parallel, I'm all for it. I know JJ had been doing this on > his local machine at one point, but we'd need to be sure all tests are > truly independent. E.g. counts on hbase tables would need to be very > specific or every test should use unique tables. Also, can we spin up > something like Docker in Travis? How many cores do we get? I'll look into > that and see what we get. > > I'm all for simplifying our dependencies. Shading the jars takes an > incredible amount of time and has consistently bitten us repeatedly. > Another bummer about the jar shading has been that the build runs > differently in IntelliJ than it does from the Maven command line. I don't > think we'll get away from it entirely, but we may be able to make this > better as well. > > From my most recent local build, these are the biggest offending modules: > metron-profiler .................................... SUCCESS [05:56 min] > metron-parsers ..................................... SUCCESS [09:38 min] > metron-data-management ............................. SUCCESS [09:15 min] > elasticsearch-shaded ............................... SUCCESS [08:05 min] > > I'm going to take a look at Travis and also see what pom dependencies I can > start excluding. > > > On Mon, Feb 6, 2017 at 3:02 PM, Casey Stella <[email protected]> wrote: > > > For those with pending/building pull requests, it will come as no > surprise > > that our build times are increasing at a pace that is worrisome. In > fact, > > we have hit a fundamental limit associated with Travis over the weekend. > > We have creeped up into the 40+ minute build territory and travis seems > to > > error out at around 49 minutes. > > > > Taking the current build ( > > https://travis-ci.org/apache/incubator-metron/jobs/198929446), looking > at > > just job times, we're spending about 19 - 20 minutes (1176.53 seconds) in > > tests out of 44 minutes and 42 seconds to do the build. This places the > > unit tests at around 43% of the build time. I say all of this to point > out > > that while unit tests are a portion of the build, they are not even the > > majority of the build time. We need an approach that addresses the whole > > build performance holistically and we need it soonest. > > > > To seed the discussion, I will point to a few things that come to mind > that > > fit into three broad categories: > > > > *Tests are Slow* > > > > > > - *Tactical*: We have around 13 tests that take more than 30 seconds > and > > make up 14 minutes of the build. Considering what we can do to speed > > those > > tests as a tactical approach may be worth considering > > - We are spinning up the same services (e.g. kafka, storm) for > multiple > > tests, instead use the docker infrastructure to spin them up once and > > then > > use them throughout the tests. > > > > > > *Tests aren't parallel* > > > > Currently we cannot run the build in parallel due to the integration test > > infrastructure spinning up its own services that bind to the same ports. > > If we correct this, we can run the builds in parallel with mvn -T > > > > - Correct this by decoupling the infrastructure from the tests and > > refactoring the tests to run in parallel. > > - Make the integration testing infrastructure bind intelligently to > > whatever port is available. > > - Move the integration tests to their own project. This will let us > run > > the build in parallel since an individual project's test will be run > > serially. > > > > *Packaging is Painful* > > > > We have a sensitive environment in terms of dependencies. As such, we > are > > careful to shade and relocate dependencies that we want to isolate from > our > > transitive dependencies. The consequences of this is that we spend a lot > > of time in the build shading and relocating maven module output. > > > > - Do the hard work to walk our transitive dependencies and ensure that > > we are including only one copy of every library by using exclusions > > effectively. This will not only bring down build times, it will make > > sure > > we know what we're including. > > - Try to devise a strategy where we only shade once at the end. This > > could look like some combination of > > - standardizing on the lowest common denominator of a troublesome > > library > > - We shade in dependencies so they can use different versions of > > libraries (e.g. metron-common with a modern version of guava) > > than the > > final jars. > > - exclusions > > - externalizing infrastructure out to not necessitate spinning up > > hadoop components in-process for integration tests (i.e. hbase > server > > conflicts with storm in a few dependencies) > > > > *Final Thoughts* > > > > If I had three to pick, I'd pick > > > > - moving off of the in-memory component infrastructure to docker > images > > - fixing the maven poms to exclude correctly > > - ensuring the resulting tests are parallelizable > > > > I will point out that fixing the maven poms to exclude correctly (i.e. we > > choose the version of every jar that we depend on transitively) ticks > > multiple boxes, not just making things faster. > > > > What are your thoughts? What did I miss? We need a plan and we need to > > execute on it soon, otherwise travis is going to keep smacking us hard. > It > > may be worth while constructing a tactical plan and then a more strategic > > plan that we can work toward. I was heartened at how much some of these > > suggestions dovetail with the discussion around the future of the docker > > infrastructure. > > > > Best, > > > > Casey > > >
