Mike, unfortunately something changed recently, and I can't run `mvn clean install -T 2C` locally anymore.
I'd like to echo that I think working on fixing the dependency issue is a very good idea. We've actually faced issues with this on the REST API PR. Working to fix this and having a standard way of including/excluding dependencies will be helpful to all, and to Ryan's point will benefit us outside of this context. On Tue, Feb 7, 2017 at 9:36 AM, Ryan Merriman <[email protected]> wrote: > Debugging integration tests in an IDE uses the same approach with our > current infrastructure or with docker: start up the topology with > LocalRunner. I've had mixed success with our current infrastructure. As > Mike alluded to, some tests work fine (most of the parser topologies and > enrichment topology) while others fail when run in my IDE but work on the > command line (ES integration test due to guava issues and Squid topology > due to some issue with the remove subdomains Stellar function). Of course > with Docker infrastructure you will need a test runner to launch topologies > in LocalRunner. They are short and simple though and I have one written > for each topology that I can share when appropriate. > > There are some advantages and disadvantages to switching the integration > tests to use Docker. The infrastructure we have now works and could be > adjusted to overcome it's primary weaknesses (single classloader and start > up/shutdown after each test). With Docker the classloader issue goes away > for the most part (or is much better than it is now) without any extra > work. For spinning services up/down once instead of with each test, we > will need to adjust our tests to clean up after themselves or (even better) > namespace all testing objects so that tests don't step on each other. That > work would have to be done no matter which infrastructure approach we > take. Probably the biggest downside to using Docker is that all > integration tests will need to be adjusted and we'll likely hit some issues > that we'll need to resolve. I was bitten several times by services that > broadcast their host address (Kafka for example) and I bet we hit more of > those. We'll also need to add a few more containers (HDFS for sure) but > those are easy to create as long as you don't hit the issue I just > mentioned. > > I think all of the suggestions so far are good ideas. I think it goes > without saying that we should do one at a time and maybe even reassess > after we see the impact of each change. I would vote for doing the > Maven/shading one first because it is all around beneficial, even outside > of this context. > > On Tue, Feb 7, 2017 at 9:04 AM, Casey Stella <[email protected]> wrote: > > > I believe that some people use travis and some people request Jenkins > from > > Apache Infra. That being said, personally, I think we should take the > > opportunity to correct the underlying issues. 50 minutes for a build > seems > > excessive to me. > > > > On Mon, Feb 6, 2017 at 10:07 PM, Otto Fowler <[email protected]> > > wrote: > > > > > Is there an alternative to Travis? Do other like sized apache projects > > > have these problems? Do they use travis? > > > > > > > > > On February 6, 2017 at 17:02:37, Casey Stella ([email protected]) > > wrote: > > > > > > For those with pending/building pull requests, it will come as no > > surprise > > > that our build times are increasing at a pace that is worrisome. In > fact, > > > we have hit a fundamental limit associated with Travis over the > weekend. > > > We have creeped up into the 40+ minute build territory and travis seems > > to > > > error out at around 49 minutes. > > > > > > Taking the current build ( > > > https://travis-ci.org/apache/incubator-metron/jobs/198929446), looking > > at > > > just job times, we're spending about 19 - 20 minutes (1176.53 seconds) > in > > > tests out of 44 minutes and 42 seconds to do the build. This places the > > > unit tests at around 43% of the build time. I say all of this to point > > out > > > that while unit tests are a portion of the build, they are not even the > > > majority of the build time. We need an approach that addresses the > whole > > > build performance holistically and we need it soonest. > > > > > > To seed the discussion, I will point to a few things that come to mind > > > that > > > fit into three broad categories: > > > > > > *Tests are Slow* > > > > > > > > > - *Tactical*: We have around 13 tests that take more than 30 seconds > and > > > make up 14 minutes of the build. Considering what we can do to speed > > those > > > tests as a tactical approach may be worth considering > > > - We are spinning up the same services (e.g. kafka, storm) for multiple > > > tests, instead use the docker infrastructure to spin them up once and > > then > > > use them throughout the tests. > > > > > > > > > *Tests aren't parallel* > > > > > > Currently we cannot run the build in parallel due to the integration > test > > > infrastructure spinning up its own services that bind to the same > ports. > > > If we correct this, we can run the builds in parallel with mvn -T > > > > > > - Correct this by decoupling the infrastructure from the tests and > > > refactoring the tests to run in parallel. > > > - Make the integration testing infrastructure bind intelligently to > > > whatever port is available. > > > - Move the integration tests to their own project. This will let us run > > > the build in parallel since an individual project's test will be run > > > serially. > > > > > > *Packaging is Painful* > > > > > > We have a sensitive environment in terms of dependencies. As such, we > are > > > careful to shade and relocate dependencies that we want to isolate from > > > our > > > transitive dependencies. The consequences of this is that we spend a > lot > > > of time in the build shading and relocating maven module output. > > > > > > - Do the hard work to walk our transitive dependencies and ensure that > > > we are including only one copy of every library by using exclusions > > > effectively. This will not only bring down build times, it will make > sure > > > we know what we're including. > > > - Try to devise a strategy where we only shade once at the end. This > > > could look like some combination of > > > - standardizing on the lowest common denominator of a troublesome > > > library > > > - We shade in dependencies so they can use different versions of > > > libraries (e.g. metron-common with a modern version of guava) than the > > > final jars. > > > - exclusions > > > - externalizing infrastructure out to not necessitate spinning up > > > hadoop components in-process for integration tests (i.e. hbase server > > > conflicts with storm in a few dependencies) > > > > > > *Final Thoughts* > > > > > > If I had three to pick, I'd pick > > > > > > - moving off of the in-memory component infrastructure to docker images > > > - fixing the maven poms to exclude correctly > > > - ensuring the resulting tests are parallelizable > > > > > > I will point out that fixing the maven poms to exclude correctly (i.e. > we > > > choose the version of every jar that we depend on transitively) ticks > > > multiple boxes, not just making things faster. > > > > > > What are your thoughts? What did I miss? We need a plan and we need to > > > execute on it soon, otherwise travis is going to keep smacking us hard. > > It > > > may be worth while constructing a tactical plan and then a more > strategic > > > plan that we can work toward. I was heartened at how much some of these > > > suggestions dovetail with the discussion around the future of the > docker > > > infrastructure. > > > > > > Best, > > > > > > Casey > > > > > > > > >
