+1 for option 3. I am in favor of using Docker for the integration tests for all the reasons that you mentioned.
On Fri, Mar 8, 2019 at 9:47 AM Ryan Merriman <merrim...@gmail.com> wrote: > I have been researching the effort involved to upgrade to HDP 3. Along the > way I've found a couple challenging issues that we will need to solve, both > involving our integration testing strategy. > > The first issue is Kafka. We are moving from 0.10.0 to 2.0.0 and there > have been significant changes to the API. This creates an issue in the > KafkaComponent class, which we use as an in-memory Kafka server in > integration tests. Most of the classes that were previously used have gone > away, and to the best of my knowledge, were not supported as public APIs. > I also don't see any publicly documented APIs to replace them. > > The second issue is HBase. We are moving from 1.1.2 to 2.0.2 so another > significant change. This creates an issue in the MockHTable class > becausethe HTableInterface class has changed to Table, essentially > requiring that MockHTable be rewritten to conform to the new interface. > It's my opinion that this class is complicated and difficult to maintain as > it is anyways. > > These 2 issues have the potential to add a significant amount of work to > upgrading Metron to HDP 3. I want to take a step back and review our > options before we move forward. Here are some initial thoughts I had on > how to approach this. For HBase: > > 1. Update MockHTable to work with the new HBase API. We would continue > using a mock server approach for HBase. > 2. Research replacing MockHTable with an in-memory HBase server. > 3. Replace MockHTable with a Docker container running HBase. > > For Kafka: > > 1. Replace KafkaComponent with a mock server implementation. > 2. Update KafkaComponent to work with the new API. We would probably > need to leverage some internal Kafka classes. I do not see a testing > API > documented publicly. > 3. Replace KafkaComponent with a Docker container running Kafka. > > What other options are there? Whatever we choose I think we should follow > a similar approach for both (mock servers, in memory servers, Docker, other > options I'm not thinking of). > > This will not shock anyone but I would be in favor of Docker containers. > They have the advantage of classpath isolation, easy upgrades, and accurate > integration testing. The downside is we will have to adjusts our tests and > travis script to incorporate these Docker containers into our build > process. We have discussed this at length in the past and it has generally > stalled for various reasons. Maybe if we move a few services at a time it > might be more palatable? As for the other 2 approaches, I think if either > worked well we wouldn't be having this discussion. Mock servers are hard > to maintain and I don't see in memory testing classes documented in > javadocs for either service. > > Thoughts? >