About moving some streams text utils into a separate package: I think this has been requested before with a filed JIRA
https://issues.apache.org/jira/browse/KAFKA-3625 Guozhang On Tue, Aug 16, 2016 at 10:18 AM, Michael Noll <mich...@confluent.io> wrote: > Addendum: > > > Unfortunately, Apache Kafka does not publish these testing facilities as > maven artifacts -- that's why everyone is rolling their own. > > Some testing facilities (like kafka.utils.TestUtils) are published via > maven, but other helpful testing facilities are not. > > Since Radek provided a snippet how to pull in the artifact that includes > k.u.TestUtils, here's the same snippet for Maven/pom.xml, with dependency > scope set to `test`: > > <dependency> > <groupId>org.apache.kafka</groupId> > <artifactId>kafka_2.11</artifactId> > <version>0.10.0.0</version> > <classifier>test</classifier> > <scope>test</scope> > </dependency> > > > > On Tue, Aug 16, 2016 at 7:14 PM, Michael Noll <mich...@confluent.io> > wrote: > > > Mathieu, > > > > FWIW here are some pointers to run embedded Kafka/ZK instances for > > integration testing. The second block of references below uses Curator's > > TestingServer for running embedded ZK instances. See also the relevant > > pom.xml for how the integration tests are being run (e.g. disabled JVM > > reusage to ensure test isolation). > > > > Unfortunately, Apache Kafka does not publish these testing facilities as > > maven artifacts -- that's why everyone is rolling their own. > > > > In Apache Kafka: > > > > Helper classes (e.g. embedded Kafka) > > https://github.com/apache/kafka/tree/trunk/streams/src/ > > test/java/org/apache/kafka/streams/integration/utils > > > > Integration test example: > > https://github.com/apache/kafka/blob/trunk/streams/src/ > > test/java/org/apache/kafka/streams/integration/ > FanoutIntegrationTest.java > > > > Also, for kafka.utils.TestUtils usage: > > https://github.com/apache/kafka/blob/trunk/core/src/ > > test/scala/integration/kafka/api/IntegrationTestHarness.scala > > > > In confluentinc/examples: > > > > Helper classes (e.g. embedded Kafka, embedded Confluent Schema > > Registry for Avro testing) > > https://github.com/confluentinc/examples/tree/ > > kafka-0.10.0.0-cp-3.0.0/kafka-streams/src/test/java/io/ > > confluent/examples/streams/kafka > > > > Some more sophisticated integration tests: > > https://github.com/confluentinc/examples/blob/ > > kafka-0.10.0.0-cp-3.0.0/kafka-streams/src/test/java/io/ > > confluent/examples/streams/WordCountLambdaIntegrationTest.java > > https://github.com/confluentinc/examples/blob/ > > kafka-0.10.0.0-cp-3.0.0/kafka-streams/src/test/java/io/ > > confluent/examples/streams/SpecificAvroIntegrationTest.java > > > > Best, > > Michael > > > > > > > > > > On Tue, Aug 16, 2016 at 3:36 PM, Mathieu Fenniak < > > mathieu.fenn...@replicon.com> wrote: > > > >> Hi Radek, > >> > >> No, I'm not familiar with these tools. I see that Curator's > TestingServer > >> looks pretty straight-forward, but, I'm not really sure what > >> kafka.util.TestUtils > >> is. I can't find any documentation referring to this, and it doesn't > seem > >> to be a part of any published maven artifacts in the Kafka project; can > >> you > >> point me at what you're using a little more specifically? > >> > >> Mathieu > >> > >> > >> On Mon, Aug 15, 2016 at 2:39 PM, Radoslaw Gruchalski < > >> ra...@gruchalski.com> > >> wrote: > >> > >> > Out of curiosity, are you aware of kafka.util.TestUtils and Apache > >> Curator > >> > TestingServer? > >> > I’m using this successfully to test publis / consume scenarios with > >> things > >> > like Flink, Spark and custom apps. > >> > What would stop you from taking the same approach? > >> > > >> > – > >> > Best regards, > >> > Radek Gruchalski > >> > ra...@gruchalski.com > >> > > >> > > >> > On August 15, 2016 at 9:41:37 PM, Mathieu Fenniak ( > >> > mathieu.fenn...@replicon.com) wrote: > >> > > >> > Hi Michael, > >> > > >> > It would definitely be an option. I am not currently doing any testing > >> > like that; it could replace the ProcessorTopologyTestDriver-style > >> testing > >> > that I'd like to do, but there are some trade-offs to consider: > >> > > >> > - I can't do an isolated test of just the TopologyBuilder; I'd be > >> > bringing in configuration management code (eg. configuring where to > >> access > >> > ZK + Kafka). > >> > - Tests using a running Kafka server wouldn't have a clear end-point; > if > >> > something in the toplogy doesn't publish a message where I expected it > >> to, > >> > my test can only fail via a timeout. > >> > - Tests are likely to be slower; this might not be significant, but a > >> > small difference in test speed has a big impact in productivity after > a > >> > few > >> > months of development > >> > - Tests will be more complex & fragile; some additional component > needs > >> > to manage starting up that Kafka server, making sure it's ready-to-go, > >> > running tests, and then tearing it down > >> > - Tests will have to be cautious of state existing in Kafka. eg. two > >> > test suites that touch the same topics could be influenced by state > of a > >> > previous test. Either you take a "destroy the world" approach between > >> test > >> > cases (or test suites), which probably makes test speed much worse, > or, > >> > you > >> > find another way to isolate test's state. > >> > > >> > I'd have to face all these problems at the higher level that I'm > calling > >> > "systems-level tests", but, I think it would be better to do the > >> majority > >> > of the automated testing at a lower level that doesn't bring these > >> > considerations into play. > >> > > >> > Mathieu > >> > > >> > > >> > On Mon, Aug 15, 2016 at 12:13 PM, Michael Noll <mich...@confluent.io> > >> > wrote: > >> > > >> > > Mathieu, > >> > > > >> > > follow-up question: Are you also doing or considering integration > >> > testing > >> > > by spawning a local Kafka cluster and then reading/writing to that > >> > cluster > >> > > (often called embedded or in-memory cluster)? This approach would be > >> in > >> > > the middle between ProcessorTopologyTestDriver (that does not spawn > a > >> > Kafka > >> > > cluster) and your system-level testing (which I suppose is running > >> > against > >> > > a "real" test Kafka cluster). > >> > > > >> > > -Michael > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > On Mon, Aug 15, 2016 at 3:44 PM, Mathieu Fenniak < > >> > > mathieu.fenn...@replicon.com> wrote: > >> > > > >> > > > Hey all, > >> > > > > >> > > > At my workplace, we have a real focus on software automated > testing. > >> > I'd > >> > > > love to be able to test the composition of a TopologyBuilder with > >> > > > org.apache.kafka.test.ProcessorTopologyTestDriver > >> > > > <https://github.com/apache/kafka/blob/14934157df7aaf5e9c37a3 > >> 02ef9fd9 > >> > > > 317b95efa4/streams/src/test/java/org/apache/kafka/test/ > >> > > > ProcessorTopologyTestDriver.java>; > >> > > > has there ever been any thought given to making this part of the > >> > public > >> > > API > >> > > > of Kafka Streams? > >> > > > > >> > > > For some background, here are some details on the automated > testing > >> > plan > >> > > > that I have in mind for a Kafka Streams application. Our goal is > to > >> > > enable > >> > > > continuous deployment of any new development we do, so, it has to > be > >> > > > rigorously tested with complete automation. > >> > > > > >> > > > As part of our pre-commit testing, we'd first have these gateways; > >> no > >> > > code > >> > > > would reach our master branch without passing these tests: > >> > > > > >> > > > - At the finest level, unit tests covering individual pieces like > a > >> > > > Serde, ValueMapper, ValueJoiner, aggregate adder/subtractor, etc. > >> > > These > >> > > > pieces are very isolated, very easy to unit test. > >> > > > - At a higher level, I'd like to have component tests of the > >> > > composition > >> > > > of the TopologyBuilder; this is where ProcessorTopologyTestDriver > >> > > would > >> > > > be > >> > > > valuable. There'd be far fewer of these tests than the lower-level > >> > > > tests. > >> > > > There are no external dependencies to these tests, so they'd be > very > >> > > > fast. > >> > > > > >> > > > Having passed that level of testing, we'd deploy the Kafka Streams > >> > > > application to an integration testing area where the rest of our > >> > > > application is kept up-to-date, and proceed with these integration > >> > tests: > >> > > > > >> > > > - Systems-level tests where we synthesize inputs to the Kafka > >> topics, > >> > > > wait for the Streams app to process the data, and then inspect the > >> > > > output > >> > > > that it pushes into other Kafka topics. These tests will be fewer > in > >> > > > nature than the above tests, but they serve to ensure that the > >> > > > application > >> > > > is well-configured, executing, and handling inputs & outputs as > >> > > > expected. > >> > > > - UI-level tests where we verify behaviors that are expected from > >> the > >> > > > system as a whole. As our application is a web app, we'd be using > >> > > > Selenium > >> > > > to drive a web browser and verifying interactions and outputs that > >> are > >> > > > expected from the Streams application matching our real-world > >> > > use-cases. > >> > > > These tests are even fewer in nature than the above. > >> > > > > >> > > > This is an adaptation of the automated testing scaffold that we > >> > currently > >> > > > use for microservices; I'd love any input on the plan as a whole. > >> > > > > >> > > > Thanks, > >> > > > > >> > > > Mathieu > >> > > > > >> > > > >> > > >> > > >> > > > > > -- -- Guozhang