Re: Automated Testing w/ Kafka Streams

Michael Noll Tue, 16 Aug 2016 10:20:12 -0700

Addendum:

> Unfortunately, Apache Kafka does not publish these testing facilities as
maven artifacts -- that's why everyone is rolling their own.


Some testing facilities (like kafka.utils.TestUtils) are published via
maven, but other helpful testing facilities are not.

Since Radek provided a snippet how to pull in the artifact that includes
k.u.TestUtils, here's the same snippet for Maven/pom.xml, with dependency
scope set to `test`:

  <dependency>
      <groupId>org.apache.kafka</groupId>
      <artifactId>kafka_2.11</artifactId>
      <version>0.10.0.0</version>
      <classifier>test</classifier>
      <scope>test</scope>
  </dependency>



On Tue, Aug 16, 2016 at 7:14 PM, Michael Noll <mich...@confluent.io> wrote:

> Mathieu,
>
> FWIW here are some pointers to run embedded Kafka/ZK instances for
> integration testing.  The second block of references below uses Curator's
> TestingServer for running embedded ZK instances.  See also the relevant
> pom.xml for how the integration tests are being run (e.g. disabled JVM
> reusage to ensure test isolation).
>
> Unfortunately, Apache Kafka does not publish these testing facilities as
> maven artifacts -- that's why everyone is rolling their own.
>
> In Apache Kafka:
>
>     Helper classes (e.g. embedded Kafka)
>     https://github.com/apache/kafka/tree/trunk/streams/src/
> test/java/org/apache/kafka/streams/integration/utils
>
>     Integration test example:
>     https://github.com/apache/kafka/blob/trunk/streams/src/
> test/java/org/apache/kafka/streams/integration/FanoutIntegrationTest.java
>
>     Also, for kafka.utils.TestUtils usage:
>     https://github.com/apache/kafka/blob/trunk/core/src/
> test/scala/integration/kafka/api/IntegrationTestHarness.scala
>
> In confluentinc/examples:
>
>     Helper classes (e.g. embedded Kafka, embedded Confluent Schema
> Registry for Avro testing)
>     https://github.com/confluentinc/examples/tree/
> kafka-0.10.0.0-cp-3.0.0/kafka-streams/src/test/java/io/
> confluent/examples/streams/kafka
>
>     Some more sophisticated integration tests:
>     https://github.com/confluentinc/examples/blob/
> kafka-0.10.0.0-cp-3.0.0/kafka-streams/src/test/java/io/
> confluent/examples/streams/WordCountLambdaIntegrationTest.java
>     https://github.com/confluentinc/examples/blob/
> kafka-0.10.0.0-cp-3.0.0/kafka-streams/src/test/java/io/
> confluent/examples/streams/SpecificAvroIntegrationTest.java
>
> Best,
> Michael
>
>
>
>
> On Tue, Aug 16, 2016 at 3:36 PM, Mathieu Fenniak <
> mathieu.fenn...@replicon.com> wrote:
>
>> Hi Radek,
>>
>> No, I'm not familiar with these tools.  I see that Curator's TestingServer
>> looks pretty straight-forward, but, I'm not really sure what
>> kafka.util.TestUtils
>> is.  I can't find any documentation referring to this, and it doesn't seem
>> to be a part of any published maven artifacts in the Kafka project; can
>> you
>> point me at what you're using a little more specifically?
>>
>> Mathieu
>>
>>
>> On Mon, Aug 15, 2016 at 2:39 PM, Radoslaw Gruchalski <
>> ra...@gruchalski.com>
>> wrote:
>>
>> > Out of curiosity, are you aware of kafka.util.TestUtils and Apache
>> Curator
>> > TestingServer?
>> > I’m using this successfully to test publis / consume scenarios with
>> things
>> > like Flink, Spark and custom apps.
>> > What would stop you from taking the same approach?
>> >
>> > –
>> > Best regards,
>> > Radek Gruchalski
>> > ra...@gruchalski.com
>> >
>> >
>> > On August 15, 2016 at 9:41:37 PM, Mathieu Fenniak (
>> > mathieu.fenn...@replicon.com) wrote:
>> >
>> > Hi Michael,
>> >
>> > It would definitely be an option. I am not currently doing any testing
>> > like that; it could replace the ProcessorTopologyTestDriver-style
>> testing
>> > that I'd like to do, but there are some trade-offs to consider:
>> >
>> > - I can't do an isolated test of just the TopologyBuilder; I'd be
>> > bringing in configuration management code (eg. configuring where to
>> access
>> > ZK + Kafka).
>> > - Tests using a running Kafka server wouldn't have a clear end-point; if
>> > something in the toplogy doesn't publish a message where I expected it
>> to,
>> > my test can only fail via a timeout.
>> > - Tests are likely to be slower; this might not be significant, but a
>> > small difference in test speed has a big impact in productivity after a
>> > few
>> > months of development
>> > - Tests will be more complex & fragile; some additional component needs
>> > to manage starting up that Kafka server, making sure it's ready-to-go,
>> > running tests, and then tearing it down
>> > - Tests will have to be cautious of state existing in Kafka. eg. two
>> > test suites that touch the same topics could be influenced by state of a
>> > previous test. Either you take a "destroy the world" approach between
>> test
>> > cases (or test suites), which probably makes test speed much worse, or,
>> > you
>> > find another way to isolate test's state.
>> >
>> > I'd have to face all these problems at the higher level that I'm calling
>> > "systems-level tests", but, I think it would be better to do the
>> majority
>> > of the automated testing at a lower level that doesn't bring these
>> > considerations into play.
>> >
>> > Mathieu
>> >
>> >
>> > On Mon, Aug 15, 2016 at 12:13 PM, Michael Noll <mich...@confluent.io>
>> > wrote:
>> >
>> > > Mathieu,
>> > >
>> > > follow-up question: Are you also doing or considering integration
>> > testing
>> > > by spawning a local Kafka cluster and then reading/writing to that
>> > cluster
>> > > (often called embedded or in-memory cluster)? This approach would be
>> in
>> > > the middle between ProcessorTopologyTestDriver (that does not spawn a
>> > Kafka
>> > > cluster) and your system-level testing (which I suppose is running
>> > against
>> > > a "real" test Kafka cluster).
>> > >
>> > > -Michael
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > On Mon, Aug 15, 2016 at 3:44 PM, Mathieu Fenniak <
>> > > mathieu.fenn...@replicon.com> wrote:
>> > >
>> > > > Hey all,
>> > > >
>> > > > At my workplace, we have a real focus on software automated testing.
>> > I'd
>> > > > love to be able to test the composition of a TopologyBuilder with
>> > > > org.apache.kafka.test.ProcessorTopologyTestDriver
>> > > > <https://github.com/apache/kafka/blob/14934157df7aaf5e9c37a3
>> 02ef9fd9
>> > > > 317b95efa4/streams/src/test/java/org/apache/kafka/test/
>> > > > ProcessorTopologyTestDriver.java>;
>> > > > has there ever been any thought given to making this part of the
>> > public
>> > > API
>> > > > of Kafka Streams?
>> > > >
>> > > > For some background, here are some details on the automated testing
>> > plan
>> > > > that I have in mind for a Kafka Streams application. Our goal is to
>> > > enable
>> > > > continuous deployment of any new development we do, so, it has to be
>> > > > rigorously tested with complete automation.
>> > > >
>> > > > As part of our pre-commit testing, we'd first have these gateways;
>> no
>> > > code
>> > > > would reach our master branch without passing these tests:
>> > > >
>> > > > - At the finest level, unit tests covering individual pieces like a
>> > > > Serde, ValueMapper, ValueJoiner, aggregate adder/subtractor, etc.
>> > > These
>> > > > pieces are very isolated, very easy to unit test.
>> > > > - At a higher level, I'd like to have component tests of the
>> > > composition
>> > > > of the TopologyBuilder; this is where ProcessorTopologyTestDriver
>> > > would
>> > > > be
>> > > > valuable. There'd be far fewer of these tests than the lower-level
>> > > > tests.
>> > > > There are no external dependencies to these tests, so they'd be very
>> > > > fast.
>> > > >
>> > > > Having passed that level of testing, we'd deploy the Kafka Streams
>> > > > application to an integration testing area where the rest of our
>> > > > application is kept up-to-date, and proceed with these integration
>> > tests:
>> > > >
>> > > > - Systems-level tests where we synthesize inputs to the Kafka
>> topics,
>> > > > wait for the Streams app to process the data, and then inspect the
>> > > > output
>> > > > that it pushes into other Kafka topics. These tests will be fewer in
>> > > > nature than the above tests, but they serve to ensure that the
>> > > > application
>> > > > is well-configured, executing, and handling inputs & outputs as
>> > > > expected.
>> > > > - UI-level tests where we verify behaviors that are expected from
>> the
>> > > > system as a whole. As our application is a web app, we'd be using
>> > > > Selenium
>> > > > to drive a web browser and verifying interactions and outputs that
>> are
>> > > > expected from the Streams application matching our real-world
>> > > use-cases.
>> > > > These tests are even fewer in nature than the above.
>> > > >
>> > > > This is an adaptation of the automated testing scaffold that we
>> > currently
>> > > > use for microservices; I'd love any input on the plan as a whole.
>> > > >
>> > > > Thanks,
>> > > >
>> > > > Mathieu
>> > > >
>> > >
>> >
>> >
>>
>
>

Re: Automated Testing w/ Kafka Streams

Reply via email to