Balint Molnar commented on KAFKA-1954:
I realized nearly every test case recreates the server infra (kafka/zookeeper)
before itself even if it's not needed, so first I would like to refactor the
classes to restart the infra only the required times.
> Speed Up The Unit Tests
> Key: KAFKA-1954
> URL: https://issues.apache.org/jira/browse/KAFKA-1954
> Project: Kafka
> Issue Type: Improvement
> Reporter: Jay Kreps
> Assignee: Sriharsha Chintalapani
> Labels: newbie++
> Attachments: KAFKA-1954.patch
> The server unit tests are pretty slow. They take about 8m40s on my machine.
> Combined with slow scala compile time this is kind of painful.
> Almost all of this time comes from the integration tests which start one or
> more brokers and then shut them down.
> Our finding has been that these integration tests are actually quite useful
> so we probably can't just get rid of them.
> Here are some times:
> Zk startup: 100ms
> Kafka server startup: 600ms
> Kafka server shutdown: 500ms
> So you can see that an integration test suite with 10 tests that starts and
> stops a 3 node cluster for each test will take ~34 seconds even if the tests
> themselves are instantaneous.
> I think the best solution to this is to get the test harness classes in shape
> and then performance tune them a bit as this would potentially speed
> everything up. There are several test harness classes:
> - ZooKeeperTestHarness
> - KafkaServerTestHarness
> - ProducerConsumerTestHarness
> - IntegrationTestHarness (similar to ProducerConsumerTestHarness but using
> new clients)
> Unfortunately often tests don't use the right harness, they often use a
> lower-level harness than they should and manually create stuff. Usually the
> cause of this is that the harness is missing some feature.
> I think the right thing to do here is
> 1. Get the tests converted to the best possible harness. If you are testing
> producers and consumers then you should use the harness that creates all that
> and shuts it down for you.
> 2. Optimize the harnesses to be faster.
> How can we optimize the harnesses? I'm not sure, I would solicit ideas. Here
> are a few:
> 1. It's worth analyzing the logging to see what is taking up time in the
> startup and shutdown.
> 2. There may be things like controlled shutdown that we can disable (since we
> are anyway going to discard the brokers after shutdown.
> 3. The harnesses could probably start all the servers and all the clients in
> 4. We maybe able to tune down the resource usage in the server config for
> test cases a bit.
This message was sent by Atlassian JIRA