[ https://issues.apache.org/jira/browse/KAFKA-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14321668#comment-14321668 ]
Jay Kreps commented on KAFKA-1954: ---------------------------------- Some data: Here are the slow tests: ||Class||Tests||Duration||| |kafka.integration.UncleanLeaderElectionTest|5|52.848| |kafka.api.ConsumerTest|9|49.334| |kafka.api.test.ProducerFailureHandlingTest|11|49.213|| |kafka.admin.DeleteTopicTest|9|48.783| |kafka.integration.RollingBounceTest|1|48.382| |kafka.consumer.ZookeeperConsumerConnectorTest|6|29.052| |kafka.admin.AddPartitionsTest|5|26.375| |kafka.admin.AdminTest|12|24.464| |kafka.admin.DeleteConsumerGroupTest|7|22.477| |kafka.server.LogRecoveryTest|4|21.186| |kafka.server.AdvertiseBrokerTest|1|16.004| |kafka.producer.ProducerTest|5|15.201| |kafka.integration.PrimitiveApiTest|8|12.037| |kafka.api.test.ProducerSendTest|5|10.104| |kafka.integration.AutoOffsetResetTest|4|9.035| |kafka.producer.SyncProducerTest|8|8.344| |kafka.server.ServerGenerateBrokerIdTest|4|7.54| |kafka.server.OffsetCommitTest|3|7.422| |kafka.producer.AsyncProducerTest|13|6.781| |unit.kafka.consumer.PartitionAssignorTest|2|6.429| |kafka.server.LeaderElectionTest|2|5.677| |kafka.server.LogOffsetTest|5|5.013| |kafka.integration.TopicMetadataTest|4|4.956| |kafka.server.ServerShutdownTest|4|4.885| |kafka.api.test.ProducerCompressionTest|4|4.77| |kafka.consumer.MetricsTest|1|3.072| |kafka.consumer.ConsumerIteratorTest|2|2.49| |kafka.server.ReplicaFetchTest|1|2.467| |kafka.javaapi.consumer.ZookeeperConsumerConnectorTest|1|2.066| |kafka.server.DynamicConfigChangeTest|2|1.892| |kafka.log4j.KafkaLog4jAppenderTest|2|1.881| |kafka.log.LogManagerTest|10|1.865| |kafka.integration.FetcherTest|1|1.235| |kafka.server.ReplicaManagerTest|3|1.229| Here is the server startup and shutdown logging to get a sense of timings: {noformat} [2015-02-14 11:07:58,350] INFO Verifying properties (kafka.utils.VerifiableProperties) [2015-02-14 11:07:58,392] INFO Property broker.id is overridden to 0 (kafka.utils.VerifiableProperties) [2015-02-14 11:07:58,392] INFO Property log.cleaner.enable is overridden to false (kafka.utils.VerifiableProperties) [2015-02-14 11:07:58,392] INFO Property log.dirs is overridden to /tmp/kafka-logs (kafka.utils.VerifiableProperties) [2015-02-14 11:07:58,392] INFO Property log.retention.check.interval.ms is overridden to 300000 (kafka.utils.VerifiableProperties) [2015-02-14 11:07:58,392] INFO Property log.retention.hours is overridden to 168 (kafka.utils.VerifiableProperties) [2015-02-14 11:07:58,392] INFO Property log.segment.bytes is overridden to 1073741824 (kafka.utils.VerifiableProperties) [2015-02-14 11:07:58,392] INFO Property num.io.threads is overridden to 8 (kafka.utils.VerifiableProperties) [2015-02-14 11:07:58,393] INFO Property num.network.threads is overridden to 3 (kafka.utils.VerifiableProperties) [2015-02-14 11:07:58,393] INFO Property num.partitions is overridden to 1 (kafka.utils.VerifiableProperties) [2015-02-14 11:07:58,393] INFO Property num.recovery.threads.per.data.dir is overridden to 1 (kafka.utils.VerifiableProperties) [2015-02-14 11:07:58,393] INFO Property port is overridden to 9092 (kafka.utils.VerifiableProperties) [2015-02-14 11:07:58,393] INFO Property socket.receive.buffer.bytes is overridden to 102400 (kafka.utils.VerifiableProperties) [2015-02-14 11:07:58,393] INFO Property socket.request.max.bytes is overridden to 104857600 (kafka.utils.VerifiableProperties) [2015-02-14 11:07:58,393] INFO Property socket.send.buffer.bytes is overridden to 102400 (kafka.utils.VerifiableProperties) [2015-02-14 11:07:58,394] INFO Property zookeeper.connect is overridden to localhost:2181 (kafka.utils.VerifiableProperties) [2015-02-14 11:07:58,394] INFO Property zookeeper.connection.timeout.ms is overridden to 6000 (kafka.utils.VerifiableProperties) SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/Users/jay/work/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/Users/jay/work/kafka/core/build/dependant-libs-2.10.4/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] [2015-02-14 11:07:58,438] INFO starting (kafka.server.KafkaServer) [2015-02-14 11:07:58,441] INFO Connecting to zookeeper on localhost:2181 (kafka.server.KafkaServer) [2015-02-14 11:07:58,453] INFO Starting ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread) [2015-02-14 11:07:58,462] INFO Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT (org.apache.zookeeper.ZooKeeper) [2015-02-14 11:07:58,462] INFO Client environment:host.name=10.0.0.248 (org.apache.zookeeper.ZooKeeper) [2015-02-14 11:07:58,462] INFO Client environment:java.version=1.7.0_51 (org.apache.zookeeper.ZooKeeper) [2015-02-14 11:07:58,462] INFO Client environment:java.vendor=Oracle Corporation (org.apache.zookeeper.ZooKeeper) [2015-02-14 11:07:58,462] INFO Client environment:java.home=/Library/Java/JavaVirtualMachines/jdk1.7.0_51.jdk/Contents/Home/jre (org.apache.zookeeper.ZooKeeper) [2015-02-14 11:07:58,462] INFO Client environment:java.class.path=:/Users/jay/work/kafka/bin/../core/build/dependant-libs-2.10.4/jopt-simple-3.2.jar:/Users/jay/work/kafka/bin/../core/build/dependant-libs-2.10.4/log4j-1.2.16.jar:/Users/jay/work/kafka/bin/../core/build/dependant-libs-2.10.4/lz4-1.2.0.jar:/Users/jay/work/kafka/bin/../core/build/dependant-libs-2.10.4/metrics-core-2.2.0.jar:/Users/jay/work/kafka/bin/../core/build/dependant-libs-2.10.4/scala-library-2.10.4.jar:/Users/jay/work/kafka/bin/../core/build/dependant-libs-2.10.4/slf4j-api-1.7.6.jar:/Users/jay/work/kafka/bin/../core/build/dependant-libs-2.10.4/slf4j-log4j12-1.6.1.jar:/Users/jay/work/kafka/bin/../core/build/dependant-libs-2.10.4/slf4j-log4j12-1.7.6.jar:/Users/jay/work/kafka/bin/../core/build/dependant-libs-2.10.4/snappy-java-1.1.1.6.jar:/Users/jay/work/kafka/bin/../core/build/dependant-libs-2.10.4/zkclient-0.3.jar:/Users/jay/work/kafka/bin/../core/build/dependant-libs-2.10.4/zookeeper-3.4.6.jar:/Users/jay/work/kafka/bin/../examples/build/libs//kafka-examples-0.8.3-SNAPSHOT.jar:/Users/jay/work/kafka/bin/../contrib/hadoop-consumer/build/libs//kafka-hadoop-consumer-0.8.3-SNAPSHOT.jar:/Users/jay/work/kafka/bin/../contrib/hadoop-producer/build/libs//kafka-hadoop-producer-0.8.3-SNAPSHOT.jar:/Users/jay/work/kafka/bin/../clients/build/libs/kafka-clients-0.8.3-SNAPSHOT.jar:/Users/jay/work/kafka/bin/../libs/*.jar:/Users/jay/work/kafka/bin/../core/build/libs/kafka_2.10-0.8.3-SNAPSHOT.jar (org.apache.zookeeper.ZooKeeper) [2015-02-14 11:07:58,463] INFO Client environment:java.library.path=/Users/jay/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:. (org.apache.zookeeper.ZooKeeper) [2015-02-14 11:07:58,463] INFO Client environment:java.io.tmpdir=/var/folders/jx/tqq9jls95k547pxmjj4xpq6h0000gn/T/ (org.apache.zookeeper.ZooKeeper) [2015-02-14 11:07:58,463] INFO Client environment:java.compiler=<NA> (org.apache.zookeeper.ZooKeeper) [2015-02-14 11:07:58,463] INFO Client environment:os.name=Mac OS X (org.apache.zookeeper.ZooKeeper) [2015-02-14 11:07:58,463] INFO Client environment:os.arch=x86_64 (org.apache.zookeeper.ZooKeeper) [2015-02-14 11:07:58,463] INFO Client environment:os.version=10.10.2 (org.apache.zookeeper.ZooKeeper) [2015-02-14 11:07:58,463] INFO Client environment:user.name=jay (org.apache.zookeeper.ZooKeeper) [2015-02-14 11:07:58,463] INFO Client environment:user.home=/Users/jay (org.apache.zookeeper.ZooKeeper) [2015-02-14 11:07:58,463] INFO Client environment:user.dir=/Users/jay/work/kafka (org.apache.zookeeper.ZooKeeper) [2015-02-14 11:07:58,464] INFO Initiating client connection, connectString=localhost:2181 sessionTimeout=6000 watcher=org.I0Itec.zkclient.ZkClient@ef82188 (org.apache.zookeeper.ZooKeeper) [2015-02-14 11:07:58,487] INFO Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn) [2015-02-14 11:07:58,495] INFO Socket connection established to localhost/0:0:0:0:0:0:0:1:2181, initiating session (org.apache.zookeeper.ClientCnxn) [2015-02-14 11:07:58,530] INFO Session establishment complete on server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x14b897d43750000, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn) [2015-02-14 11:07:58,532] INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient) [2015-02-14 11:07:58,629] INFO Log directory '/tmp/kafka-logs' not found, creating it. (kafka.log.LogManager) [2015-02-14 11:07:58,640] INFO Loading logs. (kafka.log.LogManager) [2015-02-14 11:07:58,649] INFO Logs loading complete. (kafka.log.LogManager) [2015-02-14 11:07:58,650] INFO Starting log cleanup with a period of 300000 ms. (kafka.log.LogManager) [2015-02-14 11:07:58,655] INFO Starting log flusher with a default period of 9223372036854775807 ms. (kafka.log.LogManager) [2015-02-14 11:07:58,657] WARN No meta.properties file under dir /tmp/kafka-logs/meta.properties (kafka.server.BrokerMetadataCheckpoint) [2015-02-14 11:07:58,703] INFO Awaiting socket connections on 0.0.0.0:9092. (kafka.network.Acceptor) [2015-02-14 11:07:58,704] INFO [Socket Server on Broker 0], Started (kafka.network.SocketServer) [2015-02-14 11:07:58,731] INFO [ExpirationReaper-0], Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2015-02-14 11:07:58,731] INFO [ExpirationReaper-0], Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2015-02-14 11:07:58,791] INFO Will not load MX4J, mx4j-tools.jar is not in the classpath (kafka.utils.Mx4jLoader$) [2015-02-14 11:07:58,846] INFO 0 successfully elected as leader (kafka.server.ZookeeperLeaderElector) [2015-02-14 11:07:58,935] INFO Registered broker 0 at path /brokers/ids/0 with address 10.0.0.248:9092. (kafka.utils.ZkUtils$) [2015-02-14 11:07:58,949] INFO [Kafka Server 0], started (kafka.server.KafkaServer) [2015-02-14 11:07:59,022] INFO New leader is 0 (kafka.server.ZookeeperLeaderElector$LeaderChangeListener) ^C[2015-02-14 11:08:51,696] INFO [Kafka Server 0], shutting down (kafka.server.KafkaServer) [2015-02-14 11:08:51,698] INFO [Kafka Server 0], Starting controlled shutdown (kafka.server.KafkaServer) [2015-02-14 11:08:51,785] INFO [Kafka Server 0], Controlled shutdown succeeded (kafka.server.KafkaServer) [2015-02-14 11:08:51,790] INFO Deregistered broker 0 at path /brokers/ids/0. (kafka.utils.ZkUtils$) [2015-02-14 11:08:51,791] INFO [Socket Server on Broker 0], Shutting down (kafka.network.SocketServer) [2015-02-14 11:08:51,791] INFO Closing socket connection to /10.0.0.248. (kafka.network.Processor) [2015-02-14 11:08:51,823] INFO [Socket Server on Broker 0], Shutdown completed (kafka.network.SocketServer) [2015-02-14 11:08:51,824] INFO [Kafka Request Handler on Broker 0], shutting down (kafka.server.KafkaRequestHandlerPool) [2015-02-14 11:08:51,826] INFO [Kafka Request Handler on Broker 0], shut down completely (kafka.server.KafkaRequestHandlerPool) [2015-02-14 11:08:51,830] INFO [Replica Manager on Broker 0]: Shutting down (kafka.server.ReplicaManager) [2015-02-14 11:08:51,830] INFO [ReplicaFetcherManager on broker 0] shutting down (kafka.server.ReplicaFetcherManager) [2015-02-14 11:08:51,831] INFO [ReplicaFetcherManager on broker 0] shutdown completed (kafka.server.ReplicaFetcherManager) [2015-02-14 11:08:51,831] INFO [ExpirationReaper-0], Shutting down (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2015-02-14 11:08:51,946] INFO [ExpirationReaper-0], Stopped (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2015-02-14 11:08:51,946] INFO [ExpirationReaper-0], Shutdown completed (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2015-02-14 11:08:51,946] INFO [ExpirationReaper-0], Shutting down (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2015-02-14 11:08:52,151] INFO [ExpirationReaper-0], Stopped (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2015-02-14 11:08:52,151] INFO [ExpirationReaper-0], Shutdown completed (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2015-02-14 11:08:52,158] INFO [Replica Manager on Broker 0]: Shut down completely (kafka.server.ReplicaManager) [2015-02-14 11:08:52,158] INFO Shutting down. (kafka.log.LogManager) [2015-02-14 11:08:52,166] INFO Shutdown complete. (kafka.log.LogManager) [2015-02-14 11:08:52,171] INFO Terminate ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread) [2015-02-14 11:08:52,173] INFO Session: 0x14b897d43750000 closed (org.apache.zookeeper.ZooKeeper) [2015-02-14 11:08:52,173] INFO EventThread shut down (org.apache.zookeeper.ClientCnxn) [2015-02-14 11:08:52,173] INFO [Kafka Server 0], shut down completed (kafka.server.KafkaServer) {noformat} > Speed Up The Unit Tests > ----------------------- > > Key: KAFKA-1954 > URL: https://issues.apache.org/jira/browse/KAFKA-1954 > Project: Kafka > Issue Type: Improvement > Reporter: Jay Kreps > > The server unit tests are pretty slow. They take about 8m40s on my machine. > Combined with slow scala compile time this is kind of painful. > Almost all of this time comes from the integration tests which start one or > more brokers and then shut them down. > Our finding has been that these integration tests are actually quite useful > so we probably can't just get rid of them. > Here are some times: > Zk startup: 100ms > Kafka server startup: 600ms > Kafka server shutdown: 500ms > > So you can see that an integration test suite with 10 tests that starts and > stops a 3 node cluster for each test will take ~34 seconds even if the tests > themselves are instantaneous. > I think the best solution to this is to get the test harness classes in shape > and then performance tune them a bit as this would potentially speed > everything up. There are several test harness classes: > - ZooKeeperTestHarness > - KafkaServerTestHarness > - ProducerConsumerTestHarness > - IntegrationTestHarness (similar to ProducerConsumerTestHarness but using > new clients) > Unfortunately often tests don't use the right harness, they often use a > lower-level harness than they should and manually create stuff. Usually the > cause of this is that the harness is missing some feature. > I think the right thing to do here is > 1. Get the tests converted to the best possible harness. If you are testing > producers and consumers then you should use the harness that creates all that > and shuts it down for you. > 2. Optimize the harnesses to be faster. > How can we optimize the harnesses? I'm not sure, I would solicit ideas. Here > are a few: > 1. It's worth analyzing the logging to see what is taking up time in the > startup and shutdown. > 2. There may be things like controlled shutdown that we can disable (since we > are anyway going to discard the brokers after shutdown. > 3. The harnesses could probably start all the servers and all the clients in > parallel. > 4. We maybe able to tune down the resource usage in the server config for > test cases a bit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)