[
https://issues.apache.org/jira/browse/STORM-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14541253#comment-14541253
]
Nathan Leung commented on STORM-814:
------------------------------------
How is your data pushed to kafka? It is plausible that the numbers you see are
due to max spout pending.
If you are receiving data in bursts, you would pull them off much faster with 8
spouts than 1. Then the spout output queues would be relatively large in size,
and since the tuple is sitting in the output queue longer, the overall latency
is higher. The burstiness would account for low bolt capacity figures;
typically high spout output queue sizes correlate with higher bolt utilization
than you're seeing.
Anyways, it seems odd to me to point at a very specific application profile and
then come to the conclusion that there is a bug in the spout. You should run
your tests with spout only and do throughput tests. If you can pull less data
off of kafka with 8 spouts than 1, then I would agree that something is odd.
But even that (purely hypothetical situation) would seem to be more likely to
sit with the kafka client than with the storm implementation of kafka spout.
> Kafka Spout performance
> -----------------------
>
> Key: STORM-814
> URL: https://issues.apache.org/jira/browse/STORM-814
> Project: Apache Storm
> Issue Type: Bug
> Components: storm-kafka
> Affects Versions: 0.9.3
> Environment: Centos, AWS , M3.2xlarge instances
> Reporter: Rajesh Kucharlapati
>
> I am running few test for storm topology with kafka.
> Created a topic with 16 partitions and emitted 10k messages/sec with each
> message 1k size.
> When I set the kafka spout parallelism to 1, I am getting latency of 8.9 ms
> but when i increase the parallelism to 8, I am getting latency of 1.2 sec.
> Settings :
> MAX_SPOUT SPENDING set to 1000-5000
> String zkConnString = "zookeeper1:2181";
> String kafkaTopic = "messages";
> String zkRootPath = "/kafkaStorm";
> String zkOffSetID = "kafka";
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)