[ https://issues.apache.org/jira/browse/KAFKA-2550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Manikumar resolved KAFKA-2550. ------------------------------ Resolution: Auto Closed {color:#000000}Closing inactive issue. Old clients are deprecated. Please reopen if you think the issue still exists in newer versions.{color} > [Kafka][0.8.2.1][Performance]When there are a lot of partition under a Topic, > there are serious performance degradation. > ------------------------------------------------------------------------------------------------------------------------ > > Key: KAFKA-2550 > URL: https://issues.apache.org/jira/browse/KAFKA-2550 > Project: Kafka > Issue Type: Bug > Components: clients, consumer, producer > Affects Versions: 0.8.2.1 > Reporter: yanwei > Assignee: Neha Narkhede > Priority: Major > > Because of business need to create a large number of partitions,I test the > partition number of support. > But I find When there are a lot of partition under a Topic, there are serious > performance degradation. > Through the analysis, in addition to the hard disk is bottleneck, the client > is the bottleneck > I use JProfile,producer and consumer 1000000 message(msg size:500byte) > 1、Consumer high level API:(I find i can't upload picture?) > ZookeeperConsumerConnector.scala-->rebalance > -->val assignmentContext = new AssignmentContext(group, consumerIdString, > config.excludeInternalTopics, zkClient) > -->ZkUtils.getPartitionsForTopics(zkClient, myTopicThreadIds.keySet.toSeq) > -->getPartitionAssignmentForTopics > -->Json.parseFull(jsonPartitionMap) > 1) one topic 400 partion: > JProfile:48.6% cpu run time > 2) ont topic 3000 partion: > JProfile:97.8% cpu run time > Maybe the file(jsonPartitionMap) is very big lead to parse is very slow. > But this function is executed only once, so the problem should not be too > big. > 2、Producer Scala API: > BrokerPartitionInfo.scala--->getBrokerPartitionInfo: > partitionMetadata.map { m => > m.leader match { > case Some(leader) => > //y00163442 delete log print > debug("Partition [%s,%d] has leader %d".format(topic, > m.partitionId, leader.id)) > new PartitionAndLeader(topic, m.partitionId, Some(leader.id)) > case None => > //y00163442 delete log print > //debug("Partition [%s,%d] does not have a leader > yet".format(topic, m.partitionId)) > new PartitionAndLeader(topic, m.partitionId, None) > } > }.sortWith((s, t) => s.partitionId < t.partitionId) > > When partitions number>25,the function 'format' cpu run time is 44.8%. > Nearly half of the time consumption in the format function.whether the > log print open, this format will be executed.Led to the decrease of the TPS > for five times(25000--->5000). > > 3、Producer JAVA client(clients module): > function:org.apache.kafka.clients.producer.KafkaProducer.send > I find the function 'send' cpu run time rise with the rising number of > partitions ,when partions is 5000,the cpu run time is 60.8. > Because Kafka broker side of CPU, memory, disk, the network didn't > reach the bottleneck , No matter request.required.acks is set to 0 or 1, the > results are similar, I doubt the send there may be some bottlenecks. > > Very unfortunately to upload pictures don't succeed, can't see the results. > My test results, for a single server, a single hard disk can support 1000 > partitions, 7 hard disk can support 3000 partitions.If can solve the > bottleneck for the client, then seven hard disk I estimate that can support > more partitions. > Actual production configuration, could be more partitions configuration under > more than one TOPIC,Things could be better. -- This message was sent by Atlassian JIRA (v7.6.3#76005)