[
https://issues.apache.org/jira/browse/KAFKA-734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Neha Narkhede updated KAFKA-734:
--------------------------------
Attachment: kafka-734-v1.patch
Few changes to the migration tool -
1. Do not share the list of producers amongst all the consumer threads. Since
every consumer thread used the circular iterator over the same ordered list of
producers, they spent most of their time locking the producer queues. This is
fixed by partitioning the list of producers amongst the consumer threads.
Intuitively, it seems that it is enough to have one consumer thread use on
producer, but since the producer spends a significant amount of time waiting
for an ack from the broker, it is useful to have multiple producers per
consumer thread on the migration tool.
2. There was a debug statement that was unprotected in the MigrationThread that
caused the thread to spent time blocked on log4j.
3. There is still another problem that is not related directly to the migration
tool, but is a problem with the rebalancing logic in the consumer for wildcard
subscription. I tried firing up 2 migration tool instances, each with 32
consumer threads to migrate ~300 topics, most of which had 1 partition. Since
the rebalancing logic is per topic, it causes the 1st migration tool to take
almost all of the load. This is a bug in the rebalancing logic which disallows
us from horizontally scaling out consumption in the presence of multiple
topics, each with a small number of partitions. Will file another bug to track
this.
> Migration tool needs a revamp, it was poorly written and has many performance
> bugs
> ----------------------------------------------------------------------------------
>
> Key: KAFKA-734
> URL: https://issues.apache.org/jira/browse/KAFKA-734
> Project: Kafka
> Issue Type: Bug
> Components: tools
> Affects Versions: 0.8
> Reporter: Neha Narkhede
> Assignee: Neha Narkhede
> Priority: Blocker
> Labels: p1
> Attachments: kafka-734-v1.patch
>
>
> Migration tool has a number of problems ranging from poor logging to poor
> design. This needs to be thought through again
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira