[ 
https://issues.apache.org/jira/browse/KAFKA-13007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Kim updated KAFKA-13007:
-----------------------------
    Description: 
>From 
>[KafkaAdminClient#getListOffsetsCalls|https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/admin/KafkaAdminClient.java#L4215]

```

for (Map.Entry<TopicPartition, OffsetSpec> entry: 
topicPartitionOffsets.entrySet()) {

...

Node node = mr.cluster().leaderFor(tp);

```

here we build the cluster snapshot for each topic partition. instead, we should 
reuse a snapshot. this will reduce the time complexity from O(n^2) to O(n).

for manual testing (used AK 2.8), i've passed in a map of 6K topic partitions 
to listOffsets

without snapshot reuse:
 duration of building futures from metadata response: *15582* milliseconds
 total duration of listOffsets: 15743 milliseconds

with reuse:
 duration of building futures from metadata response: *24* milliseconds
 total duration of listOffsets: 235 milliseconds

Affects all versions since Admin & KafkaAdminClient introduced listOffsets 
(original PR: [https://github.com/apache/kafka/pull/7296])

  was:
>From KafkaAdminClient#getListOffsetsCalls [line of 
>code|https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/admin/KafkaAdminClient.java#L4215]

```
for (Map.Entry<TopicPartition, OffsetSpec> entry: 
topicPartitionOffsets.entrySet()) {
            ...
                Node node = mr.cluster().leaderFor(tp);
```

here we build the cluster snapshot for each topic partition. instead, we should 
reuse a snapshot. this will reduce the time complexity from O(n^2) to O(n).


for manual testing (used AK 2.8), i've passed in a map of 6K topic partitions 
to listOffsets

without snapshot reuse:
duration of building futures from metadata response: 15582 milliseconds
total duration of listOffsets: 15743 milliseconds

with reuse:
duration of building futures from metadata response: 24 milliseconds
total duration of listOffsets: 235 milliseconds

Affects all versions since Admin & KafkaAdminClient introduced listOffsets 
(original PR: https://github.com/apache/kafka/pull/7296)


> KafkaAdminClient getListOffsetsCalls builds cluster snapshot for every topic 
> partition
> --------------------------------------------------------------------------------------
>
>                 Key: KAFKA-13007
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13007
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients
>    Affects Versions: 2.8.0
>            Reporter: Jeff Kim
>            Assignee: Jeff Kim
>            Priority: Blocker
>
> From 
> [KafkaAdminClient#getListOffsetsCalls|https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/admin/KafkaAdminClient.java#L4215]
> ```
> for (Map.Entry<TopicPartition, OffsetSpec> entry: 
> topicPartitionOffsets.entrySet()) {
> ...
> Node node = mr.cluster().leaderFor(tp);
> ```
> here we build the cluster snapshot for each topic partition. instead, we 
> should reuse a snapshot. this will reduce the time complexity from O(n^2) to 
> O(n).
> for manual testing (used AK 2.8), i've passed in a map of 6K topic partitions 
> to listOffsets
> without snapshot reuse:
>  duration of building futures from metadata response: *15582* milliseconds
>  total duration of listOffsets: 15743 milliseconds
> with reuse:
>  duration of building futures from metadata response: *24* milliseconds
>  total duration of listOffsets: 235 milliseconds
> Affects all versions since Admin & KafkaAdminClient introduced listOffsets 
> (original PR: [https://github.com/apache/kafka/pull/7296])



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to