[GitHub] [hudi] bvaradar commented on a diff in pull request #8376: [HUDI-6019] support config minPartitions when reading from kafka

via GitHub Sat, 15 Apr 2023 13:42:07 -0700


bvaradar commented on code in PR #8376:
URL: https://github.com/apache/hudi/pull/8376#discussion_r1167635549



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java:
##########
@@ -148,9 +166,58 @@ public static OffsetRange[] 
computeOffsetRanges(Map<TopicPartition, Long> fromOf
         }

Review Comment:
   @waitingF : Considering your example ( here it assumes MAX_EVENTS_PER_BATCH 
= 300), I am not following your reasoning why it is not evenly distributed. 
Isn't it in your case, Partition 0 has only 100 events (0 -> 100) and so it got 
exhausted. In that case, you would need to fill the 3rd spark partition with 
events from TP 1. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] bvaradar commented on a diff in pull request #8376: [HUDI-6019] support config minPartitions when reading from kafka

Reply via email to