Savonitar commented on code in PR #195:
URL:
https://github.com/apache/flink-connector-kafka/pull/195#discussion_r2535143513
##########
flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/enumerator/KafkaSourceEnumerator.java:
##########
@@ -587,7 +627,7 @@ public static class PartitionOffsetsRetrieverImpl
private final String groupId;
public PartitionOffsetsRetrieverImpl(AdminClient adminClient, String
groupId) {
- this.adminClient = adminClient;
+ this.adminClient = checkNotNull(adminClient);
Review Comment:
(irrelevant to this PR):
I noticed that `PartitionOffsetsRetrieverImpl` closes the `adminClient` in
`close()` method. However, `PartitionOffsetsRetrieverImpl` doesn't "own" the
`adminClient`, it is created and owned by `KafkaSourceEnumerator` which also
closes it in itsown `close()` method.
Should we either remove the `close()` implementation from
`PartitionOffsetsRetrieverImpl` (make it a no-op), OR remove `AutoCloseable` if
it's not needed? (probably in another PR).
Looks like currently it's not an issue because
`PartitionOffsetsRetrieverImpl` are never closed, but it's a potential future
bug, for example, if someone adds the usage with try-with-resources.
##########
flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/enumerator/KafkaSourceEnumerator.java:
##########
@@ -331,34 +323,50 @@ private PartitionSplitChange findNewPartitionSplits() {
* partitions
*/
private PartitionSplitChange initializePartitionSplits(PartitionChange
partitionChange) {
- Set<TopicPartition> newPartitions =
-
Collections.unmodifiableSet(partitionChange.getNewPartitions());
+ Set<TopicPartition> newPartitions = partitionChange.getNewPartitions();
+ Set<TopicPartition> initialPartitions =
partitionChange.getInitialPartitions();
- OffsetsInitializer.PartitionOffsetsRetriever offsetsRetriever =
getOffsetsRetriever();
// initial partitions use OffsetsInitializer specified by the user
while new partitions use
// EARLIEST
- final OffsetsInitializer initializer;
- if (!initialDiscoveryFinished) {
- initializer = startingOffsetInitializer;
- } else {
- initializer = newDiscoveryOffsetsInitializer;
- }
Map<TopicPartition, Long> startingOffsets =
- initializer.getPartitionOffsets(newPartitions,
offsetsRetriever);
-
+ newLinkedHashMapWithExpectedSize(newPartitions.size() +
initialPartitions.size());
Map<TopicPartition, Long> stoppingOffsets =
- stoppingOffsetInitializer.getPartitionOffsets(newPartitions,
offsetsRetriever);
+ newLinkedHashMapWithExpectedSize(newPartitions.size() +
initialPartitions.size());
+ if (!newPartitions.isEmpty()) {
+ initOffsets(
+ newPartitions,
+ newDiscoveryOffsetsInitializer,
+ startingOffsets,
+ stoppingOffsets);
+ }
+ if (!initialPartitions.isEmpty()) {
+ initOffsets(
+ initialPartitions, startingOffsetInitializer,
startingOffsets, stoppingOffsets);
+ }
Set<KafkaPartitionSplit> partitionSplits = new
HashSet<>(newPartitions.size());
Review Comment:
Shouldn't it be
`newPartitions.size() + initialPartitions.size()`
?
##########
flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/split/KafkaPartitionSplit.java:
##########
@@ -123,6 +123,10 @@ public static String toSplitId(TopicPartition tp) {
return tp.toString();
}
+ public boolean hasValidStartOffset() {
Review Comment:
A minor suggestion. Maybe it's better to rename the method to
```
public boolean isMigrated() {
return startingOffset == MIGRATED;
}
```
This method could then be used in the following two places:
1. In
`org.apache.flink.connector.kafka.source.enumerator.KafkaSourceEnumerator#start`
```
final List<KafkaPartitionSplit> preinitializedSplits =
unassignedSplits.values().stream()
.filter(split -> !split.isMigrated())
.collect(Collectors.toList());
```
2. In
`org.apache.flink.connector.kafka.source.enumerator.KafkaSourceEnumerator#getPartitionChange`
```
if (split.isMigrated()) {
initialPartitions.add(split.getTopicPartition());
}
```
By using `isMigrated()` instead of mixing the direct constant check and
`hasValidStartOffset()`, the code becomes consistent and relies only on a
helpful encapsulation method.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]