scwhittle commented on code in PR #33596:
URL: https://github.com/apache/beam/pull/33596#discussion_r1930527649
##########
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java:
##########
@@ -1073,7 +1093,7 @@ public Read<K, V> withRedistribute() {
}
public Read<K, V> withAllowDuplicates(Boolean allowDuplicates) {
- if (!isAllowDuplicates()) {
+ if (!isRedistributed()) {
LOG.warn("Setting this value without setting withRedistribute() will
have no effect.");
Review Comment:
nit: seems lik ethis log should be printed when built instead of here. Since
setWithAllowDuplicates could be called before setRedistributed
##########
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java:
##########
@@ -1086,6 +1106,17 @@ public Read<K, V> withRedistributeNumKeys(int
redistributeNumKeys) {
return toBuilder().setRedistributeNumKeys(redistributeNumKeys).build();
}
+ public Read<K, V> withOffsetDeduplication(boolean offsetDeduplication) {
+ /*
+ * TODO(tomstepp): Auto-enable offset deduplication if: redistributed
and !allowDuplicates.
+ * Until then, enforce explicit enablement only with redistributed
without duplicates.
+ */
+ checkState(
+ isRedistributed() && !isAllowDuplicates(),
Review Comment:
similarly throw this when building instead of here, so it's not sensitive to
ordering of the builder methods being called
##########
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaUnboundedReader.java:
##########
@@ -332,10 +360,12 @@ public long getSplitBacklogBytes() {
private final String name;
private @Nullable Consumer<byte[], byte[]> consumer = null;
private final List<PartitionState<K, V>> partitionStates;
- private @Nullable KafkaRecord<K, V> curRecord = null;
+ private @MonotonicNonNull KafkaRecord<K, V> curRecord = null;
Review Comment:
seems like curTimestamp would need this too
##########
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaUnboundedReader.java:
##########
@@ -299,6 +300,29 @@ public Instant getCurrentTimestamp() throws
NoSuchElementException {
return curTimestamp;
}
+ @Override
+ public byte[] getCurrentRecordId() throws NoSuchElementException {
Review Comment:
can you improve unit test to cover the new behavior?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]