sjvanrossum commented on code in PR #34201: URL: https://github.com/apache/beam/pull/34201#discussion_r2095512482
########## sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaUnboundedReader.java: ########## @@ -173,19 +172,6 @@ public boolean advance() throws IOException { elementsReadBySplit.inc(); ConsumerRecord<byte[], byte[]> rawRecord = pState.recordIter.next(); - long expected = pState.nextOffset; - long offset = rawRecord.offset(); - - if (offset < expected) { // -- (a) - // this can happen when compression is enabled in Kafka (seems to be fixed in 0.10) Review Comment: If I recall correctly, the Beam module Gradle plugin forces dependency versions, but the dependency scope for Kafka clients is set to `provided` so it doesn't apply for the `beam-sdks-java-io-kafka` artifact and the project build file tests against these versions of the Kafka client library: ``` def kafkaVersions = [ '201': "2.0.1", '231': "2.3.1", '241': "2.4.1", '251': "2.5.1", '282': "2.8.2", '312': "3.1.2", '390': "3.9.0", ] ``` I don't think that's how we should test backwards compatibility since there's built-in support for version ranges in both POM and Gradle metadata. In any case, we're miles ahead of the version mentioned in this comment and I've gone back to check on 0.11.0+ and found that the component responsible for tracking this (from the fetch collector to the poll loop) hasn't changed much during that time. I cross-referenced a few other Kafka source integrations (Flink, Spark, Venice) and only Spark (pinned to client version 0.10.x) still has this type of safeguard in place ([source](https://github.com/apache/spark/blob/v3.5.5/connector/kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/KafkaDataConsumer.scala#L131)). We can also deprecate all those older client library versions at some point since every release <4.0.0 still supports communication with brokers >=0.8.0. Client library versions >=4.0.0 bump that up to broker versions >=2.1.0 so if there's reason to build against multiple client library versions it should simply be 3.x.y and 4.x.y going forward. :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org