johnjcasey commented on code in PR #26142:
URL: https://github.com/apache/beam/pull/26142#discussion_r1164193380


##########
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/ReadFromKafkaDoFn.java:
##########
@@ -424,6 +425,29 @@ public ProcessContinuation processElement(
     }
   }
 
+  // see https://github.com/apache/beam/issues/25962
+  private ConsumerRecords<byte[], byte[]> poll(
+      Consumer<byte[], byte[]> consumer, TopicPartition topicPartition) {
+    final Stopwatch sw = Stopwatch.createStarted();
+    long previousPosition = -1;
+    while (true) {
+      final ConsumerRecords<byte[], byte[]> rawRecords =
+          consumer.poll(KAFKA_POLL_TIMEOUT.minus(sw.elapsed()));
+      if (!rawRecords.isEmpty()) {
+        // return as we have found some entries
+        return rawRecords;
+      }
+      if (previousPosition == (previousPosition = 
consumer.position(topicPartition))) {

Review Comment:
   I'm ok either way here. I don't think we need this, based on your analysis, 
but I don't think it hurts performance enough either.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to