[ 
https://issues.apache.org/jira/browse/BEAM-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16322604#comment-16322604
 ] 

ASF GitHub Bot commented on BEAM-1847:
--------------------------------------

iemejia closed pull request #4391: [BEAM-1847]: Consider both max records/time 
in KafkaIO bounded read.
URL: https://github.com/apache/beam/pull/4391
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java 
b/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
index 33fc2899a50..4b9c2a2b9f7 100644
--- a/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
+++ b/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
@@ -488,7 +488,7 @@
      * Mainly used for tests and demo applications.
      */
     public Read<K, V> withMaxNumRecords(long maxNumRecords) {
-      return 
toBuilder().setMaxNumRecords(maxNumRecords).setMaxReadTime(null).build();
+      return toBuilder().setMaxNumRecords(maxNumRecords).build();
     }
 
     /**
@@ -516,7 +516,7 @@
      * applications.
      */
     public Read<K, V> withMaxReadTime(Duration maxReadTime) {
-      return 
toBuilder().setMaxNumRecords(Long.MAX_VALUE).setMaxReadTime(maxReadTime).build();
+      return toBuilder().setMaxReadTime(maxReadTime).build();
     }
 
     /**
@@ -619,10 +619,10 @@
 
       PTransform<PBegin, PCollection<KafkaRecord<K, V>>> transform = unbounded;
 
-      if (getMaxNumRecords() < Long.MAX_VALUE) {
-        transform = unbounded.withMaxNumRecords(getMaxNumRecords());
-      } else if (getMaxReadTime() != null) {
-        transform = unbounded.withMaxReadTime(getMaxReadTime());
+      if (getMaxNumRecords() < Long.MAX_VALUE || getMaxReadTime() != null) {
+        transform = unbounded
+            .withMaxReadTime(getMaxReadTime())
+            .withMaxNumRecords(getMaxNumRecords());
       }
 
       return input.getPipeline().apply(transform);


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> KafkaIO can't specify both max records and max duration.
> --------------------------------------------------------
>
>                 Key: BEAM-1847
>                 URL: https://issues.apache.org/jira/browse/BEAM-1847
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-extensions
>            Reporter: Ryan Skraba
>            Assignee: Ryan Skraba
>            Priority: Minor
>
> Some Beam IOs that expose the ability to turn an unbounded source into a 
> bounded source.  
> For example, KafkaIO uses the underlying Read.from() API to specify the 
> {{withMaxNumRecords}} and/or {{withMaxReadTime}}.  If the former is 
> specified, the latter is silently ignored.  
> I would expect that the first stopping condition to be reached (either max 
> records OR max duration) would stop the source.  
> The underlying implementation {{BoundedReadFromUnboundedSource}} has this 
> logic, but it is not supported -in Read.Unbounded- or the Beam IOs that 
> expose this feature.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to