[jira] [Work logged] (BEAM-13171) Support for stopReadTime on KafkaIO SDF

ASF GitHub Bot (Jira) Wed, 02 Feb 2022 07:12:07 -0800


     [ 
https://issues.apache.org/jira/browse/BEAM-13171?focusedWorklogId=719439&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-719439
 ]


ASF GitHub Bot logged work on BEAM-13171:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 02/Feb/22 15:11
            Start Date: 02/Feb/22 15:11
    Worklog Time Spent: 10m 
      Work Description: nbali commented on pull request #15951:
URL: https://github.com/apache/beam/pull/15951#issuecomment-1028039066


   Well I tested with Dataflow and the job was always initalized as a streaming 
pipeline and it never stops, and yes I don't have any other source. It's 
literally a simple KafkaIO.read. At one point (around the Kafka read) the 
bounded PCollection becomes unbounded according to this: 
https://github.com/apache/beam/blob/cc0b2c5f3529e1896778dddeb6c740d40c7fb977/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java#L1467-L1483
   
   Maybe the DataflowRunner is wrong in calling this?
   ```java
       if (containsUnboundedPCollection(pipeline)) {
         options.setStreaming(true);
       }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 719439)
    Time Spent: 4.5h  (was: 4h 20m)

> Support for stopReadTime on KafkaIO SDF 
> ----------------------------------------
>
>                 Key: BEAM-13171
>                 URL: https://issues.apache.org/jira/browse/BEAM-13171
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-kafka
>            Reporter: Mostafa Aghajani
>            Assignee: Mostafa Aghajani
>            Priority: P2
>             Fix For: 2.36.0
>
>          Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> There is already the support for startReadTime using SDF when the Kafka 
> version is supported.
> I want to add the support for stopReadTIme so we can extract messages from 
> Kafka only up to a point in time and then the task will be finished.
> One use case: when you want to only re-process (re-read) a period of time for 
> a Kafka topic in your pipeline.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Work logged] (BEAM-13171) Support for stopReadTime on KafkaIO SDF

Reply via email to