[
https://issues.apache.org/jira/browse/BEAM-13171?focusedWorklogId=719439&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-719439
]
ASF GitHub Bot logged work on BEAM-13171:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 02/Feb/22 15:11
Start Date: 02/Feb/22 15:11
Worklog Time Spent: 10m
Work Description: nbali commented on pull request #15951:
URL: https://github.com/apache/beam/pull/15951#issuecomment-1028039066
Well I tested with Dataflow and the job was always initalized as a streaming
pipeline and it never stops, and yes I don't have any other source. It's
literally a simple KafkaIO.read. At one point (around the Kafka read) the
bounded PCollection becomes unbounded according to this:
https://github.com/apache/beam/blob/cc0b2c5f3529e1896778dddeb6c740d40c7fb977/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java#L1467-L1483
Maybe the DataflowRunner is wrong in calling this?
```java
if (containsUnboundedPCollection(pipeline)) {
options.setStreaming(true);
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 719439)
Time Spent: 4.5h (was: 4h 20m)
> Support for stopReadTime on KafkaIO SDF
> ----------------------------------------
>
> Key: BEAM-13171
> URL: https://issues.apache.org/jira/browse/BEAM-13171
> Project: Beam
> Issue Type: Improvement
> Components: io-java-kafka
> Reporter: Mostafa Aghajani
> Assignee: Mostafa Aghajani
> Priority: P2
> Fix For: 2.36.0
>
> Time Spent: 4.5h
> Remaining Estimate: 0h
>
> There is already the support for startReadTime using SDF when the Kafka
> version is supported.
> I want to add the support for stopReadTIme so we can extract messages from
> Kafka only up to a point in time and then the task will be finished.
> One use case: when you want to only re-process (re-read) a period of time for
> a Kafka topic in your pipeline.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)