[ 
https://issues.apache.org/jira/browse/BEAM-604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu updated BEAM-604:
--------------------------
    Description: 
Currently, streaming job with bounded input can't be terminated automatically 
and TestDataflowRunner can't handle this case. Need to update 
TestDataflowRunner so that streaming integration test such as 
WindowedWordCountIT can run with it.

Implementation:
Query watermark of each step and wait until all watermarks set to MAX then 
cancel the job.

Update:
Suggesting by [[email protected]], implement checkMaxWatermark in 
DataflowPipelineJob#waitUntilFinish. Thus, all dataflow streaming jobs with 
bounded input will take advantage of this change and are canceled automatically 
when watermarks reach to max value. Also Dataflow runners can keep simple and 
free from handling batch and streaming two cases.

  was:
Currently, streaming job with bounded input can't be terminated automatically 
and TestDataflowRunner can't handle this case. Need to update 
TestDataflowRunner so that streaming integration test such as 
WindowedWordCountIT can run with it.

Implementation:
Query watermark of each step and wait until all watermarks set to MAX then 
cancel the job.

Update:
Suggesting by [[email protected]], implement checkMaxWatermark in 
DataflowPipelineJob#waitUntilFinish. Thus, all dataflow streaming jobs with 
bounded input will take advantage of this change and are canceled automatically 
when watermarks reach to max value. Also 


> Use Watermark Check Streaming Job Finish in DataflowPipelineJob
> ---------------------------------------------------------------
>
>                 Key: BEAM-604
>                 URL: https://issues.apache.org/jira/browse/BEAM-604
>             Project: Beam
>          Issue Type: Improvement
>            Reporter: Mark Liu
>            Assignee: Mark Liu
>            Priority: Minor
>
> Currently, streaming job with bounded input can't be terminated automatically 
> and TestDataflowRunner can't handle this case. Need to update 
> TestDataflowRunner so that streaming integration test such as 
> WindowedWordCountIT can run with it.
> Implementation:
> Query watermark of each step and wait until all watermarks set to MAX then 
> cancel the job.
> Update:
> Suggesting by [[email protected]], implement checkMaxWatermark in 
> DataflowPipelineJob#waitUntilFinish. Thus, all dataflow streaming jobs with 
> bounded input will take advantage of this change and are canceled 
> automatically when watermarks reach to max value. Also Dataflow runners can 
> keep simple and free from handling batch and streaming two cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to