mik-laj commented on a change in pull request #11726:
URL: https://github.com/apache/airflow/pull/11726#discussion_r517609877



##########
File path: airflow/providers/google/cloud/operators/dataflow.py
##########
@@ -324,6 +344,23 @@ class DataflowTemplatedJobStartOperator(BaseOperator):
             `https://cloud.google.com/dataflow/pipelines/specifying-exec-params
             
<https://cloud.google.com/dataflow/docs/reference/rest/v1b3/RuntimeEnvironment>`__
     :type environment: Optional[dict]
+    :param wait_until_finished: (Optional)
+        If True, wait for the end of pipeline execution before exiting. If 
False,
+        it only waits for it to starts (``JOB_STATE_RUNNING``).
+
+        The default behavior depends on the type of pipeline:
+
+        * for the streaming pipeline, wait for jobs to start,
+        * for the batch pipeline, wait for the jobs to complete.
+
+        .. warning::
+
+            You cannot call ``PipelineResult.wait_until_finish`` method in 
your pipeline code for the operator

Review comment:
       The process of starting the Datafłow job in Airflow consists of two:
   - running a subprocess and reading the stderr/stderr log for the job id.
   - loop waiting for the end of the job ID from the previous step. This loop 
checks the status of the job.
   
   Step two is started just after step one has finished, so if you have 
`wait_until_finished` in your pipeline code, step two will not start until the 
process stops. When this process stops, steps two will run, but it will only 
execute one iteration as the job will be in a terminal state.
   
   If you in your pipeline do not call the `wait_for_pipeline` method but pass 
`wait_until_finish =True` to the operator,  the second loop will wait for the 
job's terminal state.
   
   If you in your pipeline do not call the `wait_for_pipeline` method, and you 
pass `wait_until_finish =False` to the operator, the second loop will wait for 
the running state only. 
   
   
   
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to