karenbraganz opened a new issue, #61180: URL: https://github.com/apache/airflow/issues/61180
### Description When `wait_for_completion=True` in EmrCreateJobFlowOperator, the operator code does not actually wait for the cluster to complete successfully before returning a success state. Instead, a success state is returned as soon as the cluster starts running. This can result in the task succeeding even if the cluster is terminated with errors after it begins running. I believe this is due to [this line of code](https://github.com/apache/airflow/blob/1b3329eb670a1fbf70d2f7eeaa21aaf7baa7bacd/providers/amazon/src/airflow/providers/amazon/aws/operators/emr.py#L761) that assigns the "WAIT_FOR_COMPLETION" WaitPolicy to the waiter. This corresponds to the ["job_for_waiting" wait policy](https://github.com/apache/airflow/blob/1b3329eb670a1fbf70d2f7eeaa21aaf7baa7bacd/providers/amazon/src/airflow/providers/amazon/aws/utils/waiter.py#L103) with which the waiter will only wait for the cluster to start running before returning a success state. If the user wants the waiter to wait until the cluster completes, [WAIT_FOR_STEPS_COMPLETION corresponding to the "job_flow_terminated"](https://github.com/apache/airflow/blob/1b3329eb670a1fbf70d2f7eeaa21aaf7baa7bacd/providers/amazon/src/airflow/providers/amazon/aws/utils/waiter.py#L104) wait policy must be used. The operator has hard coded the "job_for_waiting" wait policy, so the user cannot configure the wait policy. I propose adding a wait_policy parameter to the operator which allows the user to specify which wait policy they would prefer to use. ### Use case/motivation _No response_ ### Related issues _No response_ ### Are you willing to submit a PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
