ahmad-maruf opened a new issue #9127: URL: https://github.com/apache/airflow/issues/9127
A bug in the latest stable version of Airflow (1.10.10) causes the following library API call mismatch error when calling the `EmrAddStepsOperator`: ``` [2020-06-03 18:05:06,862] {taskinstance.py:1145} ERROR - 'EMR' object has no attribute 'get_cluster_id_by_name' Traceback (most recent call last): File "/home/ubuntu/.pyenv/versions/3.7.7/envs/.venv_python377/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 983, in _run_raw_task result = task_copy.execute(context=context) File "/home/ubuntu/.pyenv/versions/3.7.7/envs/.venv_python377/lib/python3.7/site-packages/airflow/contrib/operators/emr_add_steps_operator.py", line 74, in execute job_flow_id = emr.get_cluster_id_by_name(self.job_flow_name, self.cluster_states) File "/home/ubuntu/.pyenv/versions/3.7.7/envs/.venv_python377/lib/python3.7/site-packages/botocore/client.py", line 575, in _getattr_ self._class.name_, item) AttributeError: 'EMR' object has no attribute 'get_cluster_id_by_name' [2020-06-03 18:05:06,864] {taskinstance.py:1202} INFO - Marking task as FAILED.dag_id=my_spark_job_dag_id, task_id=my_spark_job_emr_add_step_id, execution_date=20200603T180500, start_date=20200603T180506, end_date=20200603T180506 [2020-06-03 18:05:16,153] {logging_mixin.py:112} INFO - [2020-06-03 18:05:16,153] {local_task_job.py:103} INFO - Task exited with return code 1* ``` After digging through the library API code, I found the code bug here: https://github.com/apache/airflow/blob/b099571b9af739c5a96e7aed41be9f22912a3443/airflow/contrib/operators/emr_add_steps_operator.py#L74 The root cause is that `botocore.client.EMR object has no attribute 'get_cluster_id_by_name'`. Instead this attribute belongs to `airflow.contrib.hooks.emr_hook.EmrHook` object. Compare the above bug with **corrected** corresponding code in the Airflow 2.0.0Dev version in the `master` branch: https://github.com/apache/airflow/blob/ff5dcccbbd49e7a4632f93fa915565ac31730110/airflow/providers/amazon/aws/operators/emr_add_steps.py#L77 This is forcing the user to provide `job_flow_id` directly when instantiating `EmrAddStepsOperator`, which in my opinion is not the best practice. apache-airflow 1.10.10 boto 2.49.0 boto3 1.13.18 botocore 1.16.21 Python 3.7.7 If this issue has already been fixed in Airflow 1.10.10 somehow, please provide instructions as I'm not aware of it. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org