pankajastro commented on PR #24306: URL: https://github.com/apache/airflow/pull/24306#issuecomment-1149876650
> Is it actually an error? Possibly due to lack of documentation of this connection type > > As far as I remember since Airflow 1.10 (and probably earlier) Amazon Elastic Map Reduce connection stored only kwargs for [run_job_flow](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/emr.html#EMR.Client.run_job_flow) and `aws_conn_id` uses for actual `boto3` emr client connection emr_conn_id is used in only EmrCreateJobFlowOperator and I feel it expecting that emr_conn_id extra filed should contain JSON request body for run_job_flow API. But keeping the boto3 API request body in DAG looks more convenient to me than storing it in connection and if I'm passing job_flow_overrides param in the task in that case then emr_conn_id should not be mandatory. Here, I'm doing - check if aws_conn_id is available, use it for AWS credential - if aws_conn_id is not available assume that credential is in emr_conn_id - if emr_conn_id conn extra is containing request body override with job_flow_overrides request body - if emr_conn_id does not exist or contain AWS credentials then use job_flow_overrides as the request body After this change, if emr_conn_id is not available then it will use aws_conn_id for auth and job_flow_overrides for the request body earlier it was failing by error `conn_id` not found -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
