Taragolis commented on PR #24306:
URL: https://github.com/apache/airflow/pull/24306#issuecomment-1149934903

   > emr_conn_id is used in only EmrCreateJobFlowOperator and I feel it 
expecting that emr_conn_id extra filed should contain JSON request body for 
run_job_flow API. But keeping the boto3 API request body in DAG looks more 
convenient to me than storing it in connection and if I'm passing 
job_flow_overrides param in the task in that case then emr_conn_id should not 
be mandatory.
   
   If only case case make `emr_conn_id` non mandatory than probably better just 
change receiving config
   
   ```python
   config = {}
   if self.emr_conn_id:
       emr_conn = self.get_connection(self.emr_conn_id)
       config = emr_conn.extra_dejson.copy()
   ```
   
   Assume that user provide AWS Connection rather than EMR Connection 
personally for me it is not good point.
   I could pass `aws_conn_id=None` and assume that Airflow will use Instance 
Profile / Task Role
   
   > If the user will store aws_access_key_id in emr_conn_id extra then I'm 
assuming that intention to use this connection as authentication rather than 
using it in building request body.
   
   If user deploy Airflow in AWS Cloud and follow AWS Best Practices he/she 
won't use `aws_access_key_id` and most probably use Instance Profile / Task Role
   
   Examples:
    - User use `role_arn` in EMR Connection and expected that he/she switch to 
this role, however error raise
    - User use `region_name` in EMR Connection and expected that region 
changed, however error raise
   
   For me it is not clear why we only need to check `aws_access_key_id` ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to