romibuzi commented on issue #29423:
URL: https://github.com/apache/airflow/issues/29423#issuecomment-1422427027

   Hi @vgutkovsk!
   
   Oh damn indeed I realize introduced a breaking change. Before the check `if 
self.s3_bucket is None` was done only when the operator was creating the job. 
Now it is done at the start of `create_glue_job_config()` method here: 
https://github.com/apache/airflow/blob/44024564cb3dd6835b0375d61e682efc1acd7d2c/airflow/providers/amazon/aws/hooks/glue.py#L103-L104
   
   And this method is called in any cases here: 
https://github.com/apache/airflow/blob/44024564cb3dd6835b0375d61e682efc1acd7d2c/airflow/providers/amazon/aws/hooks/glue.py#L328
   
   I realize `s3_bucket` is only used to determine `s3_log_path`: 
https://github.com/apache/airflow/blob/44024564cb3dd6835b0375d61e682efc1acd7d2c/airflow/providers/amazon/aws/hooks/glue.py#L112
   
   `script_location` on the other hand can be None and is not concatenated with 
`s3_bucket` at all. 
   
   Maybe the best way to handle the problem would be to remove this check on 
s3_bucket, and if it is None then omit the the parameter `"LogUri"` which makes 
usage of `s3_log_path` as it is not a mandatory parameter for a glue job: 
https://github.com/apache/airflow/blob/44024564cb3dd6835b0375d61e682efc1acd7d2c/airflow/providers/amazon/aws/hooks/glue.py#L118
   
   cc @Taragolis 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to