luos-fc opened a new issue, #32330: URL: https://github.com/apache/airflow/issues/32330
### Apache Airflow version 2.6.2 ### What happened We are currently on AWS Provider 6.0.0 and looking to upgrade to the latest version 8.2.0. However, there are some issues with the GlueCrawlerOperator making the upgrade challenging, namely that the operator attempts to update the crawler tags on every run. Because we manage our resource tagging through Terraform, we do not provide any tags to the operator, which results in all of the tags being deleted (as well as needing additional glue:GetTags and glue:UntagResource permissions needing to be added to relevant IAM roles to even run the crawler). It seems strange that the default behaviour of the operator has been changed to make modifications to infrastructure, especially as this differs from the GlueJobOperator, which only performs updates when certain parameters are set. Potentially something similar could be done here, where if no `Tags` are present in the `config` dict they aren't modified at all. Not sure what the best approach is. ### What you think should happen instead The crawler should run without any alterations to the existing infrastructure ### How to reproduce Run a GlueCrawlerOperator without tags in config, against a crawler with tags present ### Operating System Debian GNU/Linux 11 (bullseye) ### Versions of Apache Airflow Providers Amazon 8.2.0 ### Deployment Official Apache Airflow Helm Chart ### Deployment details _No response_ ### Anything else _No response_ ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
