manugarri opened a new issue, #28468: URL: https://github.com/apache/airflow/issues/28468
### Apache Airflow Provider(s) amazon ### Versions of Apache Airflow Providers _No response_ ### Apache Airflow version latest ### Operating System any ### Deployment Other ### Deployment details _No response_ ### What happened First of all, apologies if this is not the right section to post a GH issue. I looked for provider specific feature requests but couldnt find such section. We use the aws provider at my company to interact from airflow with AWS services. We are using poetry for building the testing environment to test our dags. However the build times are quite long, and the reason is building pandas, which is a [dependency ](https://github.com/apache/airflow/blob/main/airflow/providers/amazon/provider.yaml#L62) of the amazon provider. By checking the provider's code, it seems pandas is used in a small minority of functions inside the provider: ``` ./aws/transfers/hive_to_dynamodb.py:93: data = hive.get_pandas_df(self.sql, schema=self.schema) ``` and ``` ./aws/transfers/sql_to_s3.py:159: data_df = sql_hook.get_pandas_df(sql=self.query, parameters=self.parameters) ``` Forcing every AWS Airflow user that do not use hive or want to turn sql into an s3 file to install pandas is a bit cumbersome. ### What you think should happen instead given how heavy the package is and how little is used in the amazon provider, pandas should be an optional dependency. ### How to reproduce _No response_ ### Anything else _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
