manugarri opened a new issue, #28468:
URL: https://github.com/apache/airflow/issues/28468

   ### Apache Airflow Provider(s)
   
   amazon
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Apache Airflow version
   
   latest
   
   ### Operating System
   
   any
   
   ### Deployment
   
   Other
   
   ### Deployment details
   
   _No response_
   
   ### What happened
   
   First of all, apologies if this is not the right section to post a GH issue. 
I looked for provider specific feature requests but couldnt find such section.
   
   We use the aws provider at my company to interact from airflow with AWS 
services. We are using poetry for building the testing environment to test our 
dags.
   
   However the build times are quite long, and the reason is building pandas, 
which is a [dependency 
](https://github.com/apache/airflow/blob/main/airflow/providers/amazon/provider.yaml#L62)
 of the amazon provider.
   
   By checking the provider's code, it seems pandas is used in a small minority 
of functions inside the provider:
   ```
   ./aws/transfers/hive_to_dynamodb.py:93:        data = 
hive.get_pandas_df(self.sql, schema=self.schema)
   ```
   and
   ```
   ./aws/transfers/sql_to_s3.py:159:        data_df = 
sql_hook.get_pandas_df(sql=self.query, parameters=self.parameters)
   ```
   
   Forcing every AWS Airflow user that do not use hive or want to turn sql into 
an s3 file to install pandas is a bit cumbersome.
   
   ### What you think should happen instead
   
   given how heavy the package is and how little is used in the amazon 
provider, pandas should be an optional dependency.
   
   ### How to reproduce
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to