GitHub user raphaelauv added a comment to the discussion: HttpToS3Operator OOM when downloading large file
hi, HttpToS3Operator is an airflow operator doing the data transfer thanks to the code you linked. So data is going through the airflow_worker ( if using celery executor ) you have multiple options: 1) use the kubernetes Executor ( or any other executor that let you custom the RAM context of the task ) for this task and the code of the operator ( but still not doing stream transfer ) will not suffer of the airflow_worker hardware limits or 2) use the KubernetesPodOperator and trigger an efficient specialist tool to execute the transfer , like RCLONE -> https://rclone.org/docs/ GitHub link: https://github.com/apache/airflow/discussions/46066#discussioncomment-11958178 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
