Taragolis commented on issue #27065: URL: https://github.com/apache/airflow/issues/27065#issuecomment-1293984562
> BTW. I've heard VERY bad things about EFS when EFS is used to share DAGs. It has profound impact on stability and performance of Airlfow if you have big number of DAGs unless you pay big bucks for IOPS. I've heard that from many people. > This is the moment when I usually STRONGLY recommend GitSync instead: https://medium.com/apache-airflow/shared-volumes-in-airflow-the-good-the-bad-and-the-ugly-22e9f681afca It's always it depends on configuration and monitoring. I personally have this issue might be in Airflow 2.1.x and I do not know is it actually related to Airflow itself or some other stuff. Work with EFS definitely take more effort rather than GitSync. Just for someone who might found this thread in the future with EFS performance degradation might help: **Disable save python bytecodes inside of NFS (AWS EFS) mount** + Mount as Read-Only + Disable Python bytecode by set `PYTHONDONTWRITEBYTECODE=x` + Or set location for bytecodes by set `PYTHONPYCACHEPREFIX` for example to `/tmp/pycaches` Throughput in mode Bursting in first looks like miracle but when all Bursting Capacity go to zero it could turn into your life into the hell. Each newly created EFS share has about 2.1 TB Bursting capacity. What could be done here: - Switch to Provisional Throughput mode permanently which might cost a lot, something like 6 USD per 1 MiB/sec without VAT - Switch to Provisional Throughput mode only when Bursting Capacity less than some amount, like 0.5 TB, and switch back when Bursting Capacity exceed limit 2.1 TB. Unfortunately there is no autoscaling so it would be manual or combination of CloudWatch Alerting + AWS Lambda.  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
