Taragolis commented on issue #27065:
URL: https://github.com/apache/airflow/issues/27065#issuecomment-1293984562

   > BTW. I've heard VERY bad things about EFS when EFS is used to share DAGs. 
It has profound impact on stability and performance of Airlfow if you have big 
number of DAGs unless you pay big bucks for IOPS. I've heard that from many 
people.
   > This is the moment when I usually STRONGLY recommend GitSync instead: 
https://medium.com/apache-airflow/shared-volumes-in-airflow-the-good-the-bad-and-the-ugly-22e9f681afca
   
   It's always it depends on configuration and monitoring. I personally have 
this issue might be in Airflow 2.1.x and I do not know is it actually related 
to Airflow itself or some other stuff. Work with EFS definitely take more 
effort rather than GitSync.
   
   Just for someone who might found this thread in the future with EFS 
performance degradation might help:
   
   **Disable save python bytecodes inside of NFS (AWS EFS) mount**
      + Mount as Read-Only
      + Disable Python bytecode by set `PYTHONDONTWRITEBYTECODE=x`
      + Or set location for bytecodes by set `PYTHONPYCACHEPREFIX` for example 
to `/tmp/pycaches`
   
   Throughput in mode Bursting in first looks like miracle but when all 
Bursting Capacity go to zero it could turn into your life into the hell. Each 
newly created EFS share has about 2.1 TB Bursting capacity.
   
   What could be done here:
   - Switch to Provisional Throughput mode permanently which might cost a lot, 
something like 6 USD per 1 MiB/sec without VAT
   - Switch to Provisional Throughput mode only when Bursting Capacity less 
than some amount, like 0.5 TB, and switch back when Bursting Capacity exceed 
limit 2.1 TB. Unfortunately there is no autoscaling so it would be manual or 
combination of CloudWatch Alerting + AWS Lambda.
   
   
![image](https://user-images.githubusercontent.com/3998685/198383225-2b101e42-726f-4f60-90e2-44ab3e4a1098.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to