potiuk commented on issue #18598:
URL: https://github.com/apache/airflow/issues/18598#issuecomment-934906238


   As I understand  travelling to Hashicorp seems to be the culprit. As 
mentioned in the best practices - you should not rech out from DAGs while 
parsing (i.e. in top-level-code). This is the most likely reason of the problem.
   
   Back of the envelipe calculation :  imagine your request to Hashicorp takes 
200ms, each dags reads 5 variables and you have 200 dags parsed  by 4 
processes. Then the min time JUST for Hashicorp calls is 0.2 * 5 *200/4 = 50 s 
(in perfect conditions) . Parsing DAG is serial - any retrieval frome external 
source is serialized per process. Which means that any time parser reaches out, 
it has to wait before proceeding. And likely the numbers are different (worse) 
and parallelism does not scale linearly.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to