vandonr-amz commented on PR #30259:
URL: https://github.com/apache/airflow/pull/30259#issuecomment-1520646760
> As I see it - this PR reduce severity of one problem yet it creates
another one.
I don't really agree with you here. It is not creating a problem that didn't
exist before. I see your argument, but I think the situation is actually either:
- The cluster admin wants to be able to detect "bad" DAGS by monitoring
parsing time, they already do it, and they help the DAG authors in fixing it.
This cluster admin would not enable this cache because their solution is to
attack the problem at the root. They already have their own solution, this
cache is not a new problem nor a new solution to them.
or
- The cluster admin has on their hands plenty of DAGs that take a long time
to parse, they are not in the business of educating the DAG authors (for
whatever reason), and they'd be happy to have a flip to switch that would lower
parsing time, network traffic and possibly cloud provider costs. They had a big
problem before, now they have a smaller problem.
Maybe I'm wrong it my view of the things, I've never been a cluster admin,
and I have very limited experience of even being an airflow user.
I'm basing my assumptions on the fact that it was mentioned earlier that
this switch would be controlled by cluster admins. If it truly is a nightmare
for (some of) them, why would they enable it ?
And also, a cluster admin can choose to enable this to get the time & cost
benefits, but disable it for a couple of parsing cycles every once in a while
to check if things are not getting too out of hand ? idk ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]