jedcunningham commented on PR #30259:
URL: https://github.com/apache/airflow/pull/30259#issuecomment-1644767177
> Or do you think we should go zero -> all in one step ?
Yes, I feel if we do it it should be a shared cache across components so
things operate off one TTL and are consistent in a given Airflow instance.
> > I’ve seen enough folks using variables to dictate the structure of their
DAG to know this will be problematic. Having parsing be stale is asking for
runtime problems.
>
> For those cases, what do you think about providing something like
`get("key", skip_cache=True)`? With any caching system, there are always some
edge cases and I think we can build solution around it.
The problem is knowing it'll be a problem and knowing you need to skip the
cache. I suspect there would be a big overlap between the folks who naively
write DAGs that would benefit from the cache and the folks who aren't advanced
enough in Airflow to know to skip the cache. And my biggest concern here is the
problems can happen weeks/months later. "No, I haven't touched my DAG code or
Airflow config in weeks, it's been running fine, and it just broke over the
weekend" == nightmare
> * In addition, as far as I know, if an Airflow customer utilizes a
standalone DAG processor (AIP-43), the issues of multiple schedulers causing
conflict and increasing DB load are entirely eliminated.
Sure, that solves the scheduler fighting scenario. That's not the only
footgun here though. And if this is the answer to this problem, it needs to be
documented.
> Again, this PR is a step towards a more holistic product offering that
provides caching across all components, where applicable.
I respectfully disagree. I'd likely be more on board with the "off and
experimental" plan if I thought this specific implementation could evolve to be
used across all components. Having _local_ component caching, if more
components adopted this approach, would make the problematic scenarios worse,
not better.
I do see both sides of this and I don't want (or like) to be a stick in the
mud, I just think in the wild this is likely to cause heartache, even if it is
opt-in and "experimental".
That said, I won't block it. I will, however, request the docs reflect the
experimental nature of it and ideally state there can be side effects.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]