jedcunningham commented on PR #30259:
URL: https://github.com/apache/airflow/pull/30259#issuecomment-1644767177

   > Or do you think we should go zero -> all in one step ?
   
   Yes, I feel if we do it it should be a shared cache across components so 
things operate off one TTL and are consistent in a given Airflow instance.
   
   > > I’ve seen enough folks using variables to dictate the structure of their 
DAG to know this will be problematic. Having parsing be stale is asking for 
runtime problems.
   > 
   > For those cases, what do you think about providing something like 
`get("key", skip_cache=True)`? With any caching system, there are always some 
edge cases and I think we can build solution around it.
   
   The problem is knowing it'll be a problem and knowing you need to skip the 
cache. I suspect there would be a big overlap between the folks who naively 
write DAGs that would benefit from the cache and the folks who aren't advanced 
enough in Airflow to know to skip the cache. And my biggest concern here is the 
problems can happen weeks/months later. "No, I haven't touched my DAG code or 
Airflow config in weeks, it's been running fine, and it just broke over the 
weekend" == nightmare
   
   > * In addition, as far as I know, if an Airflow customer utilizes a 
standalone DAG processor (AIP-43), the issues of multiple schedulers causing 
conflict and increasing DB load are entirely eliminated.
   
   Sure, that solves the scheduler fighting scenario. That's not the only 
footgun here though. And if this is the answer to this problem, it needs to be 
documented.
   
   > Again, this PR is a step towards a more holistic product offering that 
provides caching across all components, where applicable.
   
   I respectfully disagree. I'd likely be more on board with the "off and 
experimental" plan if I thought this specific implementation could evolve to be 
used across all components. Having _local_ component caching, if more 
components adopted this approach, would make the problematic scenarios worse, 
not better.
   
   I do see both sides of this and I don't want (or like) to be a stick in the 
mud, I just think in the wild this is likely to cause heartache, even if it is 
opt-in and "experimental".
   
   That said, I won't block it. I will, however, request the docs reflect the 
experimental nature of it and ideally state there can be side effects.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to