Re: [DISCUSS] a cache for Airflow Variables

2023-04-05 Thread Jarek Potiuk
Let's. I dismissed my review and added a comment on how I propose to see it there. On Wed, Apr 5, 2023 at 2:11 AM Vandon, Raphael wrote: > I like the proposal of having a way to invalidate the cache manually when > changes need to be propagated quickly. > I think this is something we can ship

Re: [DISCUSS] a cache for Airflow Variables

2023-04-04 Thread Vandon, Raphael
I like the proposal of having a way to invalidate the cache manually when changes need to be propagated quickly. I think this is something we can ship without in the first iteration, provided we make the cache disabled by default. Then we could add it as an option in the Airflow CLI, and

RE: [DISCUSS] a cache for Airflow Variables

2023-03-30 Thread Scheffler Jens (XC-DX/ETV5)
21:16 To: dev@airflow.apache.org Subject: Re: [DISCUSS] a cache for Airflow Variables Hello here. I would like to propose something. 1) I really like the idea of conditionally disabling parsing - and I would really love that this idea is hashed out and discussed more (especially if those who

Re: [DISCUSS] a cache for Airflow Variables

2023-03-30 Thread Jarek Potiuk
Hello here. I would like to propose something. 1) I really like the idea of conditionally disabling parsing - and I would really love that this idea is hashed out and discussed more (especially if those who came with it have a bit more time than me, I seem to be involved in more things recently

Re: [DISCUSS] a cache for Airflow Variables

2023-03-27 Thread Vandon, Raphael
My initial goal when working on this cache was mostly to shorten DAG parsing times, simply because that's what I was looking at, so I'd be happy with restricting this cache to dag parsing. I'm still relatively new to airflow codebase, so I don't know all the implications this change has, so I'm

Re: [DISCUSS] a cache for Airflow Variables

2023-03-25 Thread Jarek Potiuk
Two comments here as well (just technical rather than whether we should do it or not) - to clarify maybe what the proposal really is. Hussein: Yeah I had exactly the same thoughts and that was even my original response in the PR

Re: [DISCUSS] a cache for Airflow Variables

2023-03-25 Thread Daniel Standish
Surprised to hear that it doesn't work with celery. Is that right? I assumed that this was the main target. If it's really only a benefit in dag processor, it's surprising that it provides much benefit because it should be one call per var-file-parse; in worker it will be once per ti and I

Re: [DISCUSS] a cache for Airflow Variables

2023-03-25 Thread Hussein Awala
n From: Jarek Potiuk Sent: Friday, March 24, 2023 11:59:44 AM To: dev@airflow.apache.org Subject: Re: [DISCUSS] a cache for Airflow Variables Two points: > I like Daniel's idea: implement this _in_ the provider itself (means you > can't use in memory cache,

Re: [DISCUSS] a cache for Airflow Variables

2023-03-24 Thread Jarek Potiuk
Two points: > I like Daniel's idea: implement this _in_ the provider itself (means you > can't use in memory cache, but can use the DB or redis etc) then it doesn't > even need user to upgrade to 2.6. > Also it means that a problem with a single provider doesn't make all of core >

Re: [DISCUSS] a cache for Airflow Variables

2023-03-24 Thread Elad Kalif
Jarek I can argue that by doing that the problem is not gone but just reduced to a level where it's even harder to spot and adresses by the users. This is the worst kind of problems to tackle! In my prespective if we don't address the issue directly (general solution to top level code issues) any

Re: [DISCUSS] a cache for Airflow Variables

2023-03-24 Thread Jarek Potiuk
TL;DR; I am in favour of the change (limiting it to DagFileProcessor only) and turning using Variables at top-level as good practice (2.6+). Here is why. I partially agree with Elad /Ash, but I am not 0/1 on that subject any more. I think there is a more nuanced approach that we could take here.

Re: [DISCUSS] a cache for Airflow Variables

2023-03-23 Thread Daniel Standish
It would not help with kubernetes executor. Did you try with local executor? Another option to consider is to implement this specifically on the AWS secrets manager secrets backend itself.

Re: [DISCUSS] a cache for Airflow Variables

2023-03-23 Thread Mehta, Shubham
Considering the current framing of the problem/solution - as a means to address poorly written DAGs - I agree with Elad's perspective. Nevertheless, I believe that caching for the secrets backend has the potential to enhance Airflow performance even for DAGs adhering to best practices,

Re: [DISCUSS] a cache for Airflow Variables

2023-03-23 Thread Ash Berlin-Taylor
I second Elad's view here. I would also propose an alternative fix: let's come up with a way to tell Airflow to not continuously reparse a file! A strawman example: ``` from airflow import ReparseMode, DAG AIRFLOW_REPARSE = ReparseMode.ON_FILE_CHANGED with DAG(...): ``` We could

Re: [DISCUSS] a cache for Airflow Variables

2023-03-23 Thread Elad Kalif
To me Airflow is not "open and play". It is not a closed system that guides you how to develop. Workflow as code requires more skills. There are stuff to learn before authoring dags. This discussion is about Variables but I might ask similar question about users who query API as part of top level

[DISCUSS] a cache for Airflow Variables

2023-03-23 Thread Vandon, Raphael
Hello, I’d like to submit to discussion the idea of having a cache on Airflow Variables. The aim is to reduce DAG parsing time and Secret Manager API bill for users who make a liberal use of Variable.get in their DAG files. The drawback, of course, is that caches introduce a delay in how fast