I will be waiting for responses on this discussion before creating a lazy consensus till *Tue, Nov 11, 3:00 PM UTC*
So, if you have thoughts, feel free to chime in now :) Thanks & Regards, Amogh Desai On Fri, Nov 7, 2025 at 4:57 AM Buğra Öztürk <[email protected]> wrote: > Great initiative Amogh, thanks! I agree with others on 1 and not > encouraging for 2 as well. > > Idea of filling the gaps with adding more endpoints would enable more > automation with a secure environment in the long run. In addition, we can > consider providing some more granular clean up/db functionality on CLI too > where those could be automated on server side with Admin commands and not > from Dags, just an idea. > > I hope we will add airflowctl there soon, of course with limited > opwrations. 🤞 > > Bugra Ozturk > > On Thu, 6 Nov 2025, 14:32 Amogh Desai, <[email protected]> wrote: > > > Looking for some more eyes on this one. > > > > Thanks & Regards, > > Amogh Desai > > > > > > On Thu, Nov 6, 2025 at 12:55 PM Amogh Desai <[email protected]> > wrote: > > > > > > Yes, API could do this with 5-times more code including the limits > per > > > response where you need to loop over all pages until you have a full > > > list (e.g. API limited to 100 results). Not impossible but a lot of > > > re-implementation. > > > > > > Just wondering, why not vanilla task mapping? > > > > > > > Might be something that could be a potential contributionto "airflow > db > > > clean" > > > > > > Maybe, yes. > > > > > > Thanks & Regards, > > > Amogh Desai > > > > > > > > > On Thu, Nov 6, 2025 at 12:53 PM Amogh Desai <[email protected]> > > wrote: > > > > > >> > I think our efforts should be way more focused on adding some > missing > > >> API > > >> calls in Task SDK that our users miss, rather than in allowing them to > > use > > >> "old ways". Every time someone says "I cannot migrate because i did > > this", > > >> our first thought should be: > > >> > > >> * is it a valid way? > > >> * is it acceptable to have an API call for it in SDK? > > >> * should we do it ? > > >> > > >> > > >> That is currently a grey zone we need to define better I think. > Certain > > >> use cases might be general > > >> enough that we need an execution API endpoint for that, and we can > > >> certainly do that. But there will > > >> also be cases when the use case is niche and we will NOT want to have > > >> execution API endpoints > > >> for that for various reasons. The harder problem to solve is the > latter. > > >> > > >> But you make a fair point here. > > >> > > >> > > >> > > >> Thanks & Regards, > > >> Amogh Desai > > >> > > >> > > >> On Thu, Nov 6, 2025 at 2:33 AM Jens Scheffler <[email protected]> > > >> wrote: > > >> > > >>> > Thanks for your comments too, Jens. > > >>> > > > >>> >> * Aggregate status of tasks in the upstream of same Dag (pass, > > >>> fail, > > >>> >> listing) > > >>> >> > > >>> >> Does the DAG run page not show that? > > >>> Partly yes, but in our environment it is a bit more complex than > > >>> "pass/fail". Bit more complex story, we want to know more details of > > the > > >>> failed and aggregate details. So high-level saying get the XCom from > > >>> failed and then aggregate details. Imagine all tasks ahve an owner > and > > >>> we want to send a notification to each owner but if 10 tasks from one > > >>> owner fail we want to send 1 notification with 10 failed in the text. > > >>> And, yes, can be done via API. > > >>> >> * Custom mass-triggering of other dags and collection of > results > > >>> from > > >>> >> triggered dags as scale-out option for dynamic task mapping > > >>> >> > > >>> >> Can't an API do that? > > >>> Yes, API could do this with 5-times more code including the limits > per > > >>> response where you need to loop over all pages until you have a full > > >>> list (e.g. API limited to 100 results). Not impossible but a lot of > > >>> re-implementation. > > >>> >> * And the famous: Partial database clean on a per Dag level > with > > >>> >> different retention > > >>> >> > > >>> >> Can you elaborate this one a bit :D > > >>> > > >>> Yes. We have some Dag that is called 50k-100k times per day and > others > > >>> that are called 12 times a day. And a lot of others in-between like > 25k > > >>> runs per month. The Dag with 100k runs per day we want to archive > ASAP > > >>> probably after 3 days for all not failed calls to reduce DB overhead. > > >>> The failed ones we keep for 14 days for potential re-processing if > > there > > >>> was an outage. > > >>> > > >>> Most other Dag Runs we keep for a month. And some we cap that we > > archive > > >>> if more than 25k runs > > >>> > > >>> Might be something that could be a potential contributionto "airflow > db > > >>> clean" > > >>> > > >>> >> > > >>> >> Thanks & Regards, > > >>> >> Amogh Desai > > >>> >> > > >>> >> > > >>> >> On Wed, Nov 5, 2025 at 3:12 AM Jens Scheffler < > [email protected]> > > >>> wrote: > > >>> >> > > >>> >> Thanks Amough for adding docs for migration hints. > > >>> >> > > >>> >> We actually suffer a lot of integrations that had been built in > the > > >>> past > > >>> >> which now makes it hard and serious effort to migrate to version > 3. > > So > > >>> >> most probably we ourself need to take option 2 but knowing (like > in > > >>> the > > >>> >> past) that you can not ask for support. But at least this > un-blocks > > us > > >>> >> from staying with 2.x > > >>> >> > > >>> >> I'd love to take route 1 as well but then a lot of code needs to > be > > re > > >>> >> written. This will take time, And in mid term we will migrate to > > (1). > > >>> >> > > >>> >> As in the dev call I'd love if in Airflow 3.2 we could have > option 1 > > >>> >> supported out-of-the-box - knowing that some security discussion > is > > >>> >> implied, so maybe need to be turned on and not be enabled by > > default. > > >>> >> > > >>> >> The use cases we have and which requires some kind of DB access > > where > > >>> >> TaskSDK is not helping with support > > >>> >> > > >>> >> * Adding task and dag run notes to tasks as better readable > > status > > >>> >> while and after execution > > >>> >> * Aggregate status of tasks in the upstream of same Dag (pass, > > >>> fail, > > >>> >> listing) > > >>> >> * Custom mass-triggering of other dags and collection of > results > > >>> from > > >>> >> triggered dags as scale-out option for dynamic task mapping > > >>> >> * Adjusting Pools based on available workers > > >>> >> * Checking results of pass/fail per edge worker and depending > on > > >>> >> stability adjusting Queues on Edge workers based on status > and > > >>> >> errors of workers > > >>> >> * Adjust Pools based on time of day > > >>> >> * And the famous: Partial database clean on a per Dag level > with > > >>> >> different retention > > >>> >> > > >>> >> I would be okay removing option 3 and a clear warning to option 2 > is > > >>> >> also okay. > > >>> >> > > >>> >> Jens > > >>> >> > > >>> >> On 11/4/25 13:06, Jarek Potiuk wrote: > > >>> >>> My take (and details can be found in the discussion): > > >>> >>> > > >>> >>> 2. Don't make the impression it is something that we will > support - > > >>> and > > >>> >>> explain to the users that it **WILL** break in the future and > it's > > on > > >>> >>> **THEM** to fix when it breaks. > > >>> >>> > > >>> >>> The 2 is **kinda** possible but we should strongly discourage > this > > >>> and > > >>> >> say > > >>> >>> "this will break any time and it's you who have to adapt to any > > >>> future > > >>> >>> changes in schema" - we had a lot of similar cases in the past > > where > > >>> our > > >>> >>> users felt entitled to get **something** they felt as "valid way > of > > >>> using > > >>> >>> things" broken by our changes. If we say "recommended" they will > > >>> take it > > >>> >> as > > >>> >>> "and all the usage there is expected to work when Airlfow gets a > > new > > >>> >>> version so I should be fully entitled to open a valid issue when > > >>> things > > >>> >>> change". I think "recommended" in this case is far too strong > from > > >>> our > > >>> >>> side. > > >>> >>> > > >>> >>> 3. Absolutely remove. > > >>> >>> > > >>> >>> Sounds like we are going back to Airflow 2 behaviour. And we've > > made > > >>> all > > >>> >>> the effort to break out of that. Various things will start > breaking > > >>> in > > >>> >>> Airflow 3.2 and beyond. Once we complete the task isolation work, > > >>> Airflow > > >>> >>> workers will NOT have sqlalchemy package installed by default - > it > > >>> simply > > >>> >>> will not be task-sdk dependency. The fact that you **can** use > > >>> sqlalchemy > > >>> >>> now is mostly a by-product of the fact that we have not completed > > the > > >>> >> split > > >>> >>> yet - but it was not even **SUPPOSED** to work. > > >>> >>> > > >>> >>> J. > > >>> >>> > > >>> >>> > > >>> >>> > > >>> >>> On Tue, Nov 4, 2025 at 10:03 AM Amogh Desai< > [email protected]> > > >>> >> wrote: > > >>> >>>> Hi All, > > >>> >>>> > > >>> >>>> I'm working on expanding the Airflow 3 upgrade documentation to > > >>> address > > >>> >> a > > >>> >>>> frequently asked question from users > > >>> >>>> migrating from Airflow 2.x: "How do I access the metadata > database > > >>> from > > >>> >> my > > >>> >>>> tasks now that direct database > > >>> >>>> access is blocked?" > > >>> >>>> > > >>> >>>> Currently, Step 5 of the upgrade guide[1] only mentions that > > direct > > >>> DB > > >>> >>>> access is blocked and points to a GitHub issue. > > >>> >>>> However, users need concrete guidance on migration options. > > >>> >>>> > > >>> >>>> I've drafted documentation via [2] describing three approaches, > > but > > >>> >> before > > >>> >>>> proceeding to finalising this, I'd like to get community > > >>> >>>> consensus on how we should present these options, especially > given > > >>> the > > >>> >>>> architectural principles we've established with > > >>> >>>> Airflow 3. > > >>> >>>> > > >>> >>>> ## Proposed Approaches > > >>> >>>> > > >>> >>>> Approach 1: Airflow Python Client (REST API) > > >>> >>>> - Uses `apache-airflow-client` [3] to interact via REST API > > >>> >>>> - Pros: No DB drivers needed, aligned with Airflow 3 > architecture, > > >>> >>>> API-first > > >>> >>>> - Cons: Requires package installation, API server dependency, > auth > > >>> token > > >>> >>>> management, limited operations possible > > >>> >>>> > > >>> >>>> Approach 2: Database Hooks (PostgresHook/MySqlHook) > > >>> >>>> - Create a connection to metadata DB and use DB hooks to execute > > SQL > > >>> >>>> directly > > >>> >>>> - Pros: Uses Airflow connection management, simple SQL interface > > >>> >>>> - Cons: Requires DB drivers, direct network access, bypasses > > >>> Airflow API > > >>> >>>> server and connects to DB directly > > >>> >>>> > > >>> >>>> Approach 3: Direct SQLAlchemy Access (last resort) > > >>> >>>> - Use environment variable with DB connection string and create > > >>> >> SQLAlchemy > > >>> >>>> session directly > > >>> >>>> - Pros: Maximum flexibility > > >>> >>>> - Cons: Bypasses all Airflow protections, schema coupling, > manual > > >>> >>>> connection management, worst possible option. > > >>> >>>> > > >>> >>>> I was expecting some pushback regarding these approaches and > there > > >>> were > > >>> >>>> (rightly) some important concerns raised > > >>> >>>> by Jarek about Approaches 2 and 3: > > >>> >>>> > > >>> >>>> 1. Breaks Task Isolation - Contradicts Airflow 3's core promise > > >>> >>>> 2. DB as Public Interface - Schema changes would require release > > >>> notes > > >>> >> and > > >>> >>>> break user code > > >>> >>>> 3. Performance Impact - Using Approach 2 creates direct DB > access > > >>> and > > >>> >> can > > >>> >>>> bring back Airflow 2's > > >>> >>>> connection-per-task overhead > > >>> >>>> 4. Security Model Violation - Contradicts documented isolation > > >>> >> principles > > >>> >>>> Considering these comments, this is what I want to document now: > > >>> >>>> > > >>> >>>> 1. Approach 1 - Keep as primary/recommended solution (aligns > with > > >>> >> Airflow 3 > > >>> >>>> architecture) > > >>> >>>> 2. Approach 2 - Present as "known workaround" (not > recommendation) > > >>> with > > >>> >>>> explicit warnings > > >>> >>>> about breaking isolation, schema not being public API, > performance > > >>> >>>> implications, and no support guarantees > > >>> >>>> 3. Approach 3 - Remove entirely, or keep with strongest possible > > >>> >> warnings > > >>> >>>> (would love to hear what others think for > > >>> >>>> this one particularly) > > >>> >>>> > > >>> >>>> Once we arrive at some discussion points on this one, I would > like > > >>> to > > >>> >> call > > >>> >>>> for a lazy consensus for posterity and visibility > > >>> >>>> of the community. > > >>> >>>> > > >>> >>>> Looking forward to your feedback! > > >>> >>>> > > >>> >>>> [1] > > >>> >>>> > > >>> >>>> > > >>> >> > > >>> > > > https://github.com/apache/airflow/blob/main/airflow-core/docs/installation/upgrading_to_airflow3.rst#step-5-review-custom-operators-for-direct-db-access > > >>> >>>> [2]https://github.com/apache/airflow/pull/57479 > > >>> >>>> [3]https://github.com/apache/airflow-client-python > > >>> >>>> > > >>> > > >>> --------------------------------------------------------------------- > > >>> To unsubscribe, e-mail: [email protected] > > >>> For additional commands, e-mail: [email protected] > > >>> > > >>> > > >
