Re: [DISCUSS] Persisting State in a Trigger

2025-07-31 Thread Vincent Beck
I commented on the AIP. Overall, I agree on the issue and that this is something we should solve. This is something I brought up in AIP-82 when I wrote it as "infinite scheduling issue" (without offering any solution :)). I also think this is a blocker to implement many different triggers based

Re: [DISCUSS] Persisting State in a Trigger

2025-07-28 Thread Jake Roach
Guangyang and I have created a draft AIP, which you can find here: https://cwiki.apache.org/confluence/display/AIRFLOW/%5BDRAFT%5D+AIP-93. Curious to get folks' thoughts and opinions! On Fri, Jul 25, 2025 at 9:12 PM Guangyang Li wrote: > Jake and I have put down the details in this AIP draft do

Re: [DISCUSS] Persisting State in a Trigger

2025-07-25 Thread Guangyang Li
Jake and I have put down the details in this AIP draft doc . Instead of calling this feature state persisting, we decided to call it asset watermark, as it's mainly an enhancement of asset-oriented proce

Re: [DISCUSS] Persisting State in a Trigger

2025-06-12 Thread Karen Braganza
I think the process_state model would be useful in the HttpEventTrigger that I am working on. The HttpEventTrigger sends requests to an API and triggers an event based on a user-defined response_check function. If the response_check function needs to evaluate multiple API responses cumulatively, it

Re: [DISCUSS] Persisting State in a Trigger

2025-06-12 Thread Daniel Standish
Alright since I was summoned... When I was an airflow user, I did a lot of incremental processes. Pretty much everything was incremental. Data warehousing / analytics shop / e-commerce reporting / integrations this kind of thing. One common use case is implementing something like a fivetran, wh

Re: [DISCUSS] Persisting State in a Trigger

2025-06-12 Thread Jarek Potiuk
Indeed, I think it's not very obvious whether metadata db storage for state or external storage is better. I think we might underestimate speed and latency of modern distributed object storage vs. our centralized api server (with essentially http overhead that s3 storage has per request + DB read/

Re: [DISCUSS] Persisting State in a Trigger

2025-06-11 Thread Ash Berlin-Taylor
Paging Daniel Standish — he’s got an old AIP (30 I think) that was targeting this problem but the airflow “model” has moved on a bit since so it might need re-working for the Airflow 3 world. -ash > On 10 Jun 2025, at 19:08, Ryan Hatter > wrote: > > Like... XComs it makes sense to ship to o

Re: [DISCUSS] Persisting State in a Trigger

2025-06-10 Thread Ryan Hatter
I don't think we should try to take on Kafka, but better supporting event-driven scheduling is one of the oft-repeated highlights of Airflow 3. IMO, it doesn't make sense to manage state using object storage. A simple model in Airflow would be suitable. On Mon, Jun 9, 2025 at 8:54 AM Jarek Potiuk

Re: [DISCUSS] Persisting State in a Trigger

2025-06-10 Thread Ryan Hatter
Like... XComs it makes sense to ship to object storage since it can be necessary to share large amounts of data between tasks. But something to track trigger state for event-driven scheduling should consistently be small? On Tue, Jun 10, 2025 at 1:58 PM Ryan Hatter wrote: > I don't think we shou

Re: [DISCUSS] Persisting State in a Trigger

2025-06-09 Thread Jarek Potiuk
Loose thoughts to spark the discussion: I think in the past, the basic architectural assumptions of Airflow made us say "no" to such requests - and keeping state because we had strong assumptions about data intervals and "idempotency" (You can look up devlist for our long past discussions what ide

[DISCUSS] Persisting State in a Trigger

2025-06-04 Thread Jake Roach
*Problem* Currently, there is no way to persist state within a Trigger, or within any Airflow object, for that matter. This presents a challenge to Airflow users, especially those looking to “watch” Assets. AssetWatcher’s leverage Triggers (inheriting from the BaseEventTrigger) to create Asset e