Re: [DISCUSS] Persisting State in a Trigger

2025-06-10 Thread Ryan Hatter
I don't think we should try to take on Kafka, but better supporting event-driven scheduling is one of the oft-repeated highlights of Airflow 3. IMO, it doesn't make sense to manage state using object storage. A simple model in Airflow would be suitable. On Mon, Jun 9, 2025 at 8:54 AM Jarek Potiuk

Re: [DISCUSS] Persisting State in a Trigger

2025-06-10 Thread Ryan Hatter
Like... XComs it makes sense to ship to object storage since it can be necessary to share large amounts of data between tasks. But something to track trigger state for event-driven scheduling should consistently be small? On Tue, Jun 10, 2025 at 1:58 PM Ryan Hatter wrote: > I don't

Re: [DISCUSS] Deadline Alert Callbacks

2025-05-21 Thread Ryan Hatter
+1 for option 2, primarily because of: It would be more robust and resilient, and therefore be able to run the > callbacks *even in presence of certain kinds of issues like the scheduler > being bogged-down* On Wed, May 21, 2025 at 5:09 PM Kataria, Ramit wrote: > Hi all, > > I’m working with

Re: [ANNOUNCE] S3 Docs Publishing

2025-05-14 Thread Ryan Hatter
Great work. This is something that no one wanted to do -- thanks for digging in here and making such a huge impact! On Mon, May 12, 2025 at 7:49 AM Vincent Beck wrote: > That is awesome!!! Huge huge improvement! > > On 2025/05/09 19:56:58 Vikram Koka wrote: > > Great job Pavan and team! > > 3 mi

Re: Apache - GSOC'25 projects / Contributions

2025-02-25 Thread Ryan Hatter
Feel free to DM me in the Airflow OSS Slack and I'd be happy to point you in the right direction :) On Mon, Feb 24, 2025 at 4:39 PM Bishundeo, Rajeshwar wrote: > Hi Mohith, > > It's always good to see newcomer's to Airflow. To address your question > below on getting started with Airflow, here a

Re: Airflow should deprecate the term "DAG" for end users

2025-02-18 Thread Ryan Hatter
Long after opening this can of worms, I also agree with Daniel S: Let's define "DAG" in the context of Airflow and be done with it (at least for now :) ). I've opened a docs PR attempting to do just that: https://github.com/apache/airflow/pull/46875 O

Re: Updating "zombie task" terminology to "task heartbeat timeout"

2025-02-13 Thread Ryan Hatter
> > Tbh "heartbeat" itself is an overused term/concept in Airflow. I think we > already have 6 configurations with "heartbeat" in it, and they're different > types of heartbeats. Anyways, I am against this name change: > scheduler_zombie_task_threshold --> > scheduler_task_heartbeat_timeout_thr

Re: Updating "zombie task" terminology to "task heartbeat timeout"

2025-02-11 Thread Ryan Hatter
I love it. "heartbeat timeout" is obvious and has meaning in software beyond Airflow, so it makes sense to stick with this verbiage and use it to replace "zombie" in docs, configs, logs, and code IMO. On Tue, Feb 11, 2025 at 4:15 PM Karen Braganza wrote: > Hi, > > I have been working on this PR

Re: Very strange (AI generated) issues

2025-01-30 Thread Ryan Hatter
Here's a boilerplate response that I'm going to start using moving forward: Hello, > > This Issue appears to be AI-generated spam. If this is a mistake, please > let us know—otherwise, any further spam Issues may lead us to report your > account to GitHub. > On Mon, Jan 27, 2025 at 10:16 AM Jarek

Re: New committer: Gopal Dirisao (dirrao)

2024-10-29 Thread Ryan Hatter
Congrats! So well deserved! On Tue, Oct 29, 2024 at 2:40 PM Vikram Koka wrote: > Congratulations Gopal, well deserved! > > Vikram > > > On Tue, Oct 29, 2024 at 12:51 PM Oliveira, Niko > > wrote: > > > Congrats! Welcome aboard :) > > > > Cheers, > > Niko > > > >

Airflow should deprecate the term "DAG" for end users

2024-10-21 Thread Ryan Hatter
Everyone please sheathe your swords... at least for now. The term "DAG" has very little meaning to Airflow users. Indeed, it has little meaning outside of some mathematicians and software engineers for whom the properties of a DAG actually matter. For someone new to data engineering or workflow or

Re: [LAZY CONSENSUS] No "override pools" feature in backfill

2024-10-21 Thread Ryan Hatter
I could imagine a use case where pools are far more important during business hours and allowing backfills to ignore pools outside business hours would be very useful. On Mon, Oct 21, 2024 at 3:58 PM Daniel Standish wrote: > This was discussed in open source slack and I have documented the > dis

Re: [DISCUSS] Remove `max_active_tasks_per_dag`? Or at least the default

2024-10-04 Thread Ryan Hatter
I think I agree with this: I feel it should be applied at the dag *run* scope > and not across all dag runs. > Just a thought: If someone *did* want to run multiple DAG runs at the same time and limit the max active tasks per DAG, they could create a pool for that DAG and pass the pool in default

Re: [VOTE] Airflow 2.11 as bridge release

2024-09-09 Thread Ryan Hatter
If there are no features, why wouldn't we follow semver here and release 2.10.x? On Wed, Sep 4, 2024 at 9:07 PM Kaxil Naik wrote: > Hi all, > > As discussed in > https://lists.apache.org/thread/7jf12p2mk0nr5495f26r67gnpm3jq8oj I am > calling for a lazy consensus on using marking 2.11 as a bridge

Re: [ANNOUNCE] New committer: Ryan Hatter

2024-07-01 Thread Ryan Hatter
gt; > > -Original Message- > > > > From: Jed Cunningham > > > > Sent: Friday, 28 June 2024 19:58 > > > > To: dev@airflow.apache.org > > > > Subject: [ANNOUNCE] New committer: Ryan Hatter > > > > > > > > The Project Managemen

Refactor Scheduler Timed Events to be Async?

2024-05-03 Thread Ryan Hatter
This might be a dumb question as I don't have experience with asyncio, but should the EventScheduler in the Airflow scheduler be rewritten to be asynchronous? The so called "timed events" (e.g. zombie reaping,

Re: [CALL FOR HELP] Help on Connexion 3 migration needed

2024-04-16 Thread Ryan Hatter
Does the scope of this PR warrant an AIP? On Tue, Apr 16, 2024 at 6:40 AM Jarek Potiuk wrote: > Hello here, > > I have a kind request for help from maintainers (and other contributors who > are not maintainers) - on the Connexion 3 migration for Airflow. PR here > (unfortunately - it's one big P

Re: Subject: Expressing Interest in Contributing to Apache Airflow

2024-02-15 Thread Ryan Hatter
It's also worth checking out the community page , and specifically joining the Airflow community Slack . On Thu, Feb 15, 2024 at 10:24 AM Jarek Potiuk wrote: > Hello Rahul, > > There is nothing more than the Cont

Re: [LAZY CONSENSUS] Rename slack channels

2024-02-15 Thread Ryan Hatter
Ah! Great idea! On Tue, Feb 13, 2024 at 12:22 PM Akash Sharma <2akash111...@gmail.com> wrote: > +1 > > On Tue, 13 Feb 2024, 22:50 Briana Okyere, > wrote: > > > +1 > > > > On Mon, Feb 12, 2024 at 11:42 AM Jarek Potiuk wrote: > > > > > Hey here, > > > > > > Following the earlier discussion, I am

Re: [VOTE] AIP 61 - Hybrid Executors

2024-02-01 Thread Ryan Hatter
+1 non-binding. This will be a great feature. On Thu, Feb 1, 2024 at 1:27 PM Ferruzzi, Dennis wrote: > +1 binding > > > - ferruzzi > > > > From: Igor Kholopov > Sent: Thursday, February 1, 2024 5:31 AM > To: dev@airflow.apache.org > Subject: RE: [EXTERNAL] [COU

Re: [ANNOUNCE] Starting experimenting with "Require conversation resolution" setting

2024-01-30 Thread Ryan Hatter
In my experience outside of Airflow, the benefit of not missing a review comment outweighs the friction of being required to resolve each conversation. On Mon, Jan 29, 2024 at 8:47 PM Wei Lee wrote: > I didn't notice much of a difference as a contributor. +1 vote > > Best, > Wei > > > On Jan 30,

Re: [VOTE] January 2024 PR of the Month

2024-01-23 Thread Ryan Hatter
Gotta agree with Constance and go with 22253 -- how cool that the author stuck with it all this time! On Tue, Jan 23, 2024 at 12:25 AM Aritra Basu wrote: > My vote is for #36537 it's been a huge effort and it makes huge > improvements in our packaging. Great to see it make it into airflow. > > -

Re: [DISCUSSION] Enhanced Multi-Tenant Dataset Management in Airflow: Potential First Steps

2024-01-22 Thread Ryan Hatter
I don't think it makes sense to include the create endpoint without also including dataset update and delete endpoints and updating the Datasets view in the UI to be able to manage externally created Datasets. With that said, I don't think the fact that Datasets are tightly coupled with DAGs is a

Re: AIP-61 - Hybrid Executors

2024-01-18 Thread Ryan Hatter
d be different than it was before. And > note that this is how Airflow behaves today already. > > Does that clear things up? Let me know if it doesn't and I'll have a third > go at it :) > > Cheers, > Niko > > > From: Ryan

Re: [DISCUSSION] Enabling `pre-commit.ci` application for Airflow

2024-01-18 Thread Ryan Hatter
I'm in favor of this. I love making docs changes directly in GitHub, but I often make a tiny mistake like a trailing space and the tests fail. I think things like this discourage new contributors, as contributing to docs is the easiest way to start getting involved. On Thu, Jan 4, 2024 at 12:16 PM

Re: AIP-61 - Hybrid Executors

2024-01-18 Thread Ryan Hatter
> > *IMPORTANT NOTE*: task instances that run on the default/environment > executor (i.e. with no specific override provided) will not persist the > executor in the same way so that they can be re-run/retried on any executor. Does this mean that any task that doesn't have the `executor` parameter

Re: The "no_status" state

2023-11-28 Thread Ryan Hatter
t; > > > propose if scheduler passes along a task and decides that it is not > > ready > > > > to schedule to have an additional state calling e.g. “not_ready” in > the > > > > state model between “none” and “scheduled”. > > > > > > > &g

Re: [PROPOSE] Airflow Monthly Town-Hall

2023-11-28 Thread Ryan Hatter
I'd like to be involved! On Tue, Nov 28, 2023 at 4:16 PM Viraj Parekh wrote: > I think this is a great idea -- I've heard from a lot of folks in the > community that it can be hard to keep up with everything going on with > Airflow. I think the community is really good at communicating these thi

Re: [VOTE] November PR of the Month

2023-11-28 Thread Ryan Hatter
Another +1 for 32646... this will make DAG owners' lives much easier :) On Tue, Nov 28, 2023 at 4:21 PM Hussein Awala wrote: > +1 for #32646 > > On Tue, Nov 28, 2023 at 6:32 AM Rahul Vats wrote: > > > +1 for #32646 > > > > Regards, > > Rahul Vats > > > > On Tue, 28 Nov, 2023, 09:55 Aritra Basu,

Re: [VOTE] Add providers for Pinecone, OpenAI & Cohere to enable first-class LLMOps

2023-10-27 Thread Ryan Hatter
+1 (non-binding) On Thu, Oct 26, 2023 at 9:32 AM Oliveira, Niko wrote: > +1 (binding) > > looking forward to having more native LLM capabilities in Airflow! > > > From: Aritra Basu > Sent: Wednesday, October 25, 2023 12:10:00 PM > To: dev@airflow.apache.org > Su

Re: Airflow Docs Development Issues

2023-10-26 Thread Ryan Hatter
> > > >> > > > > > Maybe I jumped to conclusions, but the easiest, > > > tactical > > > > > > > >> solution > > > > > > > >> > > (for > > > > > > > >> >

Airflow Docs Development Issues

2023-10-18 Thread Ryan Hatter
*tl;dr* 1. The GitHub Action for building docs is running out of space. I think we should archive really old documentation for large packages to cloud storage. 2. Contributing to and building Airflow docs is hard. We should migrate to a framework, preferably one that uses markdown (

The "no_status" state

2023-09-28 Thread Ryan Hatter
Over the last couple weeks I've come across a rather tricky problem a few times. One DAG run gets "stuck" in the queued state, while subsequent DAG runs will be stuck running (screenshot below). One of these issues was caused by `max_active_runs` being met when a task instance from a previously run

Re: Airflow projects

2023-09-27 Thread Ryan Hatter
You can find some Airflow community resources here: https://airflow.apache.org/community/ Also feel free to join the Airflow community Slack: https://apache-airflow-slack.herokuapp.com/ On Mon, Sep 18, 2023 at 7:59 AM Avitabayan Sarmah wrote: > Thank you, I will do that. > > Regards, > Avitab A

Re: [VOTE] September 2023 PR of the Month

2023-09-27 Thread Ryan Hatter
Gotta go with #28900 -- what a huge scope! On Tue, Sep 26, 2023 at 1:45 PM Michael Robinson wrote: > Hi folks, > > It’s once again time to vote for the PR of the Month. > > With the help of the `get_important_pr_candidates` script in dev/stats, > I’ve identified the following candidates: > > *

Re: [VOTE] Airflow Providers prepared on September 08, 2023

2023-09-11 Thread Ryan Hatter
+1 (non-binding). My change works *mostly* as expected, and the unexpected behavior isn't really a problem On Mon, Sep 11, 2023 at 1:40 PM Josh Fell wrote: > +1 (non-binding) > > Tested my changes (and another related one).

Re: Lazy Consensus - Removing the Experimental tag for Pluggy

2023-09-09 Thread Ryan Hatter
+1 (non-binding) I've seen this used as a workaround for implementing a cluster policy when (for whatever reason) modifying airflow_local_settings.py is not possible. On Fri, Sep 8, 2023 at

Re: [DISCUSS] move from semver to a more "rolling" release cycle for core

2023-09-01 Thread Ryan Hatter
What about more frequently using news fragments + using configuration settings as a way to introduce breaking changes that users can revert? Disabling "trigger dag with config" without Params by default

Re: [VOTE] Drop MsSQL as supported backend

2023-08-30 Thread Ryan Hatter
+1 non-binding On Mon, Aug 28, 2023 at 3:29 PM Aritra Basu wrote: > +1 (non-binding) > Based on reading the previous mails, looks like a good idea to drop along > with the migration support > > -- > Regards, > Aritra Basu > > On Mon, Aug 28, 2023, 11:33 PM Oliveira, Niko > > wrote: > > > +1 (bi

Re: [DISCUSS] AIP-1 and Airflow multi-tenancy

2021-04-14 Thread Ryan Hatter
I’d also like to be added please :) > On Apr 13, 2021, at 21:27, Xinbin Huang wrote: > >  > Hi Daniel & Ian, > > I am also interested in the idea of a serialization representation that can > be executed by workers directly. Can you also add me to the call? > > Thanks > Bin > >> On Tue, Apr

Re: [DISCUSS] Add Breeze Support for CeleryExecutor and KubernetesExecutor

2021-03-27 Thread Ryan Hatter
uld add it then as > an option. > > BTW. We already have a number of CeleryExecutor tests that use the > integrations, so Breeze has all what's needed: > > > https://github.com/apache/airflow/blob/master/tests/executors/test_celery_executor.py#L109 > > J. > >

Re: [DISCUSS] Add Breeze Support for CeleryExecutor and KubernetesExecutor

2021-03-22 Thread Ryan Hatter
master/tests/executors/test_celery_executor.py#L109 > > J. > > On Mon, Mar 22, 2021 at 2:24 PM Ryan Hatter wrote: > >> Hmm, maybe I was just getting twisted around with docker then. I’ll have >> a look at what you shared. >> >> Thanks Bin :) >> >&g

Re: [DISCUSS] Add Breeze Support for CeleryExecutor and KubernetesExecutor

2021-03-22 Thread Ryan Hatter
the flag --skip-mounting-local-sources. You can find > more details here > https://github.com/apache/airflow/blob/master/BREEZE.rst#mounting-local-sources-to-breeze > > Best > Bin > > >> On Sun, Mar 21, 2021 at 5:32 PM Ryan Hatter wrote: >> I recently had some

[DISCUSS] Add Breeze Support for CeleryExecutor and KubernetesExecutor

2021-03-21 Thread Ryan Hatter
I recently had some trouble trying to fix a bug in the CeleryExecutor . The code change was small, but it was really difficult to set up a development environment using the CeleryExecutor. I ultimately had to muck around with the test case that covers t

Re: dbt Provider

2021-03-05 Thread Ryan Hatter
flow-operator/ >>> >>> I'd much rather we work with dbt to do what ever is needed to make it a >>> full provider than pull it in tree when it already exists as a third party >>> package. >>> >>> -ash >>> >>>> On 3 March

Re: dbt Provider

2021-03-04 Thread Ryan Hatter
already exists as a third party package. > > -ash > >> On 3 March 2021 18:04:01 GMT, Ryan Hatter wrote: >> Hey all, >> >> dbt seems to continue to gain momentum. There's already an airflow-dbt >> project that is essentially a provider package. Would it

dbt Provider

2021-03-03 Thread Ryan Hatter
Hey all, dbt seems to continue to gain momentum. There's already an airflow-dbt project that is essentially a provider package. Would it make sense to fold the dbt_hook

Re: [VOTE] Airflow Providers - release candidates from 2021-02-27

2021-03-02 Thread Ryan Hatter
sting it. Providing that you tested it before >> with real GSuite account is for me enough of a confirmation ;). >> >> J. >> >> On Sun, Feb 28, 2021 at 10:00 PM Abdur-Rahmaan Janhangeer >> wrote: >>> Salutes for having a GSuite account just for th

Re: New Committers: James Timmins, Elad Kalif & Daniel Standish

2021-03-01 Thread Ryan Hatter
Awesome! Congrats all! > On Mar 1, 2021, at 11:20, Tomasz Urbaszek wrote: > >  > Congrats Elad, Daniel and James! Well deserved indeed! > > T. > >> On Mon, 1 Mar 2021 at 19:12, Jarek Potiuk wrote: >> Woohoo! >> >>> On Mon, Mar 1, 2021 at 6:04 PM Sumit Maheshwari >>> wrote: >>> Congratula

Re: [VOTE] Airflow Providers - release candidates from 2021-02-27

2021-02-28 Thread Ryan Hatter
ome testing.: >> >> * amazon : Cristòfol Torrens, Ruben Laguna, Arati Nagmal, Ivica Kolenkaš, >> JavierLopezT >> * apache.druid: Xinbin Huang >> * apache.spark: Igor Khrol >> * cncf.kubernetes: jpyen, Ash Berlin-Taylor, Daniel Imberman >> * google: Vivek

Re: Create official Apache Airflow publication on Medium.com

2021-02-20 Thread Ryan Hatter
I’d be interested in helping edit/proofread articles :) > On Feb 20, 2021, at 11:40, Deng Xiaodong wrote: > >  > It's a good idea to me! > > Maybe worthwhile discussing how the Publication will be managed to ensure > high quality, like: criteria of stories/articles to include, review/approval