Makes a lot of sense to me! On 2026/04/19 13:56:56 Elad Kalif wrote: > Great idea! > Love it! > > I have some questions / comments: > 1. The current interface suggests rules that contain a RetryRule object. > but I wonder if we should change exception to exceptions and accepting a > list. > > rules=[ > RetryRule( > exceptions=["requests.exceptions.HTTPError", > "google.auth.exceptions.RefreshError"] > ..., > )] > > I'm thinking about a case where several exceptions need the same behaviour > and user may not wish to offer different reasoning for each. > > 2. Does it make sense to extend the interface for xcom values? I'm thinking > about a case where dag authors don't have full control over the exception > raised or even some upstream library changing the exception which results > in retry logic to be broken. Maybe we should offer also the option to set > retry based on previous attempt xcom value? > > 3. Maybe something for the longer run but still worth discussing - one of > the main motivations for custom weight rules > https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/priority-weight.html#custom-weight-rule > was to set priority based on try number. I wonder if we may want to somehow > combine it with the Retry rule. For retries, I can argue that the weight of > the task is a property of retry instructions and it can very be that the > weight will change depending on the exception. > > On Sun, Apr 19, 2026 at 6:30 AM Shahar Epstein <[email protected]> wrote: > > > Great idea! I liked both the deterministic approach as well as the AI > > integrated. > > > > > > Shahar > > > > On Sat, Apr 18, 2026 at 3:02 AM Kaxil Naik <[email protected]> wrote: > > > > > Hi all, > > > > > > Continuing the push to make Airflow AI-native, I have put together > > AIP-105: > > > Pluggable Retry Policies. > > > > > > Wiki: > > > > > > > > https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-105%3A+Pluggable+Retry+Policies > > > PR (core): https://github.com/apache/airflow/pull/65450 > > > PR (LLM-powered, common-ai provider): > > > https://github.com/apache/airflow/pull/65451 > > > > > > The problem is straightforward: Airflow retries every failure the same > > way. > > > An expired API key gets retried 3 times over 15 minutes. A rate-limited > > API > > > gets retried immediately, hitting the same 429. Users who want smarter > > > retries today have to wrap every task in try/except and raise > > > AirflowFailException manually, mixing retry logic into business logic. > > > > > > This AIP adds a retry_policy parameter to BaseOperator. The policy > > > evaluates the actual exception at failure time and returns RETRY (with a > > > custom delay), FAIL (skip remaining retries), or DEFAULT (standard > > > behaviour). It runs in the worker process, not the scheduler. > > > > > > Declarative example: > > > > > > ```python > > > @task( > > > retries=5, > > > retry_policy=ExceptionRetryPolicy( > > > rules=[ > > > RetryRule( > > > exception="requests.exceptions.HTTPError", > > > action=RetryAction.RETRY, > > > retry_delay=timedelta(minutes=5) > > > ), > > > RetryRule( > > > exception="google.auth.exceptions.RefreshError", > > > action=RetryAction.FAIL > > > ), > > > ] > > > ), > > > ) > > > def call_api(): > > > ... > > > ``` > > > > > > LLM-powered example -- uses any pydantic-ai provider (OpenAI, Anthropic, > > > Bedrock, Ollama): > > > > > > @task(retries=5, retry_policy=(llm_conn_id="my_llm")) > > > def call_flaky_api(): ... > > > > > > The LLM version classifies errors into categories (auth, rate_limit, > > > network, data, transient, permanent) using structured output with a > > > 30-second timeout and declarative fallback rules for when the LLM itself > > is > > > down. > > > > > > I have attached demo videos and screenshots to both PRs showing both > > > policies running end-to-end in Airflow -- including the LLM correctly > > > classifying 4 different error types via Claude Haiku. > > > > > > Full design, done criteria, and implementation details are in the wiki > > page > > > above. > > > > > > Feedback welcome. > > > > > > Thanks, > > > Kaxil > > > > > >
--------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
