Hi team,

Thanks for deliberating on this and sorry for the delayed reply.

I agree with Ash, error codes seem pointless if we're going with
narrower exception classes.

In my opinion, for Airflow additional value can ONLY be obtained if
error codes are passed as per root cause or contextwise e.g. raise
AirflowNotFoundException(error_code =
"AERR-SECRET-NOT-FOUND") conveys that its a not found error, but in
context its a **secrets** not found problem.

But in this thread inclination has been more towards 1:1 error code to
exception mapping, rightly, to avoid complexity. But that does also make
error codes redundant, and side benefit of llm token use reduction (NOT
to be misunderstood as general better llm performance) is simply not
enough reason for additional complexity.

So yes, unless we're talking about using error codes contextwise, I
suppose it'll best to leave it at this and focus our energy on main reason
why we even started this discussion - better error messages (as Jarek
and Sameer rightly mentioned).

**Better** as in there should be comprehensive docs (why it occurred,
what should user do) for every exception. And not just "Raised when this
happened...". This particularly is needed for providers where
exceptions are more likely to be generated (as Jens mentioned).
Continuing what we already discussed above in this thread about
embedding error info in exception class (
https://github.com/apache/airflow/pull/65423/files#diff-743dc549559a0d0682597ce4d917ac237c145af5a53bbb386aaf1cae24adb1eeR57-R82
).

We could simply have error meta in exc doc string -

class AirflowNotFoundException(AirflowException):

"""Raise when the requested object/resource is not available..."""

  user_facing_error_message = "Requested resource was not found"
  description = "This error occurs when Airflow is unable to locate..."
  first_steps = "Verify that the requested..."
  documentation = "https://airflow.apache.org/docs/...";

No error mixin or additional properties, just plain doc string used to
generate doc page for exception classes. Breaking changes won't be
required and support for providers will come out of the box with doc
string.

Let me know if any thoughts or concerns on this approach, thanks.
Regards,

Omkar

On Mon, May 18, 2026 at 9:26 PM Jarek Potiuk <[email protected]> wrote:

> > I am +1 on Jarek's grouping idea but even then I am not really sure what
> additional value it gives over just having better/descriptive error
> messages.
>
> > I think better error messages themselves would probably help users more
> than error codes. Clear exceptions with actionable context are already
> searchable and reasonably LLM-friendly without introducing another taxonomy
> layer on top of Python exceptions.
>
> Changing my vote to -0.5. I thought about it a lot, and this is the most
> valid point actually. Possibly what we **really** want is to ensure our
> error descriptions are good. I also discussed it today with a friend. The
> point of the discussion was that the documentation should be easy for
> agents to read. Surprisingly, unique error codes, are quite a bit better
> for humans, than agents - agents will find their way in even slightly
> chaotic text as long as the description of the error is good and somewhat
> actionable.
>
> J
> ><
>
>
> On Mon, May 18, 2026 at 10:21 PM Sameer Mesiah <[email protected]>
> wrote:
>
> > Hi,
> >
> > Thanks for starting this discussion.
> >
> > Overall, I am +0 on this. I have to agree somewhat with Ash here because
> I
> > am struggling to see how these AERR codes add enough value to justify the
> > additional complexity/process around them.
> >
> > I can see some benefit from a observabiity/aggregation perspective
> > (timeouts, auth failures, transient network issues, etc.) but I not
> > convinced globally defined sequential error codes are the right solution
> > for this. Especially because most operational exceptions in Airflow come
> > from providers, not core. I agree with Jens here that this feature
> becomes
> > nearly useless if it does not cover providers.
> >
> > I am +1 on Jarek's grouping idea but even then I am not really sure what
> > additional value it gives over just having better/descriptive error
> > messages.
> >
> > One thing that stands out to me (which has already been hinted at by Ash)
> > is that providers have increasingly been moving away from wrapping
> > everything in AirflowException and instead preserving native SDK/runtime
> > exceptions and/or using provider-specific exceptions where it makes
> sense.
> > I am not really sure how that direction fits with a centralized AERR
> > registry (if you can clariffy this that would be great!). Otherwise we
> end
> > up in a situation where core exceptions have codes while provider
> > exceptions (which are the majority of real-world failures) do not.
> >
> > To me, the strongest argument for structured IDs/categories would
> actually
> > be observability systems that need a strict filtering/aggregation field.
> I
> > can definitely see value there. But I think the LLM argument is weaker
> > since LLMs already work reasonably well with descriptive exception
> > messages/stack traces as-is.
> >
> > The provider rollout also feels tricky to me. For example, if a user is
> > running the latest version of provider A (with this error system) but an
> > older version of provider B (without it), then same Airflow deployment
> > suddenly produces two different styles of errors depending on which
> > provider raised the exception. That inconsistency feels difficult to
> avoid
> > in an independently versioned provider ecosystem like Airflow’s (and I
> > don't believe implementing this in the common provider is enough to
> > mitigate this concern).
> >
> > I think better error messages themselves would probably help users more
> > than error codes. Clear exceptions with actionable context are already
> > searchable and reasonably LLM-friendly without introducing another
> taxonomy
> > layer on top of Python exceptions.
> >
> > Thanks,
> > Sameer Mesiah.
> >
>

Reply via email to