> I am +1 on Jarek's grouping idea but even then I am not really sure what additional value it gives over just having better/descriptive error messages.
> I think better error messages themselves would probably help users more than error codes. Clear exceptions with actionable context are already searchable and reasonably LLM-friendly without introducing another taxonomy layer on top of Python exceptions. Changing my vote to -0.5. I thought about it a lot, and this is the most valid point actually. Possibly what we **really** want is to ensure our error descriptions are good. I also discussed it today with a friend. The point of the discussion was that the documentation should be easy for agents to read. Surprisingly, unique error codes, are quite a bit better for humans, than agents - agents will find their way in even slightly chaotic text as long as the description of the error is good and somewhat actionable. J >< On Mon, May 18, 2026 at 10:21 PM Sameer Mesiah <[email protected]> wrote: > Hi, > > Thanks for starting this discussion. > > Overall, I am +0 on this. I have to agree somewhat with Ash here because I > am struggling to see how these AERR codes add enough value to justify the > additional complexity/process around them. > > I can see some benefit from a observabiity/aggregation perspective > (timeouts, auth failures, transient network issues, etc.) but I not > convinced globally defined sequential error codes are the right solution > for this. Especially because most operational exceptions in Airflow come > from providers, not core. I agree with Jens here that this feature becomes > nearly useless if it does not cover providers. > > I am +1 on Jarek's grouping idea but even then I am not really sure what > additional value it gives over just having better/descriptive error > messages. > > One thing that stands out to me (which has already been hinted at by Ash) > is that providers have increasingly been moving away from wrapping > everything in AirflowException and instead preserving native SDK/runtime > exceptions and/or using provider-specific exceptions where it makes sense. > I am not really sure how that direction fits with a centralized AERR > registry (if you can clariffy this that would be great!). Otherwise we end > up in a situation where core exceptions have codes while provider > exceptions (which are the majority of real-world failures) do not. > > To me, the strongest argument for structured IDs/categories would actually > be observability systems that need a strict filtering/aggregation field. I > can definitely see value there. But I think the LLM argument is weaker > since LLMs already work reasonably well with descriptive exception > messages/stack traces as-is. > > The provider rollout also feels tricky to me. For example, if a user is > running the latest version of provider A (with this error system) but an > older version of provider B (without it), then same Airflow deployment > suddenly produces two different styles of errors depending on which > provider raised the exception. That inconsistency feels difficult to avoid > in an independently versioned provider ecosystem like Airflow’s (and I > don't believe implementing this in the common provider is enough to > mitigate this concern). > > I think better error messages themselves would probably help users more > than error codes. Clear exceptions with actionable context are already > searchable and reasonably LLM-friendly without introducing another taxonomy > layer on top of Python exceptions. > > Thanks, > Sameer Mesiah. >
