Hi team,

Starting this thread to discuss the design of Airflow error codes. These
are LLM-friendly strings starting with AERR, which airflow devs can use
when raising exceptions, to convey the error context to dag users in a
succinct way. Providing current design details below.

PR: https://github.com/apache/airflow/pull/65423

Feature flow:
1. airflow dev identifies error case and defines a new error code in the
error mapping yaml (say AERR002).
2. dev then adds AirflowErrorCodeMixin to respective exception class
that they'd want to raise with an error_code.
3. dev then specifies the error_code in raise in code (e.g.  raise
AirflowNotFoundException(..., error_code="AERR002")).
4. dev runs breeze build-docs that generates a new docs page AERR002.rst
5. breeze static check takes care of validating if error code is mapped
to correct exception class.

User side:
On airflow users' side, they now see airflow error code as
part of the stack trace, which they can use for communicating problems
instead of pasting verbose stack traces. Error codes also improve
LLM-based discovery of airflow errors as codes are much more
deterministic/well-defined than plain stack traces.

Open questions:
1. Should the error code be mandatory for all raises of an exception
class that uses them?
2. Where should the error code info be stored? Is a YAML-based registry
good enough?
3. Shall we have a 1:1 mapping between an error code and exception
class? e.g. AirflowNotFoundException mapped only to AERR002 i.e. only one
error code. (current implementation in PR has supports many to one mapping,
one exception class <-> multiple error codes based on respective context).

Look forward to your thoughts on above open questions or any other
design suggestions you'd like to add, thanks!

Regards,
Omkar

Reply via email to