Hi everyone,

I would like to start a discussion regarding the configuration design for
the new REST API and UI metrics feature currently being proposed in PR
#64523 <https://github.com/apache/airflow/pull/64523>.

The core capability adds request monitoring (QPS, latency, and errors) to
Airflow’s FastAPI/Starlette stack. While there is general agreement that
having these metrics in production would be highly valuable, we have hit a
architectural design question regarding how users should configure these
metrics, and we’d love to get the community's feedback.
The Core Question

How flexible should the mapping between URL prefixes and metric tags (
api_surface) be for end-users? Currently, we have two design directions:
Option A (Current PR Design): Fully User-Configurable via JSON

Allows deployment managers to define a custom mapping in airflow.cfg
(e.g., {"/api/v2":
"public", "/ui": "ui", "/execution": "execution", "/my-plugin": "plugin"}).

   -

   *Pros:* High flexibility. It allows users to gain metrics coverage for
   additional custom mounted APIs, Execution APIs, or third-party plugin
   routes without needing Airflow core code changes.
   -

   *Cons:* Adds slightly more complexity to the configuration.

Option B (Alternative Suggestion): Hardcoded in Code with Simple Toggles

Airflow maintainers hardcode the specific route-to-tag mappings (e.g.,
strictly /api/v2 and /ui) directly in the codebase. Users would only have a
simple boolean flag to turn the metrics on or off, but cannot customize the
URL-to-tag mappings.

   -

   *Pros:* Simpler configuration, lower cognitive load for the average user.
   -

   *Cons:* Lacks extensibility for custom plugins or enterprise-specific
   API extensions.

Discussion Context

Jason suggested that allowing users to define the mapping themselves
(Option A) provides the necessary extensibility for production environments
where custom plugins or additional mounted endpoints are heavily utilized.
On the other hand, ashb raised concerns about whether users actually need
this level of configurability and whether it duplicates existing metric
filtering mechanisms.
I really appreciate Jason, ashb, Pierre, and Jens taking the time to share
their insights on this.

We would appreciate your thoughts on which approach makes more sense for
Airflow's architecture moving forward.

You can find the full discussion and implementation details in the PR here:
https://github.com/apache/airflow/pull/64523

Thanks,
Henry

Reply via email to