This is an automated email from the ASF dual-hosted git repository. vatsrahul1001 pushed a commit to branch v3-2-test in repository https://gitbox.apache.org/repos/asf/airflow.git
commit 17e44700acd593e9b8cbaf6512273c3debb7aeaf Author: Rahul Vats <[email protected]> AuthorDate: Mon May 25 18:34:31 2026 +0530 Docs: refresh JWT and security model for v3.2 with mermaid diagrams (#67435) (#67466) * Docs: refresh JWT and security model for v3.2 with mermaid diagrams Catch up the public security documentation to match the security-relevant changes flowing into the 3.2 release branch. Adds six mermaid diagrams (four in jwt_token_authentication.rst, two in security_model.rst) and documents: - Typed TIClaims Pydantic schema validation of Execution API tokens. - Unconditional revoke_token() on /auth/logout so external IdP redirects no longer leave the Airflow JWT valid. - Router-level Depends(get_user) as a defense-in-depth backstop on /api/v2 and /ui. - ExecutionAPISecretsBackend raising PermissionError on 401/403 so a deny no longer falls through to less-restrictive backends. - Tightened deserialization allowlist regex (full-string match). Registers sphinxcontrib-mermaid as a new docs dependency in devel-common and BASIC_SPHINX_EXTENSIONS. * Docs: improve security-diagram readability and add credential matrix - Replace the arrow-spaghetti credential-distribution mermaid with a component-grouped layout (least- to most-privileged left-to-right) plus an explicit RST table for true matrix lookup. - Bump all six security-diagram color palettes from very-pale tints to medium-saturation fills with explicit black text and 2px strokes, so labels stay readable in both light and dark mode renderers. * Fix HTTP verb in JWT auth mermaid diagram (PATCH, not POST) The /run endpoint is PATCH /{task_instance_id}/run, not POST. Spotted in review of #67435. (cherry picked from commit 0a506b18b9dd1e86df52302677c41f0c078f8de2) Co-authored-by: Jarek Potiuk <[email protected]> --- .../docs/security/jwt_token_authentication.rst | 135 ++++++++++++++- airflow-core/docs/security/security_model.rst | 192 +++++++++++++++++++++ devel-common/pyproject.toml | 1 + devel-common/src/docs/utils/conf_constants.py | 1 + uv.lock | 18 ++ 5 files changed, 345 insertions(+), 2 deletions(-) diff --git a/airflow-core/docs/security/jwt_token_authentication.rst b/airflow-core/docs/security/jwt_token_authentication.rst index e823c9f787a..286ca2f7308 100644 --- a/airflow-core/docs/security/jwt_token_authentication.rst +++ b/airflow-core/docs/security/jwt_token_authentication.rst @@ -40,6 +40,31 @@ Both flows share the same underlying JWT infrastructure (``JWTGenerator`` and `` classes in ``airflow.api_fastapi.auth.tokens``) but differ in audience, token lifetime, subject claims, and scope semantics. +.. mermaid:: + + flowchart LR + subgraph Clients + UI[UI / browser] + CLI[CLI] + EXT[External REST clients] + end + subgraph Internal["Internal Airflow components"] + WORKER[Worker / Task] + DFP[Dag File Processor] + TRG[Triggerer] + end + APISVR[API Server] + EXECAPI[Execution API] + UI -->|JWT cookie / Bearer| APISVR + CLI -->|Bearer| APISVR + EXT -->|Bearer| APISVR + WORKER -->|Bearer<br/>workload → execution| EXECAPI + DFP -. in-process<br/>JWT bypassed .-> EXECAPI + TRG -. in-process<br/>JWT bypassed .-> EXECAPI + + classDef internal fill:#bbdefb,stroke:#1565c0,stroke-width:2px,color:#000 + class WORKER,DFP,TRG internal + Signing and Cryptography ------------------------ @@ -66,6 +91,39 @@ Airflow supports two mutually exclusive signing modes: - The public key derived from the configured private key (automatic fallback when ``trusted_jwks_url`` is not set). +.. mermaid:: + + flowchart TB + subgraph Sym["Symmetric (HS512)"] + direction LR + S1[Scheduler / API Server] + S2[Shared secret<br/>jwt_secret] + S3[Token validator] + S1 -->|sign| S2 -->|same secret<br/>also validates| S3 + end + subgraph Asym["Asymmetric (RS256 / EdDSA)"] + direction LR + A1[Scheduler / API Server] + A2[Private key<br/>jwt_private_key_path] + A3[Public key /<br/>JWKS endpoint] + A4[Token validator] + A1 -->|sign| A2 + A2 -. derives or<br/>publishes .-> A3 + A3 -->|verify only| A4 + end + + classDef secret fill:#ffcdd2,stroke:#c62828,stroke-width:2px,color:#000 + classDef pub fill:#c8e6c9,stroke:#2e7d32,stroke-width:2px,color:#000 + class S2 secret + class A2 secret + class A3 pub + +In asymmetric mode, validators (workers, downstream services) need only the public key — the +private signing key can be tightly scoped to the issuing components (API Server, Scheduler). +In symmetric mode, any component that can validate tokens can also forge them, because there +is only one key. See :ref:`jwt-authentication-and-workload-isolation` for the deployment +implications. + REST API Authentication Flow ----------------------------- @@ -131,6 +189,14 @@ Revoked tokens are tracked in the ``revoked_token`` database table by their ``jt On logout or explicit revocation, the token's ``jti`` and ``exp`` are inserted into this table. Expired entries are automatically cleaned up at a cadence of ``2× jwt_expiration_time``. +The ``/auth/logout`` endpoint always invokes ``auth_manager.revoke_token()`` before any +redirect or cookie deletion. This includes deployments where the configured auth manager +(for example ``FabAuthManager`` or ``KeycloakAuthManager``) redirects the user to an external +logout URL — the JWT is invalidated in Airflow's ``revoked_token`` table regardless of what +the external Identity Provider does with its own session. The ``revoke_token`` call is +unconditional; auth managers that do not implement server-side revocation can keep the +default no-op implementation. + Token refresh (REST API) ^^^^^^^^^^^^^^^^^^^^^^^^ @@ -242,6 +308,37 @@ The token flows through the execution stack as follows: 6. The client's ``_update_auth()`` hook detects the header and transparently updates the ``BearerAuth`` instance to use the new ``execution`` token for all subsequent requests. +.. mermaid:: + + sequenceDiagram + autonumber + participant SCH as Scheduler + participant EXE as Executor<br/>(Celery / K8s / Local) + participant WRK as Worker + participant API as Execution API + + Note over SCH: Task ready to dispatch + SCH->>SCH: generate workload token<br/>scope=workload<br/>exp = task_queued_timeout + SCH->>EXE: workload JSON<br/>(includes token) + Note over EXE: Task waits in queue<br/>(can be minutes) + EXE->>WRK: dispatch (workload JSON) + WRK->>API: PATCH /run<br/>Bearer: workload token + Note over API: validates workload scope<br/>checks TI in QUEUED/RESTARTING<br/>409 if not + API-->>WRK: 200 OK<br/>Refreshed-API-Token: execution token<br/>(scope=execution, ~10 min) + WRK->>WRK: BearerAuth swaps to<br/>execution token + loop For all subsequent calls (heartbeats, XComs, ...) + WRK->>API: Bearer: execution token + alt token expiring (less than 20% left) + API-->>WRK: 200 OK<br/>Refreshed-API-Token: new execution token + WRK->>WRK: BearerAuth swaps again + end + end + +Even if a workload token is intercepted in transit, it can only call ``/run``. That endpoint +rejects re-runs (``409 Conflict`` unless the task instance is in ``QUEUED`` or ``RESTARTING``), +so the attack surface for the longer-lived token is bounded to "start a task that is already +queued". All other endpoints require ``scope=execution`` and reject workload tokens. + Token validation (Execution API) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -251,8 +348,16 @@ The ``JWTBearer`` security dependency validates the token once per request: 2. Performs cryptographic signature validation via ``JWTValidator``. 3. Verifies standard claims (``exp``, ``iat``, ``aud`` — ``nbf`` and ``iss`` if configured). 4. Defaults the ``scope`` claim to ``"execution"`` if absent. -5. Creates a ``TIToken`` object with the task instance ID and claims. -6. Caches the validated token on the ASGI request scope for the duration of the request. +5. **Validates the task identity claims against a typed Pydantic schema.** ``TIClaims`` + (in ``airflow.api_fastapi.execution_api.datamodels.token``) enforces that ``scope`` is one + of the declared ``TokenScope`` literals (``"execution"`` or ``"workload"``); ``TIToken`` + then parses the ``sub`` claim through a ``UUID`` field, which rejects non-UUID values. + A token whose ``scope`` is unknown, or whose ``sub`` is not a valid UUID, is rejected with + ``403 Forbidden`` even when the cryptographic signature checks pass. ``TIClaims`` keeps + ``extra="allow"`` so auth managers can attach additional, deployment-specific claims + without modifying the core schema; only the security-critical fields are typed. +6. Creates a ``TIToken`` object with the task instance ID and validated claims. +7. Caches the validated token on the ASGI request scope for the duration of the request. Route-level enforcement is handled by ``require_auth``: @@ -262,6 +367,32 @@ Route-level enforcement is handled by ``require_auth``: ``{task_instance_id}`` path parameter, preventing a worker from accessing another task's endpoints. +.. mermaid:: + + flowchart TD + REQ([Incoming request<br/>Authorization: Bearer ...]) + REQ --> CACHE{Cached on<br/>request.scope?} + CACHE -->|yes| RET([Return cached TIToken]) + CACHE -->|no| SIG[JWTValidator:<br/>verify signature] + SIG -->|fail| F1([403 Forbidden]) + SIG -->|ok| STD[Verify exp / iat / nbf<br/>aud / iss] + STD -->|fail| F1 + STD -->|ok| SCOPE[Default scope to<br/>'execution' if absent] + SCOPE --> SCHEMA[TIClaims:<br/>typed Pydantic schema] + SCHEMA -->|ValidationError| F1 + SCHEMA -->|ok| TYP{require_auth:<br/>scope in<br/>route.allowed_token_types?} + TYP -->|no| F1 + TYP -->|yes| SELF{ti:self scope<br/>declared?} + SELF -->|no| OK([Return TIToken]) + SELF -->|yes| MATCH{token.sub ==<br/>task_instance_id?} + MATCH -->|no| F1 + MATCH -->|yes| OK + + classDef fail fill:#ffcdd2,stroke:#c62828,stroke-width:2px,color:#000 + classDef pass fill:#c8e6c9,stroke:#2e7d32,stroke-width:2px,color:#000 + class F1 fail + class OK,RET pass + Token refresh (Execution API) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ diff --git a/airflow-core/docs/security/security_model.rst b/airflow-core/docs/security/security_model.rst index d46323745b4..27294d1f99a 100644 --- a/airflow-core/docs/security/security_model.rst +++ b/airflow-core/docs/security/security_model.rst @@ -329,6 +329,77 @@ Execution API. For a detailed description of the JWT authentication flows, token configuration, see :doc:`/security/jwt_token_authentication`. For the current state of workload isolation protections and their limitations, see :ref:`workload-isolation`. +The diagram below summarizes the trust boundaries between Airflow components and the metadata +database. Solid arrows are authenticated network calls (JWT-bearing HTTP). Dashed arrows are +direct database access — components on the right of the dashed line can read and write the +metadata DB, so any code they execute is implicitly trusted with the metadata database. + +.. mermaid:: + + flowchart LR + subgraph users["Users (untrusted by default)"] + UI[UI / browser] + CLI[CLI] + EXT[External REST clients] + end + + subgraph dataplane["Worker plane (no metadata DB access)"] + WRK[Worker / Task] + end + + subgraph controlplane["Control plane (metadata DB access)"] + APISVR[API Server] + SCH[Scheduler] + DFP[Dag File Processor] + TRG[Triggerer] + end + + DB[(Metadata DB)] + + UI -->|JWT| APISVR + CLI -->|JWT| APISVR + EXT -->|JWT| APISVR + WRK -->|JWT<br/>Execution API| APISVR + + APISVR -. SQL .-> DB + SCH -. SQL .-> DB + DFP -. SQL .-> DB + TRG -. SQL .-> DB + + DFP -. in-process<br/>JWT bypassed .-> APISVR + TRG -. in-process<br/>JWT bypassed .-> APISVR + + classDef untrusted fill:#ffcdd2,stroke:#c62828,stroke-width:2px,color:#000 + classDef trusted fill:#c8e6c9,stroke:#2e7d32,stroke-width:2px,color:#000 + classDef data fill:#fff9c4,stroke:#f57f17,stroke-width:2px,color:#000 + class UI,CLI,EXT,WRK untrusted + class APISVR,SCH,DFP,TRG trusted + class DB data + +The intentional asymmetry: **workers have no direct DB credentials and reach data only through +the JWT-authenticated Execution API**, while DFP and Triggerer share the control plane's DB +access and use an *in-process* transport that bypasses the JWT bearer dependency. This is why +Dag File Processor and Triggerer are treated as part of the control-plane trust boundary even +though they run user-supplied code — see :ref:`workload-isolation` and the limitations below. + +Defense in depth at the router level +.................................... + +Both the public REST API router (``/api/v2``) and the UI router (``/ui``) declare +``Depends(get_user)`` at the router level. This is purely a defense-in-depth backstop: every +authenticated route already declares its own ``GetUserDep`` or ``requires_access_*`` dependency +that itself resolves ``get_user``, and FastAPI caches dependency resolutions per request, so +the duplicate declaration is resolved only once and the runtime cost is zero. The value is preventing a future route from being added +under either router without an authentication check — the router-level dependency catches the +regression at registration time. The explicit no-auth carve-outs are limited to +``monitor_router`` (health probes), ``version_router``, and the public ``auth_router`` (login +endpoints), which are mounted directly on the public router rather than under +``authenticated_router``. + +A structural test asserts both routers carry the router-level ``Depends(get_user)``, so a +refactor that drops the dependency without considering its purpose fails CI rather than +silently widening the unauthenticated surface. + Current isolation limitations ............................. @@ -409,6 +480,24 @@ potentially still executes with direct database access in the Dag File Processor variables, and XComs are accessible to all tasks. There is no isolation between tasks belonging to different teams or Dag authors at the Execution API level. + What the Execution API **does** enforce, beyond ``ti:self``: + + * **Secrets-backend deny is honoured.** When the Execution API returns ``401``/``403`` for a + connection or variable lookup, the SDK's ``ExecutionAPISecretsBackend`` raises + ``AirflowSecretsBackendAccessDenied`` (a subclass of ``PermissionError``) instead of returning + ``None``. The secrets-backend dispatcher must not fall through to a less-restrictive backend + (such as ``EnvironmentVariablesBackend``, which performs no authorization checks) when the + authoritative backend has explicitly denied the request. ``NOT_FOUND`` responses keep the + existing fall-through behaviour so the not-found-here path remains usable. This closes a + "Type C" gap where the authorization control fired but its rejection was treated as a miss. + * **Tightened deserialization allowlist.** The serializer's class-name allowlist regex is + anchored to require a full-string match, so attacker-controlled values cannot smuggle + untrusted class names by appending an allowed suffix to a non-allowed name. + * **Typed JWT claims schema.** Even after cryptographic validation, claims are run through a + typed Pydantic schema (``TIClaims``) that enforces the ``scope`` literal, and then through + ``TIToken`` which parses ``sub`` as a UUID. A token whose ``scope`` is unknown or whose + ``sub`` is not a valid UUID is rejected before any route handler runs. + **Token signing key might be a shared secret** In symmetric key mode (``[api_auth] jwt_secret``), the same secret key is used to both generate and validate tokens. Any component that has access to this secret can forge tokens with arbitrary claims, @@ -466,6 +555,109 @@ model — Airflow does not enforce these natively. the same Unix user. Environment variables can also be scoped to individual processes or containers, making it easier to restrict which components have access to which secrets. + The diagram and table below summarize which components need which classes of sensitive value + in a well-hardened deployment. Workers should not see DB credentials or the JWT signing key + at all; the API Server is the only component that needs *all* of the privileged values. + Components are arranged from least-privileged (Worker, green) to most-privileged + (API Server, blue) — the growing column visually conveys "more privilege ⇒ more secrets": + + .. mermaid:: + + flowchart LR + subgraph WRK["Worker (least privileged)"] + direction TB + W1[Fernet key] + W2[Worker secrets backend credentials] + W3[Remote log handler kwargs] + end + + subgraph TRG["Triggerer"] + direction TB + T1[DB connection] + T2[Fernet key] + T3[Non-worker secrets backend credentials] + T4[Remote log handler kwargs] + end + + subgraph DFP["Dag File Processor"] + direction TB + D1[DB connection] + D2[Fernet key] + D3[Non-worker secrets backend credentials] + end + + subgraph SCH["Scheduler"] + direction TB + S1[DB connection] + S2[JWT signing key] + S3[Fernet key] + S4[Non-worker secrets backend credentials] + S5[Remote log handler kwargs] + end + + subgraph API["API Server (most privileged)"] + direction TB + A1[DB connection] + A2[JWT signing key] + A3[Fernet key] + end + + classDef wrk fill:#c8e6c9,stroke:#2e7d32,stroke-width:2px,color:#000 + classDef ctrl fill:#bbdefb,stroke:#1565c0,stroke-width:2px,color:#000 + class WRK wrk + class API,SCH,DFP,TRG ctrl + + Read row-by-row to check which components need a given secret; read column-by-column to + check what one component is allowed to see: + + .. list-table:: + :header-rows: 1 + :widths: 32 14 14 13 14 13 + :align: left + + * - Sensitive value + - API Server + - Scheduler + - Dag File Processor + - Triggerer + - Worker + * - DB connection + - ✓ + - ✓ + - ✓ + - ✓ + - — + * - JWT signing key + - ✓ + - ✓ + - — + - — + - — + * - Fernet key + - ✓ + - ✓ + - ✓ + - ✓ + - ✓ + * - Secrets backend credentials (non-worker) + - — + - ✓ + - ✓ + - ✓ + - — + * - Secrets backend credentials (worker) + - — + - — + - — + - — + - ✓ + * - Remote log handler kwargs + - — + - ✓ + - — + - ✓ + - ✓ + The following tables list all security-sensitive configuration variables (marked ``sensitive: true`` in Airflow's configuration). Deployment Managers should review each variable and ensure it is only provided to the components that need it. The "Needed by" column indicates which components diff --git a/devel-common/pyproject.toml b/devel-common/pyproject.toml index caa5b39b8fc..2e24abfb6b2 100644 --- a/devel-common/pyproject.toml +++ b/devel-common/pyproject.toml @@ -96,6 +96,7 @@ dependencies = [ "sphinxcontrib-httpdomain>=1.8.1", "sphinxcontrib-jquery>=4.1", "sphinxcontrib-jsmath>=1.0.1", + "sphinxcontrib-mermaid>=1.0.0", "sphinxcontrib-qthelp>=1.0.3", "sphinxcontrib-redoc>=1.6.0", "sphinxcontrib-serializinghtml>=1.1.5", diff --git a/devel-common/src/docs/utils/conf_constants.py b/devel-common/src/docs/utils/conf_constants.py index 6b75f3e8990..31ee18fb523 100644 --- a/devel-common/src/docs/utils/conf_constants.py +++ b/devel-common/src/docs/utils/conf_constants.py @@ -86,6 +86,7 @@ BASIC_SPHINX_EXTENSIONS = [ "removemarktransform", "sphinx_copybutton", "airflow_intersphinx", + "sphinxcontrib.mermaid", "sphinxcontrib.spelling", "sphinx_airflow_theme", "redirects", diff --git a/uv.lock b/uv.lock index 79a7b7ec820..8e0015f9a94 100644 --- a/uv.lock +++ b/uv.lock @@ -2364,6 +2364,7 @@ docs = [ { name = "sphinxcontrib-httpdomain" }, { name = "sphinxcontrib-jquery" }, { name = "sphinxcontrib-jsmath" }, + { name = "sphinxcontrib-mermaid" }, { name = "sphinxcontrib-qthelp" }, { name = "sphinxcontrib-redoc" }, { name = "sphinxcontrib-serializinghtml" }, @@ -2552,6 +2553,7 @@ requires-dist = [ { name = "sphinxcontrib-httpdomain", marker = "extra == 'docs'", specifier = ">=1.8.1" }, { name = "sphinxcontrib-jquery", marker = "extra == 'docs'", specifier = ">=4.1" }, { name = "sphinxcontrib-jsmath", marker = "extra == 'docs'", specifier = ">=1.0.1" }, + { name = "sphinxcontrib-mermaid", marker = "extra == 'docs'", specifier = ">=1.0.0" }, { name = "sphinxcontrib-qthelp", marker = "extra == 'docs'", specifier = ">=1.0.3" }, { name = "sphinxcontrib-redoc", marker = "extra == 'docs'", specifier = ">=1.6.0" }, { name = "sphinxcontrib-serializinghtml", marker = "extra == 'docs'", specifier = ">=1.1.5" }, @@ -20903,6 +20905,22 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/c2/42/4c8646762ee83602e3fb3fbe774c2fac12f317deb0b5dbeeedd2d3ba4b77/sphinxcontrib_jsmath-1.0.1-py2.py3-none-any.whl", hash = "sha256:2ec2eaebfb78f3f2078e73666b1415417a116cc848b72e5172e596c871103178", size = 5071, upload-time = "2019-01-21T16:10:14.333Z" }, ] +[[package]] +name = "sphinxcontrib-mermaid" +version = "2.0.2" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "jinja2" }, + { name = "pyyaml" }, + { name = "sphinx", version = "8.1.3", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.11'" }, + { name = "sphinx", version = "9.0.4", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version == '3.11.*'" }, + { name = "sphinx", version = "9.1.0", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.12'" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/19/75/3a1cc926da8c563c58ddc124a7b3fe5ccadcae96c96e3a6f8ac3653a210a/sphinxcontrib_mermaid-2.0.2.tar.gz", hash = "sha256:f09576c78ca93fa0e3034fd9c45aaffa7c44ab449de9c43b8b8d262afe52bc66", size = 19265, upload-time = "2026-05-05T13:59:02.959Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/16/8d/93be7e0f7fa915a576859b3bfac7a7baa3303181c44d7db7eefbd3e8a69f/sphinxcontrib_mermaid-2.0.2-py3-none-any.whl", hash = "sha256:d862e514991279fb4816302c5cfe167d2557bf3ce7125ae0cb47dac80a0f46ce", size = 14094, upload-time = "2026-05-05T13:59:01.585Z" }, +] + [[package]] name = "sphinxcontrib-qthelp" version = "2.0.0"
