This is an automated email from the ASF dual-hosted git repository. potiuk pushed a commit to branch 3.2.0-docs in repository https://gitbox.apache.org/repos/asf/airflow.git
commit 7ce38194186f1f386ce16dca9b6731cab05a0445 Author: Jarek Potiuk <[email protected]> AuthorDate: Mon Apr 6 13:30:40 2026 +0200 Docs: Add JWT authentication docs and strengthen security model Add comprehensive JWT token authentication documentation covering both the REST API and Execution API flows, including token structure, timings, refresh mechanisms, and the DFP/Triggerer in-process bypass. Update the security model to: - Document current isolation limitations (DFP/Triggerer DB access, shared Execution API resources, multi-team not guaranteeing task-level isolation) - Add deployment hardening guidance (per-component config, asymmetric JWT keys, env vars with PR_SET_DUMPABLE protection) - Add "What is NOT a security vulnerability" section covering all categories from the security team's response policies - Fix contradicting statements across docs that overstated isolation guarantees or recommended sharing all config across components Update AGENTS.md with security model awareness so AI agents performing security research distinguish intentional design choices from actual vulnerabilities. --- .github/instructions/code-review.instructions.md | 2 +- AGENTS.md | 64 ++- .../production-deployment.rst | 9 +- airflow-core/docs/best-practices.rst | 6 +- airflow-core/docs/configurations-ref.rst | 25 +- airflow-core/docs/core-concepts/multi-team.rst | 2 +- airflow-core/docs/howto/set-config.rst | 23 +- .../docs/installation/upgrading_to_airflow3.rst | 2 +- airflow-core/docs/public-airflow-interface.rst | 7 +- .../docs/security/jwt_token_authentication.rst | 449 +++++++++++++++++++++ airflow-core/docs/security/security_model.rst | 332 ++++++++++++++- .../src/airflow/config_templates/config.yml | 10 +- 12 files changed, 878 insertions(+), 53 deletions(-) diff --git a/.github/instructions/code-review.instructions.md b/.github/instructions/code-review.instructions.md index 0d4ce8a8791..411f0814289 100644 --- a/.github/instructions/code-review.instructions.md +++ b/.github/instructions/code-review.instructions.md @@ -11,7 +11,7 @@ Use these rules when reviewing pull requests to the Apache Airflow repository. - **Scheduler must never run user code.** It only processes serialized Dags. Flag any scheduler-path code that deserializes or executes Dag/task code. - **Flag any task execution code that accesses the metadata DB directly** instead of through the Execution API (`/execution` endpoints). -- **Flag any code in Dag Processor or Triggerer that breaks process isolation** — these components run user code in isolated processes. +- **Flag any code in Dag Processor or Triggerer that breaks process isolation** — these components run user code in separate processes from the Scheduler and API Server, but note that they have direct metadata database access and bypass JWT authentication via in-process Execution API transport. This is an intentional design choice documented in the security model, not a security vulnerability. - **Flag any provider importing core internals** like `SUPERVISOR_COMMS` or task-runner plumbing. Providers interact through the public SDK and execution API only. ## Database and Query Correctness diff --git a/AGENTS.md b/AGENTS.md index f01a0112733..1925cce4a86 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -66,15 +66,73 @@ UV workspace monorepo. Key paths: ## Architecture Boundaries 1. Users author Dags with the Task SDK (`airflow.sdk`). -2. Dag Processor parses Dag files in isolated processes and stores serialized Dags in the metadata DB. +2. Dag File Processor parses Dag files in separate processes and stores serialized Dags in the metadata DB. It has **direct database access** and uses an in-process Execution API transport that **bypasses JWT authentication**. 3. Scheduler reads serialized Dags — **never runs user code** — and creates Dag runs / task instances. -4. Workers execute tasks via Task SDK and communicate with the API server through the Execution API — **never access the metadata DB directly**. +4. Workers execute tasks via Task SDK and communicate with the API server through the Execution API — **never access the metadata DB directly**. Each task receives a short-lived JWT token scoped to its task instance ID. 5. API Server serves the React UI and handles all client-database interactions. -6. Triggerer evaluates deferred tasks/sensors in isolated processes. +6. Triggerer evaluates deferred tasks/sensors in separate processes. Like the Dag File Processor, it has **direct database access** and uses an in-process Execution API transport that **bypasses JWT authentication**. 7. Shared libraries that are symbolically linked to different Python distributions are in `shared` folder. 8. Airflow uses `uv workspace` feature to keep all the distributions sharing dependencies and venv 9. Each of the distributions should declare other needed distributions: `uv --project <FOLDER> sync` command acts on the selected project in the monorepo with only dependencies that it has +## Security Model + +When reviewing code, writing security documentation, or performing security research, keep in +mind the following aspects of Airflow's security model. The authoritative reference is +[`airflow-core/docs/security/security_model.rst`](airflow-core/docs/security/security_model.rst) +and [`airflow-core/docs/security/jwt_token_authentication.rst`](airflow-core/docs/security/jwt_token_authentication.rst). + +**The following are intentional design choices, not security vulnerabilities:** + +- **Dag File Processor and Triggerer bypass JWT authentication.** They use `InProcessExecutionAPI` + which overrides the JWT bearer dependency to always allow access. This is by design — these + components run within trusted infrastructure and need direct database access for their core + operations (storing serialized Dags, managing trigger state). +- **Dag File Processor and Triggerer have direct metadata database access.** User-submitted code + (Dag files, trigger code) executes in these components and can potentially access the database. + This is a known limitation documented in the security model, not an undiscovered vulnerability. +- **Worker Execution API tokens grant access to shared resources.** While `ti:self` scope prevents + cross-task state manipulation, connections, variables, and XComs are accessible to all tasks. + This is the current design — finer-grained scoping is planned for future versions. +- **The experimental multi-team feature (`[core] multi_team`) does not guarantee task-level + isolation.** It provides UI-level and REST API-level RBAC isolation only. At the Execution API + and database level, there is no enforcement of team boundaries. This is documented and expected. +- **Execution API tokens are not subject to revocation.** They are short-lived (default 10 min) + with automatic refresh, so revocation is intentionally not part of the Execution API security model. +- **A single Dag File Processor and Triggerer instance serves all teams by default.** Per-team + instances require deployment-level configuration by the Deployment Manager. + +**The following are NOT security vulnerabilities (per Airflow's security policy and trust model):** + +- Dag authors executing arbitrary code, accessing credentials, or reading environment variables — + Dag authors are trusted users with broad capabilities by design. +- Dag author code passing unsanitized input to operators/hooks — responsibility lies with the Dag + author, not Airflow. SQL injection or command injection is only a vulnerability if exploitable by + a non-Dag-author role without the Dag author deliberately writing unsafe code. +- Connection configuration users being able to trigger RCE/DoS/arbitrary reads via connection + parameters — these users are highly privileged by design. Test connection is disabled by default. +- DoS by authenticated users — Airflow is an internal application with known, authenticated users. + DoS by authenticated users is an operational concern, not a CVE-worthy vulnerability. +- Self-XSS by authenticated users — only considered a vulnerability if it crosses privilege + boundaries (lower-privileged user's payload executes in higher-privileged session). +- Simple Auth Manager security issues — it is for development/testing only, with a prominent warning. +- Third-party dependency CVEs in Docker images — expected over time; users should build their own + images. Only report if you have a proof-of-concept exploiting the vulnerability in Airflow's context. +- Automated scanner results without human verification against the security model. + +**When flagging security concerns, distinguish between:** + +1. **Actual vulnerabilities** — code that violates the documented security model (e.g., a worker + gaining database access it shouldn't have, a Scheduler executing user code, an unauthenticated + user accessing protected endpoints). +2. **Known limitations** — documented gaps where the current implementation doesn't provide full + isolation (e.g., DFP/Triggerer database access, shared Execution API resources, multi-team + not enforcing task-level isolation). These are tracked for improvement in future versions and + should not be reported as new findings. +3. **Deployment hardening opportunities** — measures a Deployment Manager can take to improve + isolation beyond what Airflow enforces natively (e.g., per-component configuration, asymmetric + JWT keys, network policies). These belong in deployment guidance, not as code-level issues. + # Shared libraries - shared libraries provide implementation of some common utilities like logging, configuration where the code should be reused in different distributions (potentially in different versions) diff --git a/airflow-core/docs/administration-and-deployment/production-deployment.rst b/airflow-core/docs/administration-and-deployment/production-deployment.rst index e69d4364887..2f9537aad3c 100644 --- a/airflow-core/docs/administration-and-deployment/production-deployment.rst +++ b/airflow-core/docs/administration-and-deployment/production-deployment.rst @@ -62,9 +62,12 @@ the :doc:`Celery executor <apache-airflow-providers-celery:celery_executor>`. Once you have configured the executor, it is necessary to make sure that every node in the cluster contains -the same configuration and Dags. Airflow sends simple instructions such as "execute task X of Dag Y", but -does not send any Dag files or configuration. You can use a simple cronjob or any other mechanism to sync -Dags and configs across your nodes, e.g., checkout Dags from git repo every 5 minutes on all nodes. +the Dags and configuration appropriate for its role. Airflow sends simple instructions such as +"execute task X of Dag Y", but does not send any Dag files or configuration. You can use a simple cronjob +or any other mechanism to sync Dags across your nodes, e.g., checkout Dags from git repo every 5 minutes +on all nodes. For security-sensitive deployments, restrict sensitive configuration (JWT signing keys, +database credentials, Fernet keys) to only the components that need them rather than sharing all +configuration across all nodes — see :doc:`/security/security_model` for guidance. Logging diff --git a/airflow-core/docs/best-practices.rst b/airflow-core/docs/best-practices.rst index 4c0eb029866..216142300ec 100644 --- a/airflow-core/docs/best-practices.rst +++ b/airflow-core/docs/best-practices.rst @@ -1098,8 +1098,10 @@ The benefits of using those operators are: environment is optimized for the case where you have multiple similar, but different environments. * The dependencies can be pre-vetted by the admins and your security team, no unexpected, new code will be added dynamically. This is good for both, security and stability. -* Complete isolation between tasks. They cannot influence one another in other ways than using standard - Airflow XCom mechanisms. +* Strong process-level isolation between tasks. Tasks run in separate containers/pods and cannot + influence one another at the process or filesystem level. They can still interact through standard + Airflow mechanisms (XComs, connections, variables) via the Execution API. See + :doc:`/security/security_model` for the full isolation model. The drawbacks: diff --git a/airflow-core/docs/configurations-ref.rst b/airflow-core/docs/configurations-ref.rst index 83c5d8a8ed5..1afe00f1e2c 100644 --- a/airflow-core/docs/configurations-ref.rst +++ b/airflow-core/docs/configurations-ref.rst @@ -22,15 +22,22 @@ Configuration Reference This page contains the list of all the available Airflow configurations that you can set in ``airflow.cfg`` file or using environment variables. -Use the same configuration across all the Airflow components. While each component -does not require all, some configurations need to be same otherwise they would not -work as expected. A good example for that is :ref:`secret_key<config:api__secret_key>` which -should be same on the Webserver and Worker to allow Webserver to fetch logs from Worker. - -The webserver key is also used to authorize requests to Celery workers when logs are retrieved. The token -generated using the secret key has a short expiry time though - make sure that time on ALL the machines -that you run Airflow components on is synchronized (for example using ntpd) otherwise you might get -"forbidden" errors when the logs are accessed. +Different Airflow components may require different configuration parameters, and for +improved security, you should restrict sensitive configuration to only the components that +need it. Some configuration values must be shared across specific components to work +correctly — for example, the JWT signing key (``[api_auth] jwt_secret`` or +``[api_auth] jwt_private_key_path``) must be consistent across all components that generate +or validate JWT tokens (Scheduler, API Server). However, other sensitive parameters such as +database connection strings or Fernet keys should only be provided to components that need them. + +For security-sensitive deployments, pass configuration values via environment variables +scoped to individual components rather than sharing a single configuration file across all +components. See :doc:`/security/security_model` for details on which configuration +parameters should be restricted to which components. + +Make sure that time on ALL the machines that you run Airflow components on is synchronized +(for example using ntpd) otherwise you might get "forbidden" errors when the logs are +accessed or API calls are made. .. note:: For more information see :doc:`/howto/set-config`. diff --git a/airflow-core/docs/core-concepts/multi-team.rst b/airflow-core/docs/core-concepts/multi-team.rst index 6beccc249b1..609a79cdf18 100644 --- a/airflow-core/docs/core-concepts/multi-team.rst +++ b/airflow-core/docs/core-concepts/multi-team.rst @@ -38,7 +38,7 @@ Multi-Team mode is designed for medium to large organizations that typically hav **Use Multi-Team mode when:** - You have many teams that need to share Airflow infrastructure -- You need resource isolation (Variables, Connections, Secrets, etc) between teams +- You need resource isolation (Variables, Connections, Secrets, etc) between teams at the UI and API level (see :doc:`/security/security_model` for task-level isolation limitations) - You want separate execution environments per team - You want separate views per team in the Airflow UI - You want to minimize operational overhead or cost by sharing a single Airflow deployment diff --git a/airflow-core/docs/howto/set-config.rst b/airflow-core/docs/howto/set-config.rst index 30d29c924c6..c35df0f4c89 100644 --- a/airflow-core/docs/howto/set-config.rst +++ b/airflow-core/docs/howto/set-config.rst @@ -157,15 +157,20 @@ the example below. See :doc:`/administration-and-deployment/modules_management` for details on how Python and Airflow manage modules. .. note:: - Use the same configuration across all the Airflow components. While each component - does not require all, some configurations need to be same otherwise they would not - work as expected. A good example for that is :ref:`secret_key<config:api__secret_key>` which - should be same on the Webserver and Worker to allow Webserver to fetch logs from Worker. - - The webserver key is also used to authorize requests to Celery workers when logs are retrieved. The token - generated using the secret key has a short expiry time though - make sure that time on ALL the machines - that you run Airflow components on is synchronized (for example using ntpd) otherwise you might get - "forbidden" errors when the logs are accessed. + Different Airflow components may require different configuration parameters. For improved + security, restrict sensitive configuration to only the components that need it rather than + sharing all configuration across all components. Some values must be consistent across specific + components — for example, the JWT signing key must match between components that generate and + validate tokens. However, sensitive parameters such as database connection strings, Fernet keys, + and secrets backend credentials should only be provided to components that actually need them. + + For security-sensitive deployments, pass configuration values via environment variables scoped + to individual components. See :doc:`/security/security_model` for detailed guidance on + restricting configuration parameters. + + Make sure that time on ALL the machines that you run Airflow components on is synchronized + (for example using ntpd) otherwise you might get "forbidden" errors when the logs are + accessed or API calls are made. .. _set-config:configuring-local-settings: diff --git a/airflow-core/docs/installation/upgrading_to_airflow3.rst b/airflow-core/docs/installation/upgrading_to_airflow3.rst index 2d9c878390d..2f5cfea324c 100644 --- a/airflow-core/docs/installation/upgrading_to_airflow3.rst +++ b/airflow-core/docs/installation/upgrading_to_airflow3.rst @@ -54,7 +54,7 @@ In Airflow 3, direct metadata database access from task code is now restricted. - **No Direct Database Access**: Task code can no longer directly import and use Airflow database sessions or models. - **API-Based Resource Access**: All runtime interactions (state transitions, heartbeats, XComs, and resource fetching) are handled through a dedicated Task Execution API. -- **Enhanced Security**: This ensures isolation and security by preventing malicious task code from accessing or modifying the Airflow metadata database. +- **Enhanced Security**: This improves isolation and security by preventing worker task code from directly accessing or modifying the Airflow metadata database. Note that Dag author code still executes with direct database access in the Dag File Processor and Triggerer — see :doc:`/security/security_model` for details. - **Stable Interface**: The Task SDK provides a stable, forward-compatible interface for accessing Airflow resources without direct database dependencies. Step 1: Take care of prerequisites diff --git a/airflow-core/docs/public-airflow-interface.rst b/airflow-core/docs/public-airflow-interface.rst index c768c36a7b1..2c271a580b3 100644 --- a/airflow-core/docs/public-airflow-interface.rst +++ b/airflow-core/docs/public-airflow-interface.rst @@ -548,9 +548,10 @@ but in Airflow they are not parts of the Public Interface and might change any t internal implementation detail and you should not assume they will be maintained in a backwards-compatible way. -**Direct metadata database access from task code is no longer allowed**. -Task code cannot directly access the metadata database to query Dag state, task history, -or Dag runs. Instead, use one of the following alternatives: +**Direct metadata database access from worker task code is no longer allowed**. +Worker task code cannot directly access the metadata database to query Dag state, task history, +or Dag runs — workers communicate exclusively through the Execution API. Instead, use one +of the following alternatives: * **Task Context**: Use :func:`~airflow.sdk.get_current_context` to access task instance information and methods like :meth:`~airflow.sdk.types.RuntimeTaskInstanceProtocol.get_dr_count`, diff --git a/airflow-core/docs/security/jwt_token_authentication.rst b/airflow-core/docs/security/jwt_token_authentication.rst new file mode 100644 index 00000000000..bd897a681d6 --- /dev/null +++ b/airflow-core/docs/security/jwt_token_authentication.rst @@ -0,0 +1,449 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +JWT Token Authentication +======================== + +This document describes how JWT (JSON Web Token) authentication works in Apache Airflow +for both the public REST API (Core API) and the internal Execution API used by workers. + +.. contents:: + :local: + :depth: 2 + +Overview +-------- + +Airflow uses JWT tokens as the primary authentication mechanism for its APIs. There are two +distinct JWT authentication flows: + +1. **REST API (Core API)** — used by UI users, CLI tools, and external clients to interact + with the Airflow public API. +2. **Execution API** — used internally by workers, the Dag File Processor, and the Triggerer + to communicate task state and retrieve runtime data (connections, variables, XComs). + +Both flows share the same underlying JWT infrastructure (``JWTGenerator`` and ``JWTValidator`` +classes in ``airflow.api_fastapi.auth.tokens``) but differ in audience, token lifetime, subject +claims, and scope semantics. + + +Signing and Cryptography +------------------------ + +Airflow supports two mutually exclusive signing modes: + +**Symmetric (shared secret)** + Uses a pre-shared secret key (``[api_auth] jwt_secret``) with the **HS512** algorithm. + All components that generate or validate tokens must share the same secret. If no secret + is configured, Airflow auto-generates a random 16-byte key at startup — but this key is + ephemeral and different across processes, which will cause authentication failures in + multi-component deployments. Deployment Managers must explicitly configure this value. + +**Asymmetric (public/private key pair)** + Uses a PEM-encoded private key (``[api_auth] jwt_private_key_path``) for signing and + the corresponding public key for validation. Supported algorithms: **RS256** (RSA) and + **EdDSA** (Ed25519). The algorithm is auto-detected from the key type when + ``[api_auth] jwt_algorithm`` is set to ``GUESS`` (the default). + + Validation can use either: + + - A JWKS (JSON Web Key Set) endpoint configured via ``[api_auth] trusted_jwks_url`` + (local file or remote HTTP/HTTPS URL, polled periodically for updates). + - The public key derived from the configured private key (automatic fallback when + ``trusted_jwks_url`` is not set). + +The asymmetric mode is recommended for production deployments where you want workers +and the API server to operate with different credentials (workers only need the private key for +token generation; the API server only needs the JWKS for validation). + + +REST API Authentication Flow +----------------------------- + +Token acquisition +^^^^^^^^^^^^^^^^^ + +1. A client sends a ``POST`` request to ``/auth/token`` with credentials (e.g., username + and password in JSON body). +2. The auth manager validates the credentials and creates a user object. +3. The auth manager serializes the user into JWT claims and calls ``JWTGenerator.generate()``. +4. The generated token is returned in the response as ``access_token``. + +For UI-based authentication, the token is stored in a secure, HTTP-only cookie (``_token``) +with ``SameSite=Lax``. + +The CLI uses a separate endpoint (``/auth/token/cli``) with a different (shorter) expiration +time. + +Token structure (REST API) +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. list-table:: + :header-rows: 1 + :widths: 15 85 + + * - Claim + - Description + * - ``jti`` + - Unique token identifier (UUID4 hex). Used for token revocation. + * - ``iss`` + - Issuer (from ``[api_auth] jwt_issuer``). Optional but recommended. + * - ``aud`` + - Audience (from ``[api_auth] jwt_audience``). Optional but recommended. + * - ``sub`` + - User identifier (serialized by the auth manager). + * - ``iat`` + - Issued-at timestamp (Unix epoch seconds). + * - ``nbf`` + - Not-before timestamp (same as ``iat``). + * - ``exp`` + - Expiration timestamp (``iat + jwt_expiration_time``). + +Token validation (REST API) +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +On each API request, the token is extracted in this order of precedence: + +1. ``Authorization: Bearer <token>`` header. +2. OAuth2 query parameter. +3. ``_token`` cookie. + +The ``JWTValidator`` verifies the signature, expiry (``exp``), not-before (``nbf``), +issued-at (``iat``), audience, and issuer claims. A configurable leeway +(``[api_auth] jwt_leeway``, default 10 seconds) accounts for clock skew. + +Token revocation (REST API only) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Token revocation applies only to REST API and UI tokens — it is **not** used for Execution API +tokens issued to workers. + +Revoked tokens are tracked in the ``revoked_token`` database table by their ``jti`` claim. +On logout or explicit revocation, the token's ``jti`` and ``exp`` are inserted into this +table. Expired entries are automatically cleaned up at a cadence of ``2× jwt_expiration_time``. + +Execution API tokens are not subject to revocation. They are short-lived (default 10 minutes) +and automatically refreshed by the ``JWTReissueMiddleware``, so revocation is not part of the +Execution API security model. Once an Execution API token is issued to a worker, it remains +valid until it expires. + +Token refresh (REST API) +^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``JWTRefreshMiddleware`` runs on UI requests. When the middleware detects that the +current token's ``_token`` cookie is approaching expiry, it calls +``auth_manager.refresh_user()`` to generate a new token and sets it as the updated cookie. + +Default timings (REST API) +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. list-table:: + :header-rows: 1 + :widths: 50 50 + + * - Setting + - Default + * - ``[api_auth] jwt_expiration_time`` + - 86400 seconds (24 hours) + * - ``[api_auth] jwt_cli_expiration_time`` + - 3600 seconds (1 hour) + * - ``[api_auth] jwt_leeway`` + - 10 seconds + + +Execution API Authentication Flow +---------------------------------- + +The Execution API is an internal API used by workers to report task state transitions, +heartbeats, and to retrieve connections, variables, and XComs at task runtime. + +Token generation (Execution API) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +1. The **Scheduler** (via the executor) generates a JWT for each task instance before + dispatching it to a worker. The executor's ``jwt_generator`` property creates a + ``JWTGenerator`` configured with the ``[execution_api]`` settings. +2. The token's ``sub`` (subject) claim is set to the **task instance UUID**. +3. The token is embedded in the workload JSON payload (``BaseWorkloadSchema.token`` field) + that is sent to the worker process. + +Token structure (Execution API) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. list-table:: + :header-rows: 1 + :widths: 15 85 + + * - Claim + - Description + * - ``jti`` + - Unique token identifier (UUID4 hex). + * - ``iss`` + - Issuer (from ``[api_auth] jwt_issuer``). Optional. + * - ``aud`` + - Audience (from ``[execution_api] jwt_audience``, default: ``urn:airflow.apache.org:task``). + * - ``sub`` + - Task instance UUID — the identity of the workload. + * - ``scope`` + - Token scope: ``"execution"`` (default) or ``"workload"`` (restricted). + * - ``iat`` + - Issued-at timestamp. + * - ``nbf`` + - Not-before timestamp. + * - ``exp`` + - Expiration timestamp (``iat + [execution_api] jwt_expiration_time``). + +Token scopes (Execution API) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The Execution API defines two token scopes: + +**execution** (default) + Accepted by all Execution API endpoints. This is the standard scope for worker + communication. + +**workload** + A restricted scope accepted only on endpoints that explicitly opt in via + ``Security(require_auth, scopes=["token:workload"])``. Used for endpoints that + manage task state transitions. + +Tokens without a ``scope`` claim default to ``"execution"`` for backwards compatibility. + +Token delivery to workers +^^^^^^^^^^^^^^^^^^^^^^^^^ + +The token flows through the execution stack as follows: + +1. **Executor** generates the token and embeds it in the workload JSON payload. +2. The workload JSON is passed to the worker process (via the executor-specific mechanism: + Celery message, Kubernetes Pod spec, local subprocess arguments, etc.). +3. The worker's ``execute_workload()`` function reads the workload JSON and extracts the token. +4. The ``supervise()`` function receives the token and creates an ``httpx.Client`` instance + with ``BearerAuth(token)`` for all Execution API HTTP requests. +5. The token is included in the ``Authorization: Bearer <token>`` header of every request. + +Token validation (Execution API) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``JWTBearer`` security dependency validates the token once per request: + +1. Extracts the token from the ``Authorization: Bearer`` header. +2. Performs cryptographic signature validation via ``JWTValidator``. +3. Verifies standard claims (``exp``, ``iat``, ``aud`` — ``nbf`` and ``iss`` if configured). +4. Defaults the ``scope`` claim to ``"execution"`` if absent. +5. Creates a ``TIToken`` object with the task instance ID and claims. +6. Caches the validated token on the ASGI request scope for the duration of the request. + +Route-level enforcement is handled by ``require_auth``: + +- Checks the token's ``scope`` against the route's ``allowed_token_types`` (precomputed + by ``ExecutionAPIRoute`` from ``token:*`` Security scopes at route registration time). +- Enforces ``ti:self`` scope — verifies that the token's ``sub`` claim matches the + ``{task_instance_id}`` path parameter, preventing a worker from accessing another task's + endpoints. + +Token refresh (Execution API) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The ``JWTReissueMiddleware`` automatically refreshes tokens that are approaching expiry: + +1. After each response, the middleware checks the token's remaining validity. +2. If less than **20%** of the total validity remains (minimum 30 seconds), the server + generates a new token preserving all original claims (including ``scope`` and ``sub``). +3. The refreshed token is returned in the ``Refreshed-API-Token`` response header. +4. The client's ``_update_auth()`` hook detects this header and transparently updates + the ``BearerAuth`` instance for subsequent requests. + +This mechanism ensures long-running tasks do not lose API access due to token expiry, +without requiring the worker to re-authenticate. + +Default timings (Execution API) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. list-table:: + :header-rows: 1 + :widths: 50 50 + + * - Setting + - Default + * - ``[execution_api] jwt_expiration_time`` + - 600 seconds (10 minutes) + * - ``[execution_api] jwt_audience`` + - ``urn:airflow.apache.org:task`` + * - Token refresh threshold + - 20% of validity remaining (minimum 30 seconds, i.e., at ~120 seconds before expiry + with the default 600-second token lifetime) + + +Dag File Processor and Triggerer +--------------------------------- + +The **Dag File Processor** and **Triggerer** are internal Airflow components that also +interact with the Execution API, but they do so via an **in-process** transport +(``InProcessExecutionAPI``) rather than over the network. This in-process API: + +- Runs the Execution API application directly within the same process, using an ASGI/WSGI + bridge. +- **Bypasses JWT authentication entirely** — the JWT bearer dependency is overridden to + always return a synthetic ``TIToken`` with the ``"execution"`` scope. +- Also bypasses per-resource access controls (connection, variable, and XCom access checks + are overridden to always allow). + +This design means that code running in the Dag File Processor or Triggerer has **unrestricted +access** to all Execution API operations without needing a valid JWT token. Since the Dag File +Processor parses user-submitted Dag files and the Triggerer executes user-submitted trigger +code, Dag authors whose code runs in these components effectively have the same level of +access as the internal API itself. + +In the default deployment, a **single Dag File Processor instance** parses Dag files for all +teams and a **single Triggerer instance** handles all triggers across all teams. This means +that Dag author code from different teams executes within the same process, with shared access +to the in-process Execution API and the metadata database. + +For multi-team deployments that require isolation, Deployment Managers must run **separate +Dag File Processor and Triggerer instances per team** as a deployment-level measure — Airflow +does not provide built-in support for per-team DFP or Triggerer instances. However, even with +separate instances, these components still have direct access to the metadata database +(the Dag File Processor needs it to store serialized Dags, and the Triggerer needs it to +manage trigger state). A Dag author whose code runs in these components can potentially +access the database directly, including reading or modifying data belonging to other teams, +or obtaining the JWT signing key if it is available in the process environment. + +See :doc:`/security/security_model` for the full security implications and deployment +hardening guidance. + + +Workload Isolation and Current Limitations +------------------------------------------ + +The current JWT authentication model operates under the following assumptions and limitations: + +**Worker process memory protection (Linux)** + On Linux, the supervisor process calls ``prctl(PR_SET_DUMPABLE, 0)`` at the start of + ``supervise()`` before forking the task process. This flag is inherited by the forked + child. Marking processes as non-dumpable prevents same-UID sibling processes from reading + ``/proc/<pid>/mem``, ``/proc/<pid>/environ``, or ``/proc/<pid>/maps``, and blocks + ``ptrace(PTRACE_ATTACH)``. This is critical because each supervisor holds a distinct JWT + token in memory — without this protection, a malicious task process running as the same + Unix user could steal tokens from sibling supervisor processes. + + This protection is one of the reasons that passing sensitive configuration via environment + variables is safer than via configuration files: environment variables are only readable + by the process itself (and root), whereas configuration files on disk are readable by any + process with filesystem access running as the same user. + + .. note:: + + This protection is Linux-specific. On non-Linux platforms, the + ``_make_process_nondumpable()`` call is a no-op. Deployment Managers running Airflow + on non-Linux platforms should implement alternative isolation measures. + +**No cross-workload isolation** + All worker workloads authenticate to the same Execution API with tokens that share the + same signing key, audience, and issuer. While the ``ti:self`` scope enforcement prevents + a worker from accessing *another task instance's* specific endpoints (e.g., heartbeat, + state transitions), the token grants access to shared resources such as connections, + variables, and XComs that are not scoped to individual tasks. + +**No team-level isolation in Execution API (experimental multi-team feature)** + The experimental multi-team feature (``[core] multi_team``) provides UI-level and REST + API-level RBAC isolation between teams, but **does not yet guarantee task-level isolation**. + At the Execution API level, there is no enforcement of team-based access boundaries. + A task from one team can access the same connections, variables, and XComs as a task from + another team. All workloads share the same JWT signing keys and audience regardless of team + assignment. + + In deployments where additional hardening measures are not implemented at the deployment + level, a task from one team can potentially access resources belonging to another team + (see :doc:`/security/security_model`). A deep understanding of configuration and deployment + security is required by Deployment Managers to configure it in a way that can guarantee + separation between teams. Task-level team isolation will be improved in future versions + of Airflow. + +**Dag File Processor and Triggerer bypass** + As described above, the default deployment runs a single Dag File Processor and a single + Triggerer for all teams. Both bypass JWT authentication entirely via in-process transport. + For multi-team isolation, Deployment Managers must run separate instances per team, but + even then, each instance retains direct database access. A Dag author whose code runs + in these components can potentially access the database directly — including data belonging + to other teams or the JWT signing key configuration — unless the Deployment Manager + restricts the database credentials and configuration available to each instance. + +**Planned improvements** + Future versions of Airflow will address these limitations with: + + - Finer-grained token scopes tied to specific resources (connections, variables) and teams. + - Enforcement of team-based isolation in the Execution API. + - Built-in support for per-team Dag File Processor and Triggerer instances. + - Improved sandboxing of user-submitted code in the Dag File Processor and Triggerer. + - Full task-level isolation for the multi-team feature. + + +Configuration Reference +------------------------ + +All JWT-related configuration parameters: + +.. list-table:: + :header-rows: 1 + :widths: 40 15 45 + + * - Parameter + - Default + - Description + * - ``[api_auth] jwt_secret`` + - Auto-generated + - Symmetric secret key for signing tokens. Must be the same across all components. Mutually exclusive with ``jwt_private_key_path``. + * - ``[api_auth] jwt_private_key_path`` + - None + - Path to PEM-encoded private key (RSA or Ed25519). Mutually exclusive with ``jwt_secret``. + * - ``[api_auth] jwt_algorithm`` + - ``GUESS`` + - Signing algorithm. Auto-detected from key type: HS512 for symmetric, RS256 for RSA, EdDSA for Ed25519. + * - ``[api_auth] jwt_kid`` + - Auto (RFC 7638 thumbprint) + - Key ID placed in token header. Ignored for symmetric keys. + * - ``[api_auth] jwt_issuer`` + - None + - Issuer claim (``iss``). Recommended to be unique per deployment. + * - ``[api_auth] jwt_audience`` + - None + - Audience claim (``aud``) for REST API tokens. + * - ``[api_auth] jwt_expiration_time`` + - 86400 (24h) + - REST API token lifetime in seconds. + * - ``[api_auth] jwt_cli_expiration_time`` + - 3600 (1h) + - CLI token lifetime in seconds. + * - ``[api_auth] jwt_leeway`` + - 10 + - Clock skew tolerance in seconds for token validation. + * - ``[api_auth] trusted_jwks_url`` + - None + - JWKS endpoint URL or local file path for token validation. Mutually exclusive with ``jwt_secret``. + * - ``[execution_api] jwt_expiration_time`` + - 600 (10 min) + - Execution API token lifetime in seconds. + * - ``[execution_api] jwt_audience`` + - ``urn:airflow.apache.org:task`` + - Audience claim for Execution API tokens. + +.. important:: + + Time synchronization across all Airflow components is critical. Use NTP (e.g., ``ntpd`` or + ``chrony``) to keep clocks in sync. Clock skew beyond the configured ``jwt_leeway`` will cause + authentication failures. diff --git a/airflow-core/docs/security/security_model.rst b/airflow-core/docs/security/security_model.rst index 15b59b25090..d030e879096 100644 --- a/airflow-core/docs/security/security_model.rst +++ b/airflow-core/docs/security/security_model.rst @@ -62,11 +62,15 @@ Dag authors ........... They can create, modify, and delete Dag files. The -code in Dag files is executed on workers and in the Dag Processor. -Therefore, Dag authors can create and change code executed on workers -and the Dag Processor and potentially access the credentials that the Dag -code uses to access external systems. Dag authors have full access -to the metadata database. +code in Dag files is executed on workers, in the Dag File Processor, +and in the Triggerer. +Therefore, Dag authors can create and change code executed on workers, +the Dag File Processor, and the Triggerer, and potentially access the credentials that the Dag +code uses to access external systems. In Airflow 3, worker task code communicates with +the API server exclusively through the Execution API and does not have direct access to +the metadata database. However, Dag author code that executes in the Dag File Processor +and Triggerer still has direct access to the metadata database, as these components +require it for their operation (see :ref:`jwt-authentication-and-workload-isolation` for details). Authenticated UI users ....................... @@ -115,6 +119,8 @@ The primary difference between an operator and admin is the ability to manage an to other users, and access audit logs - only admins are able to do this. Otherwise assume they have the same access as an admin. +.. _connection-configuration-users: + Connection configuration users .............................. @@ -170,6 +176,8 @@ Viewers also do not have permission to access audit logs. For more information on the capabilities of authenticated UI users, see :doc:`apache-airflow-providers-fab:auth-manager/access-control`. +.. _capabilities-of-dag-authors: + Capabilities of Dag authors --------------------------- @@ -193,15 +201,21 @@ not open new security vulnerabilities. Limiting Dag Author access to subset of Dags -------------------------------------------- -Airflow does not have multi-tenancy or multi-team features to provide isolation between different groups of users when -it comes to task execution. While, in Airflow 3.0 and later, Dag Authors cannot directly access database and cannot run -arbitrary queries on the database, they still have access to all Dags in the Airflow installation and they can +Airflow does not yet provide full task-level isolation between different groups of users when +it comes to task execution. While, in Airflow 3.0 and later, worker task code cannot directly access the +metadata database (it communicates through the Execution API), Dag author code that runs in the Dag File +Processor and Triggerer still has direct database access. Regardless of execution context, Dag authors +have access to all Dags in the Airflow installation and they can modify any of those Dags - no matter which Dag the task code is executed for. This means that Dag authors can modify state of any task instance of any Dag, and there are no finer-grained access controls to limit that access. -There is a work in progress on multi-team feature in Airflow that will allow to have some isolation between different -groups of users and potentially limit access of Dag authors to only a subset of Dags, but currently there is no -such feature in Airflow and you can assume that all Dag authors have access to all Dags and can modify their state. +There is an **experimental** multi-team feature in Airflow (``[core] multi_team``) that provides UI-level and +REST API-level RBAC isolation between teams. However, this feature **does not yet guarantee task-level isolation**. +At the task execution level, workloads from different teams still share the same Execution API, signing keys, +connections, and variables. A task from one team can access the same shared resources as a task from another team. +The multi-team feature is a work in progress — task-level isolation and Execution API enforcement of team +boundaries will be improved in future versions of Airflow. Until then, you should assume that all Dag authors +have access to all Dags and shared resources, and can modify their state regardless of team assignment. Security contexts for Dag author submitted code @@ -239,8 +253,14 @@ Triggerer In case of Triggerer, Dag authors can execute arbitrary code in Triggerer. Currently there are no enforcement mechanisms that would allow to isolate tasks that are using deferrable functionality from -each other and arbitrary code from various tasks can be executed in the same process/machine. Deployment -Manager must trust that Dag authors will not abuse this capability. +each other and arbitrary code from various tasks can be executed in the same process/machine. The default +deployment runs a single Triggerer instance that handles triggers from all teams — there is no built-in +support for per-team Triggerer instances. Additionally, the Triggerer uses an in-process Execution API +transport that bypasses JWT authentication and has direct access to the metadata database. For +multi-team deployments, Deployment Managers must run separate Triggerer instances per team as a +deployment-level measure, but even then each instance retains direct database access and a Dag author +whose trigger code runs there can potentially access the database directly — including data belonging +to other teams. Deployment Manager must trust that Dag authors will not abuse this capability. Dag files not needed for Scheduler and API Server ................................................. @@ -282,6 +302,141 @@ Access to all Dags All Dag authors have access to all Dags in the Airflow deployment. This means that they can view, modify, and update any Dag without restrictions at any time. +.. _jwt-authentication-and-workload-isolation: + +JWT authentication and workload isolation +----------------------------------------- + +Airflow uses JWT (JSON Web Token) authentication for both its public REST API and its internal +Execution API. For a detailed description of the JWT authentication flows, token structure, and +configuration, see :doc:`/security/jwt_token_authentication`. + +Current isolation limitations +............................. + +While Airflow 3 significantly improved the security model by preventing worker task code from +directly accessing the metadata database (workers now communicate exclusively through the +Execution API), **perfect isolation between Dag authors is not yet achieved**. Dag author code +still executes with direct database access in the Dag File Processor and Triggerer. The +following gaps exist: + +**Dag File Processor and Triggerer bypass JWT authentication** + The Dag File Processor and Triggerer use an in-process transport to access the Execution API, + which bypasses JWT authentication entirely. Since these components execute user-submitted code + (Dag files and trigger code respectively), a Dag author whose code runs in these components has + unrestricted access to all Execution API operations — including the ability to read any connection, + variable, or XCom — without needing a valid JWT token. + + Furthermore, the Dag File Processor has direct access to the metadata database (it needs this to + store serialized Dags). Dag author code executing in the Dag File Processor context could potentially + access the database directly, including the signing key configuration if it is available in the + process environment. If a Dag author obtains the JWT signing key, they could forge arbitrary tokens. + +**Dag File Processor and Triggerer are shared across teams** + In the default deployment, a **single Dag File Processor instance** parses all Dag files and a + **single Triggerer instance** handles all triggers — regardless of team assignment. There is no + built-in support for running per-team Dag File Processor or Triggerer instances. This means that + Dag author code from different teams executes within the same process, sharing the in-process + Execution API and direct database access. + + For multi-team deployments that require separation, Deployment Managers must run **separate + Dag File Processor and Triggerer instances per team** as a deployment-level measure (for example, + by configuring each instance to only process bundles belonging to a specific team). However, even + with separate instances, each Dag File Processor and Triggerer retains direct access to the + metadata database — a Dag author whose code runs in these components can potentially access the + database directly, including reading or modifying data belonging to other teams, unless the + Deployment Manager restricts the database credentials and configuration available to each instance. + +**No cross-workload isolation in the Execution API** + All worker workloads authenticate to the same Execution API with tokens signed by the same key and + sharing the same audience. While the ``ti:self`` scope enforcement prevents a worker from accessing + another task's specific endpoints (heartbeat, state transitions), shared resources such as connections, + variables, and XComs are accessible to all tasks. There is no isolation between tasks belonging to + different teams or Dag authors at the Execution API level. + +**Token signing key is a shared secret** + In symmetric key mode (``[api_auth] jwt_secret``), the same secret key is used to both generate and + validate tokens. Any component that has access to this secret can forge tokens with arbitrary claims, + including tokens for other task instances or with elevated scopes. + +.. _deployment-hardening-for-improved-isolation: + +Deployment hardening for improved isolation +........................................... + +Deployment Managers who require stronger isolation between Dag authors and teams can take the following +measures. Note that these are deployment-specific actions that go beyond Airflow's built-in security +model — Airflow does not enforce these natively. + +**Mandatory code review of Dag files** + Implement a review process for all Dag submissions to Dag bundles. This can include: + + * Requiring pull request reviews before Dag files are deployed. + * Static analysis of Dag code to detect suspicious patterns (e.g., direct database access attempts, + reading environment variables, importing configuration modules). + * Automated linting rules that flag potentially dangerous code. + +**Restrict sensitive configuration to components that need them** + Do not share all configuration parameters across all components. In particular: + + * The JWT signing key (``[api_auth] jwt_secret`` or ``[api_auth] jwt_private_key_path``) should only + be available to components that need to generate tokens (Scheduler/Executor, API Server) and + components that need to validate tokens (API Server). Workers should not have access to the signing + key — they only need the tokens provided to them. + * Connection credentials for external systems should only be available to the API Server + (which serves them to workers via the Execution API), not to the Scheduler, Dag File Processor, + or Triggerer processes directly. + * Database connection strings should only be available to components that need direct database access + (API Server, Scheduler, Dag File Processor), not to workers. + +**Pass configuration via environment variables** + For higher security, pass sensitive configuration values via environment variables rather than + configuration files. Environment variables are inherently safer than configuration files in + Airflow's worker processes because of a built-in protection: on Linux, the supervisor process + calls ``prctl(PR_SET_DUMPABLE, 0)`` before forking the task process, and this flag is inherited + by the forked child. This marks both processes as non-dumpable, which prevents same-UID sibling + processes from reading ``/proc/<pid>/environ``, ``/proc/<pid>/mem``, or attaching via + ``ptrace``. In contrast, configuration files on disk are readable by any process running as + the same Unix user. Environment variables can also be scoped to individual processes or + containers, making it easier to restrict which components have access to which secrets. + + The following is a non-exhaustive list of security-sensitive configuration variables that should + be carefully restricted: + + * ``AIRFLOW__API_AUTH__JWT_SECRET`` — JWT signing key (symmetric mode). + * ``AIRFLOW__API_AUTH__JWT_PRIVATE_KEY_PATH`` — Path to JWT private key (asymmetric mode). + * ``AIRFLOW__DATABASE__SQL_ALCHEMY_CONN`` — Metadata database connection string. + * ``AIRFLOW__CELERY__RESULT_BACKEND`` — Celery result backend connection string. + * ``AIRFLOW__CELERY__BROKER_URL`` — Celery broker URL. + * ``AIRFLOW__CORE__FERNET_KEY`` — Fernet encryption key for connections and variables at rest. + * ``AIRFLOW__SECRETS__BACKEND_KWARGS`` — Secrets backend credentials. + + This is not a complete list. Deployment Managers should review the full configuration reference + and identify all parameters that contain credentials or secrets relevant to their deployment. + +**Use asymmetric keys for JWT signing** + Using asymmetric keys (``[api_auth] jwt_private_key_path`` with a JWKS endpoint) provides better + security than symmetric keys because: + + * The private key (used for signing) can be restricted to the Scheduler/Executor. + * The API Server only needs the public key (via JWKS) for validation. + * Workers cannot forge tokens even if they could access the JWKS endpoint, since they would + not have the private key. + +**Network-level isolation** + Use network policies, VPCs, or similar mechanisms to restrict which components can communicate + with each other. For example, workers should only be able to reach the Execution API endpoint, + not the metadata database or internal services directly. + +**Other measures** + Deployment Managers may need to implement additional measures depending on their security requirements. + These may include monitoring and auditing of Execution API access patterns, runtime sandboxing of + Dag code, or dedicated infrastructure per team. Future versions of Airflow will address workload + isolation in a more complete way, with finer-grained token scopes, team-based Execution API enforcement, + and improved sandboxing of user-submitted code. The Airflow community is actively working on these + features. + + Custom RBAC limitations ----------------------- @@ -309,6 +464,8 @@ you trust them not to abuse the capabilities they have. You should also make sur properly configured the Airflow installation to prevent Dag authors from executing arbitrary code in the Scheduler and API Server processes. +.. _deploying-and-protecting-airflow-installation: + Deploying and protecting Airflow installation ............................................. @@ -354,13 +511,150 @@ Examples of fine-grained access control include (but are not limited to): * Access restrictions to views or Dags: Controlling user access to certain views or specific Dags, ensuring that users can only view or interact with authorized components. -Future: multi-tenancy isolation -............................... +Future: multi-team isolation +............................ These examples showcase ways in which Deployment Managers can refine and limit user privileges within Airflow, providing tighter control and ensuring that users have access only to the necessary components and functionalities based on their roles and responsibilities. However, fine-grained access control does not -provide full isolation and separation of access to allow isolation of different user groups in a -multi-tenant fashion yet. In future versions of Airflow, some fine-grained access control features could -become part of the Airflow security model, as the Airflow community is working on a multi-tenant model -currently. +yet provide full isolation and separation of access between different groups of users. + +The experimental multi-team feature (``[core] multi_team``) is a step towards cross-team isolation, but it +currently only enforces team-based isolation at the UI and REST API level. **Task-level isolation is not yet +guaranteed** — workloads from different teams share the same Execution API, JWT signing keys, and access to +connections, variables, and XComs. In deployments where additional hardening measures (described in +:ref:`deployment-hardening-for-improved-isolation`) are not implemented, a task belonging to one team can +potentially access shared resources available to tasks from other teams. Deployment Managers who enable the +multi-team feature should not rely on it alone for security-critical isolation between teams at the task +execution layer — a deep understanding of configuration and deployment security is required by Deployment +Managers to configure it in a way that can guarantee separation between teams. + +Future versions of Airflow will improve task-level isolation, including team-scoped Execution API enforcement, +finer-grained JWT token scopes, and better sandboxing of user-submitted code. The Airflow community is +actively working on these improvements. + + +What is NOT considered a security vulnerability +----------------------------------------------- + +The following scenarios are **not** considered security vulnerabilities in Airflow. They are either +intentional design choices, consequences of the trust model described above, or issues that fall +outside Airflow's threat model. Security researchers (and AI agents performing security analysis) +should review this section before reporting issues to the Airflow security team. + +For full details on reporting policies, see +`Airflow's Security Policy <https://github.com/apache/airflow/security/policy>`_. + +Dag authors executing arbitrary code +..................................... + +Dag authors can execute arbitrary code on workers, the Dag File Processor, and the Triggerer. This +includes accessing credentials, environment variables, and (in the case of the Dag File Processor +and Triggerer) the metadata database directly. This is the intended behavior as described in +:ref:`capabilities-of-dag-authors` — Dag authors are trusted users. Reports that a Dag author can +"achieve RCE" or "access the database" by writing Dag code are restating a documented capability, +not discovering a vulnerability. + +Dag author code passing unsanitized input to operators and hooks +................................................................ + +When a Dag author writes code that passes unsanitized UI user input (such as Dag run parameters, +variables, or connection configuration values) to operators, hooks, or third-party libraries, the +responsibility lies with the Dag author. Airflow's hooks and operators are low-level interfaces — +Dag authors are Python programmers who must sanitize inputs before passing them to these interfaces. + +SQL injection or command injection is only considered a vulnerability if it can be triggered by a +**non-Dag-author** user role (e.g., an authenticated UI user) **without** the Dag author deliberately +writing code that passes that input unsafely. If the only way to exploit the injection requires writing +or modifying a Dag file, it is not a vulnerability — the Dag author already has the ability to execute +arbitrary code. See also :doc:`/security/sql`. + +An exception exists when official Airflow documentation explicitly recommends a pattern that leads to +injection — in that case, the documentation guidance itself is the issue and may warrant an advisory. + +Dag File Processor and Triggerer having database access +....................................................... + +The Dag File Processor requires direct database access to store serialized Dags. The Triggerer requires +direct database access to manage trigger state. Both components execute user-submitted code (Dag files +and trigger code respectively) and bypass JWT authentication via an in-process Execution API transport. +These are intentional architectural choices, not vulnerabilities. They are documented in +:ref:`jwt-authentication-and-workload-isolation`. + +Workers accessing shared Execution API resources +................................................. + +Worker tasks can access connections, variables, and XComs via the Execution API using their JWT token. +While the ``ti:self`` scope prevents cross-task state manipulation, shared resources are accessible to +all tasks. This is the current design — not a vulnerability. Reports that "a task can read another +team's connection" are describing a known limitation of the current isolation model, documented in +:ref:`jwt-authentication-and-workload-isolation`. + +Execution API tokens not being revocable +........................................ + +Execution API tokens issued to workers are short-lived (default 10 minutes) with automatic refresh +and are intentionally not subject to revocation. This is a design choice documented in +:doc:`/security/jwt_token_authentication`, not a missing security control. + +Connection configuration capabilities +...................................... + +Users with the **Connection configuration** role can configure connections with arbitrary credentials +and connection parameters. When the ``test connection`` feature is enabled, these users can potentially +trigger RCE, arbitrary file reads, or Denial of Service through connection parameters. This is by +design — connection configuration users are highly privileged and must be trusted not to abuse these +capabilities. The ``test connection`` feature is disabled by default since Airflow 2.7.0, and enabling +it is an explicit Deployment Manager decision that acknowledges these risks. See +:ref:`connection-configuration-users` for details. + +Denial of Service by authenticated users +........................................ + +Airflow is not designed to be exposed to untrusted users on the public internet. All users who can +access the Airflow UI and API are authenticated and known. Denial of Service scenarios triggered by +authenticated users (such as creating very large Dag runs, submitting expensive queries, or flooding +the API) are not considered security vulnerabilities. They are operational concerns that Deployment +Managers should address through rate limiting, resource quotas, and monitoring — standard measures +for any internal application. See :ref:`deploying-and-protecting-airflow-installation`. + +Self-XSS by authenticated users +................................ + +Cross-site scripting (XSS) scenarios where the only victim is the user who injected the payload +(self-XSS) are not considered security vulnerabilities. Airflow's users are authenticated and +known, and self-XSS does not allow an attacker to compromise other users. If you discover an XSS +scenario where a lower-privileged user can inject a payload that executes in a higher-privileged +user's session without that user's action, that is a valid vulnerability and should be reported. + +Simple Auth Manager +................... + +The Simple Auth Manager is intended for development and testing only. This is clearly documented and +a prominent warning banner is displayed on the login page. Security issues specific to the Simple +Auth Manager (such as weak password handling, lack of rate limiting, or missing CSRF protections) are +not considered production security vulnerabilities. Production deployments must use a production-grade +auth manager. + +Third-party dependency vulnerabilities in Docker images +....................................................... + +Airflow's reference Docker images are built with the latest available dependencies at release time. +Vulnerabilities found by scanning these images against CVE databases are expected to appear over time +as new CVEs are published. These should **not** be reported to the Airflow security team. Instead, +users should build their own images with updated dependencies as described in the +`Docker image documentation <https://airflow.apache.org/docs/docker-stack/index.html>`_. + +If you discover that a third-party dependency vulnerability is **actually exploitable** in Airflow +(with a proof-of-concept demonstrating the exploitation in Airflow's context), that is a valid +report and should be submitted following the security policy. + +Automated scanning results without human verification +..................................................... + +Automated security scanner reports that list findings without human verification against Airflow's +security model are not considered valid vulnerability reports. Airflow's trust model differs +significantly from typical web applications — many scanner findings (such as "admin user can execute +code" or "database credentials accessible in configuration") are expected behavior. Reports must +include a proof-of-concept that demonstrates how the finding violates the security model described +in this document, including identifying the specific user role involved and the attack scenario. diff --git a/airflow-core/src/airflow/config_templates/config.yml b/airflow-core/src/airflow/config_templates/config.yml index e1f1c228a61..c83d8b629ec 100644 --- a/airflow-core/src/airflow/config_templates/config.yml +++ b/airflow-core/src/airflow/config_templates/config.yml @@ -1977,8 +1977,14 @@ api_auth: description: | Secret key used to encode and decode JWTs to authenticate to public and private APIs. - It should be as random as possible. However, when running more than 1 instances of API services, - make sure all of them use the same ``jwt_secret`` otherwise calls will fail on authentication. + It should be as random as possible. This key must be consistent across all components that + generate or validate JWT tokens (Scheduler, API Server). For improved security, consider + using asymmetric keys (``jwt_private_key_path``) instead, which allow you to restrict the + signing key to only the components that need to generate tokens. + + For security-sensitive deployments, pass this value via environment variable + (``AIRFLOW__API_AUTH__JWT_SECRET``) rather than storing it in a configuration file, and + restrict it to only the components that need it. Mutually exclusive with ``jwt_private_key_path``. version_added: 3.0.0
