Triggerer access

potiuk Tue, 07 Apr 2026 07:08:34 -0700

This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch 3.2.0-docs
in repository https://gitbox.apache.org/repos/asf/airflow.git


commit bd0274d0b6d000675f28bdf57270e04a6b22cd93
Author: Jarek Potiuk <[email protected]>
AuthorDate: Mon Apr 6 16:19:09 2026 +0200

    Fix spelling errors and use 'potentially' for DFP/Triggerer access
    
    - Add dumpable, sandboxing, unsanitized, XSS to spelling wordlist
    - Use 'potentially' consistently when describing Dag File Processor
      and Triggerer database access and JWT authentication bypass, since
      these are capabilities that Dag author code could exploit rather
      than guaranteed behaviors of normal operation
---
 .github/instructions/code-review.instructions.md   |  2 +-
 AGENTS.md                                          | 19 +++----
 .../docs/installation/upgrading_to_airflow3.rst    |  2 +-
 .../docs/security/jwt_token_authentication.rst     | 41 +++++++--------
 airflow-core/docs/security/security_model.rst      | 60 +++++++++++-----------
 docs/spelling_wordlist.txt                         |  4 ++
 6 files changed, 68 insertions(+), 60 deletions(-)

diff --git a/.github/instructions/code-review.instructions.md 
b/.github/instructions/code-review.instructions.md
index 411f0814289..cd480bdcaf7 100644
--- a/.github/instructions/code-review.instructions.md
+++ b/.github/instructions/code-review.instructions.md
@@ -11,7 +11,7 @@ Use these rules when reviewing pull requests to the Apache 
Airflow repository.
 
 - **Scheduler must never run user code.** It only processes serialized Dags. 
Flag any scheduler-path code that deserializes or executes Dag/task code.
 - **Flag any task execution code that accesses the metadata DB directly** 
instead of through the Execution API (`/execution` endpoints).
-- **Flag any code in Dag Processor or Triggerer that breaks process 
isolation** — these components run user code in separate processes from the 
Scheduler and API Server, but note that they have direct metadata database 
access and bypass JWT authentication via in-process Execution API transport. 
This is an intentional design choice documented in the security model, not a 
security vulnerability.
+- **Flag any code in Dag Processor or Triggerer that breaks process 
isolation** — these components run user code in separate processes from the 
Scheduler and API Server, but note that they potentially have direct metadata 
database access and potentially bypass JWT authentication via in-process 
Execution API transport. This is an intentional design choice documented in the 
security model, not a security vulnerability.
 - **Flag any provider importing core internals** like `SUPERVISOR_COMMS` or 
task-runner plumbing. Providers interact through the public SDK and execution 
API only.
 
 ## Database and Query Correctness
diff --git a/AGENTS.md b/AGENTS.md
index 1925cce4a86..ac347fd2e91 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -66,11 +66,11 @@ UV workspace monorepo. Key paths:
 ## Architecture Boundaries
 
 1. Users author Dags with the Task SDK (`airflow.sdk`).
-2. Dag File Processor parses Dag files in separate processes and stores 
serialized Dags in the metadata DB. It has **direct database access** and uses 
an in-process Execution API transport that **bypasses JWT authentication**.
+2. Dag File Processor parses Dag files in separate processes and stores 
serialized Dags in the metadata DB. It potentially has **direct database 
access** and uses an in-process Execution API transport that **potentially 
bypasses JWT authentication**.
 3. Scheduler reads serialized Dags — **never runs user code** — and creates 
Dag runs / task instances.
 4. Workers execute tasks via Task SDK and communicate with the API server 
through the Execution API — **never access the metadata DB directly**. Each 
task receives a short-lived JWT token scoped to its task instance ID.
 5. API Server serves the React UI and handles all client-database interactions.
-6. Triggerer evaluates deferred tasks/sensors in separate processes. Like the 
Dag File Processor, it has **direct database access** and uses an in-process 
Execution API transport that **bypasses JWT authentication**.
+6. Triggerer evaluates deferred tasks/sensors in separate processes. Like the 
Dag File Processor, it potentially has **direct database access** and uses an 
in-process Execution API transport that **potentially bypasses JWT 
authentication**.
 7. Shared libraries that are symbolically linked to different Python 
distributions are in `shared` folder.
 8. Airflow uses `uv workspace` feature to keep all the distributions sharing 
dependencies and venv
 9. Each of the distributions should declare other needed distributions: `uv 
--project <FOLDER> sync` command acts on the selected project in the monorepo 
with only dependencies that it has
@@ -84,13 +84,14 @@ and 
[`airflow-core/docs/security/jwt_token_authentication.rst`](airflow-core/doc
 
 **The following are intentional design choices, not security vulnerabilities:**
 
-- **Dag File Processor and Triggerer bypass JWT authentication.** They use 
`InProcessExecutionAPI`
-  which overrides the JWT bearer dependency to always allow access. This is by 
design — these
-  components run within trusted infrastructure and need direct database access 
for their core
-  operations (storing serialized Dags, managing trigger state).
-- **Dag File Processor and Triggerer have direct metadata database access.** 
User-submitted code
-  (Dag files, trigger code) executes in these components and can potentially 
access the database.
-  This is a known limitation documented in the security model, not an 
undiscovered vulnerability.
+- **Dag File Processor and Triggerer potentially bypass JWT authentication.** 
They use
+  `InProcessExecutionAPI` which overrides the JWT bearer dependency to always 
allow access. This
+  is by design — these components run within trusted infrastructure and 
potentially need direct
+  database access for their core operations (storing serialized Dags, managing 
trigger state).
+- **Dag File Processor and Triggerer potentially have direct metadata database 
access.**
+  User-submitted code (Dag files, trigger code) executes in these components 
and can potentially
+  access the database. This is a known limitation documented in the security 
model, not an
+  undiscovered vulnerability.
 - **Worker Execution API tokens grant access to shared resources.** While 
`ti:self` scope prevents
   cross-task state manipulation, connections, variables, and XComs are 
accessible to all tasks.
   This is the current design — finer-grained scoping is planned for future 
versions.
diff --git a/airflow-core/docs/installation/upgrading_to_airflow3.rst 
b/airflow-core/docs/installation/upgrading_to_airflow3.rst
index 2f5cfea324c..ad0b5507b62 100644
--- a/airflow-core/docs/installation/upgrading_to_airflow3.rst
+++ b/airflow-core/docs/installation/upgrading_to_airflow3.rst
@@ -54,7 +54,7 @@ In Airflow 3, direct metadata database access from task code 
is now restricted.
 
 - **No Direct Database Access**: Task code can no longer directly import and 
use Airflow database sessions or models.
 - **API-Based Resource Access**: All runtime interactions (state transitions, 
heartbeats, XComs, and resource fetching) are handled through a dedicated Task 
Execution API.
-- **Enhanced Security**: This improves isolation and security by preventing 
worker task code from directly accessing or modifying the Airflow metadata 
database. Note that Dag author code still executes with direct database access 
in the Dag File Processor and Triggerer — see :doc:`/security/security_model` 
for details.
+- **Enhanced Security**: This improves isolation and security by preventing 
worker task code from directly accessing or modifying the Airflow metadata 
database. Note that Dag author code potentially still executes with direct 
database access in the Dag File Processor and Triggerer — see 
:doc:`/security/security_model` for details.
 - **Stable Interface**: The Task SDK provides a stable, forward-compatible 
interface for accessing Airflow resources without direct database dependencies.
 
 Step 1: Take care of prerequisites
diff --git a/airflow-core/docs/security/jwt_token_authentication.rst 
b/airflow-core/docs/security/jwt_token_authentication.rst
index bd897a681d6..87354039447 100644
--- a/airflow-core/docs/security/jwt_token_authentication.rst
+++ b/airflow-core/docs/security/jwt_token_authentication.rst
@@ -298,28 +298,29 @@ interact with the Execution API, but they do so via an 
**in-process** transport
 
 - Runs the Execution API application directly within the same process, using 
an ASGI/WSGI
   bridge.
-- **Bypasses JWT authentication entirely** — the JWT bearer dependency is 
overridden to
-  always return a synthetic ``TIToken`` with the ``"execution"`` scope.
-- Also bypasses per-resource access controls (connection, variable, and XCom 
access checks
-  are overridden to always allow).
-
-This design means that code running in the Dag File Processor or Triggerer has 
**unrestricted
-access** to all Execution API operations without needing a valid JWT token. 
Since the Dag File
-Processor parses user-submitted Dag files and the Triggerer executes 
user-submitted trigger
-code, Dag authors whose code runs in these components effectively have the 
same level of
-access as the internal API itself.
+- **Potentially bypasses JWT authentication** — the JWT bearer dependency is 
overridden to
+  always return a synthetic ``TIToken`` with the ``"execution"`` scope, 
effectively bypassing
+  token validation.
+- Also potentially bypasses per-resource access controls (connection, 
variable, and XCom access
+  checks are overridden to always allow).
+
+This design means that code running in the Dag File Processor or Triggerer 
potentially has
+**unrestricted access** to all Execution API operations without needing a 
valid JWT token. Since
+the Dag File Processor parses user-submitted Dag files and the Triggerer 
executes user-submitted
+trigger code, Dag authors whose code runs in these components could 
potentially have the same
+level of access as the internal API itself.
 
 In the default deployment, a **single Dag File Processor instance** parses Dag 
files for all
 teams and a **single Triggerer instance** handles all triggers across all 
teams. This means
-that Dag author code from different teams executes within the same process, 
with shared access
-to the in-process Execution API and the metadata database.
+that Dag author code from different teams executes within the same process, 
with potentially
+shared access to the in-process Execution API and the metadata database.
 
 For multi-team deployments that require isolation, Deployment Managers must 
run **separate
 Dag File Processor and Triggerer instances per team** as a deployment-level 
measure — Airflow
 does not provide built-in support for per-team DFP or Triggerer instances. 
However, even with
-separate instances, these components still have direct access to the metadata 
database
-(the Dag File Processor needs it to store serialized Dags, and the Triggerer 
needs it to
-manage trigger state). A Dag author whose code runs in these components can 
potentially
+separate instances, these components still potentially have direct access to 
the metadata
+database (the Dag File Processor needs it to store serialized Dags, and the 
Triggerer needs it
+to manage trigger state). A Dag author whose code runs in these components can 
potentially
 access the database directly, including reading or modifying data belonging to 
other teams,
 or obtaining the JWT signing key if it is available in the process environment.
 
@@ -374,13 +375,13 @@ The current JWT authentication model operates under the 
following assumptions an
    separation between teams. Task-level team isolation will be improved in 
future versions
    of Airflow.
 
-**Dag File Processor and Triggerer bypass**
+**Dag File Processor and Triggerer potentially bypass JWT and access the 
database**
    As described above, the default deployment runs a single Dag File Processor 
and a single
-   Triggerer for all teams. Both bypass JWT authentication entirely via 
in-process transport.
+   Triggerer for all teams. Both potentially bypass JWT authentication via 
in-process transport.
    For multi-team isolation, Deployment Managers must run separate instances 
per team, but
-   even then, each instance retains direct database access. A Dag author whose 
code runs
-   in these components can potentially access the database directly — 
including data belonging
-   to other teams or the JWT signing key configuration — unless the Deployment 
Manager
+   even then, each instance potentially retains direct database access. A Dag 
author whose code
+   runs in these components can potentially access the database directly — 
including data
+   belonging to other teams or the JWT signing key configuration — unless the 
Deployment Manager
    restricts the database credentials and configuration available to each 
instance.
 
 **Planned improvements**
diff --git a/airflow-core/docs/security/security_model.rst 
b/airflow-core/docs/security/security_model.rst
index d030e879096..cb1ade8e4f8 100644
--- a/airflow-core/docs/security/security_model.rst
+++ b/airflow-core/docs/security/security_model.rst
@@ -69,7 +69,7 @@ the Dag File Processor, and the Triggerer, and potentially 
access the credential
 code uses to access external systems. In Airflow 3, worker task code 
communicates with
 the API server exclusively through the Execution API and does not have direct 
access to
 the metadata database. However, Dag author code that executes in the Dag File 
Processor
-and Triggerer still has direct access to the metadata database, as these 
components
+and Triggerer potentially still has direct access to the metadata database, as 
these components
 require it for their operation (see 
:ref:`jwt-authentication-and-workload-isolation` for details).
 
 Authenticated UI users
@@ -204,7 +204,7 @@ Limiting Dag Author access to subset of Dags
 Airflow does not yet provide full task-level isolation between different 
groups of users when
 it comes to task execution. While, in Airflow 3.0 and later, worker task code 
cannot directly access the
 metadata database (it communicates through the Execution API), Dag author code 
that runs in the Dag File
-Processor and Triggerer still has direct database access. Regardless of 
execution context, Dag authors
+Processor and Triggerer potentially still has direct database access. 
Regardless of execution context, Dag authors
 have access to all Dags in the Airflow installation and they can
 modify any of those Dags - no matter which Dag the task code is executed for. 
This means that Dag authors can
 modify state of any task instance of any Dag, and there are no finer-grained 
access controls to limit that access.
@@ -256,9 +256,10 @@ enforcement mechanisms that would allow to isolate tasks 
that are using deferrab
 each other and arbitrary code from various tasks can be executed in the same 
process/machine. The default
 deployment runs a single Triggerer instance that handles triggers from all 
teams — there is no built-in
 support for per-team Triggerer instances. Additionally, the Triggerer uses an 
in-process Execution API
-transport that bypasses JWT authentication and has direct access to the 
metadata database. For
-multi-team deployments, Deployment Managers must run separate Triggerer 
instances per team as a
-deployment-level measure, but even then each instance retains direct database 
access and a Dag author
+transport that potentially bypasses JWT authentication and potentially has 
direct access to the metadata
+database. For multi-team deployments, Deployment Managers must run separate 
Triggerer instances per team
+as a deployment-level measure, but even then each instance potentially retains 
direct database access
+and a Dag author
 whose trigger code runs there can potentially access the database directly — 
including data belonging
 to other teams. Deployment Manager must trust that Dag authors will not abuse 
this capability.
 
@@ -317,34 +318,35 @@ Current isolation limitations
 While Airflow 3 significantly improved the security model by preventing worker 
task code from
 directly accessing the metadata database (workers now communicate exclusively 
through the
 Execution API), **perfect isolation between Dag authors is not yet achieved**. 
Dag author code
-still executes with direct database access in the Dag File Processor and 
Triggerer. The
-following gaps exist:
+potentially still executes with direct database access in the Dag File 
Processor and Triggerer.
+The following gaps exist:
 
-**Dag File Processor and Triggerer bypass JWT authentication**
+**Dag File Processor and Triggerer potentially bypass JWT authentication**
    The Dag File Processor and Triggerer use an in-process transport to access 
the Execution API,
-   which bypasses JWT authentication entirely. Since these components execute 
user-submitted code
-   (Dag files and trigger code respectively), a Dag author whose code runs in 
these components has
-   unrestricted access to all Execution API operations — including the ability 
to read any connection,
-   variable, or XCom — without needing a valid JWT token.
+   which potentially bypasses JWT authentication. Since these components 
execute user-submitted code
+   (Dag files and trigger code respectively), a Dag author whose code runs in 
these components
+   potentially has unrestricted access to all Execution API operations — 
including the ability to
+   read any connection, variable, or XCom — without needing a valid JWT token.
 
-   Furthermore, the Dag File Processor has direct access to the metadata 
database (it needs this to
-   store serialized Dags). Dag author code executing in the Dag File Processor 
context could potentially
-   access the database directly, including the signing key configuration if it 
is available in the
-   process environment. If a Dag author obtains the JWT signing key, they 
could forge arbitrary tokens.
+   Furthermore, the Dag File Processor potentially has direct access to the 
metadata database (it
+   needs this to store serialized Dags). Dag author code executing in the Dag 
File Processor context
+   could potentially access the database directly, including the signing key 
configuration if it is
+   available in the process environment. If a Dag author obtains the JWT 
signing key, they could
+   potentially forge arbitrary tokens.
 
 **Dag File Processor and Triggerer are shared across teams**
    In the default deployment, a **single Dag File Processor instance** parses 
all Dag files and a
    **single Triggerer instance** handles all triggers — regardless of team 
assignment. There is no
    built-in support for running per-team Dag File Processor or Triggerer 
instances. This means that
-   Dag author code from different teams executes within the same process, 
sharing the in-process
-   Execution API and direct database access.
+   Dag author code from different teams executes within the same process, 
potentially sharing the
+   in-process Execution API and direct database access.
 
    For multi-team deployments that require separation, Deployment Managers 
must run **separate
    Dag File Processor and Triggerer instances per team** as a deployment-level 
measure (for example,
    by configuring each instance to only process bundles belonging to a 
specific team). However, even
-   with separate instances, each Dag File Processor and Triggerer retains 
direct access to the
-   metadata database — a Dag author whose code runs in these components can 
potentially access the
-   database directly, including reading or modifying data belonging to other 
teams, unless the
+   with separate instances, each Dag File Processor and Triggerer potentially 
retains direct access
+   to the metadata database — a Dag author whose code runs in these components 
can potentially access
+   the database directly, including reading or modifying data belonging to 
other teams, unless the
    Deployment Manager restricts the database credentials and configuration 
available to each instance.
 
 **No cross-workload isolation in the Execution API**
@@ -550,7 +552,7 @@ Dag authors executing arbitrary code
 
 Dag authors can execute arbitrary code on workers, the Dag File Processor, and 
the Triggerer. This
 includes accessing credentials, environment variables, and (in the case of the 
Dag File Processor
-and Triggerer) the metadata database directly. This is the intended behavior 
as described in
+and Triggerer) potentially the metadata database directly. This is the 
intended behavior as described in
 :ref:`capabilities-of-dag-authors` — Dag authors are trusted users. Reports 
that a Dag author can
 "achieve RCE" or "access the database" by writing Dag code are restating a 
documented capability,
 not discovering a vulnerability.
@@ -572,14 +574,14 @@ arbitrary code. See also :doc:`/security/sql`.
 An exception exists when official Airflow documentation explicitly recommends 
a pattern that leads to
 injection — in that case, the documentation guidance itself is the issue and 
may warrant an advisory.
 
-Dag File Processor and Triggerer having database access
-.......................................................
+Dag File Processor and Triggerer potentially having database access
+...................................................................
 
-The Dag File Processor requires direct database access to store serialized 
Dags. The Triggerer requires
-direct database access to manage trigger state. Both components execute 
user-submitted code (Dag files
-and trigger code respectively) and bypass JWT authentication via an in-process 
Execution API transport.
-These are intentional architectural choices, not vulnerabilities. They are 
documented in
-:ref:`jwt-authentication-and-workload-isolation`.
+The Dag File Processor potentially has direct database access to store 
serialized Dags. The Triggerer
+potentially has direct database access to manage trigger state. Both 
components execute user-submitted
+code (Dag files and trigger code respectively) and potentially bypass JWT 
authentication via an
+in-process Execution API transport. These are intentional architectural 
choices, not vulnerabilities.
+They are documented in :ref:`jwt-authentication-and-workload-isolation`.
 
 Workers accessing shared Execution API resources
 .................................................
diff --git a/docs/spelling_wordlist.txt b/docs/spelling_wordlist.txt
index bd5539dc85a..2ba6bf200d5 100644
--- a/docs/spelling_wordlist.txt
+++ b/docs/spelling_wordlist.txt
@@ -510,6 +510,7 @@ dttm
 dtypes
 du
 duckdb
+dumpable
 dunder
 dup
 durations
@@ -1384,6 +1385,7 @@ salesforce
 samesite
 saml
 sandboxed
+sandboxing
 sanitization
 sas
 Sasl
@@ -1728,6 +1730,7 @@ unpause
 unpaused
 unpausing
 unpredicted
+unsanitized
 untestable
 untransformed
 untrusted
@@ -1832,6 +1835,7 @@ Xiaodong
 xlarge
 xml
 xpath
+XSS
 xyz
 yaml
 Yandex

(airflow) 02/06: Fix spelling errors and use 'potentially' for DFP/Triggerer access

Reply via email to