Thank you so much, Ash and Jarek, for the clarifications. I learned a lot from these responses.
On Thu, Feb 26, 2026 at 8:40 AM Jarek Potiuk <[email protected]> wrote: > > Good stuff Ash. Thanks for all those well-thought-out items :). > > If we can make it, I think we can have a bit more discussion at the devlist > today (I added a topic for that) > Let me add a few comments; that might lead to a few points we can cover > during the call. > > > > No, the supervisor does nothing with those secrets. It does not generate > > any tokens. The token is generated on the executor and set to the worker > > via some mechanism (outside of the Task Execution API. That’s the executors > > responsibility.) > > > > Good. We received a few reports about it from security researchers (we > discarded them as invalid because, until now, the JWT_TOKEN "scope" did not > matter). Consequently, it's not clearly documented, and we are about to > start making some claims here. > > > > > If the API server were the sole > > > token issuer, the scheduler would dispatch tasks with just the task > > > identity, no token, no signing key > > > > > > You can (and probably should) run the Workers _without_ that setting set. > > This is a bit of a departure from the (unwritten?) rule that all Airflow > > components should have the same config. I will propose this change in our > > docs. Or if someone wants to do that I’ll approve it. > > > > Unfortunately, it is well-documented: > https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html > > We even explain that even if some configuration values refer to other > components, they should be configured the same for both the worker and the > web server. We also provide an example of an API server secret key, We do > not have a clear explamation which configuration values should be set for > which components - just "configuraion group" is not enough as explained in > the example given: Here is the paragraph: > > *> Use the same configuration across all the Airflow components. While each > component does not require all, some configurations need to be same > otherwise they would not work as expected. A good example for that is > secret_key which should be same on the Webserver and Worker to allow > Webserver to fetch logs from Worker.* > > Also, I believe it's more than just documentation. We also have: > > * As described in > https://airflow.apache.org/docs/apache-airflow/stable/howto/set-config.html#setting-configuration-options, > the "airflow config --defaults" tool produces example configuration for > airflow (a single file containing only the necessary configurations, > without separate configurations for each component). We should likely have > a way to use separate configuration files for separate components. Even > better - I think - we should have a "secure" version that does not produce > defaults for sensitive configurations. I think we must force users to pass > these via environment variables (see my comments about PR_SET_DUMPABLE > below). > > * docker compose we release where not only a single "airflow.cfg" is shared > between different components, but also AIRFLOW__API_AUTH__JWT_SECRET is in > `<<: *airflow-common` section that is shared by all components > > * Helm chart where "airflow_config_mount" has the same "airflow.cfg" file > is shared between all components. > > IMHO, if we want to introduce **new** approach where there should be > different configuration parameters set for different components, all those > places should cover this and provide a way to set different values (or > explicitly document that they are not implementing the isolation > properties). This might actually be simpler - almost no work needed except > having only non-sensitive config - if we implement a "secure" running mode > where sensitive "configuration" is only available via env vars (see below) > > Also it does not really solve a problem of DagFileProcessor and Triggerer. > Both our processes and user processes currently access the same > configuration files (for example, the DB configuration string), allowing > the user code to do "anything". If we want to promise certain levels of > isolation (which we have done partially but not precisely so far), we need > a solution for running user code in both DagFileProcessor child processes > and the triggerer's async loop. > > > > Tasks need to be able to operate only being _pushed_ information. > > > > Also there is https://github.com/apache/airflow/pull/60108 in progress > > which along a change I’m working on will mean that the token that is being > > sent along with a workload can only be used to call the /run endpoint once, > > which helps a lot here. > > > > > Yep. That's a nice improvement. However, each executor should have a > somewhat secure way to pass the original token and clearly state its > security "level" in this context (For example, the LocalExecutor provides > no isolation or security as you noted). > > > > I’m looking at if we can drop capabilities in the supervisor, such that > > the forked user-code process is then also unable to read memory of another > > process even of the same user. This change, when coupled with > > removing/documentig/warning that jwt_secret should not be on workes would I > > think make this entirely secure. > > > > Not quite the capabilities subsystem, but there is a prctl syscall and the > > PR_SET_DUMPABLE flag > > https://man7.org/linux/man-pages/man2/PR_SET_DUMPABLE.2const.html we can > > use - see some simple testing > > https://gist.github.com/ashb/5d62f244b837cb4052743318eb18fdc6 - PR > > incoming today. I’ll make sure to include the Workers (i.e. celery > > processes, as JWT transits through those as well) > > > > Yep - good idea. That's one of the options I looked up (with suid and > cgroups). It seems to add a good layer of defense against reading memory > from the parent process (as long as we do it early enough so sensitive data > isn't already in memory before forking—if the data is already there, the > child process will access it). The celery process forking dance should > ensure the celery "master" process that forks the supervisors never sees > any sensitive credentials.. > > Still I think that alone does not prevent the DagFileProcessor parsing > processes and Triggerer async coroutines from accessing the configuration > the main process ran with if the values are configured in the configuration > files. My thinking so far was that we could use `PR_SET_DUMPABLE` similarly > for both the Triggerer and DagFileProcessor, but any sensitive data should > **only** ever be set via environment variables - because PR_SET_DUMPABLE > also protects /proc/pid/environ in the same way it protects /proc/pid/mem > (but it does not prevent reading from configuration files). > > I believe we could enforce that if we ensure no "non-shareable" sensitive > configuration variable is ever read from the config file. I would say that > we might need a "security_isolation" flag or similar that will, for > example, prevent reading such sensitive information from config files (and > even actively fail when one is found). Using PR_SET_DUMPABLE for all > components needing isolation might be a very secure solution. It could also > perform other checks and fail to start a component if any security-related > checks fail (for example, permissions for the home directory are too > open)—similar to what `ssh` does when it refuses keys with overly broad > permissions. It is also deployment-independent, which is very cool, because > we do not want to "force" users to configure sudo-allowed UNIX users, for > example, or generally define several UNIX users (as was the case with > impersonation). > > I think applying all of that brings us very close to pretty "safe" > isolation feature. > > > > --- > > Claims > > > > > > So I’m not sure we _need_ to include the dag/task/run id etc in the claim, > > as the UUID already uniquely identifies the TI row, and we need to fetch > > that on every API access to validate that the TI is actually still in a > > “valid” state to be able to speak to the Execution API. Specifically that > > it is still the latest TI try (because if it’s not the latest attempt, the > > UUID will only be found in the ti history table, and that can't be running > > so this task shouldn’t be able to execute any API endpoints.) Given that, > > we can keep the tokens shorter and make the loaded TI object available in > > the request context. > > > > I don't think we need to do it indeed, if we will always make such check - > the question is whether adding for example team_id would allow to avoid > some join queries. This is likely a question for Vincent and Nicolas. > > > > > > --- > > LocalExecutor > > > > Honestly, there’s not _much_ protection you can do when running things > > with the local executor. I’d say we document “if you care about this > > pattern of protection, do not use local exec”. (Because the secrets are > > almost certainly readable on disk) > > > > Oh absolutely. We should make sure that this is well documented. > > > > > > > > > On 25 Feb 2026, at 06:33, Anish Giri <[email protected]> wrote: > > > > > > Hi Jarek, > > > > > > As we are waiting for Ash/Kaxil/Amogh's input, I tried to trace the > > > token flow through the codebase. I would appreciate your thoughts on > > > it. > > > > > > Your point about the forked task always being able to extract the > > > signing key made me rethink the whole approach. I was wondering if the > > > signing key actually need to be in the scheduler/worker process at > > > all? > > > > > > Right now the scheduler loads the signing key at startup and generates > > > the tokens in its scheduling loop. If the API server were the sole > > > token issuer, the scheduler would dispatch tasks with just the task > > > identity, no token, no signing key. The worker would request a scoped > > > execution token from the API server before calling `start()`. Nothing > > > for a forked task to extract. Some of the foundation for this exists > > > in the scope based token work I'm doing in PR #60108. This would fully > > > cover distributed deployments (Celery, Kubernetes) where the task has > > > no path to the API server's memory. > > > > > > For co located deployments (LocalExecutor), a task running as the same > > > Unix user could still reach the API server's memory via `/proc`. But > > > if the API server registers every token's JTI at issuance and rejects > > > unregistered JTIs, a forged token gets rejected even with a valid > > > signature, because its JTI was never issued. The infrastructure for > > > this seems straightforward and a table similar to `revoked_token` from > > > #61339, with the same JTI lookup pattern but with the inverted logic. > > > > > > I think so that the combination would cover all deployment topologies. > > > Please correct me if I am wrong. OS-level hardening could still be > > > recommended, but I think it wouldn't be required. > > > > > > I might be missing something obvious. I would love to hear if there's > > > a flaw in this reasoning, or if the original authors had a different > > > approach in mind. > > > > > > Anish > > > > > > On Fri, Feb 20, 2026 at 6:58 PM Jarek Potiuk <[email protected]> wrote: > > >> > > >>> You mentioned having some ideas on the cryptographically strong > > >> provenance side and I would really like to hear them. > > >> > > >> I would first like to hear the original thinking - from Ash, Kaxil, > > Amogh > > >> - I do not want to introduce too much of a complexity, because maybe I > > am - > > >> indeed overcomplicating it, maybe there are some ways that were > > discussed > > >> before to protect the secret key and JWT signing process. > > >> > > >> So far my ideas are pretty complicated and generally involve instructing > > >> users to do **a number extra** things in their deployment process that > > are > > >> far beyond installing the app and beyond "python" realm, but I might be > > >> missing something obvious. > > >> > > >> J. > > >> > > >> > > >> On Sat, Feb 21, 2026 at 1:52 AM Jarek Potiuk <[email protected]> wrote: > > >> > > >>> It does not change much (and is not good for performance). The "spawn" > > >>> suffers from the same "having access to the configuration that the > > >>> supervisor has". If the supervisor can read all configuration needed > > to get > > >>> JWT_secret then the process spawned from it can just repeat the same > > steps > > >>> that the supervisor process did to obtain the JWT_secret, and create > > >>> JWT_token with any claims. Also such spawned processes can dump memory > > of > > >>> the parent process via `/proc/<pid>/mem` if they are run with the same > > >>> user. Or use `gcore PID` to dump memory of the process to a file. > > >>> > > >>> This is controlled by "ptrace" permission that is generally enabled on > > all > > >>> Linux systems by default (in order to enable debugging - for example > > gdb > > >>> attaching to a running process or dumping core with gcore). You can > > disable > > >>> this permission by SELinux or Yama Linux Security Module. And even that > > >>> does not restrict the capability of a spawned process to just read the > > same > > >>> configuration files or environment variables that the main process had > > and > > >>> re-create JWT-token with any claims. > > >>> > > >>> It's just how unix user process separation works - any process of a > > UNIX > > >>> user by default can do **anything** with any other processes of the > > same > > >>> UNIX user. > > >>> > > >>> J. > > >>> > > >>> > > >>> On Sat, Feb 21, 2026 at 1:27 AM Anish Giri <[email protected]> > > >>> wrote: > > >>> > > >>>> Hi Jarek, Vikram, > > >>>> > > >>>> Thanks for this, and I am really very glad that I posted it before > > >>>> writing any code. > > >>>> > > >>>> I spent some time going through your point about the fork model and > > >>>> the signing key. That's something I hadn't considered at all. I went > > >>>> and looked now at how the key flows through the code, and you're right > > >>>> that with fork the scheduler's heap gets inherited via copy on write, > > >>>> so the key material ends up in the worker's address space even though > > >>>> it is never explicitly passed. The task code runs in a second fork > > >>>> inside the supervisor, so it inherits the same memory. So the identity > > >>>> model isn't secure in the fork model, no matter what we build on top > > of > > >>>> it, anyway. > > >>>> > > >>>> There is one thing I was wondering, and please correct me if I am > > >>>> wrong, would switching from **fork** to **spawn** for its worker > > >>>> processes help here? Spawned workers start with a clean interpreter, > > so > > >>>> the signing key never gets to enter their address space. And since the > > >>>> supervisor's fork inherits from the worker (which never had the key), > > >>>> the task would not have it either now. > > >>>> > > >>>> Not sure if I'm oversimplifying it though. You mentioned having some > > >>>> ideas on the cryptographically strong provenance side and I would > > >>>> really like to hear them. > > >>>> > > >>>> Anish > > >>>> > > >>>> On Fri, Feb 20, 2026 at 3:32 PM Vikram Koka via dev > > >>>> <[email protected]> wrote: > > >>>>> > > >>>>> +1 to Jarek's comments and questions here. > > >>>>> > > >>>>> I am concerned that these proposed changes at the PR level could > > create > > >>>> an > > >>>>> illusion of security, potentially leading to many "security bugs" > > >>>> reported > > >>>>> by users who may have a very different expectation. > > >>>>> > > >>>>> We need to clearly articulate a set of security expectations here > > before > > >>>>> addressing this in a set of PRs. > > >>>>> > > >>>>> Vikram > > >>>>> > > >>>>> On Fri, Feb 20, 2026 at 1:23 PM Jarek Potiuk <[email protected]> > > wrote: > > >>>>> > > >>>>>> I think there is one more thing that I've been mentioning some time > > >>>> ago and > > >>>>>> it's time to put in more concrete words. > > >>>>>> > > >>>>>> Currently there is **no** protection against the tasks making claims > > >>>> that > > >>>>>> they belong to other tasks. While the running task by default > > >>>> receives the > > >>>>>> generated token to use from the supervisor - there is absolutely no > > >>>> problem > > >>>>>> for the forked task to inspect parent process memory to get the > > >>>>>> "supervisor" token that is used to sign the "task" token and > > generate > > >>>> a new > > >>>>>> token with **any"" dag_id or task_id or basically any other claim. > > >>>>>> > > >>>>>> This is by design currently, because we do not have any control > > >>>> implemented > > >>>>>> and part of the security model of Airflow 3.0 - 3.1 is that any task > > >>>> can > > >>>>>> perform **any** action on task SDK and we never even attempt to > > verify > > >>>>>> which tasks, dags state it can modify, which connections or > > variables > > >>>> it > > >>>>>> accesses. We only need to know that this "task" was authorised by > > the > > >>>>>> scheduler to call "task-sdk" API. > > >>>>>> > > >>>>>> With multi-team, this assumption is broken. We **need** to know and > > >>>>>> **enforce** task_id provenance. The situation when one task pretends > > >>>> to be > > >>>>>> another task is not acceptable any more - and violates basic > > isolation > > >>>>>> between the teams. > > >>>>>> > > >>>>>> As I understand the way how current supervisor-> task JWT token > > >>>> generation > > >>>>>> works is (and please correct me if I am wrong): > > >>>>>> > > >>>>>> * when supervisor starts it reads configuration of ("jwt_secret" / > > >>>>>> "jwt_private_key_path" / "jwt_kid") > > >>>>>> * when it starts a task, it uses this "secret" to generate a > > >>>> JWT_token for > > >>>>>> that task (with "dag_id", "dag_run_id", "task_instance_id") claims - > > >>>> and it > > >>>>>> is used by supervisor to communicate with api_server > > >>>>>> * forked task does not have direct reference to that token nor to > > the > > >>>>>> jwt_secret when started - it does not get it passed > > >>>>>> * executing task process is only supposed to communicate with the > > >>>>>> supervisor via in-process communication, it does not open connection > > >>>> nor > > >>>>>> use the JWT_token directly > > >>>>>> > > >>>>>> Now ... the interesting thing is that while the forked process does > > >>>> not > > >>>>>> have an "easy" API to not only get the token and use it directly, > > but > > >>>> also > > >>>>>> to generate NEW token because no matter how hard we try, the forked > > >>>> task > > >>>>>> will **always** be able to access "jwt_secret" and create its own > > >>>> JWT_token > > >>>>>> - and add **ANY** claims to that token. That's simply a consequence > > of > > >>>>>> using our fork model, also additional thing is that (default) > > >>>> approach of > > >>>>>> using the same unix user in the forked process, enables the forked > > >>>> process > > >>>>>> to read **any** of the information that supervisor process accesses > > >>>>>> (including configuration files, env variables and even memory of the > > >>>>>> supervisor process). > > >>>>>> > > >>>>>> There are two ways how running task can get JWT_SECRET: > > >>>>>> > > >>>>>> * since the task process is forked from the supervisor - everything > > >>>> that > > >>>>>> parent process has in memory - even if the method executed in the > > >>>> fork has > > >>>>>> no direct reference to it. The forked process can use "globals" and > > >>>> get to > > >>>>>> any variable, function, class, method that the parent supervisor > > >>>> process > > >>>>>> has. It can read any data in memory of the process. So if the > > >>>> JWT_Secret is > > >>>>>> already in memory of the parent process when the task process is > > >>>> forked, it > > >>>>>> also is in memory of the task process > > >>>>>> > > >>>>>> * since the task process is the same unix user as the parent process > > >>>> - it > > >>>>>> has access to all the same configuration, environment data. Even if > > >>>> the > > >>>>>> parent process will clear os.environ - the child process can read > > >>>> original > > >>>>>> environment vairables the parent process has been started with using > > >>>>>> `/proc` filesystem (it just needs to know the parent process id - > > >>>> which it > > >>>>>> always has). Unless more sophisticated mechanism are used such > > SELinux > > >>>>>> (requires kernel with SELinux and configured system-level SELinux > > >>>> rules) , > > >>>>>> user impersonation and cgroups/proper access control to files > > >>>> (requires > > >>>>>> sudo access for parent process id) - such forked process can do > > >>>>>> **everything** the parent process can do - including reading the > > >>>>>> configuration of JWT_secret and creating JWT_tokens with (again) any > > >>>>>> task_instance_id, any dag id, and dag_run id claim. > > >>>>>> > > >>>>>> So no matter what we do on the "server" side - the client side > > >>>> (supervisor) > > >>>>>> - in the default configuration already allows the task to pretend > > >>>> they are > > >>>>>> "whatever dag id" - in which case server side verification is > > >>>> pointless. > > >>>>>> > > >>>>>> I believe (Ashb? Kaxil? Amogh?) that was a deliberate decision when > > >>>> the API > > >>>>>> was designed when the Task SDK / JWT token for Airflow 3.0 was > > >>>> implemented > > >>>>>> (because we did not need it). > > >>>>>> > > >>>>>> I would love to hear if my thinking is wrong, but I highly doubt it, > > >>>> so I > > >>>>>> wonder what were the original thoughts here on how the task identity > > >>>> can > > >>>>>> have "cryptographically strong" provenance ? I have some ideas for > > >>>> that, > > >>>>>> but I would love to hear what the original author's thoughts are ? > > >>>>>> > > >>>>>> J. > > >>>>>> > > >>>>>> On Fri, Feb 20, 2026 at 8:49 PM Anish Giri < > > [email protected]> > > >>>>>> wrote: > > >>>>>> > > >>>>>>> Thanks, Vincent! I appreciate your review. I'll get started on the > > >>>>>>> implementation and tag you on the PRs. > > >>>>>>> > > >>>>>>> Anish > > >>>>>>> > > >>>>>>> On Fri, Feb 20, 2026 at 8:23 AM Vincent Beck <[email protected]> > > >>>>>> wrote: > > >>>>>>>> > > >>>>>>>> Hey Anish, > > >>>>>>>> > > >>>>>>>> Everything you said makes sense to me, I might have questions on > > >>>>>>> specifics but I rather keep them for PRs, that'll make everything > > >>>> way > > >>>>>>> easier. > > >>>>>>>> > > >>>>>>>> Feel free to ping me on all your PRs, > > >>>>>>>> Vincent > > >>>>>>>> > > >>>>>>>> On 2026/02/20 07:34:47 Anish Giri wrote: > > >>>>>>>>> Hello everyone, > > >>>>>>>>> > > >>>>>>>>> Jarek asked for a proposal on #60125 [1] before implementing > > >>>> access > > >>>>>>>>> control for the Execution API's resource endpoints (variables, > > >>>>>>>>> connections, XComs), so here it is. > > >>>>>>>>> > > >>>>>>>>> After going through the codebase, I think this is really about > > >>>>>>>>> completing AIP-67's [2] multi-team boundary enforcement rather > > >>>> than > > >>>>>>>>> introducing a new security model. Most of the infrastructure > > >>>> already > > >>>>>>>>> exists. What's missing are the actual authorization checks. > > >>>>>>>>> > > >>>>>>>>> The current state: > > >>>>>>>>> > > >>>>>>>>> The Execution API has three authorization stubs that always > > >>>> return > > >>>>>>> True: > > >>>>>>>>> > > >>>>>>>>> - has_variable_access() in execution_api/routes/variables.py > > >>>>>>>>> - has_connection_access() in execution_api/routes/connections.py > > >>>>>>>>> - has_xcom_access() in execution_api/routes/xcoms.py > > >>>>>>>>> > > >>>>>>>>> All three have a "# TODO: Placeholder for actual implementation" > > >>>>>>> comment. > > >>>>>>>>> > > >>>>>>>>> For variables and connections, vincbeck's data-layer team > > >>>> scoping > > >>>>>>>>> (#58905 [4], #59476 [5]) already prevents cross-team data > > >>>> retrieval > > >>>>>> in > > >>>>>>>>> practice. A cross-team request returns a 404 rather than the > > >>>>>> resource. > > >>>>>>>>> So the data isolation is there, but the auth stubs don't reject > > >>>> these > > >>>>>>>>> requests early with a proper 403, and there's no second layer of > > >>>>>>>>> protection at the auth check itself. > > >>>>>>>>> > > >>>>>>>>> For XComs, the situation is different. There is no isolation at > > >>>> any > > >>>>>>>>> layer. XCom routes take dag_id, run_id, and task_id directly > > >>>> from URL > > >>>>>>>>> path parameters with no validation against the calling task's > > >>>>>>>>> identity. A task in Team-A's bundle can currently read and write > > >>>>>>>>> Team-B's XComs. > > >>>>>>>>> > > >>>>>>>>> There's already a get_team_name_dep() function in deps.py that > > >>>>>>>>> resolves a task's team via TaskInstance -> DagModel -> > > >>>> DagBundleModel > > >>>>>>>>> -> Team in a single join query. The variable and connection > > >>>> endpoints > > >>>>>>>>> already use it. XCom routes don't use it at all. > > >>>>>>>>> > > >>>>>>>>> Proposed approach: > > >>>>>>>>> > > >>>>>>>>> I'm thinking of this in two parts: > > >>>>>>>>> > > >>>>>>>>> 1) Team boundary checks for variables and connections > > >>>>>>>>> > > >>>>>>>>> Fill the auth stubs with team boundary checks. For reference, > > >>>> the > > >>>>>> Core > > >>>>>>>>> API does this in security.py. requires_access_variable() > > >>>> resolves the > > >>>>>>>>> resource's team via Variable.get_team_name(key), wraps it in > > >>>>>>>>> VariableDetails, and passes it to > > >>>>>>>>> auth_manager.is_authorized_variable(method, details, user). The > > >>>> auth > > >>>>>>>>> manager then checks team membership. > > >>>>>>>>> > > >>>>>>>>> For the Execution API, the flow would be similar but without > > >>>> going > > >>>>>>>>> through the auth manager (I'll explain why below): > > >>>>>>>>> > > >>>>>>>>> variable_key -> Variable.get_team_name(key) -> resource_team > > >>>>>>>>> token.id -> get_team_name_dep() -> task_team > > >>>>>>>>> Deny if resource_team != task_team (when both are non-None) > > >>>>>>>>> > > >>>>>>>>> When core.multi_team is disabled, get_team_name_dep returns > > >>>> None and > > >>>>>>>>> the check is skipped, so current single-team behavior stays > > >>>> exactly > > >>>>>>>>> the same. > > >>>>>>>>> > > >>>>>>>>> 2) XCom authorization > > >>>>>>>>> > > >>>>>>>>> This is the harder part. For writes, I think we should verify > > >>>> the > > >>>>>>>>> calling task is writing its own XComs -- the task identity from > > >>>> the > > >>>>>>>>> JWT should match the dag_id/task_id in the URL path. For reads, > > >>>>>>>>> enforce team boundary so a task can only read XComs from tasks > > >>>> within > > >>>>>>>>> the same team. This would allow cross-DAG xcom_pull within a > > >>>> team > > >>>>>>>>> (which people already do) while preventing cross-team access. > > >>>>>>>>> > > >>>>>>>>> To avoid a DB lookup on every request, I'd propose adding > > >>>> dag_id to > > >>>>>>>>> the JWT claims at generation time. The dag_id is already on the > > >>>>>>>>> TaskInstance schema in ExecuteTask.make() (workloads.py:142). > > >>>> The > > >>>>>>>>> JWTReissueMiddleware already preserves all claims during token > > >>>>>>>>> refresh, so this wouldn't break anything. Adding task_id and > > >>>> run_id > > >>>>>> to > > >>>>>>>>> the token could be done as a follow-up -- there's a TODO at > > >>>>>>>>> xcoms.py:315 about eventually deriving these from the token > > >>>> instead > > >>>>>> of > > >>>>>>>>> the URL. > > >>>>>>>>> > > >>>>>>>>> I'm not proposing to add team_name to the token. It's not > > >>>> available > > >>>>>> on > > >>>>>>>>> the TaskInstance schema at generation time. Resolving it > > >>>> requires a > > >>>>>> DB > > >>>>>>>>> join through DagModel -> DagBundleModel -> Team, which would > > >>>> slow > > >>>>>> down > > >>>>>>>>> the scheduler's task queuing path. Better to resolve it at > > >>>> request > > >>>>>>>>> time via get_team_name_dep. > > >>>>>>>>> > > >>>>>>>>> Why not go through BaseAuthManager? > > >>>>>>>>> > > >>>>>>>>> One design question I want to raise: the Execution API auth > > >>>> stubs > > >>>>>>>>> currently don't call BaseAuthManager.is_authorized_*(), and I > > >>>> think > > >>>>>>>>> they probably shouldn't. The BaseAuthManager interface is > > >>>> designed > > >>>>>>>>> around human identity (BaseUser with roles and team > > >>>> memberships), but > > >>>>>>>>> the Execution API operates on task identity (TIToken with a > > >>>> UUID). > > >>>>>>>>> These are very different things. A task doesn't have a "role" > > >>>> in the > > >>>>>>>>> RBAC sense, it has a team derived from its DAG's bundle. > > >>>>>>>>> > > >>>>>>>>> I'm leaning toward keeping the authorization logic directly in > > >>>> the > > >>>>>>>>> has_*_access dependency functions, using get_team_name_dep for > > >>>> team > > >>>>>>>>> resolution. This keeps the Execution API auth simple and avoids > > >>>> tying > > >>>>>>>>> task authorization to the human auth manager. But I'd like to > > >>>> hear if > > >>>>>>>>> others think we should instead extend BaseAuthManager with > > >>>>>>>>> task-identity-aware methods. > > >>>>>>>>> > > >>>>>>>>> What about single-team deployments? > > >>>>>>>>> > > >>>>>>>>> When core.multi_team=False (the default for most deployments), > > >>>> the > > >>>>>>>>> team boundary checks would be skipped entirely for variables and > > >>>>>>>>> connections. For XComs, I think write ownership verification > > >>>> (task > > >>>>>> can > > >>>>>>>>> only write its own XComs) is worth keeping regardless of > > >>>> multi-team > > >>>>>>>>> mode -- it's more of a correctness concern than an > > >>>> authorization one. > > >>>>>>>>> But I can also see the argument for a complete no-op when > > >>>> multi_team > > >>>>>>>>> is off to keep things simple. > > >>>>>>>>> > > >>>>>>>>> Out of scope: > > >>>>>>>>> > > >>>>>>>>> AIP-72 [3] mentions three possible authorization models: > > >>>>>>>>> pre-declaration (DAGs declare required resources), runtime > > >>>> request > > >>>>>>>>> with deployment-level policy, and OPA integration via WASM > > >>>> bindings. > > >>>>>>>>> I'm not trying to address any of those here. The team-boundary > > >>>>>>>>> enforcement is the base that all three future models need. > > >>>>>>>>> > > >>>>>>>>> Implementation plan: > > >>>>>>>>> > > >>>>>>>>> 1. Add dag_id claim to JWT token generation in workloads.py > > >>>>>>>>> 2. Implement has_variable_access team boundary check > > >>>>>>>>> 3. Implement has_connection_access team boundary check > > >>>>>>>>> 4. Implement has_xcom_access with write ownership + team > > >>>> boundary > > >>>>>>>>> 5. Add XCom team resolution (XCom routes currently have no > > >>>>>>>>> get_team_name_dep usage) > > >>>>>>>>> 6. Tests for all authorization scenarios including cross-team > > >>>> denial > > >>>>>>>>> 7. Documentation update for multi-team authorization behavior > > >>>>>>>>> > > >>>>>>>>> This should be a fairly small change -- mostly filling in the > > >>>>>> existing > > >>>>>>>>> stubs with actual checks. > > >>>>>>>>> > > >>>>>>>>> Let me know what you think. > > >>>>>>>>> > > >>>>>>>>> Anish > > >>>>>>>>> > > >>>>>>>>> [1] > > >>>>>>> > > >>>> > > https://github.com/apache/airflow/issues/60125#issuecomment-3712218766 > > >>>>>>>>> [2] > > >>>>>>> > > >>>>>> > > >>>> > > https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-67+Multi-team+deployment+of+Airflow+components > > >>>>>>>>> [3] > > >>>>>>> > > >>>>>> > > >>>> > > https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-72+Task+Execution+Interface+aka+Task+SDK > > >>>>>>>>> [4] https://github.com/apache/airflow/pull/58905 > > >>>>>>>>> [5] https://github.com/apache/airflow/pull/59476 > > >>>>>>>>> > > >>>>>>>>> > > >>>> --------------------------------------------------------------------- > > >>>>>>>>> To unsubscribe, e-mail: [email protected] > > >>>>>>>>> For additional commands, e-mail: [email protected] > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>> --------------------------------------------------------------------- > > >>>>>>>> To unsubscribe, e-mail: [email protected] > > >>>>>>>> For additional commands, e-mail: [email protected] > > >>>>>>>> > > >>>>>>> > > >>>>>>> > > >>>> --------------------------------------------------------------------- > > >>>>>>> To unsubscribe, e-mail: [email protected] > > >>>>>>> For additional commands, e-mail: [email protected] > > >>>>>>> > > >>>>>>> > > >>>>>> > > >>>> > > >>>> --------------------------------------------------------------------- > > >>>> To unsubscribe, e-mail: [email protected] > > >>>> For additional commands, e-mail: [email protected] > > >>>> > > >>>> > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: [email protected] > > > For additional commands, e-mail: [email protected] > > > > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
