kaxil commented on code in PR #67878: URL: https://github.com/apache/airflow/pull/67878#discussion_r3352417425
########## airflow-core/src/airflow/api_fastapi/auth/dag_processor_token.py: ########## @@ -0,0 +1,103 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. +""" +Mint and provision the bearer token the DAG processor presents to the API server (AIP-92). + +The DAG processor parses (and forks) user code, so it must never hold the deployment signing key +or mint its own token. A *trusted* component runs the helpers here -- the deployment's provisioning +step (a Helm init container, a docker-compose init service) or ``airflow standalone`` -- mints the +token and writes it to ``[dag_processor] api_token_path``. The processor only ever reads that file +(re-reading it as it is rotated), so it carries a token without being able to forge one. +""" + +from __future__ import annotations + +import logging +import os +from pathlib import Path + +from airflow.api_fastapi.auth.tokens import JWTGenerator, get_signing_args +from airflow.configuration import conf + +log = logging.getLogger(__name__) + +# The Execution API is task-instance scoped: its ``sub`` is validated as a UUID. The DAG processor +# is not a task instance, so its token carries an all-zero sentinel UUID rather than a real id. +DAG_PROCESSOR_TOKEN_SUBJECT = "00000000-0000-0000-0000-000000000000" Review Comment: Agreed it's a hack. It's a UUID only because the processor reuses the Execution API for parse-time `Connection`/`Variable` reads, and those routes go through `CurrentTIToken` -> `TIToken(id: UUID)` (the `id` also feeds team scoping), so a non-UUID `sub` is rejected before the route runs. The all-zero value is a stand-in non-TI principal, and the one token currently carries both the execution and dag-processing audiences. The DAG Processing validator accepts any `sub`, so the UUID is only needed on the Execution side. Two ways to give dag processing a real `sub`: 1. Two tokens: a dag-processing token with `sub=dag-processor` and a separate execution token that keeps the UUID (the Execution API genuinely is TI-scoped). 2. Generalise the Execution principal so the read-only conn/var routes accept a non-TI `sub`. I lean to (2) if you're open to it (one token, no sentinel anywhere); otherwise I'll do (1). Which would you prefer? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
