GayathriSrividya opened a new pull request, #67944:
URL: https://github.com/apache/airflow/pull/67944

   ## Problem
   
   Long-running tasks fail with repeated 403 errors when their JWT token 
expires while a heartbeat request is in-flight. The race condition is:
   
   1. Security middleware validates the token — it is still valid (or within 
its leeway).
   2. Request processing starts.
   3. The token's `exp` boundary is crossed during processing.
   4. `JWTReissueMiddleware.dispatch` calls `avalidated_claims(token, {})` — 
this now raises `ExpiredSignatureError`.
   5. The exception is caught by the outer `except Exception` block, logged as 
a warning, and **no `Refreshed-API-Token` header is set**.
   6. The client receives a 403 with no refreshed token, so it cannot update 
its `Bearer` token.
   7. After `MAX_FAILED_HEARTBEATS` consecutive failures the supervisor kills 
the task.
   
   ## Fix
   
   When `avalidated_claims` raises `ExpiredSignatureError` inside 
`JWTReissueMiddleware`, retry the validation with a 60-second grace leeway 
(`REISSUE_GRACE_LEEWAY`). If the token is within that grace window its claims 
are extracted and a fresh replacement token is issued and returned in the 
`Refreshed-API-Token` response header.
   
   The signature and all other claims are still fully verified; only the expiry 
window is relaxed in this specific code path. The new token is generated from 
the same claims (same `sub`, `scope`, `ti_id`), so there is no privilege 
escalation.
   
   The client's `_update_auth` hook already updates the `Bearer` token from 
`Refreshed-API-Token` before raising on 4xx/5xx, so the next retry uses the 
fresh token and succeeds.
   
   ## Changes
   
   - `airflow-core/src/airflow/api_fastapi/auth/tokens.py`: add `extra_leeway: 
float = 0` keyword argument to `validated_claims` and `avalidated_claims`; pass 
it on top of `self.leeway` to `jwt.decode`.
   - `airflow-core/src/airflow/api_fastapi/execution_api/app.py`: add 
`REISSUE_GRACE_LEEWAY = 60` class constant; catch `ExpiredSignatureError` on 
the inner `avalidated_claims` call and retry with the grace leeway.
   - 
`airflow-core/tests/unit/api_fastapi/execution_api/versions/head/test_router.py`:
 add regression test `test_just_expired_token_is_reissued_within_grace_period`.
   
   closes: #67939


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to