andrewhharmon opened a new issue, #61736:
URL: https://github.com/apache/airflow/issues/61736

   ### Apache Airflow Provider(s)
   
   amazon
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-amazon==9.18.0
   
   ### Apache Airflow version
   
   3..x.x
   
   ### Operating System
   
   Debian/Ubuntu-based containers (Astronomer Runtime, official Airflow images)
   
   ### Deployment
   
   Astronomer
   
   ### Deployment details
   
   Any deployment where the triggerer runs on a different host than the worker 
(Astronomer, MWAA, distributed Airflow with separate triggerer pods).
   
   ### What happened
   
   `EksPodOperator` with `deferrable=True` fails with 401 Unauthorized when the 
triggerer runs on a separate host from the worker (e.g., Astronomer, MWAA).
   
   The root cause is that the credential temp file created during `execute()` 
is not available on the triggerer when it tries to poll the pod.
   
   **The credential lifecycle during deferral:**
   
   1. `EksPodOperator.execute()` calls `eks_hook.get_session()` to extract AWS 
credentials
   2. `_secure_credential_context()` writes them to a temp file on the 
**worker** (e.g., `/tmp/tmpXYZ`)
   3. `generate_config_file()` creates a kubeconfig with an exec block that 
references that temp file:
      ```yaml
      users:
        - name: aws
          user:
            exec:
              command: sh
              args: ["-c", ". /tmp/tmpXYZ; python -m 
airflow...utils.eks_get_token ..."]
      ```
   4. `KubernetesPodOperator.invoke_defer_method()` calls 
`convert_config_file_to_dict()`, which reads the kubeconfig into a dict — 
**including the exec block with the temp file path**
   5. The task defers, the context managers exit, and **both temp files are 
deleted** (the credential file and the kubeconfig file)
   6. The trigger is serialized to the metadata DB with `config_dict` 
containing the now-stale temp file path
   7. The **triggerer** (on a different host) deserializes the trigger and 
calls `load_kube_config_from_dict(config_dict)`
   8. `kubernetes_asyncio` processes the exec block and runs `sh -c ". 
/tmp/tmpXYZ; ..."`
   9. `/tmp/tmpXYZ` doesn't exist on the triggerer → credentials not loaded → 
401 Unauthorized
   
   **Error output:**
   ```
   kubernetes_asyncio.client.exceptions.ApiException: (401)
   Reason: Unauthorized
   HTTP response body: 
{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure",
   "message":"Unauthorized","reason":"Unauthorized","code":401}
   ```
   
   **Note:** This is a separate issue from #60269 (POSIX shell compatibility). 
Even with the `. ` fix from that issue, the credential temp file still won't 
exist on the triggerer.
   
   **Note:** The `trigger_reentry()` path works correctly because it generates 
fresh credentials on the worker when the trigger fires. The problem is only 
during the triggerer's polling phase.
   
   ### What you think should happen instead
   
   The exec block in the serialized `config_dict` should not depend on temp 
files that only exist on the worker.
   
   ### How to reproduce
   
   1. Deploy Airflow with a triggerer running on a separate host from the 
worker (Astronomer, MWAA, or distributed Helm deployment with dedicated 
triggerer pods)
   2. Configure an `EksPodOperator` with `deferrable=True` connecting to an EKS 
cluster
   3. Run the DAG
   4. The pod launches successfully (worker has valid credentials)
   5. The task defers to the triggerer
   6. The triggerer fails to poll the pod with 401 Unauthorized
   
   
   ### Anything else
   
   I used AI tools to help troubleshoot this and my understanding is still 
limited. Hoping some discussion can help determine the correct fix for which 
I'm happy to help with. 
   
   ### Are you willing to submit PR?
   
   - [x] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to