andrewhharmon opened a new issue, #61735:
URL: https://github.com/apache/airflow/issues/61735
### Apache Airflow version
3.1.7
### If "Other Airflow 3 version" selected, which one?
_No response_
### What happened?
`EksPodOperator` with `deferrable=True` fails with 401 Unauthorized when the
triggerer runs on a separate host from the worker (e.g., Astronomer, MWAA).
The root cause is that the credential temp file created during `execute()`
is not available on the triggerer when it tries to poll the pod.
**The credential lifecycle during deferral:**
1. `EksPodOperator.execute()` calls `eks_hook.get_session()` to extract AWS
credentials
2. `_secure_credential_context()` writes them to a temp file on the
**worker** (e.g., `/tmp/tmpXYZ`)
3. `generate_config_file()` creates a kubeconfig with an exec block that
references that temp file:
```yaml
users:
- name: aws
user:
exec:
command: sh
args: ["-c", ". /tmp/tmpXYZ; python -m
airflow...utils.eks_get_token ..."]
```
4. `KubernetesPodOperator.invoke_defer_method()` calls
`convert_config_file_to_dict()`, which reads the kubeconfig into a dict —
**including the exec block with the temp file path**
5. The task defers, the context managers exit, and **both temp files are
deleted** (the credential file and the kubeconfig file)
6. The trigger is serialized to the metadata DB with `config_dict`
containing the now-stale temp file path
7. The **triggerer** (on a different host) deserializes the trigger and
calls `load_kube_config_from_dict(config_dict)`
8. `kubernetes_asyncio` processes the exec block and runs `sh -c ".
/tmp/tmpXYZ; ..."`
9. `/tmp/tmpXYZ` doesn't exist on the triggerer → credentials not loaded →
401 Unauthorized
**Error output:**
```
kubernetes_asyncio.client.exceptions.ApiException: (401)
Reason: Unauthorized
HTTP response body:
{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure",
"message":"Unauthorized","reason":"Unauthorized","code":401}
```
**Note:** This is a separate issue from #60269 (POSIX shell compatibility).
Even with the `. ` fix from that issue, the credential temp file still won't
exist on the triggerer.
**Note:** The `trigger_reentry()` path works correctly because it generates
fresh credentials on the worker when the trigger fires. The problem is only
during the triggerer's polling phase.
### What you think should happen instead?
I'm honestly not sure they right fix here. Hoping some discussion will point
to the correct path.
### How to reproduce
1. Deploy Airflow with a triggerer running on a separate host from the
worker (Astronomer, MWAA, or distributed Helm deployment with dedicated
triggerer pods)
2. Configure an `EksPodOperator` with `deferrable=True` connecting to an EKS
cluster
3. Run the DAG
4. The pod launches successfully (worker has valid credentials)
5. The task defers to the triggerer
6. The triggerer fails to poll the pod with 401 Unauthorized
### Operating System
astro
### Versions of Apache Airflow Providers
apache-airflow-providers-amazon==9.18.0
### Deployment
Astronomer
### Deployment details
_No response_
### Anything else?
I used AI tools to help troubleshoot this, so my understanding is a bit
limited. I don't fully understand how the trigger does it's polling. So if Im
off here, my apologies, but hoping some discussion can help solve.
### Are you willing to submit PR?
- [x] Yes I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]