Taragolis commented on code in PR #29623:
URL: https://github.com/apache/airflow/pull/29623#discussion_r1112384223
##########
airflow/providers/amazon/aws/hooks/base_aws.py:
##########
@@ -312,19 +312,35 @@ def _get_web_identity_credential_fetcher(
base_session = self.basic_session._session or
botocore.session.get_session()
client_creator = base_session.create_client
federation =
self.extra_config.get("assume_role_with_web_identity_federation")
- if federation == "google":
- web_identity_token_loader =
self._get_google_identity_token_loader()
- else:
- raise AirflowException(
- f'Unsupported federation: {federation}. Currently "google"
only are supported.'
- )
+
+ web_identity_token_loader = (
+ {
+ "file": self._get_file_token_loader,
+ "google": self._get_google_identity_token_loader,
+ }.get(federation)()
+ if type(federation) == str
+ else None
+ )
Review Comment:
> The objective is more about using multiple role_arn than multiple tokens.
It is widespread in practice for users not to have the ability to override
config files (for instance, in corporate environments or on PaaS), so defining
profiles in the AWS config or shared credentials file is not always practical.
Seems like it provide ability to set different web identity token file,
isn't it? 🤔
```python
token_file =
self.extra_config.get("assume_role_with_web_identity_token_file") or os.getenv(
AssumeRoleWithWebIdentityProvider._CONFIG_TO_ENV_VAR["web_identity_token_file"]
)
```
> This shares the same concern as point 1 regarding the inability to
overwrite config files. Also, I think this option would be dismissed as too
complex and burdensome by 99% of Airflow users.
I guess this option exists for allow create own method auth to AWS without
overwrite everything in provider. If it complicated better to think how it make
easier to end users.
> This does not permit configuration through the Airflow connections
subsystem without relying on external configs, which are (1) not necessarily
mutable and (2) more-or-less static.
Airflow
[`AssumeRole`](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRole.html)
in two steps
1. Obtain initial session by: either direct credentials or use `boto3`
default strategy for obtain credentials (Environment Variables, EC2 IAM
Profile, ECS Execution Role and etc)
2. Use session obtained in Step 1 for assume role thought STS API
[`AssumeRole`](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRole.html).
So I can't see any problem for obtain in EKS IRSA or by other SA
(`AWS_ROLE_ARN`, `AWS_WEB_IDENTITY_TOKEN_FILE` and `AWS_ROLE_SESSION_NAME`
environment variables) and after that assume to required role.
That simple snippet DAG for check
```python
import os
from airflow import DAG
from airflow.decorators import task
from airflow.models import Connection
import pendulum
with DAG(
dag_id="test-irsa-role",
start_date=pendulum.datetime(2023, 2, 1, tz="UTC"),
schedule=None,
catchup=False,
tags=["aws", "assume-role", "29623"]
):
@task
def check_connection():
os.environ["AWS_ROLE_ARN"] =
"arn:aws:iam::111111111111:role/spam-egg-role"
os.environ["AWS_WEB_IDENTITY_TOKEN_FILE"] =
"/var/run/secrets/eks.amazonaws.com/serviceaccount/token"
conn_id = "fake-conn-for-test"
region_name = "eu-west-1"
assumed_role = "arn:aws:iam::111111111111:role/foo-bar-role"
assert assumed_role != os.environ["AWS_ROLE_ARN"]
conn = Connection(
conn_id=conn_id,
conn_type="aws",
extra={
"role_arn": assumed_role,
"region_name": region_name
}
)
env_key = f"AIRFLOW_CONN_{conn.conn_id.upper()}"
os.environ[env_key] = conn.get_uri()
result, message = conn.test_connection()
if result:
return message
raise PermissionError(message)
check_connection()
```
> This would be possible if boto3 allowed passing web_identity_token_file,
role_arn, or role_session_name as parameters to clients or sessions, but sadly
it does not at this point.
This would be nice to address to [boto team](https://github.com/boto), and
may be they could share why it is not implemented into their SDK, I guess some
reason exists for that.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]