mcreenan opened a new issue #16972:
URL: https://github.com/apache/airflow/issues/16972
**Apache Airflow version**: 2.1.0
**Kubernetes version (if you are using kubernetes)** (use `kubectl
version`): Client Version: version.Info{Major:"1", Minor:"19",
GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df",
GitTreeState:"clean", BuildDate:"2020-10-14T18:49:28Z", GoVersion:"go1.15.2",
Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"18+",
GitVersion:"v1.18.16-eks-7737de",
GitCommit:"7737de131e58a68dda49cdd0ad821b4cb3665ae8", GitTreeState:"clean",
BuildDate:"2021-03-10T21:33:25Z", GoVersion:"go1.13.15", Compiler:"gc",
Platform:"linux/amd64"}
**Environment**: Local/Development
- **Cloud provider or hardware configuration**: Docker container
- **OS** (e.g. from /etc/os-release): Debian GNU/Linux 10 (buster)
- **Kernel** (e.g. `uname -a`): Linux 243e98509628 5.10.25-linuxkit #1 SMP
Tue Mar 23 09:27:39 UTC 2021 x86_64 GNU/Linux
- **Install tools**:
- **Others**:
**What happened**:
* Using AWS Secrets Manager secrets backend
* Using S3Hook with aws_conn_id="foo/bar/baz" (example, but the slashes are
important)
* Secret value is:
aws://?role_arn=arn%3Aaws%3Aiam%3A%3A`<account_id>`%3Arole%2F`<role_name>`®ion_name=us-east-1
* Get the following error: `botocore.exceptions.ClientError: An error
occurred (ValidationError) when calling the AssumeRole operation: 1 validation
error detected: Value 'Airflow_data/foo/bar/baz' at 'roleSessionName' failed to
satisfy constraint: Member must satisfy regular expression pattern: [\w+=,.@-]*`
**What you expected to happen**:
No error and for boto to attempt to assume the role in the connection URI.
The _SessionFactory._assume_role class method is setting the role session
name to `f"Airflow_{self.conn.conn_id}"` with no encoding.
**How to reproduce it**:
* Create an AWS connection with forward slashes in the name/id
** Use a role_arn in the connection string (e.g. `aws://?role_arn=...`)
* Create a test DAG using an AWS hook. Example below:
```python
from airflow import DAG
from airflow.operators.python import PythonOperator
from airflow.providers.amazon.aws.hooks.s3 import S3Hook
from datetime import datetime, timedelta
with DAG(
dag_id='test_assume_role',
start_date=datetime(2021, 6, 1),
schedule_interval=None, # no schedule, triggered manually/ad-hoc
tags=['test'],
) as dag:
def write_to_s3(**kwargs):
s3_hook = S3Hook(aws_conn_id=AWS_ASSUME_ROLE_CONN_ID)
s3_hook.load_string(string_data='test', bucket_name='test_bucket',
key='test/{{ execution_date }}')
write_test_object = PythonOperator(task_id='write_test_object',
python_callable=write_to_s3)
```
**Anything else we need to know**:
This is a redacted log from my actual test while using AWS Secrets Manager.
Should get a similar result *without* Secrets Manager though.
<details>
<summary>1.log</summary>
[2021-07-13 12:38:10,271] {taskinstance.py:876} INFO - Dependencies all met
for <TaskInstance: test_assume_role.write_test_object
2021-07-13T12:35:02.576772+00:00 [queued]>
[2021-07-13 12:38:10,288] {taskinstance.py:876} INFO - Dependencies all met
for <TaskInstance: test_assume_role.write_test_object
2021-07-13T12:35:02.576772+00:00 [queued]>
[2021-07-13 12:38:10,288] {taskinstance.py:1067} INFO -
--------------------------------------------------------------------------------
[2021-07-13 12:38:10,289] {taskinstance.py:1068} INFO - Starting attempt 1
of 1
[2021-07-13 12:38:10,289] {taskinstance.py:1069} INFO -
--------------------------------------------------------------------------------
[2021-07-13 12:38:10,299] {taskinstance.py:1087} INFO - Executing
<Task(PythonOperator): write_test_object> on 2021-07-13T12:35:02.576772+00:00
[2021-07-13 12:38:10,305] {standard_task_runner.py:52} INFO - Started
process 38974 to run task
[2021-07-13 12:38:10,309] {standard_task_runner.py:76} INFO - Running:
['airflow', 'tasks', 'run', 'test_assume_role', 'write_test_object',
'2021-07-13T12:35:02.576772+00:00', '--job-id', '2376', '--pool',
'default_pool', '--raw', '--subdir', 'DAGS_FOLDER/test_assume_role.py',
'--cfg-path', '/tmp/tmprusuo0ys', '--error-file', '/tmp/tmp8ytd9bk8']
[2021-07-13 12:38:10,311] {standard_task_runner.py:77} INFO - Job 2376:
Subtask write_test_object
[2021-07-13 12:38:10,331] {logging_mixin.py:104} INFO - Running
<TaskInstance: test_assume_role.write_test_object
2021-07-13T12:35:02.576772+00:00 [running]> on host 243e98509628
[2021-07-13 12:38:10,392] {taskinstance.py:1282} INFO - Exporting the
following env vars:
AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=test_assume_role
AIRFLOW_CTX_TASK_ID=write_test_object
AIRFLOW_CTX_EXECUTION_DATE=2021-07-13T12:35:02.576772+00:00
AIRFLOW_CTX_DAG_RUN_ID=manual__2021-07-13T12:35:02.576772+00:00
[2021-07-13 12:38:10,419] {base_aws.py:362} INFO - Airflow Connection:
aws_conn_id=foo/bar/baz
[2021-07-13 12:38:10,444] {credentials.py:1087} INFO - Found credentials in
environment variables.
[2021-07-13 12:38:11,079] {base_aws.py:173} INFO - No credentials retrieved
from Connection
[2021-07-13 12:38:11,079] {base_aws.py:76} INFO - Retrieving region_name
from Connection.extra_config['region_name']
[2021-07-13 12:38:11,079] {base_aws.py:81} INFO - Creating session with
aws_access_key_id=None region_name=us-east-1
[2021-07-13 12:38:11,096] {base_aws.py:151} INFO - role_arn is
arn:aws:iam::<account_id>:role/<role_name>
[2021-07-13 12:38:11,096] {base_aws.py:97} INFO - assume_role_method=None
[2021-07-13 12:38:11,098] {credentials.py:1087} INFO - Found credentials in
environment variables.
[2021-07-13 12:38:11,119] {base_aws.py:185} INFO - Doing
sts_client.assume_role to role_arn=arn:aws:iam::<account_id>:role/<role_name>
(role_session_name=Airflow_data/foo/bar/baz)
[2021-07-13 12:38:11,407] {taskinstance.py:1481} ERROR - Task failed with
exception
Traceback (most recent call last):
File
"/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line
1137, in _run_raw_task
self._prepare_and_execute_task_with_callbacks(context, task)
File
"/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line
1311, in _prepare_and_execute_task_with_callbacks
result = self._execute_task(context, task_copy)
File
"/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line
1341, in _execute_task
result = task_copy.execute(context=context)
File "/usr/local/lib/python3.7/site-packages/airflow/operators/python.py",
line 150, in execute
return_value = self.execute_callable()
File "/usr/local/lib/python3.7/site-packages/airflow/operators/python.py",
line 161, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File "/usr/local/airflow/dags/test_assume_role.py", line 49, in write_to_s3
key='test/{{ execution_date }}'
File
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
line 61, in wrapper
return func(*bound_args.args, **bound_args.kwargs)
File
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
line 90, in wrapper
return func(*bound_args.args, **bound_args.kwargs)
File
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
line 571, in load_string
self._upload_file_obj(file_obj, key, bucket_name, replace, encrypt,
acl_policy)
File
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
line 652, in _upload_file_obj
if not replace and self.check_for_key(key, bucket_name):
File
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
line 61, in wrapper
return func(*bound_args.args, **bound_args.kwargs)
File
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
line 90, in wrapper
return func(*bound_args.args, **bound_args.kwargs)
File
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
line 328, in check_for_key
raise e
File
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
line 322, in check_for_key
self.get_conn().head_object(Bucket=bucket_name, Key=key)
File
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py",
line 455, in get_conn
return self.conn
File "/usr/local/lib/python3.7/site-packages/cached_property.py", line 36,
in __get__
value = obj.__dict__[self.func.__name__] = self.func(obj)
File
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py",
line 437, in conn
return self.get_client_type(self.client_type,
region_name=self.region_name)
File
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py",
line 403, in get_client_type
session, endpoint_url = self._get_credentials(region_name)
File
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py",
line 379, in _get_credentials
conn=connection_object, region_name=region_name, config=self.config
File
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py",
line 69, in create_session
return self._impersonate_to_role(role_arn=role_arn, session=session,
session_kwargs=session_kwargs)
File
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py",
line 101, in _impersonate_to_role
sts_client=sts_client, role_arn=role_arn,
assume_role_kwargs=assume_role_kwargs
File
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py",
line 188, in _assume_role
RoleArn=role_arn, RoleSessionName=role_session_name, **assume_role_kwargs
File "/usr/local/lib/python3.7/site-packages/botocore/client.py", line
357, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/usr/local/lib/python3.7/site-packages/botocore/client.py", line
676, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (ValidationError) when
calling the AssumeRole operation: 1 validation error detected: Value
'Airflow_data/foo/bar/baz' at 'roleSessionName' failed to satisfy constraint:
Member must satisfy regular expression pattern: [\w+=,.@-]*
[2021-07-13 12:38:11,417] {taskinstance.py:1531} INFO - Marking task as
FAILED. dag_id=test_assume_role, task_id=write_test_object,
execution_date=20210713T123502, start_date=20210713T123810,
end_date=20210713T123811
[2021-07-13 12:38:11,486] {local_task_job.py:151} INFO - Task exited with
return code 1
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]