mcreenan opened a new issue #16972:
URL: https://github.com/apache/airflow/issues/16972


   **Apache Airflow version**: 2.1.0
   
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl 
version`): Client Version: version.Info{Major:"1", Minor:"19", 
GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", 
GitTreeState:"clean", BuildDate:"2020-10-14T18:49:28Z", GoVersion:"go1.15.2", 
Compiler:"gc", Platform:"darwin/amd64"}
   Server Version: version.Info{Major:"1", Minor:"18+", 
GitVersion:"v1.18.16-eks-7737de", 
GitCommit:"7737de131e58a68dda49cdd0ad821b4cb3665ae8", GitTreeState:"clean", 
BuildDate:"2021-03-10T21:33:25Z", GoVersion:"go1.13.15", Compiler:"gc", 
Platform:"linux/amd64"}
   
   **Environment**: Local/Development
   
   - **Cloud provider or hardware configuration**: Docker container
   - **OS** (e.g. from /etc/os-release): Debian GNU/Linux 10 (buster) 
   - **Kernel** (e.g. `uname -a`): Linux 243e98509628 5.10.25-linuxkit #1 SMP 
Tue Mar 23 09:27:39 UTC 2021 x86_64 GNU/Linux
   - **Install tools**:
   - **Others**:
   
   **What happened**:
   
   * Using AWS Secrets Manager secrets backend
   * Using S3Hook with aws_conn_id="foo/bar/baz" (example, but the slashes are 
important)
   * Secret value is: 
aws://?role_arn=arn%3Aaws%3Aiam%3A%3A`<account_id>`%3Arole%2F`<role_name>`&region_name=us-east-1
   * Get the following error: `botocore.exceptions.ClientError: An error 
occurred (ValidationError) when calling the AssumeRole operation: 1 validation 
error detected: Value 'Airflow_data/foo/bar/baz' at 'roleSessionName' failed to 
satisfy constraint: Member must satisfy regular expression pattern: [\w+=,.@-]*`
   
   
   **What you expected to happen**:
   
   No error and for boto to attempt to assume the role in the connection URI.
   
   The _SessionFactory._assume_role class method is setting the role session 
name to `f"Airflow_{self.conn.conn_id}"` with no encoding.
   
   **How to reproduce it**:
   
   * Create an AWS connection with forward slashes in the name/id
   ** Use a role_arn in the connection string (e.g. `aws://?role_arn=...`)
   * Create a test DAG using an AWS hook.  Example below:
   
   ```python
   from airflow import DAG
   from airflow.operators.python import PythonOperator
   from airflow.providers.amazon.aws.hooks.s3 import S3Hook
   from datetime import datetime, timedelta
   
   
   with DAG(
       dag_id='test_assume_role',
       start_date=datetime(2021, 6, 1),
       schedule_interval=None, # no schedule, triggered manually/ad-hoc
       tags=['test'],
   ) as dag:
       def write_to_s3(**kwargs):
           s3_hook = S3Hook(aws_conn_id=AWS_ASSUME_ROLE_CONN_ID)
           s3_hook.load_string(string_data='test', bucket_name='test_bucket', 
key='test/{{ execution_date }}')
       write_test_object = PythonOperator(task_id='write_test_object', 
python_callable=write_to_s3)
   ```
   
   **Anything else we need to know**:
   
   This is a redacted log from my actual test while using AWS Secrets Manager. 
Should get a similar result *without* Secrets Manager though.
   
   <details>
   <summary>1.log</summary>
   [2021-07-13 12:38:10,271] {taskinstance.py:876} INFO - Dependencies all met 
for <TaskInstance: test_assume_role.write_test_object 
2021-07-13T12:35:02.576772+00:00 [queued]>
   [2021-07-13 12:38:10,288] {taskinstance.py:876} INFO - Dependencies all met 
for <TaskInstance: test_assume_role.write_test_object 
2021-07-13T12:35:02.576772+00:00 [queued]>
   [2021-07-13 12:38:10,288] {taskinstance.py:1067} INFO - 
   
--------------------------------------------------------------------------------
   [2021-07-13 12:38:10,289] {taskinstance.py:1068} INFO - Starting attempt 1 
of 1
   [2021-07-13 12:38:10,289] {taskinstance.py:1069} INFO - 
   
--------------------------------------------------------------------------------
   [2021-07-13 12:38:10,299] {taskinstance.py:1087} INFO - Executing 
<Task(PythonOperator): write_test_object> on 2021-07-13T12:35:02.576772+00:00
   [2021-07-13 12:38:10,305] {standard_task_runner.py:52} INFO - Started 
process 38974 to run task
   [2021-07-13 12:38:10,309] {standard_task_runner.py:76} INFO - Running: 
['airflow', 'tasks', 'run', 'test_assume_role', 'write_test_object', 
'2021-07-13T12:35:02.576772+00:00', '--job-id', '2376', '--pool', 
'default_pool', '--raw', '--subdir', 'DAGS_FOLDER/test_assume_role.py', 
'--cfg-path', '/tmp/tmprusuo0ys', '--error-file', '/tmp/tmp8ytd9bk8']
   [2021-07-13 12:38:10,311] {standard_task_runner.py:77} INFO - Job 2376: 
Subtask write_test_object
   [2021-07-13 12:38:10,331] {logging_mixin.py:104} INFO - Running 
<TaskInstance: test_assume_role.write_test_object 
2021-07-13T12:35:02.576772+00:00 [running]> on host 243e98509628
   [2021-07-13 12:38:10,392] {taskinstance.py:1282} INFO - Exporting the 
following env vars:
   AIRFLOW_CTX_DAG_OWNER=airflow
   AIRFLOW_CTX_DAG_ID=test_assume_role
   AIRFLOW_CTX_TASK_ID=write_test_object
   AIRFLOW_CTX_EXECUTION_DATE=2021-07-13T12:35:02.576772+00:00
   AIRFLOW_CTX_DAG_RUN_ID=manual__2021-07-13T12:35:02.576772+00:00
   [2021-07-13 12:38:10,419] {base_aws.py:362} INFO - Airflow Connection: 
aws_conn_id=foo/bar/baz
   [2021-07-13 12:38:10,444] {credentials.py:1087} INFO - Found credentials in 
environment variables.
   [2021-07-13 12:38:11,079] {base_aws.py:173} INFO - No credentials retrieved 
from Connection
   [2021-07-13 12:38:11,079] {base_aws.py:76} INFO - Retrieving region_name 
from Connection.extra_config['region_name']
   [2021-07-13 12:38:11,079] {base_aws.py:81} INFO - Creating session with 
aws_access_key_id=None region_name=us-east-1
   [2021-07-13 12:38:11,096] {base_aws.py:151} INFO - role_arn is 
arn:aws:iam::<account_id>:role/<role_name>
   [2021-07-13 12:38:11,096] {base_aws.py:97} INFO - assume_role_method=None
   [2021-07-13 12:38:11,098] {credentials.py:1087} INFO - Found credentials in 
environment variables.
   [2021-07-13 12:38:11,119] {base_aws.py:185} INFO - Doing 
sts_client.assume_role to role_arn=arn:aws:iam::<account_id>:role/<role_name> 
(role_session_name=Airflow_data/foo/bar/baz)
   [2021-07-13 12:38:11,407] {taskinstance.py:1481} ERROR - Task failed with 
exception
   Traceback (most recent call last):
     File 
"/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 
1137, in _run_raw_task
       self._prepare_and_execute_task_with_callbacks(context, task)
     File 
"/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 
1311, in _prepare_and_execute_task_with_callbacks
       result = self._execute_task(context, task_copy)
     File 
"/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 
1341, in _execute_task
       result = task_copy.execute(context=context)
     File "/usr/local/lib/python3.7/site-packages/airflow/operators/python.py", 
line 150, in execute
       return_value = self.execute_callable()
     File "/usr/local/lib/python3.7/site-packages/airflow/operators/python.py", 
line 161, in execute_callable
       return self.python_callable(*self.op_args, **self.op_kwargs)
     File "/usr/local/airflow/dags/test_assume_role.py", line 49, in write_to_s3
       key='test/{{ execution_date }}'
     File 
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
 line 61, in wrapper
       return func(*bound_args.args, **bound_args.kwargs)
     File 
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
 line 90, in wrapper
       return func(*bound_args.args, **bound_args.kwargs)
     File 
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
 line 571, in load_string
       self._upload_file_obj(file_obj, key, bucket_name, replace, encrypt, 
acl_policy)
     File 
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
 line 652, in _upload_file_obj
       if not replace and self.check_for_key(key, bucket_name):
     File 
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
 line 61, in wrapper
       return func(*bound_args.args, **bound_args.kwargs)
     File 
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
 line 90, in wrapper
       return func(*bound_args.args, **bound_args.kwargs)
     File 
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
 line 328, in check_for_key
       raise e
     File 
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
 line 322, in check_for_key
       self.get_conn().head_object(Bucket=bucket_name, Key=key)
     File 
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py",
 line 455, in get_conn
       return self.conn
     File "/usr/local/lib/python3.7/site-packages/cached_property.py", line 36, 
in __get__
       value = obj.__dict__[self.func.__name__] = self.func(obj)
     File 
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py",
 line 437, in conn
       return self.get_client_type(self.client_type, 
region_name=self.region_name)
     File 
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py",
 line 403, in get_client_type
       session, endpoint_url = self._get_credentials(region_name)
     File 
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py",
 line 379, in _get_credentials
       conn=connection_object, region_name=region_name, config=self.config
     File 
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py",
 line 69, in create_session
       return self._impersonate_to_role(role_arn=role_arn, session=session, 
session_kwargs=session_kwargs)
     File 
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py",
 line 101, in _impersonate_to_role
       sts_client=sts_client, role_arn=role_arn, 
assume_role_kwargs=assume_role_kwargs
     File 
"/usr/local/lib/python3.7/site-packages/airflow/providers/amazon/aws/hooks/base_aws.py",
 line 188, in _assume_role
       RoleArn=role_arn, RoleSessionName=role_session_name, **assume_role_kwargs
     File "/usr/local/lib/python3.7/site-packages/botocore/client.py", line 
357, in _api_call
       return self._make_api_call(operation_name, kwargs)
     File "/usr/local/lib/python3.7/site-packages/botocore/client.py", line 
676, in _make_api_call
       raise error_class(parsed_response, operation_name)
   botocore.exceptions.ClientError: An error occurred (ValidationError) when 
calling the AssumeRole operation: 1 validation error detected: Value 
'Airflow_data/foo/bar/baz' at 'roleSessionName' failed to satisfy constraint: 
Member must satisfy regular expression pattern: [\w+=,.@-]*
   [2021-07-13 12:38:11,417] {taskinstance.py:1531} INFO - Marking task as 
FAILED. dag_id=test_assume_role, task_id=write_test_object, 
execution_date=20210713T123502, start_date=20210713T123810, 
end_date=20210713T123811
   [2021-07-13 12:38:11,486] {local_task_job.py:151} INFO - Task exited with 
return code 1
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to