Taragolis commented on code in PR #35273:
URL: https://github.com/apache/airflow/pull/35273#discussion_r1376510926


##########
airflow/providers/amazon/aws/fs/s3.py:
##########
@@ -85,7 +85,15 @@ def get_fs(conn_id: str | None) -> AbstractFileSystem:
     if proxy_uri := s3_service_config.get(S3_PROXY_URI, None):
         config_kwargs["proxies"] = {"http": proxy_uri, "https": proxy_uri}
 
-    fs = S3FileSystem(session=session, config_kwargs=config_kwargs, 
endpoint_url=endpoint_url)
+    anon = False
+    if (
+        aws.conn_config.aws_access_key_id is None
+        and aws.conn_config.aws_secret_access_key is None
+        and aws.conn_config.aws_session_token is None
+    ):
+        anon = True

Review Comment:
   > The issue is that with an empty connection id and no env variables it will 
not default to anonymous,
   
   `None` = use [default `boto3` credentials 
strategy](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html#configuring-credentials),
 and this is not mean use anonymous. Depend on situation credentials check 
might work, and might not work.
   
   E.g. user deploy on EC2 with enabled Instance IAM Profile, in this case 
credentials might be empty, but session.get_credentials() return something 
different than `None`
   
   And even more funny stuff: for historical reason AWS Connection fallback to 
boto3 default strategy even if connection id not found which is personally for 
me "Have a nice debugging time" 🙄 
   
   I'm not sure, but maybe better to check 
config_kwargs.get("signature_version") == "unsigned". 
   However it is also doesn't guarantee that we correctly guess is it required 
or not. All possible cases might be only resolved when we get finally 
constructed `botocore.config.Config` in `AwsGenericHook. 
_get_config(config=None)`
   
   
https://github.com/apache/airflow/blob/1aa91a482630f389005a9a36b0f038e675163411/airflow/providers/amazon/aws/hooks/base_aws.py#L638-L654
   
   
   
   
   



##########
airflow/providers/amazon/aws/fs/s3.py:
##########
@@ -85,7 +85,15 @@ def get_fs(conn_id: str | None) -> AbstractFileSystem:
     if proxy_uri := s3_service_config.get(S3_PROXY_URI, None):
         config_kwargs["proxies"] = {"http": proxy_uri, "https": proxy_uri}
 
-    fs = S3FileSystem(session=session, config_kwargs=config_kwargs, 
endpoint_url=endpoint_url)
+    anon = False
+    if (
+        aws.conn_config.aws_access_key_id is None
+        and aws.conn_config.aws_secret_access_key is None
+        and aws.conn_config.aws_session_token is None
+    ):
+        anon = True

Review Comment:
   > The issue is that with an empty connection id and no env variables it will 
not default to anonymous,
   
   `None` = use [default `boto3` credentials 
strategy](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html#configuring-credentials),
 and this is not mean use anonymous. Depend on situation credentials check 
might work, and might not work.
   
   E.g. user deploy on EC2 with enabled Instance IAM Profile, in this case 
credentials might be empty, but session.get_credentials() return something 
different than `None`
   
   And even more funny stuff: for historical reason AWS Connection fallback to 
boto3 default strategy even if connection id not found which is personally for 
me "Have a nice debugging time" 🙄 
   
   I'm not sure, but maybe better to check 
config_kwargs.get("signature_version") == "unsigned". 
   However it is also doesn't guarantee that we correctly guess is it required 
or not. All possible cases might be only resolved when we get finally 
constructed `botocore.config.Config` in `AwsGenericHook. 
_get_config(config=None)`
   
   
https://github.com/apache/airflow/blob/1aa91a482630f389005a9a36b0f038e675163411/airflow/providers/amazon/aws/hooks/base_aws.py#L638-L654
   
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to