Taragolis opened a new issue, #35449:
URL: https://github.com/apache/airflow/issues/35449

   ### Body
   
   Original stacktrace from the Slack
   
   ```console
   Error:
    File "/usr/local/airflow/plugins/plugins/others/data_source_monitor.py", 
line 53, in retrieve_data
   get_time_query = s3_hook.read_key(
   File 
"/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
 line 64, in wrapper
   return func(*bound_args.args, **bound_args.kwargs)
   File 
"/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
 line 92, in wrapper
   return func(*bound_args.args, **bound_args.kwargs)
   File 
"/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
 line 514, in read_key
   obj = self.get_key(key, bucket_name)
   File 
"/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
 line 64, in wrapper
   return func(*bound_args.args, **bound_args.kwargs)
   File 
"/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
 line 92, in wrapper
   return func(*bound_args.args, **bound_args.kwargs)
   File 
"/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
 line 493, in get_key
   s3_resource = self.get_session().resource(
   File 
"/usr/local/airflow/.local/lib/python3.10/site-packages/boto3/session.py", line 
446, in resource
   client = self.client(
   File 
"/usr/local/airflow/.local/lib/python3.10/site-packages/boto3/session.py", line 
299, in client
   return self._session.create_client(
   File 
"/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/session.py", 
line 976, in create_client
   client = client_creator.create_client(
   File 
"/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/client.py", 
line 116, in create_client
   endpoints_ruleset_data = self._load_service_endpoints_ruleset(
   File 
"/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/client.py", 
line 220, in _load_service_endpoints_ruleset
   return self._loader.load_service_model(
   File 
"/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/loaders.py", 
line 142, in _wrapper
   data = func(self, *args, **kwargs)
   File 
"/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/loaders.py", 
line 406, in load_service_model
   known_services = self.list_available_services(type_name)
   File 
"/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/loaders.py", 
line 142, in _wrapper
   data = func(self, *args, **kwargs)
   File 
"/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/loaders.py", 
line 311, in list_available_services
   api_versions = os.listdir(full_dirname)
   OSError: [Errno 12] Cannot allocate memory: 
'/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/data/efs'
   ```
   
   The reason of this error simple, for some operations S3Hook create 
[resource](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/resources.html)
 (High Level client) in addition to 
[`S3.Client`](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html)
 and this resource created every time when some method of S3Hook called as 
result additional memory required, for example if run 
[`S3Hook.download_file`](https://github.com/apache/airflow/blob/95980a9bc50c1accd34166ba608bbe2b4ebd6d52/airflow/providers/amazon/aws/hooks/s3.py#L1338)
 into the loop it might be reason for this error
   
   As usual there are at least two solutions:
   **Option 1**: use caching into the internal methods of S3Hook
   **Option 2**: Get rid of resource usage in S3 hook and replace it by 
`S3.Client` methods. It might be better solution:
      - Seems like resources do not actively maintained in `boto3`
      - It required for about 30-40 MB of memory for create new resource 
object, however everything (and even more) could be done by `S3.Client`
   
   ### Committer
   
   - [X] I acknowledge that I am a maintainer/committer of the Apache Airflow 
project.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to