Taragolis opened a new issue, #35449:
URL: https://github.com/apache/airflow/issues/35449
### Body
Original stacktrace from the Slack
```console
Error:
File "/usr/local/airflow/plugins/plugins/others/data_source_monitor.py",
line 53, in retrieve_data
get_time_query = s3_hook.read_key(
File
"/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
line 64, in wrapper
return func(*bound_args.args, **bound_args.kwargs)
File
"/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
line 92, in wrapper
return func(*bound_args.args, **bound_args.kwargs)
File
"/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
line 514, in read_key
obj = self.get_key(key, bucket_name)
File
"/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
line 64, in wrapper
return func(*bound_args.args, **bound_args.kwargs)
File
"/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
line 92, in wrapper
return func(*bound_args.args, **bound_args.kwargs)
File
"/usr/local/airflow/.local/lib/python3.10/site-packages/airflow/providers/amazon/aws/hooks/s3.py",
line 493, in get_key
s3_resource = self.get_session().resource(
File
"/usr/local/airflow/.local/lib/python3.10/site-packages/boto3/session.py", line
446, in resource
client = self.client(
File
"/usr/local/airflow/.local/lib/python3.10/site-packages/boto3/session.py", line
299, in client
return self._session.create_client(
File
"/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/session.py",
line 976, in create_client
client = client_creator.create_client(
File
"/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/client.py",
line 116, in create_client
endpoints_ruleset_data = self._load_service_endpoints_ruleset(
File
"/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/client.py",
line 220, in _load_service_endpoints_ruleset
return self._loader.load_service_model(
File
"/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/loaders.py",
line 142, in _wrapper
data = func(self, *args, **kwargs)
File
"/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/loaders.py",
line 406, in load_service_model
known_services = self.list_available_services(type_name)
File
"/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/loaders.py",
line 142, in _wrapper
data = func(self, *args, **kwargs)
File
"/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/loaders.py",
line 311, in list_available_services
api_versions = os.listdir(full_dirname)
OSError: [Errno 12] Cannot allocate memory:
'/usr/local/airflow/.local/lib/python3.10/site-packages/botocore/data/efs'
```
The reason of this error simple, for some operations S3Hook create
[resource](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/resources.html)
(High Level client) in addition to
[`S3.Client`](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html)
and this resource created every time when some method of S3Hook called as
result additional memory required, for example if run
[`S3Hook.download_file`](https://github.com/apache/airflow/blob/95980a9bc50c1accd34166ba608bbe2b4ebd6d52/airflow/providers/amazon/aws/hooks/s3.py#L1338)
into the loop it might be reason for this error
As usual there are at least two solutions:
**Option 1**: use caching into the internal methods of S3Hook
**Option 2**: Get rid of resource usage in S3 hook and replace it by
`S3.Client` methods. It might be better solution:
- Seems like resources do not actively maintained in `boto3`
- It required for about 30-40 MB of memory for create new resource
object, however everything (and even more) could be done by `S3.Client`
### Committer
- [X] I acknowledge that I am a maintainer/committer of the Apache Airflow
project.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]