vafremov213 opened a new issue, #60836:
URL: https://github.com/apache/airflow/issues/60836
### Description
Add an optional **'max_mails'** parameter to ImapHook attachement methods
(download_mail_attachments etc.) that allows limiting the number of latest
emails being processed, while still retrieving all attachments
from those emails.
### Use case/motivation
Problem
ImapHook currently has only two options for processing attachments from
emails matching the given
mail_filter.
For example:
- the last 3 emails match the filter
- each email contains 2 attachments
Currently, users must either:
- download all 6 attachments, or
- use `latest_only=True` and receive only a single attachment
There is no way to get only latest email with more than 1 attachment.
There is no way to retrieve all attachments from the last N emails.
There is no way to limit the number of emails being processed other than
latetest_only flag.
This becomes problematic when working with emails that contain multiple
attachments.
Proposed solution
Introduce an optional **max_mails** parameter to `ImapHook` methods that
retrieve
messages or attachments.
The parameter would limit the number of **latest emails** being processed,
while still returning **all attachments** from those emails.
The default value would be `None`, preserving the current behavior.
Conceptually, this can be implemented by modifying the internal
_list_mail_ids_desc method as follows:
```
import itertools
def _list_mail_ids_desc(
self,
mail_filter: str,
max_mails: int | None = None,
) -> Iterable[str]:
if not self.mail_client:
raise RuntimeError("The 'mail_client' should be initialized before!")
_, data = self.mail_client.search(None, mail_filter)
mail_ids = data[0].split()
mail_ids_desc = reversed(mail_ids)
if max_mails is not None:
return itertools.islice(mail_ids_desc, max_mails)
return mail_ids_desc
```
Of cource this parameter shoud aslo be added to attachemnt methods.
Example use case
```python
hook.retrieve_mail_attachments(
mail_filter="SINCE 01-Jan-2024",
max_mails=2,
)
This would retrieve all attachments from the two most recent matching emails.
### Related issues
Haven't found any
### Are you willing to submit a PR?
- [x] Yes I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]