dstandish commented on a change in pull request #16821:
URL: https://github.com/apache/airflow/pull/16821#discussion_r668288116
##########
File path: airflow/providers/amazon/aws/hooks/s3.py
##########
@@ -292,18 +302,24 @@ def list_keys(
'PageSize': page_size,
'MaxItems': max_items,
}
-
paginator = self.get_conn().get_paginator('list_objects_v2')
response = paginator.paginate(
- Bucket=bucket_name, Prefix=prefix, Delimiter=delimiter,
PaginationConfig=config
+ Bucket=bucket_name,
+ Prefix=prefix,
+ Delimiter=delimiter,
+ PaginationConfig=config,
+ StartAfter=start_after_key,
+ )
+ # JMESPath to query directly on paginated results
+ filtered_response = response.search(
+ "Contents[?to_string("
+ "LastModified)<='\"{}\"' && "
+ "to_string(LastModified)>='\"{"
Review comment:
i'm not sure adding this ResponseFilter is the right approach. i am not
sure we need to add this abstraction layer.
Why not just let the user put an arbitrary callable that does the filtering
that the user needs? For example, then the user could pass `lambda x:
x['LastModified'] > some_val` for greater than, or `lambda x: other_val >
x['LastModified'] > some_val` for between.
This is of course simpler to implement, but also it seems straightforward
from user perspective. This way we wouldn't be adding a new class that the
user needs to learn how to use.
Alternatively, we can go with your original thought, i.e. `from_datetime`
and `to_datetime` (though with the updated naming / logic). I think this is
probably the best approach. Simply adding the two optional datetime params,
from and to, is simple, easy to understand, useful, and importantly there
aren't other attributes in the response that we would want to filter on (which
lessens the utility of an arbitrary callable). It's true that with simple from
/ to params, we can't provide flexibility with interval edge inclusivity, but
to me that seems like an acceptable tradeoff for the simplicity and usabilitiy.
I would probably vote for doing `>=` for `from` and `<` for `to`.
But if not adding simple from / to, I'm in support of simple arbitrary
callable instead of the ResponseFilter apparatus.
What do yall think?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]