dstandish commented on a change in pull request #16821:
URL: https://github.com/apache/airflow/pull/16821#discussion_r668288116



##########
File path: airflow/providers/amazon/aws/hooks/s3.py
##########
@@ -292,18 +302,24 @@ def list_keys(
             'PageSize': page_size,
             'MaxItems': max_items,
         }
-
         paginator = self.get_conn().get_paginator('list_objects_v2')
         response = paginator.paginate(
-            Bucket=bucket_name, Prefix=prefix, Delimiter=delimiter, 
PaginationConfig=config
+            Bucket=bucket_name,
+            Prefix=prefix,
+            Delimiter=delimiter,
+            PaginationConfig=config,
+            StartAfter=start_after_key,
+        )
+        # JMESPath to query directly on paginated results
+        filtered_response = response.search(
+            "Contents[?to_string("
+            "LastModified)<='\"{}\"' && "
+            "to_string(LastModified)>='\"{"

Review comment:
       i'm not sure adding this ResponseFilter is the right approach.  i am not 
sure we need to add this abstraction layer.
   
   Why not just let the user put an arbitrary callable that does the filtering 
that the user needs?  For example, then the user could pass `lambda x: 
x['LastModified'] > some_val` for greater than, or `lambda x: other_val > 
x['LastModified'] > some_val` for between.
   
   This is of course simpler to implement, but also it seems straightforward 
from user perspective.  This way we wouldn't be adding a new class that the 
user needs to learn how to use.
   
   Alternatively, we can go with your original thought, i.e. `from_datetime` 
and `to_datetime` (though with the updated naming / logic).  To me, simply 
adding the two optional datetime params, from and to, seems like the best 
approach because it's simple, easy to understand, useful, and there aren't a 
lot of other attributes that we would want to filter on (which lessens the 
utility of an arbitrary callable).  It's true that we can't provide flexibility 
with interval edge inclusivity, but to me that seems like an acceptable 
tradeoff for the simplicity and usabilitiy.  I would probably vote for doing 
`>=` for `from` and `<` for `to`.  
   
   What do yall think?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to