anaynayak commented on a change in pull request #18807:
URL: https://github.com/apache/airflow/pull/18807#discussion_r726771459
##########
File path: airflow/providers/amazon/aws/sensors/s3_prefix.py
##########
@@ -67,18 +67,15 @@ def __init__(
super().__init__(**kwargs)
# Parse
self.bucket_name = bucket_name
- self.prefix = prefix
+ self.prefix = [prefix] if isinstance(prefix, str) else prefix
self.delimiter = delimiter
- self.full_url = "s3://" + bucket_name + '/' + prefix
self.aws_conn_id = aws_conn_id
self.verify = verify
self.hook: Optional[S3Hook] = None
- def poke(self, context):
+ def poke(self, context: Dict[str, Any]):
self.log.info('Poking for prefix : %s in bucket s3://%s', self.prefix,
self.bucket_name)
- return self.get_hook().check_for_prefix(
- prefix=self.prefix, delimiter=self.delimiter,
bucket_name=self.bucket_name
- )
+ return all(self._check_for_prefix(prefix) for prefix in self.prefix)
Review comment:
I can raise another PR to change the name to `prefixes` . Had the same
thought going on in my mind 😄 . Had called out the same as point 1 on the
[description](https://github.com/apache/airflow/pull/18807#issue-1020070170).I'm
curious how we handle such backward incompatible changes. Do we update
UPDATING.md ?
To support finer control over `all` or `any`, was also suggesting passing a
callable which lets the user decide on a per key basis. Default implementation
could continue to be `all` based. Can raise another PR to do both these changes
if it makes sense.
The change will lead to an extra `__init__` parameter:
```
def __init__(
self,
...,
callback: Callable[[Dict[str, bool]], bool] = lambda
prefix_available: all(prefix_available.values()),
**kwargs,
):
def poke(self, context: Dict[str, Any]):
self.log.info('Poking for prefix : %s in bucket s3://%s',
self.prefix, self.bucket_name)
# callback can choose to return true even if any/subset/all of the
keys are present
return self.callback({prefix: self._check_for_prefix(prefix) for
prefix in self.prefix})
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]