syun64 opened a new issue, #28272: URL: https://github.com/apache/airflow/issues/28272
### Apache Airflow Provider(s) amazon ### Versions of Apache Airflow Providers apache-airflow-providers-amazon==6.2.0 ### Apache Airflow version 2.5.0 ### Operating System Red Hat Enterprise Linux Server 7.6 (Maipo) ### Deployment Virtualenv installation ### Deployment details Simple virtualenv deployment ### What happened bucket_key is a template_field in S3KeySensor, which means that is expected to be rendered as a template field. The supported types for the attribute are both 'str' and 'list'. There is also a [conditional operation in the __init__ function](https://github.com/apache/airflow/blob/main/airflow/providers/amazon/aws/sensors/s3.py#L89) of the class that relies on the type of the input data, that converts the attribute to a list of strings. If a list of str is passed in through Jinja template, **self.bucket_key** is available as a _**doubly-nested list of strings**_, rather than a list of strings. This is because the input value of **bucket_key** can only be a string type that represents the template-string when used as a template_field. These template_fields are then converted to their corresponding values when instantiated as a task_instance. Example log from __init__ function: ` scheduler | DEBUG | type: <class 'list'> | val: ["{{ ti.xcom_pull(task_ids='t1') }}"]` Example log from poke function: `poke | DEBUG | type: <class 'list'> | val: [["s3://test_bucket/test_key1", "s3://test_bucket/test_key2"]]` This leads to the poke function throwing an [exception](https://github.com/apache/airflow/blob/main/airflow/providers/amazon/aws/hooks/s3.py#L172) as each individual key needs to be a string value to parse the url, but is being passed as a list (since self.bucket_key is a nested list). ### What you think should happen instead Instead of putting the input value of **bucket_key** in a list, we should store the value as-is upon initialization of the class, and just conditionally check the type of the attribute within the poke function. [def \_\_init\_\_](https://github.com/apache/airflow/blob/main/airflow/providers/amazon/aws/sensors/s3.py#L89) `self.bucket_key = bucket_key` (which willstore the input values correctly as a str or a list when the task instance is created and the template fields are rendered) [def poke](https://github.com/apache/airflow/blob/main/airflow/providers/amazon/aws/sensors/s3.py#L127) ``` def poke(self, context: Context): if isinstance(self.bucket_key, str): return self._check_key(key) else: return all(self._check_key(key) for key in self.bucket_key) ``` ### How to reproduce 1. Use a template field as the bucket_key attribute in S3KeySensor 2. Pass a list of strings as the rendered template input value for the bucket_key attribute in the S3KeySensor task. (e.g. as an XCOM or Variable pulled value) Example: ``` with DAG( ... render_template_as_native_obj=True, ) as dag: @task(task_id="get_list_of_str", do_xcom_push=True) def get_list_of_str(): return ["s3://test_bucket/test_key1", "s3://test_bucket/test_key1"] t = get_list_of_str() op = S3KeySensor(task_id="s3_key_sensor", bucket_key="{{ ti.xcom_pull(task_ids='get_list_of_str') }}") t >> op ``` ### Anything else _No response_ ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org