o-nikolas commented on code in PR #22758:
URL: https://github.com/apache/airflow/pull/22758#discussion_r848932625
##########
airflow/providers/amazon/aws/operators/s3.py:
##########
@@ -318,6 +318,94 @@ def execute(self, context: 'Context'):
)
+class S3CreateObjectOperator(BaseOperator):
+ """
+ Creates a new object from a given string or bytes.
Review Comment:
> What I fail to understand is how this operator will get the given string?
(Assuming the string is not hard coded)?
From within Airflow use cases I know of (see
[here](https://github.com/apache/airflow/blob/2400de2c5ece644cadb870baeea28907fa4dcf58/airflow/providers/amazon/aws/example_dags/example_s3_to_redshift.py#L36),
[here](https://github.com/apache/airflow/blob/2400de2c5ece644cadb870baeea28907fa4dcf58/airflow/providers/amazon/aws/example_dags/example_athena.py#L44)
and
[here](https://github.com/apache/airflow/blob/2400de2c5ece644cadb870baeea28907fa4dcf58/airflow/providers/amazon/aws/example_dags/example_glue.py#L75))
are from hard coded string/data in the dag file.
From user dags, I've seen both hardcoded and runtime.
> But for that to work it means that by some other way someone stored data
in Xcom and then you want to create S3 object from the data stored in xcom.
This may encourage bad practices.
What is bad practice about a workflow which consumes output of one operation
and then writes that data to S3? Taking some output (whether that be json,
text, csv, etc) from one operation and persisting that data to object storage
is a very common pipeline workflow. In the S3 case you either must write an
unnecessary/temporary file to disk or use the S3Hook directly. I've used both,
and the latter is much cleaner, but not as convenient as a dedicated operator
would be.
But that's just my 2c
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]