o-nikolas commented on code in PR #22758:
URL: https://github.com/apache/airflow/pull/22758#discussion_r848932625


##########
airflow/providers/amazon/aws/operators/s3.py:
##########
@@ -318,6 +318,94 @@ def execute(self, context: 'Context'):
         )
 
 
+class S3CreateObjectOperator(BaseOperator):
+    """
+    Creates a new object from a given string or bytes.

Review Comment:
   > What I fail to understand is how this operator will get the given string? 
(Assuming the string is not hard coded)?
   
   From within Airflow use cases I know of (see 
[here](https://github.com/apache/airflow/blob/2400de2c5ece644cadb870baeea28907fa4dcf58/airflow/providers/amazon/aws/example_dags/example_s3_to_redshift.py#L36),
 
[here](https://github.com/apache/airflow/blob/2400de2c5ece644cadb870baeea28907fa4dcf58/airflow/providers/amazon/aws/example_dags/example_athena.py#L44)
 and 
[here](https://github.com/apache/airflow/blob/2400de2c5ece644cadb870baeea28907fa4dcf58/airflow/providers/amazon/aws/example_dags/example_glue.py#L75))
 are from hard coded string/data in the dag file.
   
   From user dags, I've seen both hardcoded and runtime.
   
   > But for that to work it means that by some other way someone stored data 
in Xcom and then you want to create S3 object from the data stored in xcom. 
This may encourage bad practices.
   
   What is bad practice about a workflow which consumes output of one operation 
and then writes that data to S3? Taking some output (whether that be json, 
text, csv, etc) from one operation and persisting that data to object storage 
is a very common pipeline workflow. In the S3 case you either must write an 
unnecessary/temporary file to disk or use the S3Hook directly. I've used both, 
and the latter is much cleaner, but not as convenient as a dedicated operator 
would be.
   
   But that's just my 2c



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to