I have been thinking on a Redshift reader/writer, basically to wrap UNLOAD and COPY in a PTransform. For example, steps to UNLOAD into a PCollection:
1) JDBC to Redshift - UNLOAD <http://docs.aws.amazon.com/redshift/latest/dg/r_UNLOAD.html> TO 's3://bucket/tmp-prefix' 2) S3 to PCollection - work in progress <https://github.com/Kochava/beam-s3> 3) delete tmp files from S3 To implement steps 1 and 3, I can't see a way to perform a task exactly once, globally, in a PTransform. Sure, I could do those steps in main() or even in a separate script, but the result isn't code that can be shared and reused very well. Am I missing something? Seems like the kind of problem that I shouldn't be the first to encounter. Thanks, Jacob
