[
https://issues.apache.org/jira/browse/AIRFLOW-2862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16742936#comment-16742936
]
Daniel Lamblin commented on AIRFLOW-2862:
-----------------------------------------
This would be a breaking change and would need to be held back for the 2.0
release (if breaking this is okay then) or made non-default behavior via an
option… or just write a different operator named a little differently.
TBH if it were my Airflow deployments I was doing this for, I would do
something like:
* Copy this operator and name it like S3ToRedshiftTransfer2
* put the more sensible (it's a good suggestion) change into this operator in
a way that the S3ToRedshiftTransfer operator can subclass the
S3ToRedshiftTransfer2 operator and
* override the command template to provide the existing behavior in a subclass
named S3ToRedshiftTransfer (I know that sounds backward but...).; then
* when 2.0 is released
** rename S3ToRedshiftTransfer2 operator back to S3ToRedshiftTransfer,
** rename the subclassed operator to S3ToRedshiftTransferDeprecated, and
** leave it's implementation in documentation only for users who are upgrading
and can't update some X number of DAGs.
> S3ToRedshiftTransfer Copy Command Flexibility
> ---------------------------------------------
>
> Key: AIRFLOW-2862
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2862
> Project: Apache Airflow
> Issue Type: Improvement
> Components: operators
> Reporter: Micheal Ascah
> Assignee: Micheal Ascah
> Priority: Minor
>
> Currently, the S3ToRedshiftTransfer class requires that the target table to
> be loaded is suffixed to the end of the S3 key provided.
> It doesn't seem justifiable that the operator should require the file be
> named by any convention. The S3 bucket + S3 key should be all that is needed.
> This makes it possible to load any S3 Key into a Redshift table, rather than
> only files that have the table name at the end of the S3 key.
> The S3 key parameter should also be template-able so that files created in S3
> using timestamps from macros in other tasks in the current DAG run can be
> used to identify files when loading from S3 to Redshift.
> The command template should change from
> {code:java}
> COPY {schema}.{table}
> FROM 's3://{s3_bucket}/{s3_key}/{table}'
> with credentials
> 'aws_access_key_id={access_key};aws_secret_access_key={secret_key}'
> {copy_options};{code}
> To
>
> {code:java}
> COPY {schema}.{table}
> FROM 's3://{s3_bucket}/{s3_key}'
> with credentials
> 'aws_access_key_id={access_key};aws_secret_access_key={secret_key}'
> {copy_options};
> {code}
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)