ashb commented on a change in pull request #6396: [AIRFLOW-5726] Delete table
as file name in RedshiftToS3Transfer
URL: https://github.com/apache/airflow/pull/6396#discussion_r345357187
##########
File path: airflow/operators/redshift_to_s3_operator.py
##########
@@ -103,19 +108,33 @@ def execute(self, context):
credentials = s3_hook.get_credentials()
unload_options = '\n\t\t\t'.join(self.unload_options)
select_query = "SELECT * FROM
{schema}.{table}".format(schema=self.schema, table=self.table)
- unload_query = """
- UNLOAD ('{select_query}')
- TO 's3://{s3_bucket}/{s3_key}/{table}_'
- with credentials
-
'aws_access_key_id={access_key};aws_secret_access_key={secret_key}'
- {unload_options};
- """.format(select_query=select_query,
- table=self.table,
- s3_bucket=self.s3_bucket,
- s3_key=self.s3_key,
- access_key=credentials.access_key,
- secret_key=credentials.secret_key,
- unload_options=unload_options)
+ if self.table_as_file_name:
Review comment:
There's a lot of duplication between these branches. I think instead of all
this duplication I would like to see something like:
```
s3_key = '{}/{}_'.format(self.s3_key, self.table) if self.table else
self.s3_key
```
Then the unload_query is the same for either path.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services