[ 
https://issues.apache.org/jira/browse/AIRFLOW-1663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16196430#comment-16196430
 ] 

Andy Hadjigeorgiou commented on AIRFLOW-1663:
---------------------------------------------

It may make more sense to extend Postgres Connection to a 'RedshiftDB' 
connection (and save the 'Redshift' keyword for Redshift cluster management, as 
opposed to queries).  This would maintain style between any boto-based hooks & 
operators.

> Redshift Connection, Hook, & Operator for COPY command usability
> ----------------------------------------------------------------
>
>                 Key: AIRFLOW-1663
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1663
>             Project: Apache Airflow
>          Issue Type: New Feature
>          Components: hooks, operators
>            Reporter: Andy Hadjigeorgiou
>            Assignee: Andy Hadjigeorgiou
>            Priority: Minor
>
> I'm using Redshift as a data warehouse in conjunction with Airflow, and I've 
> found that it wasn't immediately apparent that Airflow had the 
> hooks/connections to support Redshift. In practice, because Redshift is based 
> off of Postgres, a Postgres hook works for basic commands. However, when 
> running a COPY command (uniquely built in Redshift to copy data in parallel), 
> more work is necessary to include AWS credentials (ideally credentials aren't 
> in version control, but in a connection). Redshift's unloading to s3 feature 
> would also benefit from a solution where credentials could be stored in a 
> connection.
> My proposed solution is to include a Redshift connection, that will allow us 
> to include AWS credentials along with Redshift db connection credentials 
> (similar to an S3 connection). From here, I'll create an appropriate 
> RedshiftHook (probably an extension of PostgresHook), and a RedshiftOperator, 
> with means to simplify Redshift sql queries with AWS credentials (& perhaps 
> using psycopg2's copy_expert method).
> It's my first time posting here, and I'm looking to contribute meaningfully - 
> any feedback regarding this feature would be much appreciated! I read that 
> features which involve contributing to new hooks & operators are welcome, and 
> features in line with project Roadmap are ideal ("Adding features already 
> offered by existing workflow solutions (i.e we need to add expected 
> features"). Currently, Airflow only supports Redshift because of it's basis 
> on Postgres, but more native support will be in line with the features of 
> other workflow solutions, and attract more Redshift users.
> I've already started work on this feature, once I clean it up I'll post it 
> here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to