yehoshuadimarsky commented on PR #27947:
URL: https://github.com/apache/airflow/pull/27947#issuecomment-1403969694

   @dstandish 
   I think the Data API of Redshift is a bit different some of the other AWS 
services. The regulars SQL-based hook for Redshift requires a redshift conn_id 
that contains all of the information needed to connect to a given database. But 
the Data API requires you to specify the cluster identifier, database name, and 
database user **in the operation itself** when making a call, such as in the 
[execute_statement](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/redshift-data.html#RedshiftDataAPIService.Client.execute_statement)
 method. This means that if we just delegate any methods down to the underlying 
boto3 client, every upstream user of the Redshift Data API hook would have to 
tediously repeat all of these connection parameters in every method the 
create/invoke. 
   
   I think there are two approaches we can do here.
   1. Add more parameters to the hook itself, so that when creating a hook you 
only need to specify these connection args once. Yes, that adds another client 
layer like you were discouraging. Also, a downside of this is that a single 
hook instance can only be used for a single redshift destination.
   2. Do nothing, and indeed expect the end user to pass these params around 
through each function call. I guess a similar service to this would be S3, 
where each call can specify using a different bucket and/or prefix key.
   
   Not sure what the optimal approach is here going forward. As you suggested, 
I indeed started the work on the actual transfer operators with S3, and quickly 
ran into this question of how to model these "extra" connection params that the 
data api needs.
   
   What do you (and the greater Airflow community) think or suggest?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to