yehoshuadimarsky commented on PR #27947: URL: https://github.com/apache/airflow/pull/27947#issuecomment-1403969694
@dstandish I think the Data API of Redshift is a bit different some of the other AWS services. The regulars SQL-based hook for Redshift requires a redshift conn_id that contains all of the information needed to connect to a given database. But the Data API requires you to specify the cluster identifier, database name, and database user **in the operation itself** when making a call, such as in the [execute_statement](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/redshift-data.html#RedshiftDataAPIService.Client.execute_statement) method. This means that if we just delegate any methods down to the underlying boto3 client, every upstream user of the Redshift Data API hook would have to tediously repeat all of these connection parameters in every method the create/invoke. I think there are two approaches we can do here. 1. Add more parameters to the hook itself, so that when creating a hook you only need to specify these connection args once. Yes, that adds another client layer like you were discouraging. Also, a downside of this is that a single hook instance can only be used for a single redshift destination. 2. Do nothing, and indeed expect the end user to pass these params around through each function call. I guess a similar service to this would be S3, where each call can specify using a different bucket and/or prefix key. Not sure what the optimal approach is here going forward. As you suggested, I indeed started the work on the actual transfer operators with S3, and quickly ran into this question of how to model these "extra" connection params that the data api needs. What do you (and the greater Airflow community) think or suggest? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
