[
https://issues.apache.org/jira/browse/AIRFLOW-820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16641164#comment-16641164
]
Kaxil Naik commented on AIRFLOW-820:
------------------------------------
At this point of writing, `DataProcClusterCreateOperator` has already been
standardized to use `gcp_conn_id ` instead of `google_cloud_default`. As far as
I can remember the only service that is using a different conn id is BigQuery
which is due to the fact that we use it in GCStoBQ Operator where if we need
separate connections for services it is problematic to have a single conn id
for both.
Regarding [~dlamblin]'s comment:
Having `source_conn_id` and `target_conn_id` removes the possbility and the
main objective of this jira that is to pass the connections in default args.
The `source_conn_id` and `target_conn_id` for GCStoBQ will be different to that
of `GCStoS3`.
> Standardize GCP related connection id names and default values
> ----------------------------------------------------------------
>
> Key: AIRFLOW-820
> URL: https://issues.apache.org/jira/browse/AIRFLOW-820
> Project: Apache Airflow
> Issue Type: Improvement
> Components: contrib
> Reporter: Feng Lu
> Assignee: Feng Lu
> Priority: Major
>
> A number of Google Cloud Platform (GCP) related operators, such as
> BigQueryCheckOperator or DataFlowJavaOperator, are using different
> connection_id var names and default values. For example,
> BigQueryCheckOperator(.., big_query_conn_id='bigquery_default'..)
> DataFlowJavaOperator(..., gcp_conn_id='google_cloud_default'...)
> DataProcClusterCreateOperator(...,
> google_cloud_conn_id='google_cloud_default',...)
> This makes dag-level default_args problematic, one would have to specify each
> connection_id explicitly in the default_args even though the same GCP
> connection is shared throughout the DAG. We propose to:
> - standardize all connection id names, e.g.,
> big_query_conn_id ---> gcp_conn_id
> google_cloud_conn_id-->gcp_conn_id
> - standardize all default values, e.g.,
> 'bigquery_default' --> 'google_cloud_default'
> Therefore, if the same GCP connection is used, we only need to specify once
> in the DAG default_args, e.g.,
> default_args = {
> ...
> gcp_conn_id='some_gcp_connection_id'
> ...
> }
> Better still, if a connection with the default name 'google_cloud_default'
> has already been created and used by all GCP operators, the gcp_conn_id
> doesn't even need to be specified in DAG default_args.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)