nicolas-settembrini commented on issue #16911:
URL: https://github.com/apache/airflow/issues/16911#issuecomment-992769308
Hi, sorry to write here but i didn't find another place talking about this.
I am using Version: 2.1.4+composer and I have a DAG where i defined the
DataprocClusterCreateOperator like this:
create_dataproc = dataproc_operator.DataprocClusterCreateOperator(
task_id='create_dataproc',
cluster_name='dataproc-cluster-demo-{{ ds_nodash }}',
num_workers=2,
region='us-east4',
zone='us-east4-a',
subnetwork_uri='projects/example',
internal_ip_only=True,
tags=['allow-iap-ssh'],
init_actions_uris=['gs://goog-dataproc-initialization-actions-us-east4/connectors/connectors.sh'],
metadata=[('spark-bigquery-connector-url','gs://spark-lib/bigquery/spark-2.4-bigquery-0.23.1-preview.jar')],
labels=dict(equipo='dm',ambiente='dev',etapa='datapreparation',producto='x',modelo='x'),
master_machine_type='n1-standard-1',
worker_machine_type='n1-standard-1',
image_version='1.5-debian10'
)
I passed the metadata as a sequence of tuples as i read here, using the dict
is not working.
Also, the metadata is not being rendered in the cluster_config.
@pateash could you please explain a more detailed way to use your
workaround? In what part of the dag could i use the workaround?
Thanks in advance
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]