Joel Croteau created AIRFLOW-4408:
-------------------------------------
Summary: Add template expansion support to properties arguments
for DataProc operators
Key: AIRFLOW-4408
URL: https://issues.apache.org/jira/browse/AIRFLOW-4408
Project: Apache Airflow
Issue Type: Improvement
Components: contrib, gcp, operators
Affects Versions: 1.10.3
Reporter: Joel Croteau
Most GCP Dataproc operators take a `properties` argument of some sort, in the
form of a `dict` of string->string that sets details of how the job is to be
run on the cluster and other useful properties. It would be very nice if we
could expand templates in the components of this `dict` in the same way we do
for other string arguments. In particular, I am enabling log aggregation on a
cluster, and would like to specify `yarn:yarn.nodemanager.remote-app-log-dir`
to include the cluster name, which itself includes a template for the run date.
It seems like it would be easy enough to iterate through the keys and values of
the dict and build a new dict with the template-expanded results. I guess one
potential problem with this is it might be possible that distinct templated
keys might resolve to the same string after expansion, but this seems like a
corner case we could deal with by raising an exception, or even just not expand
keys and only expand values, as I'm not really sure how useful templated
property names would actually be.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)