Zen Yui created AIRFLOW-4923:
--------------------------------
Summary: Databricks hook logs API secret
Key: AIRFLOW-4923
URL: https://issues.apache.org/jira/browse/AIRFLOW-4923
Project: Apache Airflow
Issue Type: Bug
Components: contrib, hooks
Affects Versions: 1.10.3
Reporter: Zen Yui
The databricks operator logs API keys during task instance run. The databricksĀ
operator implementation encourages users to put their API key in the connection
"extra" field ([link to
docstring|https://github.com/apache/airflow/blob/1.10.3/airflow/contrib/operators/databricks_operator.py#L201-L204]),
and its accompanying databricks hook invokes BaseHook.get_connection(), which
logs that "extra" field in plaintext via the [models.Connection.debug_info
method|https://github.com/apache/airflow/blob/1.10.3/airflow/models/connection.py#L271-L280].
Links:
*
[BaseHook.get_connection|https://github.com/apache/airflow/blob/1.10.3/airflow/hooks/base_hook.py#L69-L84]
* [DatabricksHook constructor invoking
get_connection|https://github.com/apache/airflow/blob/1.10.3/airflow/contrib/hooks/databricks_hook.py#L65]
*
[BaseHook.debug_info|https://github.com/apache/airflow/blob/1.10.3/airflow/models/connection.py#L271-L280]
One potential fix would be to allow the operator to emit bearer token headers
if the token is saved to the password field and/or a flag is set in "extra"
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)