t oo created AIRFLOW-6350:
-----------------------------
Summary: security - spark submit operator logging+exceptions
should mask passwords
Key: AIRFLOW-6350
URL: https://issues.apache.org/jira/browse/AIRFLOW-6350
Project: Apache Airflow
Issue Type: Bug
Components: hooks, operators
Affects Versions: 1.10.3
Reporter: t oo
contrib/hooks/spark_submit_hook.py
Mask passwords in spark submit cmd AND error stacktrace
*add*
def _mask_cmd(self, connection_cmd):
# Mask any password related fields in application args with key value pair
where key contains password (case insensitive), e.g. HivePassword='abc'
connection_cmd_masked =
re.sub(r"(\S*?(?:secret|password)\S*?\s*=\s*')[^']*(?=')", r'\1******', '
'.join(connection_cmd), flags=re.I)
return connection_cmd_masked
*BEFORE*
self.log.info("Spark-Submit cmd: %s", connection_cmd)
*AFTER*
self.log.info("Spark-Submit cmd: %s", self._mask_cmd(connection_cmd))
*BEFORE*
if returncode or (self._is_kubernetes and self._spark_exit_code != 0):
raise AirflowException(
"Cannot execute: {}. Error code is: {}.".format(
spark_submit_cmd, returncode
)
)
*AFTER*
if returncode or (self._is_kubernetes and self._spark_exit_code != 0):
raise AirflowException(
"Cannot execute: {}. Error code is: {}.".format(
self._mask_cmd(spark_submit_cmd), returncode
)
)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)