Niha created FLINK-38035:
----------------------------
Summary: Security Vulnerability in PyFlink Logging Mechanism
(PythonEnvUtils.java)
Key: FLINK-38035
URL: https://issues.apache.org/jira/browse/FLINK-38035
Project: Flink
Issue Type: Bug
Components: API / Python
Affects Versions: 1.20.1, 1.19.1
Reporter: Niha
Potential security vulnerability in the logging statement within
{{PythonEnvUtils.java}} that may expose environment variables — including
Kubernetes-mounted secrets — during PyFlink job submission.
The class
[{{org.apache.flink.client.python.PythonEnvUtils}}|https://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonEnvUtils.java#L372-L377]
logs all environment variables at job startup with the following line:
{{{}LOG.info("Starting Python process with environment variables: {}",
environment);{}}}{{{}{}}}
This line is problematic because it indiscriminately logs {*}all environment
variables{*}, which may contain {*}sensitive credentials{*}.
h4. *Context: Kubernetes Operator Users Are Especially at Risk*
When Flink is deployed using the {*}Flink Kubernetes Operator{*}, secrets are
commonly passed into pods as *environment variables* (via Kubernetes {{env}} or
{{envFrom}} fields, e.g. from {{{}secretRef{}}}).
This includes:
* Database credentials
* Cloud service keys (e.g., {{{}AWS_SECRET_ACCESS_KEY{}}})
* Tokens and encryption keys
* Custom user-defined secrets
Logging these secrets in plain text within the Flink JobManager or TaskManager
logs violates Kubernetes security best practices, which explicitly discourage
exposing sensitive environment variables in logs, and poses a serious risk in
production environments.
h4. *Proposed Fix*
* Redact known sensitive keys ({{{}*_SECRET_*{}}}, {{{}*_TOKEN{}}},
{{{}*_KEY{}}}, {{{}PASSWORD{}}}, etc.) before logging.
Example fix snippet:
{{Map<String, String> redactedEnv = redactSensitive(environment);
LOG.info("Starting Python process with environment variables: {}",
redactedEnv);}} * Consider an opt-in mechanism (e.g.,
{{{}log.python.env=true{}}}) for full environment visibility in safe/test
setups.
h4. *Steps to Reproduce*
# Set Kubernetes secrets as environment variables in a FlinkDeployment (e.g.,
via {{{}envFrom.secretRef{}}}).
# Launch a PyFlink job using the Flink Kubernetes Operator.
# Examine the JobManager logs.
# Observe secrets printed via {{{}PythonEnvUtils.java{}}}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)