[ https://issues.apache.org/jira/browse/FLINK-38035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Metzger reassigned FLINK-38035: -------------------------------------- Assignee: Niha > Security Vulnerability in PyFlink Logging Mechanism (PythonEnvUtils.java) > ------------------------------------------------------------------------- > > Key: FLINK-38035 > URL: https://issues.apache.org/jira/browse/FLINK-38035 > Project: Flink > Issue Type: Bug > Components: API / Python > Affects Versions: 1.19.1, 1.20.1 > Reporter: Niha > Assignee: Niha > Priority: Major > > Potential security vulnerability in the logging statement within > {{PythonEnvUtils.java}} that may expose environment variables — including > Kubernetes-mounted secrets — during PyFlink job submission. > The class > [{{org.apache.flink.client.python.PythonEnvUtils}}|https://github.com/apache/flink/blob/master/flink-python/src/main/java/org/apache/flink/client/python/PythonEnvUtils.java#L372-L377] > logs all environment variables at job startup with the following line: > > {{{}LOG.info("Starting Python process with environment variables: {}", > environment);{}}}{{{{}}{}}} > This line is problematic because it indiscriminately logs {*}all environment > variables{*}, which may contain {*}sensitive credentials{*}. > h4. *Context: Kubernetes Operator Users Are Especially at Risk* > When Flink is deployed using the {*}Flink Kubernetes Operator{*}, secrets are > commonly passed into pods as *environment variables* (via Kubernetes {{env}} > or {{envFrom}} fields, e.g. from {{{}secretRef{}}}). > This includes: > * Database credentials > * Cloud service keys (e.g., {{{}AWS_SECRET_ACCESS_KEY{}}}) > * Tokens and encryption keys > * Custom user-defined secrets > Logging these secrets in plain text within the Flink JobManager or > TaskManager logs violates Kubernetes security best practices, which > explicitly discourage exposing sensitive environment variables in logs, and > poses a serious risk in production environments. > h4. *Proposed Fix* > * Redact known sensitive keys ({{{}SECRET{}}}, {{{}TOKEN{}}}, {{{}KEY{}}}, > {{{}PASSWORD{}}}, etc.) before logging. > Example fix snippet: > Map<String, String> redactedEnv = redactSensitive(environment); > LOG.info("Starting Python process with environment variables: {}", > redactedEnv);}} > * Consider an opt-in mechanism (e.g., {{{}log.python.env=true{}}}) for full > environment visibility in safe/test setups. > h4. *Steps to Reproduce* > # Set Kubernetes secrets as environment variables in a FlinkDeployment > (e.g., via {{{}envFrom.secretRef{}}}). > # Launch a PyFlink job using the Flink Kubernetes Operator. > # Examine the JobManager logs. > # Observe secrets printed via {{{}PythonEnvUtils.java{}}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)