Biao Geng created FLINK-39215:
---------------------------------
Summary: Fix tmp dir leak in PythonDriver when
launchPy4jPythonClient fails
Key: FLINK-39215
URL: https://issues.apache.org/jira/browse/FLINK-39215
Project: Flink
Issue Type: Improvement
Reporter: Biao Geng
In org.apache.flink.client.python.PythonDriver, we have codes like:
{code:java}
String tmpDir =
System.getProperty("java.io.tmpdir")
+ File.separator
+ "pyflink"
+ File.separator
+ UUID.randomUUID();
// start the python process.
Process pythonProcess =
PythonEnvUtils.launchPy4jPythonClient(
gatewayServer,
config,
commands,
pythonDriverOptions.getEntryPointScript().orElse(null),
tmpDir,
true);
shutdownHook =
new PythonEnvUtils.PythonProcessShutdownHook(
pythonProcess, gatewayServer, tmpDir);{code}
We use the shutdownHook to cleanup tmp dirs but it is possible that
`launchPy4jPythonClient` successfully runs
preparePythonEnvironment but fails on startPythonProcess and throws exception.
As a result, the shutdown hook registration is skipped and the tmp dir used by
preparePythonEnvironment would stay forever. We may refactor current logic a
little and introduce a separate the shutdown hook for cleaning up the tmp dir.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)