Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/21426#discussion_r191039095
--- Diff: core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala ---
@@ -153,4 +154,30 @@ object PythonRunner {
.map { p => formatPath(p, testWindows) }
}
+ /**
+ * Resolves the ".py" files. ".py" file should not be added as is
because PYTHONPATH does
+ * not expect a file. This method creates a temporary directory and puts
the ".py" files
+ * if exist in the given paths.
+ */
+ private def resolvePyFiles(pyFiles: Array[String]): Array[String] = {
+ val dest = Utils.createTempDir(namePrefix = "localPyFiles")
+ pyFiles.flatMap { pyFile =>
+ // In case of client with submit, the python paths should be set
before context
+ // initialization because the context initialization can be done
later.
+ // We will copy the local ".py" files because ".py" file shouldn't
be added
+ // alone but its parent directory in PYTHONPATH. See SPARK-24384.
+ if (pyFile.endsWith(".py")) {
+ val source = new File(pyFile)
+ if (source.exists() && source.canRead) {
--- End diff --
@vanzin, do you mean that this should be checked ahead (for example in
SparkSubmit) before we are in this logic?
Just for clarification, this is just a sanity check. The previous behaviour
was that the path is added but it's ignored and the current behaviour is that
it doesn't add the path.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]