Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21267#discussion_r188144573
  
    --- Diff: python/pyspark/context.py ---
    @@ -211,9 +211,22 @@ def _do_init(self, master, appName, sparkHome, 
pyFiles, environment, batchSize,
             for path in self._conf.get("spark.submit.pyFiles", "").split(","):
                 if path != "":
                     (dirname, filename) = os.path.split(path)
    -                if filename[-4:].lower() in self.PACKAGE_EXTENSIONS:
    -                    self._python_includes.append(filename)
    -                    sys.path.insert(1, 
os.path.join(SparkFiles.getRootDirectory(), filename))
    +                try:
    +                    filepath = os.path.join(SparkFiles.getRootDirectory(), 
filename)
    +                    if not os.path.exists(filepath):
    +                        # In case of YARN with shell mode, 
'spark.submit.pyFiles' files are
    +                        # not added via SparkContext.addFile. Here we 
check if the file exists,
    +                        # try to copy and then add it to the path. See 
SPARK-21945.
    +                        shutil.copyfile(path, filepath)
    +                    if filename[-4:].lower() in self.PACKAGE_EXTENSIONS:
    +                        self._python_includes.append(filename)
    +                        sys.path.insert(1, filepath)
    +                except Exception:
    +                    from pyspark import util
    +                    warnings.warn(
    --- End diff --
    
    Likewise, I checked the warning manually:
    
    ```
    .../pyspark/context.py:229: RuntimeWarning: Failed to add file 
[/home/spark/tmp.py] speficied in 'spark.submit.pyFiles' to Python path:
    
    ...
      /usr/lib64/python27.zip
      /usr/lib64/python2.7
    ... 
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to