Github user vanzin commented on the issue:
https://github.com/apache/spark/pull/20669
`SparkFiles.get()` is the official way of retrieving anything that is
distributed using `--files`. So if the application is not using that, it's
relying on undocumented behavior.
In the error logs from jenkins, the path of the file
("/var/spark-data/spark-files/pagerank_data.txt") seems to be provided by the
test code as part of running spark-submit. That seems wrong, since the test
code doesn't really have control of whether files will show up. If the file is
already in the image somehow, then it's probably a case of the test and the
image not agreeing about what the path is. If the file is uploaded using
`--files`, then providing an absolute path to the app is wrong (or, at worst,
the app should be using just the file name and using `SparkFiles.get()` to find
its actual location).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]