[
https://issues.apache.org/jira/browse/SPARK-19417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16177276#comment-16177276
]
Chris Kanich commented on SPARK-19417:
--------------------------------------
This is my hacked up runtime library loader:
{{
# nonstandard imports
for lib in glob.glob("/home/ckanich/libs/*"):
try:
sc.addPyFile(lib)
except Exception as e:
if "already registered" in str(e):
continue
else:
raise
}}
If I'm developing/debugging one of these libraries outside of Spark, I would
like to be able to reload() to use an updated version without having to restart
the context. My python library-fu isn't amazing, but I believe that the library
file names need to stay the same so the rest of the code works. Looking at that
PR discussion, I do believe that having spark.files.overwrite available as an
option but it not functioning was the maddening part that made me feel the need
to write up bug & test case.
> spark.files.overwrite is ignored
> --------------------------------
>
> Key: SPARK-19417
> URL: https://issues.apache.org/jira/browse/SPARK-19417
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 2.1.0
> Reporter: Chris Kanich
>
> I have not been able to get Spark to actually overwrite a file after I have
> changed it on the driver node, re-called addFile, and then used it on the
> executors again. Here's a failing test.
> {code}
> test("can overwrite files when spark.files.overwrite is true") {
> val dir = Utils.createTempDir()
> val file = new File(dir, "file")
> try {
> Files.write("one", file, StandardCharsets.UTF_8)
> sc = new SparkContext(new
> SparkConf().setAppName("test").setMaster("local-cluster[1,1,1024]")
> .set("spark.files.overwrite", "true"))
> sc.addFile(file.getAbsolutePath)
> def getAddedFileContents(): String = {
> sc.parallelize(Seq(0)).map { _ =>
> scala.io.Source.fromFile(SparkFiles.get("file")).mkString
> }.first()
> }
> assert(getAddedFileContents() === "one")
> Files.write("two", file, StandardCharsets.UTF_8)
> sc.addFile(file.getAbsolutePath)
> assert(getAddedFileContents() === "onetwo")
> } finally {
> Utils.deleteRecursively(dir)
> sc.stop()
> }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]