[
https://issues.apache.org/jira/browse/SPARK-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157389#comment-14157389
]
Josh Rosen commented on SPARK-3769:
-----------------------------------
I think that {{SparkFiles.get()}} can be called from driver code, too, so
that's one option if you'd like to achieve consistency between driver and
executor code.
> SparkFiles.get gives me the wrong fully qualified path
> ------------------------------------------------------
>
> Key: SPARK-3769
> URL: https://issues.apache.org/jira/browse/SPARK-3769
> Project: Spark
> Issue Type: Bug
> Components: Java API
> Affects Versions: 1.0.2, 1.1.0
> Environment: linux host, and linux grid.
> Reporter: Tom Weber
> Priority: Minor
>
> My spark pgm running on my host, (submitting work to my grid).
> JavaSparkContext sc =new JavaSparkContext(conf);
> final String path = args[1];
> sc.addFile(path); /* args[1] = /opt/tom/SparkFiles.sas */
> The log shows:
> 14/10/02 16:07:14 INFO Utils: Copying /opt/tom/SparkFiles.sas to
> /tmp/spark-4c661c3f-cb57-4c9f-a0e9-c2162a89db77/SparkFiles.sas
> 14/10/02 16:07:15 INFO SparkContext: Added file /opt/tom/SparkFiles.sas at
> http://10.20.xx.xx:49587/files/SparkFiles.sas with timestamp 1412280434986
> those are paths on my host machine. The location that this file gets on grid
> nodes is:
> /opt/tom/spark-1.1.0-bin-hadoop2.4/work/app-20141002160704-0002/1/SparkFiles.sas
> While the call to get the path in my code that runs in my mapPartitions
> function on the grid nodes is:
> String pgm = SparkFiles.get(path);
> And this returns the following string:
> /opt/tom/spark-1.1.0-bin-hadoop2.4/work/app-20141002160704-0002/1/./opt/tom/SparkFiles.sas
> So, am I expected to take the qualified path that was given to me and parse
> it to get only the file name at the end, and then concatenate that to what I
> get from the SparkFiles.getRootDirectory() call in order to get this to work?
> Or pass only the parsed file name to the SparkFiles.get method? Seems as
> though I should be able to pass the same file specification to both
> sc.addFile() and SparkFiles.get() and get the correct location of the file.
> Thanks,
> Tom
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]