[
https://issues.apache.org/jira/browse/SPARK-4687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14242086#comment-14242086
]
Xuefu Zhang commented on SPARK-4687:
------------------------------------
I concur with [~sandyr]'s account for the need of addFolder() as it helps any
Spark application from doing the same thing over and over again, possibly in a
less performant way. Folder is such an natural, indispensible to some extent,
extension to file in any system that deals with bits on at storage level. I'd
also contend that being hard to get it right shouldn't prevent us from trying
and perfecting it on the way if we believe functionally it's right thing to add.
> SparkContext#addFile doesn't keep file folder information
> ---------------------------------------------------------
>
> Key: SPARK-4687
> URL: https://issues.apache.org/jira/browse/SPARK-4687
> Project: Spark
> Issue Type: Bug
> Affects Versions: 1.2.0
> Reporter: Jimmy Xiang
>
> Files added with SparkContext#addFile are loaded with Utils#fetchFile before
> a task starts. However, Utils#fetchFile puts all files under the Spart root
> on the worker node. We should have an option to keep the folder information.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]