[
https://issues.apache.org/jira/browse/SPARK-14687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen updated SPARK-14687:
------------------------------
Assignee: Liwei Lin
> Call path.getFileSystem(conf) instead of call FileSystem.get(conf)
> ------------------------------------------------------------------
>
> Key: SPARK-14687
> URL: https://issues.apache.org/jira/browse/SPARK-14687
> Project: Spark
> Issue Type: Improvement
> Components: MLlib, Spark Core, SQL
> Affects Versions: 2.0.0
> Reporter: Liwei Lin
> Assignee: Liwei Lin
> Priority: Minor
>
> Generally we should call path.getFileSystem(conf) instead of call
> FileSystem.get(conf), because the latter is actually called on the
> DEFAULT_URI (fs.defaultFS), leading to problems under certain situations:
> - if {{fs.defaultFS}} is {{hdfs://clusterA/...}}, but path is
> {{hdfs://clusterB/...}}: then we'll encounter
> {{java.lang.IllegalArgumentException (Wrong FS: hdfs://clusterB/...,
> expected: hdfs://clusterA/...)}}
> - if {{fs.defaultFS}} is not specified, the schema will default to
> {{file:///}}: then we'll encounter {{java.lang.IllegalArgumentException
> (Wrong FS: hdfs://..., expected: file:///)}}
> - if {{fs.defaultFS}} is not {{hdfs://...}}, for example {{viewfs://}}(which
> is used for federated HDFS): then we'll encounter
> {{java.lang.IllegalArgumentException (Wrong FS: hdfs://..., expected:
> viewfs:///)}}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]