[ 
https://issues.apache.org/jira/browse/SPARK-17132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15426962#comment-15426962
 ] 

Sean Owen commented on SPARK-17132:
-----------------------------------

Yeah, that would be a solution. It actually affects all related API methods of 
SparkContext, not just one. I'm not clear if it's worth adding a bunch to the 
RDD API now in Spark 2, but it's not out of the question.

It should work to escape the commas with \, or at least that's what the Hadoop 
classes appear to want done. I suppose that's the intended usage, though I also 
would prefer a more explicit seq argument.

> binaryFiles method can't handle paths with embedded commas
> ----------------------------------------------------------
>
>                 Key: SPARK-17132
>                 URL: https://issues.apache.org/jira/browse/SPARK-17132
>             Project: Spark
>          Issue Type: Bug
>          Components: Java API
>    Affects Versions: 1.2.0, 1.2.1, 1.2.2, 1.3.0, 1.3.1, 1.4.0, 1.4.1, 1.5.0, 
> 1.5.1, 1.5.2, 1.6.0, 1.6.1, 1.6.2, 2.0.0
>            Reporter: Maximilian Najork
>
> A path with an embedded comma is treated as two separate paths by 
> binaryFiles. Since commas are legal characters in paths, this behavior is 
> incorrect. I recommend overloading binaryFiles to accept an array of path 
> strings in addition to a string of comma-separated paths. Since setInputPaths 
> is already overloaded to accept either form, this should be relatively 
> low-effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to