[
https://issues.apache.org/jira/browse/SPARK-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14533277#comment-14533277
]
Nicholas Chammas commented on SPARK-3928:
-----------------------------------------
{quote}
Comma separated lists: were supported, will not be supported anymore. Use the
varargs method to pass more than one file. This is because {{,}} is a valid
character in a filename and so the old implementation was broken for some
people.
{quote}
Isn't this inconsistent with how {{textFile()}} works? {{textFile()}} allows
you to pass a single, comma-delimited string of file paths.
If we want to support files with commas in their name -- which sounds like a
corner case -- shouldn't we instead offer some kind of escaping mechanism for
commas?
It would be more work for those who have commas in their files names, but that
seems like a fair tradeoff. If you do weird things, then you should expect to
do more work.
The advantage for the rest of us is that we get a consistent way of globbing
files across {{textFile()}} and {{parquetFile()}}.
> Support wildcard matches on Parquet files
> -----------------------------------------
>
> Key: SPARK-3928
> URL: https://issues.apache.org/jira/browse/SPARK-3928
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core, SQL
> Reporter: Nicholas Chammas
> Assignee: Cheng Lian
> Priority: Minor
> Fix For: 1.3.0
>
>
> {{SparkContext.textFile()}} supports patterns like {{part-*}} and
> {{2014-\?\?-\?\?}}.
> It would be nice if {{SparkContext.parquetFile()}} did the same.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]