Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/15376#discussion_r83621898
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ---
@@ -246,7 +247,28 @@ case class LoadDataCommand(
val loadPath =
if (isLocal) {
val uri = Utils.resolveURI(path)
- if (!new File(uri.getPath()).exists()) {
+ val filePath = uri.getPath()
+ val exists = if (filePath.contains("*")) {
+ val fileSystem = FileSystems.getDefault
+ val pathPattern = fileSystem.getPath(filePath)
+ val dir = pathPattern.getParent.toString
+ val filePattern = pathPattern.getName(pathPattern.getNameCount -
1).toString
+ if (dir.contains("*")) {
+ throw new AnalysisException(
+ s"LOAD DATA input path allows only filename wildcard: $path")
+ }
+
+ val files = new File(dir).listFiles()
+ if (files == null) {
+ false
+ } else {
+ val matcher = fileSystem.getPathMatcher("glob:" + filePattern)
--- End diff --
I was looking up how this works, and found
http://stackoverflow.com/a/14164134/64174 which suggests that this might not
work unless the glob starts with "**". However, I wonder if you can just pass
this method `"glob:" + pathPattern` in this case anyway to have it match the
whole absolute path?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]