HyukjinKwon commented on a change in pull request #23288: 
[SPARK-26339][SQL]Throws better exception when reading files that start with 
underscore
URL: https://github.com/apache/spark/pull/23288#discussion_r241662133
 
 

 ##########
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
 ##########
 @@ -554,8 +554,13 @@ case class DataSource(
 
       // Sufficient to check head of the globPath seq for non-glob scenario
       // Don't need to check once again if files exist in streaming mode
-      if (checkFilesExist && !fs.exists(globPath.head)) {
-        throw new AnalysisException(s"Path does not exist: ${globPath.head}")
+      if (checkFilesExist) {
+        val firstPath = globPath.head
+        if (!fs.exists(firstPath)) {
+          throw new AnalysisException(s"Path does not exist: ${firstPath}")
+        } else if (InMemoryFileIndex.shouldFilterOut(firstPath.getName)) {
+          throw new AnalysisException(s"Path exists but is ignored: 
${firstPath}")
 
 Review comment:
   One thing i'm not sure tho, it's going to throw an exception for, for 
instance,
   
   ```
   spark.read.text("_text.txt").show()
   ```
   
   instead of returning an empty dataframe - which is kind of a behaviour 
change.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to