Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9490#discussion_r44516927
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala ---
    @@ -604,10 +609,33 @@ abstract class HadoopFsRelation 
private[sql](maybePartitionSpec: Option[Partitio
           }
         }
     
    -    buildInternalScan(requiredColumns, filters, inputStatuses, 
broadcastedConf)
    +    if (!inputExists) {
    +      throw new IOException("Input paths do not exist, input paths="
    +        + inputPaths.mkString("[", ",", "]"))
    +    } else {
    +      if (inputStatuses.isEmpty && readFromHDFS) {
    +        logWarning("Input paths are empty, input paths=" + 
inputPaths.mkString("[", ",", "]"))
    +        sqlContext.sparkContext.emptyRDD[InternalRow]
    +      } else {
    +        buildInternalScan(requiredColumns, filters, inputStatuses, 
broadcastedConf)
    +      }
    +    }
       }
     
       /**
    +   * Most of time, HadoopFsRelation should check the inputPaths, but for 
some cases it is not,
    +   * e.g. JsonRelation may read from RDD[String]
    +   */
    +  def inputExists: Boolean = fileStatusCache.inputExists
    +
    +  /**
    +   * Most of time, HadoopFsRelation should read from hdfs, but some cases 
it is not,
    +   * e.g. JsonRelation may read from RDD[String]
    +   * @return
    +   */
    +  def readFromHDFS: Boolean = true
    --- End diff --
    
    Is there any way to fix this issue without adding any public interface 
methods? Especially, it's a little bit weird that a `HadoopFsRelation` doesn't 
`readFromHDFS`. Can we special case `JSONRelation` without affecting existing 
public APIs?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to