[GitHub] spark pull request #20611: [SPARK-23425][SQL]Support wildcard in HDFS path f...

vinodkc Thu, 12 Apr 2018 00:56:20 -0700

Github user vinodkc commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20611#discussion_r180993462
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ---
    @@ -370,26 +339,35 @@ case class LoadDataCommand(
                 throw new AnalysisException(
                   s"LOAD DATA: URI scheme is required for non-local input 
paths: '$path'")
               }
    -
               // Follow Hive's behavior:
               // If LOCAL is not specified, and the path is relative,
               // then the path is interpreted relative to "/user/<username>"
               val uriPath = uri.getPath()
               val absolutePath = if (uriPath != null && 
uriPath.startsWith("/")) {
                 uriPath
               } else {
    -            s"/user/${System.getProperty("user.name")}/$uriPath"
    +            s"/user/${ System.getProperty("user.name") }/$uriPath"
               }
               new URI(scheme, authority, absolutePath, uri.getQuery(), 
uri.getFragment())
             }
    -        val hadoopConf = sparkSession.sessionState.newHadoopConf()
    -        val srcPath = new Path(hdfsUri)
    -        val fs = srcPath.getFileSystem(hadoopConf)
    -        if (!fs.exists(srcPath)) {
    -          throw new AnalysisException(s"LOAD DATA input path does not 
exist: $path")
    -        }
    -        hdfsUri
           }
    +    }
    +    val srcPath = new Path(loadPath)
    +    val fs = 
srcPath.getFileSystem(sparkSession.sessionState.newHadoopConf())
    +    // This handling is because while reoslving the invalid urls starting 
with file:///
    +    // system throws IllegalArgumentException from globStatus api,so 
inorder to handle
    +    // such scenarios this code is added in try catch block and after 
catching the
    +    // run time exception a generic error will be displayed to the user.
    +    try {
    +      if (null == fs.globStatus(srcPath) || 
fs.globStatus(srcPath).isEmpty) {
    +        throw new AnalysisException(s"LOAD DATA input path does not exist: 
$path")
    +      }
    +    }
    +    catch {
    +      case e: Exception =>
    --- End diff --
    
    Avoid catching generic exception, catch IllegalArgumentException



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20611: [SPARK-23425][SQL]Support wildcard in HDFS path f...

Reply via email to