GitHub user sujith71955 opened a pull request:

    https://github.com/apache/spark/pull/20611

    [SPARK-23425][SQL]When wild card is been used  in load command system is 
throwing  analysis exception

    ## What changes were proposed in this pull request?
    A validaton logic is been added for non local files, Error will be 
thrown,If hdfs path doest not exist or if no files matches the wild card 
defined in load path.
    fs.exists(srcPath) API cannot resolve the path with wild card pattern, so 
globStatus() api is been used which can resolve the paths with hdfs supported 
wildcard
    string like *,? etc.
    ## How was this patch tested?
    Manually tested in hdfs-Yarn cluster, please find the attached verification 
report below.
    Before fix
    
![hdfs_path_snapshot](https://user-images.githubusercontent.com/12999161/36221919-aa322450-11e5-11e8-976a-d01db5d2cebd.PNG)
    
![wildcard_issue](https://user-images.githubusercontent.com/12999161/36221932-b50bb0da-11e5-11e8-8c64-b7c0f2ce1ec7.PNG)
    
    After Fix
    
![afterfix_wildcard](https://user-images.githubusercontent.com/12999161/36221953-ca18cc74-11e5-11e8-940e-b25cff212066.PNG)
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sujith71955/spark master_wldcardsupport

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20611.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20611
    
----
commit af17f65d2d60b69fe0c4addff5299153d4af37c0
Author: sujith71955 <sujithchacko.2010@...>
Date:   2018-02-13T16:41:51Z

    [SPARK-23425][SQL]When wild card is been used  in load command system is 
throwing  analysis exception
    
    ## What changes were proposed in this pull request?
    A validaton logic is been added for non local files, Error will be 
thrown,If hdfs path doest not exist or if no files matches the wild card 
defined in load path.
    fs.exists(srcPath) API cannot resolve the path with wild card pattern, so 
globStatus() api is been used which can resolve the paths with hdfs supported 
wildcard
    string like *,? etc.
    
    ## How was this patch tested?
    Manually tested in hdfs-Yarn cluster

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to