[GitHub] spark pull request: [SPARK-8000][SQL] Support for auto-detecting d...

HyukjinKwon Fri, 19 Feb 2016 02:21:22 -0800

GitHub user HyukjinKwon opened a pull request:

    https://github.com/apache/spark/pull/11270


    [SPARK-8000][SQL] Support for auto-detecting data sources.

    https://issues.apache.org/jira/browse/SPARK-8000
    
    This PR adds the support for detecting data source by extension.
    
    As I described in comments, detection follows the steps below:
    
    This tries to find out data source by file extension if the `format()` is 
not called.
    The auto-detection is based on given paths and it recognizes glob pattern 
as well but
    it does not recursively check the sub-paths even if the given paths are 
directories.
    This source detection goes the following steps
    
       1. Check `provider` and use this if this is not `null`.
       2. If `provider` is not given, then it tries to detect the source types 
by extension.
           at this point, if detects only if all the given paths have the same 
extension.
       3. if it fails to detect, use the datasource given to 
`spark.sql.sources.default`.
    
    
    Each tests has been added for each datasource.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HyukjinKwon/spark SPARK-8000

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/11270.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11270
    
----
commit 23ba7266358a3de4800bb65da316c20f60bbf7a8
Author: hyukjinkwon <[email protected]>
Date:   2016-02-19T10:15:44Z

    Support for auto-detecting data sources.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-8000][SQL] Support for auto-detecting d...

Reply via email to