[GitHub] spark pull request: [SPARK-12833][SQL] Initial import of spark-csv

rxin Thu, 14 Jan 2016 23:10:13 -0800

GitHub user rxin opened a pull request:

    https://github.com/apache/spark/pull/10766


    [SPARK-12833][SQL] Initial import of spark-csv

    CSV is the most common data format in the "small data" world. It is often 
the first format people want to try when they see Spark on a single node. 
Having to rely on a 3rd party component for this leads to poor user experience 
for new users. This PR merges the popular spark-csv data source package 
(https://github.com/databricks/spark-csv) with SparkSQL.
    
    This is a first PR to bring the functionality to spark 2.0 master. We will 
complete items outlines in the design document (see JIRA attachment) in follow 
up pull requests.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rxin/spark csv

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10766.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10766
    
----
commit f3e99bde657ece010929e04f622ccdf75588af0d
Author: Hossein <[email protected]>
Date:   2016-01-06T00:40:13Z

    Added univocity-parsers as a dependency

commit 29c15c84b511363f47dca327f791f2c5e28ffcea
Author: Hossein <[email protected]>
Date:   2016-01-06T00:42:20Z

    Added inline implementation of spark-csv in SparkSQL

commit c9900d800fddb69a74f54dcd3b1dfc0afea8e8ee
Author: Hossein <[email protected]>
Date:   2016-01-06T00:48:07Z

    Minor style and comments with some TODOs

commit da314cb9cb323b5800175e15a49fe48f5c5c5e75
Author: Hossein <[email protected]>
Date:   2016-01-06T06:37:30Z

    Ported tests from spark-csv

commit e85cd7d708dd8c7f175d936606b9744c4d7c5727
Author: Hossein <[email protected]>
Date:   2016-01-06T10:02:27Z

    Excluding test resource files from license check

commit b09c38c65aeb92566df79d31dd14cda5dc0bb262
Author: Hossein <[email protected]>
Date:   2016-01-06T10:02:54Z

    Adding test resource file for nullable types

commit b31cb893dfcd87d1269a4a932d34fed830fe55ce
Author: Hossein <[email protected]>
Date:   2016-01-06T10:08:08Z

    Remove debugging message and extra lines

commit e364c284f2d37540aa2487220b417fa433198361
Author: Hossein <[email protected]>
Date:   2016-01-06T20:12:24Z

    Updated deps

commit 1856ed33dc4b677b0f3c83f61c100640c3f8e801
Author: Hossein <[email protected]>
Date:   2016-01-06T22:38:45Z

    Using Hive numericPrecedence

commit 0fd4bd3cd177e23c46db56b2a08a12b85c57355f
Author: Hossein <[email protected]>
Date:   2016-01-07T00:15:24Z

    Not using depricated DecimalType constructor

commit 1e312a525c85ec08f2aa76870fe812716f6699a0
Author: Hossein <[email protected]>
Date:   2016-01-07T06:39:05Z

    Further style improvement

commit 319e0edb17d02eb994bc1cd104a29df8c47a9c59
Author: Hossein <[email protected]>
Date:   2016-01-07T22:08:16Z

    Fixing write test

commit c448be766c8dddbd25fcba5d817bf66b976e0b5a
Author: Reynold Xin <[email protected]>
Date:   2016-01-15T06:55:15Z

    Merge pull request #10615 from falaki/SPARK-12420
    
    [SPARK-12420][SQL] Have a built-in CSV data source implementation

commit ff22a2c883e8236f911bd583b7e9a4da66d6e980
Author: Reynold Xin <[email protected]>
Date:   2016-01-15T07:08:19Z

    Fix scala style and add notice file.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-12833][SQL] Initial import of spark-csv

Reply via email to