GitHub user AndreSchumacher opened a pull request:

    https://github.com/apache/spark/pull/195

    Spark parquet improvements

    A few improvements to the Parquet support for SQL queries:
    - Instead of files a ParquetRelation is now backed by a directory, which 
simplifies importing data from other
      sources
    - InsertIntoParquetTable operation now supports switching between 
overwriting or appending (at least in
      HiveQL)
    - tests now use the new API
    - Parquet logging can be set to WARNING level (Default)
    - Default compression for Parquet files (GZIP, as in parquet-mr)


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/AndreSchumacher/spark 
spark_parquet_improvements

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/195.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #195
    
----
commit 14a1d2c1a5df4475ffccaf3ce36a41f2234ec3b7
Author: Andre Schumacher <[email protected]>
Date:   2014-03-18T13:23:26Z

    Changing ParquetRelation underlying data from file to dir

commit d6630d408cfadd6ef67767f4794f57b7ebe1c605
Author: Andre Schumacher <[email protected]>
Date:   2014-03-19T06:04:37Z

    Optional overwrite when inserting into ParquetRelation

commit d1d3639d8c45f93dd485628cb02ab5ef1dccc93e
Author: Andre Schumacher <[email protected]>
Date:   2014-03-19T11:32:51Z

    Update of Parquet tests to new API

commit 233e67f5571b9be38c04c0205ed522455cd91661
Author: Andre Schumacher <[email protected]>
Date:   2014-03-19T17:15:09Z

    Implementing appending to existing ParquetRelation

commit b8abe01a54090191e89d49f0960096b0844b3fd7
Author: Andre Schumacher <[email protected]>
Date:   2014-03-14T19:09:15Z

    Setting Parquet log level

commit 0a07f05e094e99317913c619c9ff08bc45cc933c
Author: Andre Schumacher <[email protected]>
Date:   2014-03-21T13:14:55Z

    Adding Parquet debug parameter and default compression

commit 05ed2477718798fd373c9b478f25518bf8919381
Author: Andre Schumacher <[email protected]>
Date:   2014-03-21T15:26:12Z

    Adding future example

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to