GitHub user liancheng opened a pull request:
https://github.com/apache/spark/pull/4563
[SPARK-4553] [SPARK-5767] [SQL] Wires Parquet data source with the newly
introduced write support for data source API
This PR migrates the Parquet data source to the new data source write
support API. Now users can also overwriting and appending to existing tables.
Notice that inserting into partitioned tables is not supported yet.
When Parquet data source is enabled, insertion to Hive Metastore Parquet
tables is also fullfilled by the Parquet data source. This is done by the newly
introduced `HiveMetastoreCatalog.ParquetConversions` rule, which is a "proper"
implementation of the original hacky `HiveStrategies.ParquetConversion`. The
latter is still preserved, and can be removed together with the old Parquet
support in the future.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/liancheng/spark parquet-refining
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/4563.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #4563
----
commit c1a4e78302dba5c0df8b353f3dbfe732bbb6e56e
Author: Cheng Lian <[email protected]>
Date: 2015-02-11T01:28:37Z
Passes Hive Metastore partitioning information to ParquetRelation2
commit 62490304f2a04a54a9677de1e89d550001fe57ab
Author: Cheng Lian <[email protected]>
Date: 2015-02-11T22:04:28Z
Rewires Parquet data source and the new data source write support
commit 38ebd4ebe37027dacd2adee7fc305d67cdd56633
Author: Cheng Lian <[email protected]>
Date: 2015-02-12T00:26:19Z
Temporary solution for moving Parquet conversion to analysis phase
Although it works, it's so ugly... I duplicated the whole Analyzer
in Hive Context. Have to fix this.
commit f7b81da52028ecaceefc01cfb5ce6d471271935c
Author: Cheng Lian <[email protected]>
Date: 2015-02-12T10:03:45Z
Cleaner solution for Metastore Parquet table conversion
commit ae17ea8a3b1123845cd0be19cd1073bbaf9d0d7b
Author: Cheng Lian <[email protected]>
Date: 2015-02-12T10:21:29Z
Fixes compilation errors introduced during rebasing
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]