[ https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842804#comment-13842804 ]
Carl Steinbach commented on HIVE-5783: -------------------------------------- [~brocknoland] Up to this point we have reserved first-class support for data formats in Hive (i.e. changing the grammar) to formats that are implemented natively in the Hive source repository. I think we should maintain this convention. There are a couple option available if we feel that it's important for users to be able to create Parquet formatted tables using the abbreviated syntax: # Add a format registry feature to Hive that allows admins to register third-party SerDe implementations and associate them with a format keyword that users can reference in a DDL statement. # Maintain two copies of the Parquet SerDe implementation -- one in Hive and one in the parquet-mr repository -- and backport patches between these repositories as necessary. If users want to use the parquet-mr version of the SerDe with Hive they may do so by referencing the third-party package name in their DDL. On a side note I think the ticket summary "Native Parquet Support in Hive" is misleading. Users who see this description in the release notes will conclude that the Parquet SerDe code lives in Hive when the exact opposite is true. > Native Parquet Support in Hive > ------------------------------ > > Key: HIVE-5783 > URL: https://issues.apache.org/jira/browse/HIVE-5783 > Project: Hive > Issue Type: New Feature > Reporter: Justin Coffey > Assignee: Justin Coffey > Priority: Minor > Fix For: 0.11.0 > > Attachments: HIVE-5783.patch, hive-0.11-parquet.patch > > > Problem Statement: > Hive would be easier to use if it had native Parquet support. Our > organization, Criteo, uses Hive extensively. Therefore we built the Parquet > Hive integration and would like to now contribute that integration to Hive. > About Parquet: > Parquet is a columnar storage format for Hadoop and integrates with many > Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, > Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native > Parquet integration. > Changes Details: > Parquet was built with dependency management in mind and therefore only a > single Parquet jar will be added as a dependency. -- This message was sent by Atlassian JIRA (v6.1.4#6159)