[
https://issues.apache.org/jira/browse/HIVE-17580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16333955#comment-16333955
]
ASF GitHub Bot commented on HIVE-17580:
---------------------------------------
GitHub user vihangk1 opened a pull request:
https://github.com/apache/hive/pull/294
HIVE-17580 Remove dependency of get_fields_with_environment_context API to
serde
This is an alternative approach to the solve the dependencies with serdes
for get_fields HMS API. The earlier attempt for HIVE-17580 was very disruptive
since it attempted to move TypeInfo, and various Type implementations to
storage-api and also created another module called serde-api.
This patch is a lot more cleaner and less disruptive. Instead of moving
TypeInfo, it creates similar classes in standalone-metastore. The PR is broken
into multiple commits with descriptive commit messages.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/vihangk1/hive vihangk1_HIVE-17580v2
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/hive/pull/294.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #294
----
commit 708443af3f6356ab73133e271cf00e3418ced8ef
Author: Vihang Karajgaonkar <vihang@...>
Date: 2018-01-21T23:54:04Z
Added MetastoreTypeInfo similar to TypeInfo
This patch adds classes similar to TypeInfo called MetastoreTypeInfo in
standalone-metastore.
Ideally, we should move TypeInfo to standalone-metastore since they store
the information
about types. However, moving TypeInfo to standalone-metastore is
non-trivial effort primarily
because of the below reasons:
1. TypeInfo is annotated as Public API.
This means we can only alter/move these classes in a compatible way.
2. Directly moving these classes is not straight-forward because TypeInfo
uses PrimitiveEntry
class which internally maps the TypeInfo to Type implementations. Ideally
metastore should
not use Type implementation which makes it harder to move the TypeInfo
directly.
However, if we are ready to break compatibility, then TypeInfo broken such
that it doesn't
use PrimitiveEntry directly. In such a world TypeInfo will store just what
it needs to store.
Metadata of Types i.e the type category, its qualified name, whether its a
parameterized type
or not and if yes, how do we validate the parameters.
I am assuming that breaking TypeInfo is a no-go and hence I am copying the
relevant code
from TypeInfo to Metastore and calling it MetastoreTypeInfo.
MetastoreTypeInfo and its sub-classes
are used by TypeInfoParser (also copied) to parse the column type strings
into TypeInfos.
commit 6ec0efa59408c355cfa9aec7fd9dd59d3545aff2
Author: Vihang Karajgaonkar <vihang@...>
Date: 2018-01-03T19:45:32Z
Add avro storeage schema reader
This commit adds a AvroStorageSchemaReader which reads the Avro schema
files both for external schema and regular avro tables.
Most of the util methods are in AvroSchemaUtils class which has methods
copied from AvroSerDeUtils. Some of the needed classes like
SchemaResolutionProblem, InstanceCache, SchemaToTypeInfo, TypeInfoToSchema
are also copied from Hive. The constants defined
in AvroSerde are copied in AvroSerdeConstants. The class
AvroFieldSchemaGenerator converts the AvroSchema into List of
FieldSchema which is returned by the AvroStorageSchemaReader
Avro schema reader uses MetastoreTypeInfo and MetastoreTypeInfoParser
introduced earlier
commit b0f6d1df1ddb627e0f3c1cff3a164c9397337be0
Author: Vihang Karajgaonkar <vihang@...>
Date: 2018-01-04T01:02:40Z
Introduce default storage schema reader
This change introduces a default storage schema reader which copies the
common code from serdes
initialization method and uses it to parse the column name, type and
comments from the table
properties. For custom storage schema reades like Avro we will have to add
more schema readers
as and when required
commit 5ae977a0bf3fd54389671bed86322d3d4652bc20
Author: Vihang Karajgaonkar <vihang@...>
Date: 2018-01-04T19:18:03Z
Integrates the avro schema reader into the DefaultStorageaSchemaReader
commit 2074b16e12c1bdc7ef3781f50e01ab4dd4c71890
Author: Vihang Karajgaonkar <vihang@...>
Date: 2018-01-05T02:38:28Z
Added a test for getFields method in standalone-metastore
commit 4159b5ee9852b41a64489274040e79dbddad54f1
Author: Vihang Karajgaonkar <vihang@...>
Date: 2018-01-22T07:16:13Z
HIVE-18508 : Port schema changes from HIVE-14498 to standalone-metastore
----
> Remove dependency of get_fields_with_environment_context API to serde
> ---------------------------------------------------------------------
>
> Key: HIVE-17580
> URL: https://issues.apache.org/jira/browse/HIVE-17580
> Project: Hive
> Issue Type: Sub-task
> Components: Standalone Metastore
> Reporter: Vihang Karajgaonkar
> Assignee: Vihang Karajgaonkar
> Priority: Major
> Labels: pull-request-available
> Attachments: HIVE-17580.003-standalone-metastore.patch
>
>
> {{get_fields_with_environment_context}} metastore API uses {{Deserializer}}
> class to access the fields metadata for the cases where it is stored along
> with the data files (avro tables). The problem is Deserializer classes is
> defined in hive-serde module and in order to make metastore independent of
> Hive we will have to remove this dependency (atleast we should change it to
> runtime dependency instead of compile time).
> The other option is investigate if we can use SearchArgument to provide this
> functionality.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)