[
https://issues.apache.org/jira/browse/HIVE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16250273#comment-16250273
]
Alan Gates commented on HIVE-17714:
-----------------------------------
bq. I will try bringing in TypeInfo and ObjectInspector too. What are the
specific advantages of doing that?
I think you'll forced to by the interdependencies of the interfaces. If you
are not, then fine, we don't have to move them.
bq. Also, I didn't quite understand by "avoids the need for ORC and any other
storage format to pick it up". Can you please elaborate?
ORC today depends on the storage-api. It works hard to keep down the number of
its dependencies in order to minimize its jar size. So I suspect you'll get
pushback from the ORC community on adding Serializer et al to the storage-api.
By making serde interfaces a separate module in storage-api we can address this
concern from ORC.
bq. This assumes that SerDes implementations do not bring along other
dependencies like hive-common etc. I am not sure yet but I think it is very
likely that these SerDes will have more dependencies, so it may not be just
adding hive-serde.jar to the standalone-metastore classpath. I already see
hive-serde depends on hive-common, hive-service-rpc and hive-shims so not sure
if we will be able to create a standalone serde jar for metastore.
Fair point, though even if we could get them to only pull in common, shims, and
serdes that would be a big improvement over needing the exec jar.
> move custom SerDe schema considerations into metastore from QL
> --------------------------------------------------------------
>
> Key: HIVE-17714
> URL: https://issues.apache.org/jira/browse/HIVE-17714
> Project: Hive
> Issue Type: Bug
> Reporter: Sergey Shelukhin
> Assignee: Alan Gates
>
> Columns in metastore for tables that use external schema don't have the type
> information (since HIVE-11985) and may be entirely inconsistent (since
> forever, due to issues like HIVE-17713; or for SerDes that allow an URL for
> the schema, due to a change in the underlying file).
> Currently, if you trace the usage of ConfVars.SERDESUSINGMETASTOREFORSCHEMA,
> and to MetaStoreUtils.getFieldsFromDeserializer, you'd see that the code in
> QL handles this in Hive. So, for the most part metastore just returns
> whatever is stored for columns in the database.
> One exception appears to be get_fields_with_environment_context, which is
> interesting... so getTable will return incorrect columns (potentially), but
> get_fields/get_schema will return correct ones from SerDe as far as I can
> tell.
> As part of separating the metastore, we should make sure all the APIs return
> the correct schema for the columns; it's not a good idea to have everyone
> reimplement getFieldsFromDeserializer.
> Note: this should also remove a flag introduced in HIVE-17731
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)