[
https://issues.apache.org/jira/browse/HIVE-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252660#comment-16252660
]
Sergey Shelukhin edited comment on HIVE-17714 at 11/14/17 11:14 PM:
--------------------------------------------------------------------
The reason people never hit it before is that before HIVE-11985, which people
are only starting to use, metastore would duplicate the schema from
deserializer into the metastore columns (2.1 in my previous comment). So, in
most cases (unless either the user or the serde messed with it), the schema
returned would actually be the real schema.
I actually filed this JIRA based on a case where someone was using a Hive table
from Presto, and in addition to the problems introduced by HIVE-11985 for that
scenario (presto would not get the correct type; but in this case it worked
anyway because it was a text-based serde so it was anyway always reading
string), also managed to modify the columns manually so they were out of sync
with the SerDe
was (Author: sershe):
The reason people never hit it before is that before HIVE-11985, which people
are only starting to use, metastore would duplicate the schema from
deserializer into the metastore columns (2.1 in my previous comment). So, in
most cases (unless either the user or the serde messed with it), the schema
returned would actually be the real schema.
I actually filed this JIRA based on a case where someone was using a Hive table
from Presto, and in addition to the problems introduced by HIVE-11985 for that
scenario, also managed to modify the columns manually so they were out of sync
with the SerDe
> move custom SerDe schema considerations into metastore from QL
> --------------------------------------------------------------
>
> Key: HIVE-17714
> URL: https://issues.apache.org/jira/browse/HIVE-17714
> Project: Hive
> Issue Type: Bug
> Reporter: Sergey Shelukhin
> Assignee: Alan Gates
>
> Columns in metastore for tables that use external schema don't have the type
> information (since HIVE-11985) and may be entirely inconsistent (since
> forever, due to issues like HIVE-17713; or for SerDes that allow an URL for
> the schema, due to a change in the underlying file).
> Currently, if you trace the usage of ConfVars.SERDESUSINGMETASTOREFORSCHEMA,
> and to MetaStoreUtils.getFieldsFromDeserializer, you'd see that the code in
> QL handles this in Hive. So, for the most part metastore just returns
> whatever is stored for columns in the database.
> One exception appears to be get_fields_with_environment_context, which is
> interesting... so getTable will return incorrect columns (potentially), but
> get_fields/get_schema will return correct ones from SerDe as far as I can
> tell.
> As part of separating the metastore, we should make sure all the APIs return
> the correct schema for the columns; it's not a good idea to have everyone
> reimplement getFieldsFromDeserializer.
> Note: this should also remove a flag introduced in HIVE-17731
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)