[ https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15282291#comment-15282291 ]
Hari Sankar Sivarama Subramaniyan commented on HIVE-13708: ---------------------------------------------------------- [~thejas] I checked whether we could do this in a generic way. As you mentioned, we can perform a deep check of the object inspector after initialize() and see if the types will match the column type in the table definition. My concern here is if it is backward compatible or will it break things that used to work previously. If we haven't enforced this rule previously, how will we expect the custom serde developer henceforth to know that this is an enforced rule in Hive. Also, it looked cleaner to implement this check in the actual serde itself (like for e.g. RegexSerDe has done a similar check in initialize()) since it seems that it is the responsibility of the Serde to interpret the data correctly and not the query processor. Let me know your feedback. Thanks Hari > Create table should verify datatypes supported by the serde > ----------------------------------------------------------- > > Key: HIVE-13708 > URL: https://issues.apache.org/jira/browse/HIVE-13708 > Project: Hive > Issue Type: Bug > Components: Query Planning > Reporter: Thejas M Nair > Assignee: Hari Sankar Sivarama Subramaniyan > Priority: Critical > Attachments: HIVE-13708.1.patch > > > As [~Goldshuv] mentioned in HIVE-7777. > Create table with serde such as OpenCSVSerde allows for creation of table > with columns of arbitrary types. But 'describe table' would still return > string datatypes, and so does selects on the table. > This is misleading and would result in users not getting intended results. > The create table ideally should disallow the creation of such tables with > unsupported types. > Example posted by [~Goldshuv] in HIVE-7777 - > {noformat} > CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) > ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with > serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") > STORED AS TEXTFILE > LOCATION '<some location>' > tblproperties ("skip.header.line.count"="1"); > {noformat} > Now consider this sql: > hive> select min(totalprice) from test; > in this case given my data, the result should have been 874.89, but the > actual result became 100001.57 (as it is first according to byte ordering of > a string type). this is a wrong result. > hive> desc extended test; > OK > o_totalprice string from deserializer > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)