[
https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15316833#comment-15316833
]
Hari Sankar Sivarama Subramaniyan commented on HIVE-13708:
----------------------------------------------------------
Retouching this jira after some test runs:
1. Making this support generic as in patch#4 is still not entirely correct as
it still throws errors too aggressively in case of certain serdes where it is
ok to bypass the datatype check.
2. I would be an ok for going ahead with patch#1, which is a fix specific to
the serde.
3. Making the check generic will need to bypass some scenarios(for e.g. in
patch#4, actualColumnTypes.size() and expectedColumnTypes.size() to be non-zero
values) which effectively doesnt make the check generic and nullifies the
overall purpose of this patch. I dont see a direct way to verify the datatypes
supported by the serde since there was no such API to take care of this while
the serde was originally designed.
[~ashutoshc] Please let me know if I am missing something here.
Thanks
Hari
> Create table should verify datatypes supported by the serde
> -----------------------------------------------------------
>
> Key: HIVE-13708
> URL: https://issues.apache.org/jira/browse/HIVE-13708
> Project: Hive
> Issue Type: Bug
> Components: Query Planning
> Reporter: Thejas M Nair
> Assignee: Hari Sankar Sivarama Subramaniyan
> Priority: Critical
> Attachments: HIVE-13708.1.patch, HIVE-13708.2.patch,
> HIVE-13708.3.patch, HIVE-13708.4.patch
>
>
> As [~Goldshuv] mentioned in HIVE-7777.
> Create table with serde such as OpenCSVSerde allows for creation of table
> with columns of arbitrary types. But 'describe table' would still return
> string datatypes, and so does selects on the table.
> This is misleading and would result in users not getting intended results.
> The create table ideally should disallow the creation of such tables with
> unsupported types.
> Example posted by [~Goldshuv] in HIVE-7777 -
> {noformat}
> CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10))
> ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with
> serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\")
> STORED AS TEXTFILE
> LOCATION '<some location>'
> tblproperties ("skip.header.line.count"="1");
> {noformat}
> Now consider this sql:
> hive> select min(totalprice) from test;
> in this case given my data, the result should have been 874.89, but the
> actual result became 100001.57 (as it is first according to byte ordering of
> a string type). this is a wrong result.
> hive> desc extended test;
> OK
> o_totalprice string from deserializer
> ...
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)