[ 
https://issues.apache.org/jira/browse/HIVE-13708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15282303#comment-15282303
 ] 

Thejas M Nair commented on HIVE-13708:
--------------------------------------

This change to CSVSerde breaks backward compatbility for anyone who had a 
scripted create table command with a non string column. Those statements would 
fail now.
If we consider CSVSerde in isolation, the best thing to do about it is to 
address HIVE-13709, ie support other types as supported by LazySimpleSerde. 
That would lead to correct results and also be backward compatible.

Regarding the generic change applicable to any such serde - It is a difficult 
choice between allowing logically incorrect results and backward compatibility. 
I think if we also make the changes in HIVE-13709, only users who use custom 
serde with same limitations (but without error checks) and also use unsupported 
types for that serde would be affected. That set is likely to be very small. I 
would vote for making this incompatible change and fix the logical correctness 
issue.



> Create table should verify datatypes supported by the serde
> -----------------------------------------------------------
>
>                 Key: HIVE-13708
>                 URL: https://issues.apache.org/jira/browse/HIVE-13708
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Planning
>            Reporter: Thejas M Nair
>            Assignee: Hari Sankar Sivarama Subramaniyan
>            Priority: Critical
>         Attachments: HIVE-13708.1.patch
>
>
> As [~Goldshuv] mentioned in HIVE-7777.
> Create table with serde such as OpenCSVSerde allows for creation of table 
> with columns of arbitrary types. But 'describe table' would still return 
> string datatypes, and so does selects on the table.
> This is misleading and would result in users not getting intended results.
> The create table ideally should disallow the creation of such tables with 
> unsupported types.
> Example posted by [~Goldshuv] in HIVE-7777 -
> {noformat}
> CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) 
> ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with 
> serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") 
> STORED AS TEXTFILE 
> LOCATION '<some location>' 
> tblproperties ("skip.header.line.count"="1");
> {noformat}
> Now consider this sql:
> hive> select min(totalprice) from test;
> in this case given my data, the result should have been 874.89, but the 
> actual result became 100001.57 (as it is first according to byte ordering of 
> a string type). this is a wrong result.
> hive> desc extended test;
> OK
> o_totalprice          string                  from deserializer
> ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to