[jira] [Commented] (HIVE-15079) Hive cannot read Parquet string timetamps as TIMESTAMP data type

Ganesha Shreedhara (Jira) Mon, 21 Oct 2019 00:16:10 -0700


    [ 
https://issues.apache.org/jira/browse/HIVE-15079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955820#comment-16955820
 ]


Ganesha Shreedhara commented on HIVE-15079:
-------------------------------------------

I have an another instance similar to this where the data is in long format and 
table schema has column type as timestamp. 

This works with ORC but throws ClassCastException when parquet is used. Long 
type data can be converted to timestamp (Eg: new Timestamp(longValue)).

Do we have any plans to support automatic type conversion for parquet file 
formats in hive?   

> Hive cannot read Parquet string timetamps as TIMESTAMP data type
> ----------------------------------------------------------------
>
>                 Key: HIVE-15079
>                 URL: https://issues.apache.org/jira/browse/HIVE-15079
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: Sergio Peña
>            Priority: Major
>
> The Hive Wiki for timestamps specifies that strings timestamps can be read by 
> Hive. 
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-Timestamps
> {noformat}
> Supported conversions:
> Integer numeric types: Interpreted as UNIX timestamp in seconds
> Floating point numeric types: Interpreted as UNIX timestamp in seconds with 
> decimal precision
> Strings: JDBC compliant java.sql.Timestamp format "YYYY-MM-DD 
> HH:MM:SS.fffffffff" (9 decimal place precision)
> {noformat}
> This works fine with Text table formats, but when Parquet is used, then it 
> throws the following exception:
> {noformat}
> java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to 
> org.apache.hadoop.hive.serde2.io.TimestampWritable
> {noformat}
> How to reproduce
> {noformat}
> > create table t1 (id int, time string) stored as parquet;
> > insert into table t1 values (1,'2016-07-17 14:42:18');
> > alter table t1 replace columns (id int, time timestamp);
> > select * from t1
> {noformat}
> The above example will run fine if you use a TEXT format instead of PARQUET.
> This issue was raised on PARQUET-723



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-15079) Hive cannot read Parquet string timetamps as TIMESTAMP data type

Reply via email to