[
https://issues.apache.org/jira/browse/HIVE-8297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14486655#comment-14486655
]
hongyu bi commented on HIVE-8297:
---------------------------------
Hi , we hit this issue as well:
CREATE EXTERNAL TABLE `temp_db.table1`(
`c1` int,
`c2` timestamp,
`c3` string,
`c4` string,
`c5` string,
`c6` string,
`c7` string,
`t` double)
PARTITIONED BY (
`dt` string)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.columnar.LazyBinaryColumnarSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.RCFileInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.RCFileOutputFormat'
0: jdbc:hive2://ip:port> select * from temp_db.table1 where dt='2015-02-26'
limit 1;
+-----------------------------------+-------------------------------------+-------------------------------+----------------------------------+--------------------------------+------------------------------+------------------------------------+---+
| table1.c1 | table1.c2 | table1.c3 | table1.c4 | table1.c5 | table1.c6 |
table1.c7 | t |
+-----------------------------------+-------------------------------------+-------------------------------+----------------------------------+--------------------------------+------------------------------+------------------------------------+---+
| 9318 | 2015-02-26 09:22:07.0 |
100005824 | 20150228 |
15022895901544 | x1 | 97
| 6 |
+-----------------------------------+-------------------------------------+-------------------------------+----------------------------------+--------------------------------+------------------------------+------------------------------------+---+
1 row selected (0.976 seconds)
0: jdbc:hive2://ip:port> select * from temp_db.table1 where dt='2015-02-26'
limit 2;
+-----------------------------------+-------------------------------------+-------------------------------+----------------------------------+--------------------------------+------------------------------+------------------------------------+---+
| table1.c1 | table1.c2 | table1.c3 | table1.c4 | table1.c5 | table1.c6 |
table1.c7 | t |
+-----------------------------------+-------------------------------------+-------------------------------+----------------------------------+--------------------------------+------------------------------+------------------------------------+---+
| 9318 | 2015-02-26 09:22:07.0 |
100005824 | 20150228 |
15022895901544 | x1 | 97
| 6 |
| 9318 | 2015-02-26 09:22:07.0 |
100005824 | 20150228 |
15022895901544 | x1 | 97
| 6 |
+-----------------------------------+-------------------------------------+-------------------------------+----------------------------------+--------------------------------+------------------------------+------------------------------------+---+
2 rows selected (0.173 seconds)
0: jdbc:hive2://ip:port> select * from temp_db.table1 where dt='2015-02-26'
limit 3;
+-----------------------------------+-------------------------------------+-------------------------------+----------------------------------+--------------------------------+------------------------------+------------------------------------+---+
| table1.c1 | table1.c2 | table1.c3 | table1.c4 | table1.c5 | table1.c6 |
table1.c7 | t |
+-----------------------------------+-------------------------------------+-------------------------------+----------------------------------+--------------------------------+------------------------------+------------------------------------+---+
| 9318 | 2015-02-26 09:23:58.0 |
100005824 | 20150228 |
15022895901544 | x1 | 97
| 6 |
| 9318 | 2015-02-26 09:23:58.0 |
100005824 | 20150228 |
15022895901544 | x1 | 97
| 6 |
| 9318 | 2015-02-26 09:23:58.0 |
100008989 | 20150228 |
15022883170834 | y1 | 97
| 1 |
+-----------------------------------+-------------------------------------+-------------------------------+----------------------------------+--------------------------------+------------------------------+------------------------------------+---+
3 rows selected (0.205 seconds)
0: jdbc:hive2://ip:port> select * from temp_db.table1 where dt='2015-02-26'
limit 4;
+-----------------------------------+-------------------------------------+-------------------------------+----------------------------------+--------------------------------+------------------------------+------------------------------------+---+
| table1.c1 | table1.c2 | table1.c3 | table1.c4 | table1.c5 | table1.c6 |
table1.c7 | t |
+-----------------------------------+-------------------------------------+-------------------------------+----------------------------------+--------------------------------+------------------------------+------------------------------------+---+
| 9318 | 2015-02-26 09:23:58.0 |
100005824 | 20150228 |
15022895901544 | x1 | 97
| 6 |
| 9318 | 2015-02-26 09:23:58.0 |
100005824 | 20150228 |
15022895901544 | x1 | 97
| 6 |
| 9318 | 2015-02-26 09:23:58.0 |
100008989 | 20150228 |
15022883170834 | y1 | 97
| 1 |
| 9318 | 2015-02-26 09:23:58.0 |
100008989 | 20150228 |
15022883170834 | y1 | 97
| 1 |
+-----------------------------------+-------------------------------------+-------------------------------+----------------------------------+--------------------------------+------------------------------+------------------------------------+---+
4 rows selected (0.157 seconds)
0: jdbc:hive2://ip:port> select * from temp_db.table1 where dt='2015-02-26'
limit 5;
+-----------------------------------+-------------------------------------+-------------------------------+----------------------------------+--------------------------------+------------------------------+------------------------------------+---+
| table1.c1 | table1.c2 | table1.c3 | table1.c4 | table1.c5 | table1.c6 |
table1.c7 | t |
+-----------------------------------+-------------------------------------+-------------------------------+----------------------------------+--------------------------------+------------------------------+------------------------------------+---+
| 9318 | 2015-02-26 09:22:02.0 |
100005824 | 20150228 |
15022895901544 | x1 | 97
| 6 |
| 9318 | 2015-02-26 09:22:02.0 |
100005824 | 20150228 |
15022895901544 | x1 | 97
| 6 |
| 9318 | 2015-02-26 09:22:02.0 |
100008989 | 20150228 |
15022883170834 | y1 | 97
| 1 |
| 9318 | 2015-02-26 09:22:02.0 |
100008989 | 20150228 |
15022883170834 | y1 | 97
| 1 |
| 9318 | 2015-02-26 09:22:02.0 |
100015982 | 20150228 |
15022867271422 | w1 | 60
| 1 |
+-----------------------------------+-------------------------------------+-------------------------------+----------------------------------+--------------------------------+------------------------------+------------------------------------+---+
5 rows selected (0.215 seconds)
> Wrong results with JDBC direct read of TIMESTAMP column in RCFile and ORC
> format
> --------------------------------------------------------------------------------
>
> Key: HIVE-8297
> URL: https://issues.apache.org/jira/browse/HIVE-8297
> Project: Hive
> Issue Type: Bug
> Components: CLI, JDBC
> Affects Versions: 0.13.0
> Environment: Linux
> Reporter: Doug Sedlak
>
> For the case:
> SELECT * FROM [table]
> JDBC direct reads the table backing data, versus cranking up a MR and
> creating a result set. Where table format is RCFile or ORC, incorrect
> results are delivered by JDBC direct read for TIMESTAMP columns. If you
> force a result set, correct data is returned.
> To reproduce using beeline:
> 1) Create this file as follows in HDFS.
> $ cat > /tmp/ts.txt
> 2014-09-28 00:00:00
> 2014-09-29 00:00:00
> 2014-09-30 00:00:00
> <ctrl-D>
> $ hadoop fs -copyFromLocal /tmp/ts.txt /tmp/ts.txt
> 2) In beeline load above HDFS data to a TEXTFILE table, and verify ok:
> $ beeline
> > !connect jdbc:hive2://<host>:<port>/<db> hive pass
> > org.apache.hive.jdbc.HiveDriver
> > drop table `TIMESTAMP_TEXT`;
> > CREATE TABLE `TIMESTAMP_TEXT` (`ts` TIMESTAMP) ROW FORMAT DELIMITED FIELDS
> > TERMINATED BY '\001'
> LINES TERMINATED BY '\012' STORED AS TEXTFILE;
> > LOAD DATA INPATH '/tmp/ts.txt' OVERWRITE INTO TABLE
> `TIMESTAMP_TEXT`;
> > select * from `TIMESTAMP_TEXT`;
> 3) In beeline create and load an RCFile from the TEXTFILE:
> > drop table `TIMESTAMP_RCFILE`;
> > CREATE TABLE `TIMESTAMP_RCFILE` (`ts` TIMESTAMP) stored as rcfile;
> > INSERT INTO TABLE `TIMESTAMP_RCFILE` SELECT * FROM `TIMESTAMP_TEXT`;
> 4) Demonstrate incorrect direct JDBC read versus good read by inducing result
> set creation:
> > SELECT * FROM `TIMESTAMP_RCFILE`;
> +------------------------+
> | timestamp_rcfile.ts |
> +------------------------+
> | 2014-09-30 00:00:00.0 |
> | 2014-09-30 00:00:00.0 |
> | 2014-09-30 00:00:00.0 |
> +------------------------+
> > SELECT * FROM `TIMESTAMP_RCFILE` where ts is not NULL;
> +------------------------+
> | timestamp_rcfile.ts |
> +------------------------+
> | 2014-09-28 00:00:00.0 |
> | 2014-09-29 00:00:00.0 |
> | 2014-09-30 00:00:00.0 |
> +------------------------+
> Note 1: The incorrect conduct demonstrated above replicates with a standalone
> Java/JDBC program.
>
> Note 2: Don't know if this is an issue with any other data types, also don't
> know what releases affected, however this occurs in Hive 13. Direct JDBC
> read of TEXTFILE and SEQUENCEFILE work fine. As above for RCFile and ORC
> wrong results are delivered, did not test any other file types.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)