[
https://issues.apache.org/jira/browse/IMPALA-11749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhi tang updated IMPALA-11749:
------------------------------
Description:
Create test table and write some data:
{code:java}
CREATE TABLE default.test_timestamp (
f0 TIMESTAMP
)
STORED AS TEXTFILE
LOCATION 'hdfs://***/test_timestamp';
INSERT INTO default.test_timestamp VALUES("1991-04-14 00:16:00.0");
{code}
Query:
{code:java}
SELECT f0, unix_timestamp(f0) FROM default.test_timestamp;
RESULT:
From Impala with timezone=Asia/Shanghai:
+---------------------+--------------------+
| f0 | unix_timestamp(f0) |
+---------------------+--------------------+
| 1991-04-14 00:16:00 | 671558400 |
+---------------------+--------------------+
From Hive:
+------------------------+------------+--+
| f0 | c1 |
+------------------------+------------+--+
| 1991-04-14 01:16:00.0 | 671559360 |
+------------------------+------------+--+
From HDFS:(./bin/hadoop fs -cat hdfs://***/test_timestamp/***.txt)
1991-04-14 00:16:00{code}
There are obvious differences between Hive and Impala query results. This
difference is caused by inconsistent processing methods, but I'm not sure which
is more standardized. For me, I think hive's approach is more reasonable.
was:
Create test table and write some data:
{code:java}
CREATE TABLE default.test_timestamp (
f0 TIMESTAMP
)
STORED AS TEXTFILE
LOCATION 'hdfs://***/test_timestamp';
INSERT INTO default.test_timestamp VALUES("1991-04-14 00:16:00.0");
{code}
Query:
{code:java}
SELECT f0, unix_timestamp(f0) FROM default.test_timestamp;
RESULT:
From Impala with timezone=Asia/Shanghai:
+---------------------+--------------------+
| f0 | unix_timestamp(f0) |
+---------------------+--------------------+
| 1991-04-14 00:16:00 | 671558400 |
+---------------------+--------------------+
From Hive:
+------------------------+------------+--+
| f0 | c1 |
+------------------------+------------+--+
| 1991-04-14 01:16:00.0 | 671559360 |
+------------------------+------------+--+
From HDFS:(./bin/hadoop fs -cat hdfs://***/test_timestamp/***.txt)
1991-04-14 00:16:00{code}
There are obvious differences between Hive and Impala query results. This
difference is caused by inconsistent processing methods, but I'm not sure which
is more standardized. For me, I think hive's approach is more reasonable.
> Impala doesn't handle China Daylight saving time properly
> ---------------------------------------------------------
>
> Key: IMPALA-11749
> URL: https://issues.apache.org/jira/browse/IMPALA-11749
> Project: IMPALA
> Issue Type: Bug
> Reporter: zhi tang
> Priority: Critical
>
> Create test table and write some data:
> {code:java}
> CREATE TABLE default.test_timestamp (
> f0 TIMESTAMP
> )
> STORED AS TEXTFILE
> LOCATION 'hdfs://***/test_timestamp';
> INSERT INTO default.test_timestamp VALUES("1991-04-14 00:16:00.0");
> {code}
> Query:
> {code:java}
> SELECT f0, unix_timestamp(f0) FROM default.test_timestamp;
> RESULT:
> From Impala with timezone=Asia/Shanghai:
> +---------------------+--------------------+
> | f0 | unix_timestamp(f0) |
> +---------------------+--------------------+
> | 1991-04-14 00:16:00 | 671558400 |
> +---------------------+--------------------+
> From Hive:
> +------------------------+------------+--+
> | f0 | c1 |
> +------------------------+------------+--+
> | 1991-04-14 01:16:00.0 | 671559360 |
> +------------------------+------------+--+
> From HDFS:(./bin/hadoop fs -cat hdfs://***/test_timestamp/***.txt)
> 1991-04-14 00:16:00{code}
> There are obvious differences between Hive and Impala query results. This
> difference is caused by inconsistent processing methods, but I'm not sure
> which is more standardized. For me, I think hive's approach is more
> reasonable.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]