YuanGuanhu created SPARK-36604:
----------------------------------
Summary: timestamp type column analyze result is wrong
Key: SPARK-36604
URL: https://issues.apache.org/jira/browse/SPARK-36604
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 3.1.2, 3.1.1
Environment: Spark 3.1.1
Reporter: YuanGuanhu
when we create table with timestamp column type, the min and max data of the
analyze result for the timestamp column is wrong
eg:
> select * from a;
2021-08-15 15:30:01
Time taken: 2.789 seconds, Fetched 1 row(s)
spark-sql> desc formatted a a;
col_name a
data_type timestamp
comment NULL
min 2021-08-15 07:30:01.000000
max 2021-08-15 07:30:01.000000
num_nulls 0
distinct_count 1
avg_col_len 8
max_col_len 8
histogram NULL
Time taken: 0.278 seconds, Fetched 10 row(s)
spark-sql> desc a;
a timestamp NULL
Time taken: 1.432 seconds, Fetched 1 row(s)
reproduce step:
create table a(a timestamp);
insert into a select '2021-08-15 15:30:01';
analyze table a compute statistics for columns a;
desc formatted a a;
select * from a;
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]