Thomas Rebele created HIVE-29432:
------------------------------------
Summary: Statistics missing for tables with a TIMESTAMP WITH LOCAL
TIME ZONE
Key: HIVE-29432
URL: https://issues.apache.org/jira/browse/HIVE-29432
Project: Hive
Issue Type: Bug
Affects Versions: 4.3.0
Reporter: Thomas Rebele
Given the following qfile:
{code:java}
set hive.stats.kll.enable=true;
set metastore.stats.fetch.bitvector=true;
set metastore.stats.fetch.kll=true;
set hive.stats.autogather=true;
set hive.stats.column.autogather=true;
CREATE TABLE test_stats0 (a int, b timestamp) STORED AS TEXTFILE;
CREATE TABLE test_stats1 (a int, b timestamp with local time zone) STORED AS
TEXTFILE;
INSERT INTO test_stats0 (a, b) VALUES (1, "2020-11-02 00:00:00");
INSERT INTO test_stats1 (a, b) VALUES (1, "2020-11-02 00:00:00");
DESCRIBE FORMATTED test_stats0 a;
DESCRIBE FORMATTED test_stats0 b;
DESCRIBE FORMATTED test_stats1 a;
DESCRIBE FORMATTED test_stats1 b;
{code}
The statistics for test_stats0 column a are computed successfully:
{code:java}
POSTHOOK: Input: default@test_stats0
col_name a
data_type int
min 1
max 1
num_nulls 0
distinct_count 1
avg_col_len
max_col_len
num_trues
num_falses
bit_vector HL
histogram Q1: 1, Q2: 1, Q3: 1
{code}
However, the statistics for test_stats1 column a are missing:
{code:java}
POSTHOOK: Input: default@test_stats1
col_name a
data_type int
min
max
num_nulls
distinct_count
avg_col_len
max_col_len
num_trues
num_falses
bit_vector
histogram
{code}
Similar for column b, i.e., stats are available for table test_stats0, but not
for test_stats1.
Even if the stats for a TIMESTAMP WITH LOCAL TIME ZONE column cannot be
calculated, it should not affect the other columns.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)