Thomas Rebele created HIVE-29432:
------------------------------------

             Summary: Statistics missing for tables with a TIMESTAMP WITH LOCAL 
TIME ZONE
                 Key: HIVE-29432
                 URL: https://issues.apache.org/jira/browse/HIVE-29432
             Project: Hive
          Issue Type: Bug
    Affects Versions: 4.3.0
            Reporter: Thomas Rebele


Given the following qfile:
{code:java}
set hive.stats.kll.enable=true;
set metastore.stats.fetch.bitvector=true;
set metastore.stats.fetch.kll=true;
set hive.stats.autogather=true;
set hive.stats.column.autogather=true;

CREATE TABLE test_stats0 (a int, b timestamp) STORED AS TEXTFILE;
CREATE TABLE test_stats1 (a int, b timestamp with local time zone) STORED AS 
TEXTFILE;

INSERT INTO test_stats0 (a, b) VALUES (1, "2020-11-02 00:00:00");
INSERT INTO test_stats1 (a, b) VALUES (1, "2020-11-02 00:00:00");

DESCRIBE FORMATTED test_stats0 a;
DESCRIBE FORMATTED test_stats0 b;

DESCRIBE FORMATTED test_stats1 a;
DESCRIBE FORMATTED test_stats1 b;
 {code}
The statistics for test_stats0 column a are computed successfully:
{code:java}
POSTHOOK: Input: default@test_stats0
col_name                a                   
data_type               int                 
min                     1                   
max                     1                   
num_nulls               0                   
distinct_count          1                   
avg_col_len                                 
max_col_len                                 
num_trues                                   
num_falses                                  
bit_vector              HL                  
histogram               Q1: 1, Q2: 1, Q3: 1 
{code}
However, the statistics for test_stats1 column a are missing:
{code:java}
POSTHOOK: Input: default@test_stats1
col_name                a                   
data_type               int                 
min                                         
max                                         
num_nulls                                   
distinct_count                              
avg_col_len                                 
max_col_len                                 
num_trues                                   
num_falses                                  
bit_vector                                  
histogram                              
{code}
Similar for column b, i.e., stats are available for table test_stats0, but not 
for test_stats1.

Even if the stats for a TIMESTAMP WITH LOCAL TIME ZONE column cannot be 
calculated, it should not affect the other columns.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to