konstantinb commented on PR #6505:
URL: https://github.com/apache/hive/pull/6505#issuecomment-4580556255
@zabetak there are two more considerations. One relates to "const null"
column statistics to which buildColStatForConstant() assigns an NDV of 0 while
the Hive metastore saves such columns with an NDV of 1:
`CREATE TABLE test_const_null_ndv (i INT, s STRING) STORED AS ORC;
INSERT INTO test_const_null_ndv VALUES (NULL, 'a'), (NULL, 'b');
DESCRIBE FORMATTED test_const_null_ndv i;
`
results in the describe output of
`POSTHOOK: Input: default@test_const_null_ndv
col_name i
data_type int
min
max
num_nulls 2
distinct_count 1
avg_col_len
max_col_len
num_trues
num_falses
bit_vector HL
comment from deserializer
COLUMN_STATS_ACCURATE
{\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"i\":\"true\",\"s\":\"true\"}}
`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]