[
https://issues.apache.org/jira/browse/IMPALA-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jim Apple resolved IMPALA-6620.
-------------------------------
Resolution: Duplicate
Duplicates IMPALA-5615
> Compute incremental stats for groups of partitions does not update stats
> correctly
> ----------------------------------------------------------------------------------
>
> Key: IMPALA-6620
> URL: https://issues.apache.org/jira/browse/IMPALA-6620
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Affects Versions: Impala 2.8.0
> Environment: Impala - v2.8.0-cdh5.11.1
> We are using Hive Metastore Database embedded (by cloudera)
> It's postgres 8.4.20
> OS: Centos
> Reporter: H Milyakov
> Priority: Major
>
> Executing COMPUTE INCREMENTAL STATS `table` PARTITION (`partition clause`)
> does not compute statistics correctly (computes 0) when `partition clause`
> matches more than one partition.
> Executing the same command when `partition clause` matches just a single
> partition
> results in statistics being computed correctly (non 0 and non -1).
> The issue was observed on our production cluster for a table with 40 000
> partitions and 20 columns.
> I have copied the table to separate isolated cluster and observed the same
> behaviour.
> We use Impala 2.8.0 in Cloudera CDH 5.11
> The issue could be simulated with the following:
> 1. CREATE TABLE my_test_table ( some_ints BIGINT )
> PARTITIONED BY ( part_1 BIGINT, part_2 STRING )
> STORED AS PARQUET;
>
> 2. The only column 'some_ints' is populated so that there are 10 000
> different partitions (part_1, part_2).
> Total number of records in the table does not matter and could be same as
> the number of different partitions.
>
> 3. Then running the compute incremental as described above simulates the
> issue.
> Did anybody faced similar issue or does have more info on the case?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)