[jira] [Resolved] (IMPALA-6620) Compute incremental stats for groups of partitions does not update stats correctly

Jim Apple (JIRA) Mon, 02 Apr 2018 08:59:14 -0700

     [ 
https://issues.apache.org/jira/browse/IMPALA-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jim Apple resolved IMPALA-6620.
-------------------------------
    Resolution: Duplicate

Duplicates IMPALA-5615

> Compute incremental stats for groups of partitions does not update stats 
> correctly
> ----------------------------------------------------------------------------------
>
>                 Key: IMPALA-6620
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6620
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>    Affects Versions: Impala 2.8.0
>         Environment: Impala - v2.8.0-cdh5.11.1 
> We are using Hive Metastore Database embedded (by cloudera) 
> It's postgres 8.4.20 
> OS: Centos 
>            Reporter: H Milyakov
>            Priority: Major
>
> Executing COMPUTE INCREMENTAL STATS `table` PARTITION (`partition clause`) 
> does not compute statistics correctly (computes 0) when `partition clause` 
> matches more than one partition.
> Executing the same command when `partition clause` matches just a single 
> partition 
> results in statistics being computed correctly (non 0 and non -1).
> The issue was observed on our production cluster for a table with 40 000 
> partitions and 20 columns.
> I have copied the table to separate isolated cluster and observed the same 
> behaviour.
> We use Impala 2.8.0 in Cloudera CDH 5.11
> The issue could be simulated with the following:
>  1. CREATE TABLE my_test_table ( some_ints BIGINT )
>  PARTITIONED BY ( part_1 BIGINT, part_2 STRING ) 
>  STORED AS PARQUET;
>  
>  2. The only column 'some_ints' is populated so that there are 10 000 
> different partitions (part_1, part_2).
>  Total number of records in the table does not matter and could be same as 
> the number of different partitions.
>  
>  3. Then running the compute incremental as described above simulates the 
> issue.
> Did anybody faced similar issue or does have more info on the case?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (IMPALA-6620) Compute incremental stats for groups of partitions does not update stats correctly

Reply via email to