[ 
https://issues.apache.org/jira/browse/IMPALA-10208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-10208:
-----------------------------------
    Labels: newbie ramp-up  (was: )

> Drop stats doesnt remove impala_intermediate_stats_num_chunks from 
> PARTITION_PARAMS
> -----------------------------------------------------------------------------------
>
>                 Key: IMPALA-10208
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10208
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Venkat Sambath
>            Priority: Minor
>              Labels: newbie, ramp-up
>         Attachments: image-2020-10-02-10-38-16-153.png, 
> image-2020-10-02-10-38-48-144.png, image-2020-10-02-10-39-18-642.png
>
>
> Steps to replicate the issue:
> Step1: 
> {code:java}
> CREATE TABLE impala_partition_test1 (                                         
>  
>    a INT                                                                      
>          
>  )                                                                            
>          
>  PARTITIONED BY (                                                             
>          
>    b STRING                                                                   
>          
>  ); alter table impala_partition_test1 add partition(b="part1");
>  alter table impala_partition_test1 add partition(b="part2");
>  alter table impala_partition_test1 add partition(b="part3");
>  alter table impala_partition_test1 add partition(b="part4");
> {code}
> Step2: Populating the partitions
> {code:java}
> for i in `seq 1 10`; do base64 /dev/urandom | head -c 5000K > text_data  && 
> hdfs dfs -put text_data 
> hdfs://nameservice1/user/hive/warehouse/impala_partition_test1/b=part1/test_${i};
>  done
>  for i in `seq 1 10`; do base64 /dev/urandom | head -c 5000K > text_data  && 
> hdfs dfs -put text_data 
> hdfs://nameservice1/user/hive/warehouse/impala_partition_test1/b=part2/test_${i};
>  done
>  for i in `seq 1 10`; do base64 /dev/urandom | head -c 5000K > text_data  && 
> hdfs dfs -put text_data 
> hdfs://nameservice1/user/hive/warehouse/impala_partition_test1/b=part3/test_${i};
>  done
>  for i in `seq 1 10`; do base64 /dev/urandom | head -c 5000K > text_data  && 
> hdfs dfs -put text_data 
> hdfs://nameservice1/user/hive/warehouse/impala_partition_test1/b=part4/test_${i};
>  done
> {code}
> Step3: Run compute incremental stats impala_partition_test1;
> Step4: In HMS DB when you run the below query 
> {code:java}
> select A.TBL_NAME, B.PART_NAME, C.PARAM_KEY, sum(length(C.PARAM_KEY) + 
> length(C.PARAM_VALUE)) from TBLS A join PARTITIONS B join PARTITION_PARAMS C 
> on A.TBL_ID = B.TBL_ID and C.PART_ID=B.PART_ID and C.PARAM_KEY like 
> "%impala_intermediate_stats%" group by A.TBL_NAME,B.PART_NAME,C.PARAM_KEY;    
> {code}
> You will be noticing
>  !image-2020-10-02-10-39-18-642.png! 
> Step5: After you drop the stats [drop stats impala_partition_test1 ] you 
> still be noticing impala_intermediate_stats_num_chunks left unremoved.
>  !image-2020-10-02-10-38-48-144.png! 
> When you have million partitions this could contribute to 37mb I suppose. 
> Requesting you to remove impala_intermediate_stats_num_chunks while we drop 
> stats from table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to