Bharath Vissapragada has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11341


Change subject: IMPALA-7424: Optimize in-memory representation of incremental 
stats
......................................................................

IMPALA-7424: Optimize in-memory representation of incremental stats

Currently incremental stats are stored as chunked Base64 strings in the
HMS parameters map of partition objects. Each of these strings are Java
'String' objects that use UTF-16 encoding and takes up to 2 bytes per
character.

This patch converts the string representation into a gzipped byte array
form when the partition is loaded in the Catalogd and this state is
maintained when transmitting them to the coordinators. To maintain
backward compatibility, the on-disk HMS representation of stats has not
been modified. So the incremental stats are still written back to the
chunked Base64 representation while serializing the partition state to
HMS.

On a real world catalogserver dominated by incremental stats memory
footprint, this patch showed ~54% end-to-end heapsize reduction and ~79%
reduction in the memory footprint of incremental stats data structures.

This patch also optimizes the way the callers check if a partition has
incremental stats by computing this information once and reusing it
later. Without the patch, we deserialize the entire incremental stats
structure everytime this information is needed and that triggers a spike
in usage of working memory on catalogds/Impalads.

Testing: Ran core tests on Catalog V1 Implementation. Ran some manual
queries on Catalog V2 implementation.

Change-Id: I39f02ebfa0c6e9b0baedd0d76058a1b34efb5a02
---
M common/thrift/CatalogObjects.thrift
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
A fe/src/main/java/org/apache/impala/util/CompressionUtil.java
M fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoTest.java
A fe/src/test/java/org/apache/impala/util/CompressionUtilTest.java
17 files changed, 342 insertions(+), 110 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/11341/2
--
To view, visit http://gerrit.cloudera.org:8080/11341
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I39f02ebfa0c6e9b0baedd0d76058a1b34efb5a02
Gerrit-Change-Number: 11341
Gerrit-PatchSet: 2
Gerrit-Owner: Bharath Vissapragada <bhara...@cloudera.com>

Reply via email to