Hello Todd Lipcon, Impala Public Jenkins, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11341

to look at the new patch set (#6).

Change subject: IMPALA-7424: Reduce in-memory footprint of incremental stats
......................................................................

IMPALA-7424: Reduce in-memory footprint of incremental stats

Currently incremental stats are stored as chunked Base64 strings in the
HMS parameters map of partition objects. Each of these strings when
stored in the catalogd are Java 'String' objects that use UTF-16 encoding
and take up to 2 bytes per character.

This patch converts the string representation into a deflate-compressed byte
array form when the partition is loaded in the Catalogd and this state is
maintained when transmitting them to the coordinators. To maintain
backward compatibility, the persistent HMS representation of stats has not
been modified. So the incremental stats are still written back to the
chunked Base64 representation while serializing the partition state to
HMS.

On a real world catalogserver dominated by incremental stats memory
footprint, this patch showed ~54% end-to-end heapsize reduction and ~79%
reduction in the memory footprint of incremental stats data structures.

This patch also improves the way the callers check if a partition has
incremental stats by computing this information once and reusing it
later. Without the patch, we deserialize the entire incremental stats
structure everytime this information is needed and that triggers a spike
in usage of working memory on catalogds/Impalads.

Testing: Ran core tests on Catalog V1 Implementation. Ran some manual
queries on Catalog V2 implementation.

Change-Id: I39f02ebfa0c6e9b0baedd0d76058a1b34efb5a02
---
M common/thrift/CatalogObjects.thrift
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
A fe/src/main/java/org/apache/impala/util/CompressionUtil.java
M fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoTest.java
A fe/src/test/java/org/apache/impala/util/CompressionUtilTest.java
17 files changed, 368 insertions(+), 131 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/11341/6
--
To view, visit http://gerrit.cloudera.org:8080/11341
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I39f02ebfa0c6e9b0baedd0d76058a1b34efb5a02
Gerrit-Change-Number: 11341
Gerrit-PatchSet: 6
Gerrit-Owner: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com>

Reply via email to