[jira] [Comment Edited] (IMPALA-7501) Slim down metastore Partition objects in LocalCatalog cache

Paul Rogers (JIRA) Wed, 26 Sep 2018 16:22:41 -0700


    [ 
https://issues.apache.org/jira/browse/IMPALA-7501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629410#comment-16629410
 ]


Paul Rogers edited comment on IMPALA-7501 at 9/26/18 11:21 PM:
---------------------------------------------------------------

Analysis:

* Impala's {{LocalCatalog}} contains a list of {{FeDb}} objects.
* Impala's {{LocalDb}}, which extends {{FeDb}} contains a map of {{LocalTable}} 
objects.
* Impala's {{LocalTable}} contains a Hive {{Table}} object.
* The {{Table}} object is defined in [Hive's Thrift 
schema|https://github.com/apache/hive/blob/3287a097e31063cc805ca55c2ca7defffe761b6f/standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift]
 API. It does not contain a list of partitions.
* The {{LocalTable}} wraps a number of subclass, of which the one of interest 
is {{HdfsTable}}.

Impala loads tables in the background by calling {{HdfsTable.load()}}:

* {{load()}} calls {{loadAllPartitions()}} to do the partition work.
* {{loadAllPartitions}} calls {{MetaStoreUtil.fetchAllPartitions()}} to get the 
partitions as a list of Hive {{Partition} objects.
* {{loadAllParitions}} wraps each in a {{HdfsPartition}}, and calls 
{{addPartition}} to put the partition into a couple of maps.
* Hive's 
[{{Partition}}|https://github.com/apache/hive/blob/master/standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Partition.java]
 is generated from Thrift. Contains a {{StorageDescriptor}}.
* Hive's 
[{{StorageDescriptor}}|https://github.com/apache/hive/blob/master/standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/StorageDescriptor.java]
 contains the list of {{FieldSchema}} objects which Todd saw in the heap dump.

Things are a bit confusing because:

* Hive defines a different 
[{{Table}}|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java]
 class, which contains a {{TableSpec}}.
* Hive's 
[{{TableSpec}}|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java]
 contains a list of {{Partition}} objects.
* 
A quick scan of the Hive code suggests that Hive's Thrift objects carry more 
info that is required in the Impala cache. Creating Impala-specific, 
high-performance versions would likely save space. (No need for parent 
pointers, no need for the two-level Hive API structure, etc.)

So, this gives us two options:

* Reach inside Hive's Thrift objects to null out fields which we don't need, or
* Design an Impala-specific, compact representation for the data that omits all 
but essential objects and fields.

The second choice provides a huge opportunity for memory optimization. The 
first is a crude-but-effective short-term solution.


was (Author: paul.rogers):
Analysis:

* Impala's {{LocalCatalog}} contains a list of {{FeDb}} objects.
* Impala's {{LocalDb}}, which extends {{FeDb}} contains a map of {{LocalTable}} 
objects.
* Impala's {{LocalTable}} contains a Hive {{Table}} object.
* The {{Table}} object is defined in [Hive's Thrift 
schema|https://github.com/apache/hive/blob/3287a097e31063cc805ca55c2ca7defffe761b6f/standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift]
 API. It does not contain a list of partitions.

Things here get complex because Cloudera does not provide source jars for its 
build of Hive, so can't step into or set breakpoints in Hive code.

Impala loads tables in the background by calling {{HdfsTable.load()}}:

* {{load()}} calls {{MetaStoreUtil.fetchAllPartitions()}} to get the partitions.
* The list of Hive {{Partition} objects is passed to 
{{HdfsTable.loadAllPartitions()}}.
* Hive's 
[{{Partition}}|https://github.com/apache/hive/blob/master/standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Partition.java]
 is generated from Thrift. Contains a {{StorageDescriptor}}.
* Hive's 
[{{StorageDescriptor}}|https://github.com/apache/hive/blob/master/standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/StorageDescriptor.java]
 contains the list of {{FieldSchema}} objects which Todd saw in the heap dump.

Things are a bit confusing because:

* Hive defines a different 
[{{Table}}|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java]
 class, which contains a {{TableSpec}}.
* Hive's 
[{{TableSpec}}|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java]
 contains a list of {{Partition}} objects.
A quick scan of the Hive code suggests that Hive's Thrift objects carry more 
info that is required in the Impala cache. Creating Impala-specific, 
high-performance versions would likely save space. (No need for parent 
pointers, no need for the two-level Hive API structure, etc.)

So, this gives us two options:

* Reach inside Hive's Thrift objects to null out fields which we don't need, or
* Design an Impala-specific, compact representation for the data that omits all 
but essential objects and fields.

The second choice provides a huge opportunity for memory optimization. The 
first is a crude-but-effective short-term solution.

> Slim down metastore Partition objects in LocalCatalog cache
> -----------------------------------------------------------
>
>                 Key: IMPALA-7501
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7501
>             Project: IMPALA
>          Issue Type: Sub-task
>            Reporter: Todd Lipcon
>            Priority: Minor
>
> I took a heap dump of an impalad running in LocalCatalog mode with a 2G limit 
> after running a production workload simulation for a couple hours. It had 
> 38.5M objects and 2.02GB heap (the vast majority of the heap is, as expected, 
> in the LocalCatalog cache). Of this total footprint, 1.78GB and 34.6M objects 
> are retained by 'Partition' objects. Drilling into those, 1.29GB and 33.6M 
> objects are retained by FieldSchema, which, as far as I remember, are ignored 
> on the partition level by the Impala planner. So, with a bit of slimming down 
> of these objects, we could make a huge dent in effective cache capacity given 
> a fixed budget. Reducing object count should also have the effect of improved 
> GC performance (old gen GC is more closely tied to object count than size)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (IMPALA-7501) Slim down metastore Partition objects in LocalCatalog cache

Reply via email to