[jira] [Comment Edited] (IMPALA-7501) Slim down metastore Partition objects in LocalCatalog cache

Paul Rogers (JIRA) Thu, 27 Sep 2018 15:38:17 -0700


    [ 
https://issues.apache.org/jira/browse/IMPALA-7501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629621#comment-16629621
 ]


Paul Rogers edited comment on IMPALA-7501 at 9/27/18 10:37 PM:
---------------------------------------------------------------

The path to the HMS {{Partition}} objects appears to be:

* {{HdfsTable}} holds onto a set of {{FeFsPartition}} objects.
* In local catalog mode, the {{FeFsParition}} is an instance of 
{{LocalFsPartition}}.
* {{LocalFsPartition}} holds onto the HMS {{Partition}} objects.
* {{Partition}} holds onto a {{StorageDescriptor}} which holds onto a list of 
the {{FieldSchema}} objects that Todd noted.

However, there is no obvious path that causes code to hold onto the 
{{LocalFsParition}} objects; in the local catalog implementation, they are 
converted to Thrift format, then discarded. It is not clear how the 
{{FeFsPartition}} objects are recreated for a query. The available tests don’t 
exercise this path.

Perhaps code changed since this issues was reported?

No code in {{LocalFsParition}} accesses the columns stored in the 
{{StorageDescriptor}} associated with the {{Partition}}, so it is probably safe 
to nuke them. Added the following to the {{LocalFsPartition}} constructor:

{noformat}
msPartition_.getSd().unsetCols();
{noformat}

Rerunning the {{LocalCatalogTest}} cases showed no issues.


was (Author: paul.rogers):
So the above was probably looking in the wrong haystack. Todd's comment is the 
key: {{LocalCatalog}}. The local catalog caches the HMS Thrift objects, 
including {{Partition}}.

As it turns out, in the current build, all {{LocalCatalogTest}}s fail in the 
current build, so the following analysis had to be done based on source code, 
not live debugging.

Work is further complicated because Cloudera does not provide source jars for 
its Hive jars, so we cannot simply step into or inspect the HMS source code.

There are three separate schema trees to consider:

* That in the parse tree which represent table and column refs in the query. 
{{TableRef}}, {{SlotRef}}.
* The coordinator catalog classes: {Db}}, {{Table}}, {{Column}}, and their many 
subclasses.
* The Local catalog implementation: {{LocalDb}}, {{LocalTable}}, and their 
subclasses.

The issue here is with the local catalog. The earlier note showed that the 
coordinator catalog classes are already Impala-specific, slimmed down 
implementations. Local catalog, because it is fairly new, still uses Hive (HMS) 
classes as part of its implementation. Here we need to figure out the structure 
that leads to {{Partition}} objects being cached.

The chain is:

* {{LocalDb}} contains a map of {LocalTable}}.
* {{LocalTable.load()}} creates the proper table class for each data source one 
of which is {{LocalFsTable}}.
* {{LocalTable}} holds a Hive {{Table}} object. As noted above, {{Table}} does 
not hold partitions.
* {{LocalFsTable}} holds a map of {{LocalPartitionSpec}}, which does not hold a 
Hive partition.

Here things get a bit muddy because I can’t actually execute the code. The 
relevant bits seem to be:

* {{LocalFsPartition}} holds onto the Hive {{Partition}}, which holds onto the 
{{FieldSchema}} objects.
* {{LocalFsTable.loadParitions()}} fetches {{Partition}}s from HMS, wraps them 
in a {{LocalFsPartition}} (which extends {{FeFsPartition}}) and returns them.

We may be able to add a couple of lines to the {{LocalFsPartition}} constructor:

{noformat}
msPartition_.getSd().unsetCols();
{noformat}

Again, the above cannot yet be tested because the local catalog tests don’t yet 
work.

Longer term, the note earlier does apply. While the query-specific metadata 
goes to pains to avoid caching HMS objects, LocalCatalog (and presumably the 
similar version in the {{catalogd}} do cache HMS objects which, as noted 
earlier, are rather bloated for our needs.

> Slim down metastore Partition objects in LocalCatalog cache
> -----------------------------------------------------------
>
>                 Key: IMPALA-7501
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7501
>             Project: IMPALA
>          Issue Type: Sub-task
>            Reporter: Todd Lipcon
>            Priority: Minor
>
> I took a heap dump of an impalad running in LocalCatalog mode with a 2G limit 
> after running a production workload simulation for a couple hours. It had 
> 38.5M objects and 2.02GB heap (the vast majority of the heap is, as expected, 
> in the LocalCatalog cache). Of this total footprint, 1.78GB and 34.6M objects 
> are retained by 'Partition' objects. Drilling into those, 1.29GB and 33.6M 
> objects are retained by FieldSchema, which, as far as I remember, are ignored 
> on the partition level by the Impala planner. So, with a bit of slimming down 
> of these objects, we could make a huge dent in effective cache capacity given 
> a fixed budget. Reducing object count should also have the effect of improved 
> GC performance (old gen GC is more closely tied to object count than size)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (IMPALA-7501) Slim down metastore Partition objects in LocalCatalog cache

Reply via email to