[ 
https://issues.apache.org/jira/browse/IMPALA-7501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16629410#comment-16629410
 ] 

Paul Rogers edited comment on IMPALA-7501 at 9/27/18 10:38 PM:
---------------------------------------------------------------

A quick scan of the Hive code suggests that Hive's Thrift objects carry more 
info that is required in the Impala cache. Looks like the local cache 
implementation, at present, caches these hive objects. A future iteration of 
local cache might want to cache the slimmer, optimized Impala metadata objects.

Turns out this very discussion was held, at length, in the design document, so 
we'll just focus on the current implementation.


was (Author: paul.rogers):
A quick scan of the Hive code suggests that Hive's Thrift objects carry more 
info that is required in the Impala cache. Looks like the local cache 
implementation, at present, caches these hive objects. A future iteration of 
local cache might want to cache the slimmer, optimized Impala metadata objects.

> Slim down metastore Partition objects in LocalCatalog cache
> -----------------------------------------------------------
>
>                 Key: IMPALA-7501
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7501
>             Project: IMPALA
>          Issue Type: Sub-task
>            Reporter: Todd Lipcon
>            Priority: Minor
>
> I took a heap dump of an impalad running in LocalCatalog mode with a 2G limit 
> after running a production workload simulation for a couple hours. It had 
> 38.5M objects and 2.02GB heap (the vast majority of the heap is, as expected, 
> in the LocalCatalog cache). Of this total footprint, 1.78GB and 34.6M objects 
> are retained by 'Partition' objects. Drilling into those, 1.29GB and 33.6M 
> objects are retained by FieldSchema, which, as far as I remember, are ignored 
> on the partition level by the Impala planner. So, with a bit of slimming down 
> of these objects, we could make a huge dent in effective cache capacity given 
> a fixed budget. Reducing object count should also have the effect of improved 
> GC performance (old gen GC is more closely tied to object count than size)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to