Jason Dere updated HIVE-18252:
    Attachment: HIVE-18252.2.patch

> Limit the size of the object inspector caches
> ---------------------------------------------
>                 Key: HIVE-18252
>                 URL: https://issues.apache.org/jira/browse/HIVE-18252
>             Project: Hive
>          Issue Type: Bug
>          Components: Types
>            Reporter: Jason Dere
>            Assignee: Jason Dere
>            Priority: Major
>         Attachments: HIVE-18252.1.patch, HIVE-18252.2.patch
> Was running some tests that had a lot of queries with constant values, and 
> noticed that ObjectInspectorFactory.cachedStandardStructObjectInspector 
> started using up a lot of memory.
> It appears that StructObjectInspector caching does not work properly with 
> constant values. Constant ObjectInspectors are not cached, so each constant 
> expression creates a new constant ObjectInspector. And since object 
> inspectors do not override equals(), object inspector comparison relies on 
> object instance comparison. So even if the values are exactly the same as 
> what is already in the cache, the StructObjectInspector cache lookup would 
> fail, and Hive would create a new object inspector and add it to the cache, 
> creating another entry that would never be used. Plus, there is no max cache 
> size - it's just a map that is allowed to grow as long as values keep getting 
> added to it.
> Some possible solutions I can think of:
> 1. Limit the size of the object inspector caches, rather than growing without 
> bound.
> 2. Try to fix the caching to work with constant values. This would require 
> implementing equals() on the constant object inspectors (which could be slow 
> in nested cases), or else we would have to start caching constant object 
> inspectors, which could be expensive in terms of memory usage. Could be used 
> in combination with (1). By itself this is not a great solution because this 
> still has the unbounded cache growth issue.
> 3. Disable caching in the case of constant object inspectors since this 
> scenario currently doesn't work. This could be used in combination with (1).

This message was sent by Atlassian JIRA

Reply via email to