[
https://issues.apache.org/jira/browse/HIVE-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Siying Dong updated HIVE-1758:
------------------------------
Attachment: HIVE-1758.1.patch
1. use array instead of ArrayList<Object>
2. abstract KeyWrapper a little bit so that we can write specific codes for
different kind of keys
3. use ListKeyWrapper and TestKeyWrapper
The motivation of the task is to reduce memory footprint. There is minor impact
to CPU time.
> optimize group by hash map memory
> ---------------------------------
>
> Key: HIVE-1758
> URL: https://issues.apache.org/jira/browse/HIVE-1758
> Project: Hive
> Issue Type: Improvement
> Components: Query Processor
> Reporter: Namit Jain
> Assignee: Siying Dong
> Attachments: HIVE-1758.1.patch
>
>
> Group By map side's hash map consumes a lot of memory, thereby decreasing its
> effectiveness.
> We can use some of the optimizations from map-join to reduce the memory
> footprint:
> class KeyWrapper {
> int hashcode;
> ArrayList<Object> keys;
> // decide whether this is already in hashmap (keys in hashmap are
> deepcopied
> // version, and we need to use 'currentKeyObjectInspector').
> boolean copy = false;
> 1. Changes keys to Array
> 2. Optimize the scenario when keys is of a small size (1,2) etc
> Let us start profiling it and take it from there
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.