Brock Noland created HIVE-4838:
----------------------------------

             Summary: Refactor MapJoin HashMap code to improve testability, 
readability, and other implementations
                 Key: HIVE-4838
                 URL: https://issues.apache.org/jira/browse/HIVE-4838
             Project: Hive
          Issue Type: Bug
            Reporter: Brock Noland
            Assignee: Brock Noland


MapJoin is an essential component for high performance joins in Hive and the 
current code has done great service for many years. However, the code is 
showing it's age and currently suffers  from the following issues:

* Uses static state via the MapJoinMetaData class to pass serialization 
metadata to the Key, Row classes.
* The api of a logical "Table Container" is not defined and therefore it's 
unclear what apis HashMapWrapper 
needs to publicize. Additionally HashMapWrapper has many used public methods.
* HashMapWrapper contains logic to serialize, test memory bounds, and implement 
the table container. Ideally these logical units could be seperated
* HashTableSinkObjectCtx has unused fields and unused methods
* CommonJoinOperator and children use ArrayList on left hand side when only 
List is required
* There are unused classes Unused classes MRU, DCLLItemm, MapJoinSingleKey, and 
MapJoinDoubleKeys

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to