-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13059/
-----------------------------------------------------------

(Updated Oct. 3, 2013, 2:17 p.m.)


Review request for hive, Eric Hanson and Jitendra Pandey.


Bugs: HIVE-4850
    https://issues.apache.org/jira/browse/HIVE-4850


Repository: hive-git


Description
-------

This is not the final iteration, but I thought is easier to discuss it with a 
review.
This implementation works, handles multiple aliases and multiple values per 
key. The implementation uses the exiting hash tables saved by the local task 
for the map join, which are row mode hash tables (have row mode keys and store 
row mode writable object values). Going forward we should avoid the 
size-of-big-table conversions of big table keys to row-mode and conversion of 
small table values to vector data. This would require either converting 
on-the-fly the hash tables to vector friendly ones (when loaded) or changing 
the local task tahstable sink to create a vectorization friendly hash. First 
approach may have memory consumption problems (potentially two hash tables end 
up in memory, would have to stream the transformation or transform as reading 
from serialized format... nasty).


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java d320b47 
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java 86db044 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 153b8ea 
  ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 8ab5395 
  ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java cde1a59 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ColumnVector.java 8b4c615 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorColumnAssign.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorColumnAssignFactory.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapperBatch.java
 9955d09 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorReduceSinkOperator.java 
6df3551 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
02ebe14 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatch.java 
ff13f89 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriterFactory.java
 9e189c9 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 
df1c5a6 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java a72ec8b 

Diff: https://reviews.apache.org/r/13059/diff/


Testing
-------

Manually run some join queries on alltypes_orc table.


Thanks,

Remus Rusanu

Reply via email to