Re: Review Request 33251: HIVE-10302 Cache small tables in memory [Spark Branch]

Jimmy Xiang Mon, 20 Apr 2015 18:37:48 -0700

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33251/
-----------------------------------------------------------


(Updated April 21, 2015, 1:37 a.m.)


Review request for hive, Chao Sun, Szehon Ho, and Xuefu Zhang.


Changes
-------

Changed the assumption. The small tables are cache only for the same work.


Bugs: HIVE-10302
    https://issues.apache.org/jira/browse/HIVE-10302


Repository: hive-git


Description
-------

Cached the small table containter so that mapjoin tasks can use it if the task 
is executed on the same Spark executor.
The cache is released right before the next job after the mapjoin job is done.


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java fe108c4 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java 
3f240f5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkRecordHandler.java 
97b3471 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkUtilities.java 72ab913 

Diff: https://reviews.apache.org/r/33251/diff/


Testing
-------

Ran several queries in live cluster. ptest pending.


Thanks,

Jimmy Xiang

Re: Review Request 33251: HIVE-10302 Cache small tables in memory [Spark Branch]

Reply via email to