-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16313/
-----------------------------------------------------------

(Updated Dec. 18, 2013, 3:04 a.m.)


Review request for pig, Alex Bain, Daniel Dai, Mark Wagner, and Rohini 
Palaniswamy.


Changes
-------

Incorporated Rohini's comments-
* Adds an e2e test case for outer replicated join.
* Adds a unit test case for 3-way replicated join.
* Adds a unit test case for replicated join in reducer.
* Cleans up POShuffleTezLoad code to make use of inputKeys. Now 
POShuffleTezLoad#attachInputs() looks up LogicalInputs by inputKey and only 
attaches applicable ones to itself. For example, it attaches 
ShuffledMergedInputs but ignores ShuffledUnorderedKVInputs. This is needed 
because it is possible for both broadcast and scatter/gather edges to be 
attached to the same vertex. In that case, we should only attach applicable 
inputs to different operators in the vertex.
* Includes the fix for PIG-3624 (establishing the order of joined columns).

ant test-tez passes.
e2e test passes.


Bugs: PIG-3604
    https://issues.apache.org/jira/browse/PIG-3604


Repository: pig-git


Description
-------

Implemented replicated join in Tez as follows:
- POFRJoinTez extends POFRJoin. The difference between two is that replication 
hash table is constructed out of broadcasting edges in Tez instead of files on 
distributed cache in MR.
- TezCompiler adds a vertex per replicated table and connect it to POFRJoin 
vertex via broadcasting edge.

Note that in POLocalRerrangeTez, I package tuples in the same way for broadcast 
and scatter/gather edges, so I removed outputType (DataMovementType). 


Diffs (updated)
-----

  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POFRJoin.java
 d7c54d8 
  
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMergeJoin.java
 e900751 
  src/org/apache/pig/backend/hadoop/executionengine/tez/POFRJoinTez.java 
e69de29 
  
src/org/apache/pig/backend/hadoop/executionengine/tez/POLocalRearrangeTez.java 
cda5d89 
  src/org/apache/pig/backend/hadoop/executionengine/tez/POShuffleTezLoad.java 
d76cfc5 
  src/org/apache/pig/backend/hadoop/executionengine/tez/POUnionTezLoad.java 
e6f9be5 
  src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java 
7a1736a 
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java 
2584501 
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 
96ccdde 
  test/e2e/pig/tests/tez.conf b280698 
  test/org/apache/pig/test/data/GoldenFiles/TEZC10.gld e69de29 
  test/org/apache/pig/test/data/GoldenFiles/TEZC11.gld e69de29 
  test/org/apache/pig/tez/TestTezCompiler.java 79dc94e 

Diff: https://reviews.apache.org/r/16313/diff/


Testing
-------

Added a unit test case to TestTezCompiler.
Added a e2e test case to Join.

ant test-tez passes.
e2e test passes.


Thanks,

Cheolsoo Park

Reply via email to