-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15219/#review28174
-----------------------------------------------------------

Ship it!


Looks great! Thank you Alex!

Please let me fix the typo below when I commit it.


test/e2e/pig/tests/tez.conf
<https://reviews.apache.org/r/15219/#comment54810>

    This should be 2.


- Cheolsoo Park


On Nov. 5, 2013, 12:57 a.m., Alex Bain wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15219/
> -----------------------------------------------------------
> 
> (Updated Nov. 5, 2013, 12:57 a.m.)
> 
> 
> Review request for pig, Cheolsoo Park, Daniel Dai, Mark Wagner, and Rohini 
> Palaniswamy.
> 
> 
> Bugs: PIG-3536
>     https://issues.apache.org/jira/browse/PIG-3536
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> Implement DISTINCT for Pig-on-Tez by providing a (very straightforward) 
> implementation in TezCompiler.java.
> 
> For the moment, this does NOT use two optimizations done in the MRCompiler. 
> We will create a separate JIRA for these optimizations:
> 1. A distinct combiner
> 2. A combiner optimizer that replaces certain uses of DISTINCT with an 
> algebraic udf
> 
> [Little code note: I changed the name of getPlainForEach to getForEachPlain. 
> That way we can have getForEachHelper1, getForEachHelper2, etc. all follow 
> alphabetically. Sorry if that's a little too OCD.]
> 
> 
> Diffs
> -----
> 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java 
> d62b2a1 
>   test/e2e/pig/tests/tez.conf 24af8d3 
>   test/org/apache/pig/test/data/GoldenFiles/TEZC5.gld PRE-CREATION 
>   test/org/apache/pig/tez/TestTezCompiler.java 1209d08 
> 
> Diff: https://reviews.apache.org/r/15219/diff/
> 
> 
> Testing
> -------
> 
> This patch includes:
> -A unit test in TestTezCompiler.java
> -An e2e test
> 
> DANIEL: Can you check that my e2e test looks appropriate? I wasn't sure which 
> test data set to choose, I just picked studenttab20m.
> 
> 
> Thanks,
> 
> Alex Bain
> 
>

Reply via email to