> On Dec. 26, 2013, 4:45 p.m., Rohini Palaniswamy wrote:
> > test/org/apache/pig/test/TestAccumulator.java, line 189
> > <https://reviews.apache.org/r/16463/diff/1/?file=403084#file403084line189>
> >
> >     Any reason for adding this newly within tests? tearDown() already has 
> > it.
> 
> Cheolsoo Park wrote:
>     I added this because in the same test case, if pigServer invokes 
> openIterator() multiple times, Pig generates temporary files with the same 
> name in subsequent calls. So I am cleaning them up after each invocation.
>     
>     Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 2002: Output 
> Location Validation Failed for: 
> 'hdfs://localhost:56901/tmp/temp-1456742965/tmp-1456742965 More info to 
> follow:
>     8788 Output directory 
> hdfs://localhost:56901/tmp/temp-1456742965/tmp-1456742965 already exists
>     
>     I don't see this behavior in hadoopversion=20 but in hadoopversion=23. 
> Perhaps we should fix FileLocalizer#getTemporaryPath(). I can take a closer 
> look.
> 
> Rohini Palaniswamy wrote:
>     I am sure this problem does not exist in trunk. We need to see why and 
> fix. Something must be broken in tez branch.
> 
> Cheolsoo Park wrote:
>     Random doesn't guarantee unique numbers. We haven't seen this problem 
> because we were lucky.
>     
>     After changing Random.getNextInt() to UUID.randomUUID(), I no longer see 
> failures.

Lot of MR golden files will have to be changed if you change the random number 
generation logic. Also will lead to longer HDFS path names. How does the same 
test work in MR? In fact saw that FileLocalizer.setR() is called twice in 
MRCompiler resetting the static variable. Have not traced through. I am sure we 
are missing something in Tez. 


- Rohini


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16463/#review30867
-----------------------------------------------------------


On Dec. 26, 2013, 6:11 a.m., Cheolsoo Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16463/
> -----------------------------------------------------------
> 
> (Updated Dec. 26, 2013, 6:11 a.m.)
> 
> 
> Review request for pig, Alex Bain, Daniel Dai, Mark Wagner, and Rohini 
> Palaniswamy.
> 
> 
> Bugs: PIG-3636
>     https://issues.apache.org/jira/browse/PIG-3636
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> The patch implements accumulator optimization in Tez. The changes include-
> * Create AccumulatorOptimizer in Tez.
> * Create AccumulatorOptimizerUtil class and factor out common functions in MR 
> and Tez.
> * Implement accumulator logic in POShuffleTezLoad.
> * Update TestAccumulator to make it run in Tez mode.
> 
> 
> Diffs
> -----
> 
>   src/org/apache/pig/PigConfiguration.java 0a26e8c 
>   
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java
>  7f9e15a 
>   
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java
>  9eed25c 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java
>  6e04513 
>   
> src/org/apache/pig/backend/hadoop/executionengine/tez/AccumulatorOptimizer.java
>  e69de29 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/POShuffleTezLoad.java 
> 722b9f6 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/POUnionTezLoad.java 
> 742a33a 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 
> d42ce89 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java 
> c6af682 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezPlanContainer.java 
> e33a7c6 
>   
> src/org/apache/pig/backend/hadoop/executionengine/util/AccumulatorOptimizerUtil.java
>  e69de29 
>   test/org/apache/pig/test/TestAccumulator.java b979649 
>   test/org/apache/pig/test/TestCombiner.java a227d18 
>   test/tez-tests fcb573e 
> 
> Diff: https://reviews.apache.org/r/16463/diff/
> 
> 
> Testing
> -------
> 
> * TestAccumulator passes in Tez mode.
> * All unit tests pass.
> * All e2e tests pass.
> 
> Note that 3 test cases in TestAccumulator are annotated as @Ignore because 
> SecondaryKeyOptimizer in Tez is not implement yet. The test cases expect 
> accumulator optimizer is applied when order-by and distinct are present in a 
> nested foreash because sort operator is removed by SecondaryKeyOptimizer. 
> Added TODO comments accordingly.
> 
> 
> Thanks,
> 
> Cheolsoo Park
> 
>

Reply via email to