----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15881/#review37876 -----------------------------------------------------------
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java <https://reviews.apache.org/r/15881/#comment69655> useDefaultBag not used here. firstTime needs to be in attachInput and useDefaultBag used there src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java <https://reviews.apache.org/r/15881/#comment69656> InternalCachedBag if not useDefaultBag src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java <https://reviews.apache.org/r/15881/#comment69657> I did not get this part about readOnce and copying to a new bag. Also why default bag? If the bag is big won't you hit OOM with this copy? src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java <https://reviews.apache.org/r/15881/#comment69658> clone.keyInfo = new HashMap<Integer, Pair<Boolean, Map<Integer, Integer>>>(keyInfo); src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java <https://reviews.apache.org/r/15881/#comment69653> POJoinPackage also prints forEach.getFlatStr(). Probably have a pkgr.name() and override that for each Packager src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java <https://reviews.apache.org/r/15881/#comment69644> Shouldn't we have this condition (if(pkgr.isDistinct())) in POPackage instead of Packager.getNext()? src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java <https://reviews.apache.org/r/15881/#comment69649> reporter.progress will never be called src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java <https://reviews.apache.org/r/15881/#comment69648> private static src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java <https://reviews.apache.org/r/15881/#comment69650> Wouldn't the index be always same for all the elements in the bag? Why get everytime? src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java <https://reviews.apache.org/r/15881/#comment69647> Same question here on readOnce. Why do we copy the bag on readonce? src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java <https://reviews.apache.org/r/15881/#comment69651> Should be a deep clone as was in POPackage - Rohini Palaniswamy On March 4, 2014, 9:40 p.m., Mark Wagner wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/15881/ > ----------------------------------------------------------- > > (Updated March 4, 2014, 9:40 p.m.) > > > Review request for pig and Cheolsoo Park. > > > Bugs: PIG-3591 > https://issues.apache.org/jira/browse/PIG-3591 > > > Repository: pig-git > > > Description > ------- > > Separate "packaging" logic from "shuffle handling" logic. This moves the > packaging logic to a new class "Packager", which is extended by > CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager. > > This is not finished. Known problem are illustrate and streaming the last > input are not implemented. > > > Diffs > ----- > > src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java > d801f6f > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java > 3638b5c > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java > 18a382b > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java > 5e28eb6 > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java > 5dddab7 > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java > 93de6d5 > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java > eb7c428 > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java > 64f0ee1 > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java > 933363d > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java > 773a22c > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java > eea5ce3 > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java > 1578630 > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java > 47137d5 > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java > abb16ff > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java > ff82801 > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java > 892c26f > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java > 9105a0e > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java > 82f11ac > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java > d604174 > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java > 86314d9 > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java > c200715 > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java > b860521 > src/org/apache/pig/data/ReadOnceBag.java e2b3887 > > src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java > 7112695 > src/org/apache/pig/pen/IllustratorAttacher.java db9c707 > src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7 > test/org/apache/pig/test/TestJobSubmission.java cccade4 > test/org/apache/pig/test/TestPackage.java 6b197d1 > test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a > test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728 > test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f > test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d > test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158 > test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b > test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509 > test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281 > test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e > test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57 > test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32 > test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178 > test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0 > test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f > test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e > test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d > test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec > test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9 > test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047 > > Diff: https://reviews.apache.org/r/15881/diff/ > > > Testing > ------- > > ant test-commit passes, with the exception of TestExampleGenerator which is > caused by the illustrate work not being finished. I also ran some manual > scripts, which worked. > > > Thanks, > > Mark Wagner > >