----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/61265/ -----------------------------------------------------------
(Updated Aug. 11, 2017, 7:56 p.m.) Review request for pig, Daniel Dai and Koji Noguchi. Changes ------- Changes done 1) Fixed the unnecessary bag construction for PORelationToExprProject in case of isStar() which got missed when I removed PORelationToExprProject from isStar() handling in CodeGenerator to handle EOP cases. 2) Extended cases where Read once bags to UDFs as well which made a big improvement to performance and totally brings down spilling required in a task. By default EvalFunc.isInputBagIterationOnce is true. i.e, it is assumed that all UDFs iterator over input bag only once. Did not make default false as there would be only very few UDFs that might iterate on bag more than once. Any one turning on pig.opt.bytecode can fix that UDF. Bugs: PIG-5256 https://issues.apache.org/jira/browse/PIG-5256 Repository: pig Description (updated) ------- For each POForEach or POFilter operator in the plan, a corresponding class is created by inlining code for the input plans. Bytecode is generated directly through asm APIs and the generated class files are shipped to the tasks as a jar. On the frontend, the generated classes are dynamically added to the PigContext classloader. The explain command now decompiles the class files in the explain output directory. Refer to the golden files in the test for examples of generated code. License: The newly added fernflower.jar used for decompilation is from IntelliJ (java decompiler used by IntelliJ IDEA) https://github.com/JetBrains/intellij-community/blob/master/plugins/java-decompiler/engine/src/org/jetbrains/java/decompiler/main/Fernflower.java and the license of that is Apache 2.0 ithttps://github.com/JetBrains/intellij-community/blob/master/LICENSE.txt TODO items to be addressed in a separate jira: 1) PIG-5279 - Support for MR and Spark. Currently only done for Tez. Will also add documentation in this jira 2) Support for Accumulator 3) Support for CROSS 4) Run the optimizer on combiner plans 5) Fix for test failures - TestScriptLanguage.runParallelTest2, Jython_CompileBindRun_3 java.lang.LinkageError: loader (instance of org/apache/pig/impl/PigContext$ContextClassLoader): attempted duplicate class definition for name: "org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach_scope_6" Since NodeIdGenerator is ThreadLocal, running same script in parallel with different parameters using embedded Python causes conflict. Requires ThreadLocal classloaders in PigContext. Will address in a separate jira - PIG-5291. Other jiras fixed as part of this: 1) PIG-4515: org.apache.pig.builtin.Distinct throws ClassCastException 2) When pig.opt.bytecode=true, UDFs defined as an alias inside nested foreach are executed only once. This was actually the primary goal of doing bytecode generation PIG-3000: Optimize nested foreach PIG-1633: Using an alias withing Nested Foreach causes indeterminate behaviour Diffs (updated) ----- http://svn.apache.org/repos/asf/pig/trunk/build.xml 1804148 http://svn.apache.org/repos/asf/pig/trunk/ivy.xml 1804148 http://svn.apache.org/repos/asf/pig/trunk/ivy/libraries.properties 1804148 http://svn.apache.org/repos/asf/pig/trunk/shade/pom.xml PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/EvalFunc.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/PigConfiguration.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/PigServer.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/AsmUtil.java PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/CodeGenerator.java PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/FilterCodeGenerator.java PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/ForEachCodeGenerator.java PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/BagInputOperator.java PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/PhysicalOperator.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Add.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Divide.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/EqualToExpr.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/GTOrEqualToExpr.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/GreaterThanExpr.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/LTOrEqualToExpr.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/LessThanExpr.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Mod.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Multiply.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/NotEqualToExpr.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POCast.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POProject.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PORegexp.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PORelationToExprProject.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Subtract.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/PODistinct.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POFilter.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POLimit.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POOptimizedForEach.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POSort.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POSortedDistinct.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezResourceManager.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPlanContainer.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/RuntimeCodeGenOptimizer.java PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/util/CombinerOptimizerUtil.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/Distinct.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/InvokerGenerator.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/DataType.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/GetNextTupleDataBag.java PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/SimpleAbstractDataBag.java PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/PigContext.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/io/FileLocalizer.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/plan/OperatorPlan.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/ClassDecompiler.java PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/JarManager.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/UserFuncExpression.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/LogicalPlanBuilder.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/tools/grunt/GruntParser.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestBuiltin.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestByteCodeGen.java PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestEvalPipelineLocal.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestProjectStarRangeInUdf.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/Util.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/MRC17.gld 1804148 http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilter/POFilter_scope_12.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilter/POForEach_scope_11.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilter/plan.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/POFilter_scope_14.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/POFilter_scope_9.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/POForEach_scope_7.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/plan.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachMapLookUp/POForEach_scope_8.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachMapLookUp/plan.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/POForEach_scope_20.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/POForEach_scope_32.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/POForEach_scope_9.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/plan.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval/POForEach_scope_35.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval/plan.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POFilter_scope_45.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POForEach_scope_18.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POForEach_scope_31.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POForEach_scope_43.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POForEach_scope_49.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/plan.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POFilter_scope_20.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POFilter_scope_48.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_14.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_26.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_37.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_42.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_54.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_61.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_67.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_7.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_70.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_76.java.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/plan.gld PRE-CREATION http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Distinct-2.gld 1804148 http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezAutoParallelism.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezCompiler.java 1804148 http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezGraceParallelism.java 1804148 Diff: https://reviews.apache.org/r/61265/diff/7/ Changes: https://reviews.apache.org/r/61265/diff/6-7/ Testing ------- Ran full suite of unit and e2e tests with both pig.opt.bytecode=true. All tests except TestScriptLanguage.runParallelTest2, Jython_CompileBindRun_3 mentioned in description pass. Thanks, Rohini Palaniswamy