----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15194/#review28179 -----------------------------------------------------------
src/org/apache/pig/backend/hadoop/executionengine/tez/POShuffleTezLoad.java <https://reviews.apache.org/r/15194/#comment54825> Need to take care of "pig.sortOrder"? src/org/apache/pig/backend/hadoop/executionengine/tez/POShuffleTezLoad.java <https://reviews.apache.org/r/15194/#comment54826> We don't tag the input, so no index is needed. We might be able to further optimize this (but I am fine to leave it to the future). src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java <https://reviews.apache.org/r/15194/#comment54824> Now we no longer tag input, so no need to specify a group comparator (in old group comparator, we skip tag when compare, to make sure same key (even different tag) is grouped together. - Daniel Dai On Nov. 2, 2013, 1:17 a.m., Mark Wagner wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/15194/ > ----------------------------------------------------------- > > (Updated Nov. 2, 2013, 1:17 a.m.) > > > Review request for pig, Cheolsoo Park, Daniel Dai, and Rohini Palaniswamy. > > > Bugs: PIG-3527 > https://issues.apache.org/jira/browse/PIG-3527 > > > Repository: pig-git > > > Description > ------- > > Adds support for multiple LogicalInputs to the PigProcessor. This is done by > adding a new TezLoad interface which PhysicalOperators may implement. On the > backend, any operators implementing this interface will have the LogicalInput > attached to them. 2 implementations are included: > * POSimpleTezLoad which consumes a single MRInput > * POShuffleTezLoad which consumes one or more ShuffledMergedInputs. > The POShuffleTezLoad does a k-way merge of the shuffle inputs to package for > the operator pipeline. This required a change to the comparators used so that > the sort order remained consistent. There is also a fix to POForEach where it > was using the incorrect status code for signaling (although it produced the > same end result in the MR pipeline). > > > Diffs > ----- > > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigBigDecimalRawComparator.java > ddea99e > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigBigIntegerRawComparator.java > 5ea3fc7 > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigBooleanRawComparator.java > dfd4ebf > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigBytesRawComparator.java > 09397e5 > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigDateTimeRawComparator.java > a87161f > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigDoubleRawComparator.java > cbf457f > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigFloatRawComparator.java > 1d86e3f > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigIntRawComparator.java > bb6c9df > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigLongRawComparator.java > b3ded76 > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigSecondaryKeyComparator.java > 5ad334b > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigTextRawComparator.java > 022f37b > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigTupleDefaultRawComparator.java > 866c39d > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigTupleSortComparator.java > 9724b9f > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/POSimpleTezLoad.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/TezLoad.java > PRE-CREATION > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach.java > eb9f62a > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java > 86314d9 > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java > c200715 > src/org/apache/pig/backend/hadoop/executionengine/tez/FileInputHandler.java > d29e330 > src/org/apache/pig/backend/hadoop/executionengine/tez/InputHandler.java > d2298ca > src/org/apache/pig/backend/hadoop/executionengine/tez/POShuffleTezLoad.java > PRE-CREATION > src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java > ebb3145 > > src/org/apache/pig/backend/hadoop/executionengine/tez/ShuffledInputHandler.java > d7b42b8 > src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java > 45e47b0 > src/org/apache/pig/data/BinInterSedes.java b3ec51e > src/org/apache/pig/data/DefaultTuple.java 2e7ca5f > test/e2e/pig/tests/tez.conf 24af8d3 > > Diff: https://reviews.apache.org/r/15194/diff/ > > > Testing > ------- > > Manual testing and an e2e test has been added. Because of the comparator > change, some of the tests fail because of bag ordering. > > > Thanks, > > Mark Wagner > >
