[ https://issues.apache.org/jira/browse/PIG-2286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13108877#comment-13108877 ]
jirapos...@reviews.apache.org commented on PIG-2286: ---------------------------------------------------- ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1929/#review1974 ----------------------------------------------------------- trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java <https://reviews.apache.org/r/1929/#comment4462> I think a comment will be useful - // The algebraic udf can have more than one input. Add the udf only once trunk/src/org/apache/pig/builtin/COR.java <https://reviews.apache.org/r/1929/#comment4463> The size of the tuple would need to be size*(size-1). Details - the inner loop is executed - (n-1) + (n-2) + .. (n - (n-1)) = n(n-1)/2 . Each time the inner loop is executed two columns are being added. So 2 * n(n-1)/2 = n(n-1) trunk/src/org/apache/pig/builtin/COR.java <https://reviews.apache.org/r/1929/#comment4464> I don't understand why the values are being added to a tuple as columns. That does not look right. - Thejas On 2011-09-16 18:11:08, Daniel Dai wrote: bq. bq. ----------------------------------------------------------- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/1929/ bq. ----------------------------------------------------------- bq. bq. (Updated 2011-09-16 18:11:08) bq. bq. bq. Review request for pig and Thejas Nair. bq. bq. bq. Summary bq. ------- bq. bq. See PIG-2286 bq. bq. bq. This addresses bug PIG-2286. bq. https://issues.apache.org/jira/browse/PIG-2286 bq. bq. bq. Diffs bq. ----- bq. bq. trunk/src/org/apache/pig/builtin/COR.java 1171325 bq. trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 1171325 bq. trunk/test/e2e/pig/tests/nightly.conf 1171325 bq. bq. Diff: https://reviews.apache.org/r/1929/diff bq. bq. bq. Testing bq. ------- bq. bq. Unit-test: bq. all pass bq. bq. Piggybank-test: bq. TestDBStorage fail for other reason, unrelated to patch bq. bq. Test-patch: bq. [exec] +1 overall. bq. [exec] bq. [exec] +1 @author. The patch does not contain any @author tags. bq. [exec] bq. [exec] +1 tests included. The patch appears to include 3 new or modified tests. bq. [exec] bq. [exec] +1 javadoc. The javadoc tool did not generate any warning messages. bq. [exec] bq. [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. bq. [exec] bq. [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. bq. [exec] bq. [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. bq. bq. bq. Thanks, bq. bq. Daniel bq. bq. > Using COR function in Piggybank results in ERROR 2018: Internal error. Unable > to introduce the combiner for optimization > ------------------------------------------------------------------------------------------------------------------------ > > Key: PIG-2286 > URL: https://issues.apache.org/jira/browse/PIG-2286 > Project: Pig > Issue Type: Bug > Components: impl, piggybank > Affects Versions: 0.9.0 > Reporter: Viraj Bhat > Assignee: Daniel Dai > Attachments: PIG-2286-1.patch > > > Usage of the COR function in a Pig script, results in an error. The > "studenttab5" contains student, age and gpa separated by "tab". > {code} > register /home/viraj/pig-svn/trunk/contrib/piggybank/java/piggybank.jar; > A = LOAD '/user/viraj/studenttab5' AS (name, age:double,gpa:double); > B = group A all; > C = foreach B generate group, COR(A.a, A.b); > dump C; > {code} > {quote} > 2011-09-14 17:03:22,001 [main] INFO > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting > to hadoop file system at: hdfs://localhost:9000 > 2011-09-14 17:03:22,088 [main] INFO > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting > to map-reduce job tracker at: localhost:9001 > 2011-09-14 17:03:22,960 [main] INFO > org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: > GROUP_BY > 2011-09-14 17:03:23,168 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - > File concatenation threshold: 100 optimistic? false > 2011-09-14 17:03:23,179 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.CombinerOptimizer > - Choosing to move algebraic foreach to combiner > 2011-09-14 17:03:23,186 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 2018: Internal error. Unable to introduce the combiner for optimization. > {quote} > Viraj -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira