[jira] Commented: (PIG-1113) Diamond query optimization throws error in JOIN

2009-12-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12784515#action_12784515
 ] 

Hadoop QA commented on PIG-1113:


+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12426566/PIG-1113.patch
  against trunk revision 885858.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/71/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/71/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/71/console

This message is automatically generated.

 Diamond query optimization throws error in JOIN
 ---

 Key: PIG-1113
 URL: https://issues.apache.org/jira/browse/PIG-1113
 Project: Pig
  Issue Type: Bug
Reporter: Ankur
Assignee: Richard Ding
 Fix For: 0.6.0

 Attachments: PIG-1113.patch


 The following script results in 1 M/R job as a result of diamond query 
 optimization but the script fails.
 set1 = LOAD 'set1' USING PigStorage as (a:chararray, b:chararray, 
 c:chararray);
 set2 = LOAD 'set2' USING PigStorage as (a: chararray, b:chararray, c:bag{});
 set2_1 = FOREACH set2 GENERATE a as f1, b as f2, (chararray) 0 as f3;
 set2_2 = FOREACH set2 GENERATE a as f1, FLATTEN((IsEmpty(c) ? null : c)) as 
 f2, (chararray) 1 as f3;
 all_set2 = UNION set2_1, set2_2;
 joined_sets = JOIN set1 BY (a,b), all_set2 BY (f2,f3);
 dump joined_sets;
 And here is the error
 org.apache.pig.backend.executionengine.ExecException: ERROR 1071: Cannot 
 convert a bag to a String
   at org.apache.pig.data.DataType.toString(DataType.java:739)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:625)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:364)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:288)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:260)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:256)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POUnion.getNext(POUnion.java:162)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:247)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:238)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:93)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
   at org.apache.hadoop.mapred.Child.main(Child.java:159)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1113) Diamond query optimization throws error in JOIN

2009-11-30 Thread Richard Ding (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783984#action_12783984
 ] 

Richard Ding commented on PIG-1113:
---

The problem here is that the diamond query optimization didn't take into 
account that the diamond tail may also load files other than the file stored 
by the diamond head. The diamond query optimization should check the file 
specs (make sure the load file of the diamond tail is the same as the store 
file of the diamon head) before removing store/load combination.

 Diamond query optimization throws error in JOIN
 ---

 Key: PIG-1113
 URL: https://issues.apache.org/jira/browse/PIG-1113
 Project: Pig
  Issue Type: Bug
Reporter: Ankur
Assignee: Richard Ding
 Fix For: 0.6.0


 The following script results in 1 M/R job as a result of diamond query 
 optimization but the script fails.
 set1 = LOAD 'set1' USING PigStorage as (a:chararray, b:chararray, 
 c:chararray);
 set2 = LOAD 'set2' USING PigStorage as (a: chararray, b:chararray, c:bag{});
 set2_1 = FOREACH set2 GENERATE a as f1, b as f2, (chararray) 0 as f3;
 set2_2 = FOREACH set2 GENERATE a as f1, FLATTEN((IsEmpty(c) ? null : c)) as 
 f2, (chararray) 1 as f3;
 all_set2 = UNION set2_1, set2_2;
 joined_sets = JOIN set1 BY (a,b), all_set2 BY (f2,f3);
 dump joined_sets;
 And here is the error
 org.apache.pig.backend.executionengine.ExecException: ERROR 1071: Cannot 
 convert a bag to a String
   at org.apache.pig.data.DataType.toString(DataType.java:739)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:625)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:364)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:288)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:260)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:256)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POUnion.getNext(POUnion.java:162)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:247)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:238)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:93)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
   at org.apache.hadoop.mapred.Child.main(Child.java:159)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1113) Diamond query optimization throws error in JOIN

2009-11-26 Thread Ankur (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12782877#action_12782877
 ] 

Ankur commented on PIG-1113:


The script fails even if correct schema is specified for the c:bag{}. So the 
following change does not alleviate the problem

set2 = LOAD 'set2' USING PigStorage as (a: chararray, b:chararray, 
c:bag{T:tuple(l:chararray)});

 Diamond query optimization throws error in JOIN
 ---

 Key: PIG-1113
 URL: https://issues.apache.org/jira/browse/PIG-1113
 Project: Pig
  Issue Type: Bug
Reporter: Ankur

 The following script results in 1 M/R job as a result of diamond query 
 optimization but the script fails.
 set1 = LOAD 'set1' USING PigStorage as (a:chararray, b:chararray, 
 c:chararray);
 set2 = LOAD 'set2' USING PigStorage as (a: chararray, b:chararray, c:bag{});
 set2_1 = FOREACH set2 GENERATE a as f1, b as f2, (chararray) 0 as f3;
 set2_2 = FOREACH set2 GENERATE a as f1, FLATTEN((IsEmpty(c) ? null : c)) as 
 f2, (chararray) 1 as f3;
 all_set2 = UNION set2_1, set2_2;
 joined_sets = JOIN set1 BY (a,b), all_set2 BY (f2,f3);
 dump joined_sets;
 And here is the error
 org.apache.pig.backend.executionengine.ExecException: ERROR 1071: Cannot 
 convert a bag to a String
   at org.apache.pig.data.DataType.toString(DataType.java:739)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:625)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:364)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:288)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:260)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:256)
   at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POUnion.getNext(POUnion.java:162)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:247)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:238)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:93)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
   at org.apache.hadoop.mapred.Child.main(Child.java:159)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.