[jira] Commented: (PIG-1157) Sucessive replicated joins do not generate Map Reduce plan and fails due to OOM

2009-12-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792468#action_12792468
 ] 

Hadoop QA commented on PIG-1157:


-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12428359/PIG-1157.patch
  against trunk revision 892125.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/140/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/140/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/140/console

This message is automatically generated.

 Sucessive replicated joins do not generate Map Reduce plan and fails due to 
 OOM
 ---

 Key: PIG-1157
 URL: https://issues.apache.org/jira/browse/PIG-1157
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.6.0
Reporter: Viraj Bhat
Assignee: Richard Ding
 Fix For: 0.6.0

 Attachments: oomreplicatedjoin.pig, PIG-1157.patch, 
 replicatedjoinexplain.log


 Hi all,
  I have a script which does 2 replicated joins in succession. Please note 
 that the inputs do not exist on the HDFS.
 {code}
 A = LOAD '/tmp/abc' USING PigStorage('\u0001') AS (a:long, b, c);
 A1 = FOREACH A GENERATE a;
 B = GROUP A1 BY a;
 C = LOAD '/tmp/xyz' USING PigStorage('\u0001') AS (x:long, y);
 D = JOIN C BY x, B BY group USING replicated;
 E = JOIN A BY a, D by x USING replicated;
 dump E;
 {code}
 2009-12-16 19:12:00,253 [main] INFO  
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
  - MR plan size before optimization: 4
 2009-12-16 19:12:00,254 [main] INFO  
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
  - Merged 1 map-only splittees.
 2009-12-16 19:12:00,254 [main] INFO  
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
  - Merged 1 map-reduce splittees.
 2009-12-16 19:12:00,254 [main] INFO  
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
  - Merged 2 out of total 2 splittees.
 2009-12-16 19:12:00,254 [main] INFO  
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
  - MR plan size after optimization: 2
 2009-12-16 19:12:00,713 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 2998: Unhandled internal error. unable to create new native thread
 Details at logfile: pig_1260990666148.log
 Looking at the log file:
 Pig Stack Trace
 ---
 ERROR 2998: Unhandled internal error. unable to create new native thread
 java.lang.OutOfMemoryError: unable to create new native thread
 at java.lang.Thread.start0(Native Method)
 at java.lang.Thread.start(Thread.java:597)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:131)
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
 at 
 org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:773)
 at org.apache.pig.PigServer.store(PigServer.java:522)
 at org.apache.pig.PigServer.openIterator(PigServer.java:458)
 at 
 org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:532)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:190)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:142)
 at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
 at org.apache.pig.Main.main(Main.java:397)
 
 If we want to look at the explain output, we find that there is no Map Reduce 
 plan that is generated. 
  Why is the M/R plan not generated?
 Attaching the script and explain output.
 Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply 

[jira] Commented: (PIG-1157) Sucessive replicated joins do not generate Map Reduce plan and fails due to OOM

2009-12-18 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792593#action_12792593
 ] 

Olga Natkovich commented on PIG-1157:
-

+1. Patch looks good. Will commit once the tests pass.

 Sucessive replicated joins do not generate Map Reduce plan and fails due to 
 OOM
 ---

 Key: PIG-1157
 URL: https://issues.apache.org/jira/browse/PIG-1157
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.6.0
Reporter: Viraj Bhat
Assignee: Richard Ding
 Fix For: 0.6.0

 Attachments: oomreplicatedjoin.pig, PIG-1157.patch, PIG-1157.patch, 
 replicatedjoinexplain.log


 Hi all,
  I have a script which does 2 replicated joins in succession. Please note 
 that the inputs do not exist on the HDFS.
 {code}
 A = LOAD '/tmp/abc' USING PigStorage('\u0001') AS (a:long, b, c);
 A1 = FOREACH A GENERATE a;
 B = GROUP A1 BY a;
 C = LOAD '/tmp/xyz' USING PigStorage('\u0001') AS (x:long, y);
 D = JOIN C BY x, B BY group USING replicated;
 E = JOIN A BY a, D by x USING replicated;
 dump E;
 {code}
 2009-12-16 19:12:00,253 [main] INFO  
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
  - MR plan size before optimization: 4
 2009-12-16 19:12:00,254 [main] INFO  
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
  - Merged 1 map-only splittees.
 2009-12-16 19:12:00,254 [main] INFO  
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
  - Merged 1 map-reduce splittees.
 2009-12-16 19:12:00,254 [main] INFO  
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
  - Merged 2 out of total 2 splittees.
 2009-12-16 19:12:00,254 [main] INFO  
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
  - MR plan size after optimization: 2
 2009-12-16 19:12:00,713 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 2998: Unhandled internal error. unable to create new native thread
 Details at logfile: pig_1260990666148.log
 Looking at the log file:
 Pig Stack Trace
 ---
 ERROR 2998: Unhandled internal error. unable to create new native thread
 java.lang.OutOfMemoryError: unable to create new native thread
 at java.lang.Thread.start0(Native Method)
 at java.lang.Thread.start(Thread.java:597)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:131)
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
 at 
 org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:773)
 at org.apache.pig.PigServer.store(PigServer.java:522)
 at org.apache.pig.PigServer.openIterator(PigServer.java:458)
 at 
 org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:532)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:190)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:142)
 at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
 at org.apache.pig.Main.main(Main.java:397)
 
 If we want to look at the explain output, we find that there is no Map Reduce 
 plan that is generated. 
  Why is the M/R plan not generated?
 Attaching the script and explain output.
 Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1157) Sucessive replicated joins do not generate Map Reduce plan and fails due to OOM

2009-12-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12792633#action_12792633
 ] 

Hadoop QA commented on PIG-1157:


+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12428448/PIG-1157.patch
  against trunk revision 892125.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/141/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/141/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/141/console

This message is automatically generated.

 Sucessive replicated joins do not generate Map Reduce plan and fails due to 
 OOM
 ---

 Key: PIG-1157
 URL: https://issues.apache.org/jira/browse/PIG-1157
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.6.0
Reporter: Viraj Bhat
Assignee: Richard Ding
 Fix For: 0.6.0

 Attachments: oomreplicatedjoin.pig, PIG-1157.patch, PIG-1157.patch, 
 replicatedjoinexplain.log


 Hi all,
  I have a script which does 2 replicated joins in succession. Please note 
 that the inputs do not exist on the HDFS.
 {code}
 A = LOAD '/tmp/abc' USING PigStorage('\u0001') AS (a:long, b, c);
 A1 = FOREACH A GENERATE a;
 B = GROUP A1 BY a;
 C = LOAD '/tmp/xyz' USING PigStorage('\u0001') AS (x:long, y);
 D = JOIN C BY x, B BY group USING replicated;
 E = JOIN A BY a, D by x USING replicated;
 dump E;
 {code}
 2009-12-16 19:12:00,253 [main] INFO  
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
  - MR plan size before optimization: 4
 2009-12-16 19:12:00,254 [main] INFO  
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
  - Merged 1 map-only splittees.
 2009-12-16 19:12:00,254 [main] INFO  
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
  - Merged 1 map-reduce splittees.
 2009-12-16 19:12:00,254 [main] INFO  
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
  - Merged 2 out of total 2 splittees.
 2009-12-16 19:12:00,254 [main] INFO  
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
  - MR plan size after optimization: 2
 2009-12-16 19:12:00,713 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 2998: Unhandled internal error. unable to create new native thread
 Details at logfile: pig_1260990666148.log
 Looking at the log file:
 Pig Stack Trace
 ---
 ERROR 2998: Unhandled internal error. unable to create new native thread
 java.lang.OutOfMemoryError: unable to create new native thread
 at java.lang.Thread.start0(Native Method)
 at java.lang.Thread.start(Thread.java:597)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:131)
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
 at 
 org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:773)
 at org.apache.pig.PigServer.store(PigServer.java:522)
 at org.apache.pig.PigServer.openIterator(PigServer.java:458)
 at 
 org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:532)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:190)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:142)
 at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
 at org.apache.pig.Main.main(Main.java:397)
 
 If we want to look at the explain output, we find that there is no Map Reduce 
 plan that is generated. 
  Why is the M/R plan not generated?
 Attaching the script and explain output.
 Viraj

-- 
This message is automatically generated by JIRA.
-

[jira] Commented: (PIG-1157) Sucessive replicated joins do not generate Map Reduce plan and fails due to OOM

2009-12-16 Thread Richard Ding (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12791730#action_12791730
 ] 

Richard Ding commented on PIG-1157:
---

Two quick observations:

1. The script works if multi-query optimization is disabled (-M).
2. The script also works if using regular join instead of replicated join.

I'll look into it further.

 Sucessive replicated joins do not generate Map Reduce plan and fails due to 
 OOM
 ---

 Key: PIG-1157
 URL: https://issues.apache.org/jira/browse/PIG-1157
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.6.0
Reporter: Viraj Bhat
 Fix For: 0.6.0

 Attachments: oomreplicatedjoin.pig, replicatedjoinexplain.log


 Hi all,
  I have a script which does 2 replicated joins in succession. Please note 
 that the inputs do not exist on the HDFS.
 {code}
 A = LOAD '/tmp/abc' USING PigStorage('\u0001') AS (a:long, b, c);
 A1 = FOREACH A GENERATE a;
 B = GROUP A1 BY a;
 C = LOAD '/tmp/xyz' USING PigStorage('\u0001') AS (x:long, y);
 D = JOIN C BY x, B BY group USING replicated;
 E = JOIN A BY a, D by x USING replicated;
 dump E;
 {code}
 2009-12-16 19:12:00,253 [main] INFO  
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
  - MR plan size before optimization: 4
 2009-12-16 19:12:00,254 [main] INFO  
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
  - Merged 1 map-only splittees.
 2009-12-16 19:12:00,254 [main] INFO  
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
  - Merged 1 map-reduce splittees.
 2009-12-16 19:12:00,254 [main] INFO  
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
  - Merged 2 out of total 2 splittees.
 2009-12-16 19:12:00,254 [main] INFO  
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
  - MR plan size after optimization: 2
 2009-12-16 19:12:00,713 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 2998: Unhandled internal error. unable to create new native thread
 Details at logfile: pig_1260990666148.log
 Looking at the log file:
 Pig Stack Trace
 ---
 ERROR 2998: Unhandled internal error. unable to create new native thread
 java.lang.OutOfMemoryError: unable to create new native thread
 at java.lang.Thread.start0(Native Method)
 at java.lang.Thread.start(Thread.java:597)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:131)
 at 
 org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
 at 
 org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:773)
 at org.apache.pig.PigServer.store(PigServer.java:522)
 at org.apache.pig.PigServer.openIterator(PigServer.java:458)
 at 
 org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:532)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:190)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:142)
 at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
 at org.apache.pig.Main.main(Main.java:397)
 
 If we want to look at the explain output, we find that there is no Map Reduce 
 plan that is generated. 
  Why is the M/R plan not generated?
 Attaching the script and explain output.
 Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.