[jira] Updated: (PIG-1157) Sucessive replicated joins do not generate Map Reduce plan and fails due to OOM

2009-12-18 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-1157:


   Resolution: Fixed
Fix Version/s: (was: 0.6.0)
   0.7.0
   Status: Resolved  (was: Patch Available)

patch committed, thanks Richard

> Sucessive replicated joins do not generate Map Reduce plan and fails due to 
> OOM
> ---
>
> Key: PIG-1157
> URL: https://issues.apache.org/jira/browse/PIG-1157
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.6.0
>Reporter: Viraj Bhat
>Assignee: Richard Ding
> Fix For: 0.7.0
>
> Attachments: oomreplicatedjoin.pig, PIG-1157.patch, PIG-1157.patch, 
> replicatedjoinexplain.log
>
>
> Hi all,
>  I have a script which does 2 replicated joins in succession. Please note 
> that the inputs do not exist on the HDFS.
> {code}
> A = LOAD '/tmp/abc' USING PigStorage('\u0001') AS (a:long, b, c);
> A1 = FOREACH A GENERATE a;
> B = GROUP A1 BY a;
> C = LOAD '/tmp/xyz' USING PigStorage('\u0001') AS (x:long, y);
> D = JOIN C BY x, B BY group USING "replicated";
> E = JOIN A BY a, D by x USING "replicated";
> dump E;
> {code}
> 2009-12-16 19:12:00,253 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - MR plan size before optimization: 4
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - Merged 1 map-only splittees.
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - Merged 1 map-reduce splittees.
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - Merged 2 out of total 2 splittees.
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - MR plan size after optimization: 2
> 2009-12-16 19:12:00,713 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 2998: Unhandled internal error. unable to create new native thread
> Details at logfile: pig_1260990666148.log
> Looking at the log file:
> Pig Stack Trace
> ---
> ERROR 2998: Unhandled internal error. unable to create new native thread
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:597)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:131)
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
> at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:773)
> at org.apache.pig.PigServer.store(PigServer.java:522)
> at org.apache.pig.PigServer.openIterator(PigServer.java:458)
> at 
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:532)
> at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:190)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:142)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
> at org.apache.pig.Main.main(Main.java:397)
> 
> If we want to look at the explain output, we find that there is no Map Reduce 
> plan that is generated. 
>  Why is the M/R plan not generated?
> Attaching the script and explain output.
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1157) Sucessive replicated joins do not generate Map Reduce plan and fails due to OOM

2009-12-18 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1157:
--

Status: Patch Available  (was: Open)

> Sucessive replicated joins do not generate Map Reduce plan and fails due to 
> OOM
> ---
>
> Key: PIG-1157
> URL: https://issues.apache.org/jira/browse/PIG-1157
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.6.0
>Reporter: Viraj Bhat
>Assignee: Richard Ding
> Fix For: 0.6.0
>
> Attachments: oomreplicatedjoin.pig, PIG-1157.patch, PIG-1157.patch, 
> replicatedjoinexplain.log
>
>
> Hi all,
>  I have a script which does 2 replicated joins in succession. Please note 
> that the inputs do not exist on the HDFS.
> {code}
> A = LOAD '/tmp/abc' USING PigStorage('\u0001') AS (a:long, b, c);
> A1 = FOREACH A GENERATE a;
> B = GROUP A1 BY a;
> C = LOAD '/tmp/xyz' USING PigStorage('\u0001') AS (x:long, y);
> D = JOIN C BY x, B BY group USING "replicated";
> E = JOIN A BY a, D by x USING "replicated";
> dump E;
> {code}
> 2009-12-16 19:12:00,253 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - MR plan size before optimization: 4
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - Merged 1 map-only splittees.
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - Merged 1 map-reduce splittees.
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - Merged 2 out of total 2 splittees.
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - MR plan size after optimization: 2
> 2009-12-16 19:12:00,713 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 2998: Unhandled internal error. unable to create new native thread
> Details at logfile: pig_1260990666148.log
> Looking at the log file:
> Pig Stack Trace
> ---
> ERROR 2998: Unhandled internal error. unable to create new native thread
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:597)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:131)
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
> at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:773)
> at org.apache.pig.PigServer.store(PigServer.java:522)
> at org.apache.pig.PigServer.openIterator(PigServer.java:458)
> at 
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:532)
> at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:190)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:142)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
> at org.apache.pig.Main.main(Main.java:397)
> 
> If we want to look at the explain output, we find that there is no Map Reduce 
> plan that is generated. 
>  Why is the M/R plan not generated?
> Attaching the script and explain output.
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1157) Sucessive replicated joins do not generate Map Reduce plan and fails due to OOM

2009-12-18 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1157:
--

Attachment: PIG-1157.patch

> Sucessive replicated joins do not generate Map Reduce plan and fails due to 
> OOM
> ---
>
> Key: PIG-1157
> URL: https://issues.apache.org/jira/browse/PIG-1157
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.6.0
>Reporter: Viraj Bhat
>Assignee: Richard Ding
> Fix For: 0.6.0
>
> Attachments: oomreplicatedjoin.pig, PIG-1157.patch, PIG-1157.patch, 
> replicatedjoinexplain.log
>
>
> Hi all,
>  I have a script which does 2 replicated joins in succession. Please note 
> that the inputs do not exist on the HDFS.
> {code}
> A = LOAD '/tmp/abc' USING PigStorage('\u0001') AS (a:long, b, c);
> A1 = FOREACH A GENERATE a;
> B = GROUP A1 BY a;
> C = LOAD '/tmp/xyz' USING PigStorage('\u0001') AS (x:long, y);
> D = JOIN C BY x, B BY group USING "replicated";
> E = JOIN A BY a, D by x USING "replicated";
> dump E;
> {code}
> 2009-12-16 19:12:00,253 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - MR plan size before optimization: 4
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - Merged 1 map-only splittees.
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - Merged 1 map-reduce splittees.
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - Merged 2 out of total 2 splittees.
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - MR plan size after optimization: 2
> 2009-12-16 19:12:00,713 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 2998: Unhandled internal error. unable to create new native thread
> Details at logfile: pig_1260990666148.log
> Looking at the log file:
> Pig Stack Trace
> ---
> ERROR 2998: Unhandled internal error. unable to create new native thread
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:597)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:131)
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
> at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:773)
> at org.apache.pig.PigServer.store(PigServer.java:522)
> at org.apache.pig.PigServer.openIterator(PigServer.java:458)
> at 
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:532)
> at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:190)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:142)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
> at org.apache.pig.Main.main(Main.java:397)
> 
> If we want to look at the explain output, we find that there is no Map Reduce 
> plan that is generated. 
>  Why is the M/R plan not generated?
> Attaching the script and explain output.
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1157) Sucessive replicated joins do not generate Map Reduce plan and fails due to OOM

2009-12-18 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1157:
--

Status: Open  (was: Patch Available)

> Sucessive replicated joins do not generate Map Reduce plan and fails due to 
> OOM
> ---
>
> Key: PIG-1157
> URL: https://issues.apache.org/jira/browse/PIG-1157
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.6.0
>Reporter: Viraj Bhat
>Assignee: Richard Ding
> Fix For: 0.6.0
>
> Attachments: oomreplicatedjoin.pig, PIG-1157.patch, PIG-1157.patch, 
> replicatedjoinexplain.log
>
>
> Hi all,
>  I have a script which does 2 replicated joins in succession. Please note 
> that the inputs do not exist on the HDFS.
> {code}
> A = LOAD '/tmp/abc' USING PigStorage('\u0001') AS (a:long, b, c);
> A1 = FOREACH A GENERATE a;
> B = GROUP A1 BY a;
> C = LOAD '/tmp/xyz' USING PigStorage('\u0001') AS (x:long, y);
> D = JOIN C BY x, B BY group USING "replicated";
> E = JOIN A BY a, D by x USING "replicated";
> dump E;
> {code}
> 2009-12-16 19:12:00,253 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - MR plan size before optimization: 4
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - Merged 1 map-only splittees.
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - Merged 1 map-reduce splittees.
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - Merged 2 out of total 2 splittees.
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - MR plan size after optimization: 2
> 2009-12-16 19:12:00,713 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 2998: Unhandled internal error. unable to create new native thread
> Details at logfile: pig_1260990666148.log
> Looking at the log file:
> Pig Stack Trace
> ---
> ERROR 2998: Unhandled internal error. unable to create new native thread
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:597)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:131)
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
> at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:773)
> at org.apache.pig.PigServer.store(PigServer.java:522)
> at org.apache.pig.PigServer.openIterator(PigServer.java:458)
> at 
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:532)
> at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:190)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:142)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
> at org.apache.pig.Main.main(Main.java:397)
> 
> If we want to look at the explain output, we find that there is no Map Reduce 
> plan that is generated. 
>  Why is the M/R plan not generated?
> Attaching the script and explain output.
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1157) Sucessive replicated joins do not generate Map Reduce plan and fails due to OOM

2009-12-17 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1157:
--

Attachment: PIG-1157.patch

The problem is that, by merging a MR splittee with a FR join, the MultiQuery 
optimizer may introduce a direct cycle to the graph of the MR plan. This patch 
fixed this problem by not merging FR splitees.

This is actually stronger than necessary. A better solution would be to check 
if merging a MR splittee would form a directed cycle in the original DAG before 
merging it, and if not, allow the merge to go ahead.

> Sucessive replicated joins do not generate Map Reduce plan and fails due to 
> OOM
> ---
>
> Key: PIG-1157
> URL: https://issues.apache.org/jira/browse/PIG-1157
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.6.0
>Reporter: Viraj Bhat
>Assignee: Richard Ding
> Fix For: 0.6.0
>
> Attachments: oomreplicatedjoin.pig, PIG-1157.patch, 
> replicatedjoinexplain.log
>
>
> Hi all,
>  I have a script which does 2 replicated joins in succession. Please note 
> that the inputs do not exist on the HDFS.
> {code}
> A = LOAD '/tmp/abc' USING PigStorage('\u0001') AS (a:long, b, c);
> A1 = FOREACH A GENERATE a;
> B = GROUP A1 BY a;
> C = LOAD '/tmp/xyz' USING PigStorage('\u0001') AS (x:long, y);
> D = JOIN C BY x, B BY group USING "replicated";
> E = JOIN A BY a, D by x USING "replicated";
> dump E;
> {code}
> 2009-12-16 19:12:00,253 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - MR plan size before optimization: 4
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - Merged 1 map-only splittees.
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - Merged 1 map-reduce splittees.
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - Merged 2 out of total 2 splittees.
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - MR plan size after optimization: 2
> 2009-12-16 19:12:00,713 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 2998: Unhandled internal error. unable to create new native thread
> Details at logfile: pig_1260990666148.log
> Looking at the log file:
> Pig Stack Trace
> ---
> ERROR 2998: Unhandled internal error. unable to create new native thread
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:597)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:131)
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
> at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:773)
> at org.apache.pig.PigServer.store(PigServer.java:522)
> at org.apache.pig.PigServer.openIterator(PigServer.java:458)
> at 
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:532)
> at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:190)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:142)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
> at org.apache.pig.Main.main(Main.java:397)
> 
> If we want to look at the explain output, we find that there is no Map Reduce 
> plan that is generated. 
>  Why is the M/R plan not generated?
> Attaching the script and explain output.
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1157) Sucessive replicated joins do not generate Map Reduce plan and fails due to OOM

2009-12-17 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1157:
--

Status: Patch Available  (was: Open)

> Sucessive replicated joins do not generate Map Reduce plan and fails due to 
> OOM
> ---
>
> Key: PIG-1157
> URL: https://issues.apache.org/jira/browse/PIG-1157
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.6.0
>Reporter: Viraj Bhat
>Assignee: Richard Ding
> Fix For: 0.6.0
>
> Attachments: oomreplicatedjoin.pig, PIG-1157.patch, 
> replicatedjoinexplain.log
>
>
> Hi all,
>  I have a script which does 2 replicated joins in succession. Please note 
> that the inputs do not exist on the HDFS.
> {code}
> A = LOAD '/tmp/abc' USING PigStorage('\u0001') AS (a:long, b, c);
> A1 = FOREACH A GENERATE a;
> B = GROUP A1 BY a;
> C = LOAD '/tmp/xyz' USING PigStorage('\u0001') AS (x:long, y);
> D = JOIN C BY x, B BY group USING "replicated";
> E = JOIN A BY a, D by x USING "replicated";
> dump E;
> {code}
> 2009-12-16 19:12:00,253 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - MR plan size before optimization: 4
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - Merged 1 map-only splittees.
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - Merged 1 map-reduce splittees.
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - Merged 2 out of total 2 splittees.
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - MR plan size after optimization: 2
> 2009-12-16 19:12:00,713 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 2998: Unhandled internal error. unable to create new native thread
> Details at logfile: pig_1260990666148.log
> Looking at the log file:
> Pig Stack Trace
> ---
> ERROR 2998: Unhandled internal error. unable to create new native thread
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:597)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:131)
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
> at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:773)
> at org.apache.pig.PigServer.store(PigServer.java:522)
> at org.apache.pig.PigServer.openIterator(PigServer.java:458)
> at 
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:532)
> at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:190)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:142)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
> at org.apache.pig.Main.main(Main.java:397)
> 
> If we want to look at the explain output, we find that there is no Map Reduce 
> plan that is generated. 
>  Why is the M/R plan not generated?
> Attaching the script and explain output.
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-1157) Sucessive replicated joins do not generate Map Reduce plan and fails due to OOM

2009-12-16 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat updated PIG-1157:


Attachment: oomreplicatedjoin.pig
replicatedjoinexplain.log

Explain output and Pig script.

> Sucessive replicated joins do not generate Map Reduce plan and fails due to 
> OOM
> ---
>
> Key: PIG-1157
> URL: https://issues.apache.org/jira/browse/PIG-1157
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.6.0
>Reporter: Viraj Bhat
> Fix For: 0.6.0
>
> Attachments: oomreplicatedjoin.pig, replicatedjoinexplain.log
>
>
> Hi all,
>  I have a script which does 2 replicated joins in succession. Please note 
> that the inputs do not exist on the HDFS.
> {code}
> A = LOAD '/tmp/abc' USING PigStorage('\u0001') AS (a:long, b, c);
> A1 = FOREACH A GENERATE a;
> B = GROUP A1 BY a;
> C = LOAD '/tmp/xyz' USING PigStorage('\u0001') AS (x:long, y);
> D = JOIN C BY x, B BY group USING "replicated";
> E = JOIN A BY a, D by x USING "replicated";
> dump E;
> {code}
> 2009-12-16 19:12:00,253 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - MR plan size before optimization: 4
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - Merged 1 map-only splittees.
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - Merged 1 map-reduce splittees.
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - Merged 2 out of total 2 splittees.
> 2009-12-16 19:12:00,254 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
>  - MR plan size after optimization: 2
> 2009-12-16 19:12:00,713 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 2998: Unhandled internal error. unable to create new native thread
> Details at logfile: pig_1260990666148.log
> Looking at the log file:
> Pig Stack Trace
> ---
> ERROR 2998: Unhandled internal error. unable to create new native thread
> java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:597)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:131)
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
> at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:773)
> at org.apache.pig.PigServer.store(PigServer.java:522)
> at org.apache.pig.PigServer.openIterator(PigServer.java:458)
> at 
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:532)
> at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:190)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:142)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
> at org.apache.pig.Main.main(Main.java:397)
> 
> If we want to look at the explain output, we find that there is no Map Reduce 
> plan that is generated. 
>  Why is the M/R plan not generated?
> Attaching the script and explain output.
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.