Try to increase heap size. If you are running through bin/pig, set
PIG_HEAPSIZE (in MB, default is 1000). You can use "pig -secretDebugCmd"
option to see what the command line looks like.
Daniel
On 06/15/2011 10:09 AM, Shubham Chopra wrote:
Hi,
I am using Pig for number crunching on data that has a large number of
columns (~300 or so). The script has around 25 operators and all I am doing
in the script is group bys and SUMs. The script fails with the following
exception:
<code>
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit
exceeded
at java.util.HashMap.<init>(HashMap.java:209)
at
org.apache.pig.impl.logicalLayer.schema.Schema$FieldSchema.<init>(Schema.java:190)
at
org.apache.pig.impl.logicalLayer.schema.Schema$FieldSchema.clone(Schema.java:450)
at
org.apache.pig.impl.logicalLayer.schema.Schema.clone(Schema.java:1005)
at
org.apache.pig.impl.logicalLayer.schema.Schema$FieldSchema.clone(Schema.java:450)
at
org.apache.pig.impl.logicalLayer.ExpressionOperator.clone(ExpressionOperator.java:144)
at
org.apache.pig.impl.logicalLayer.LOProject.clone(LOProject.java:447)
at
org.apache.pig.impl.logicalLayer.LogicalPlan.clone(LogicalPlan.java:116)
at
org.apache.pig.impl.logicalLayer.LogicalPlanCloneHelper.<init>(LogicalPlanCloneHelper.java:63)
at
org.apache.pig.impl.logicalLayer.LogicalPlanCloner.getClonedPlan(LogicalPlanCloner.java:45)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.ForEachClause(QueryParser.java:3504)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1464)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:1013)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:800)
at
org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1601)
at org.apache.pig.PigServer$Graph.clone(PigServer.java:1645)
at org.apache.pig.PigServer.getClonedGraph(PigServer.java:527)
at org.apache.pig.PigServer.storeEx(PigServer.java:850)
at org.apache.pig.PigServer.store(PigServer.java:816)
at org.apache.pig.PigServer.store(PigServer.java:784)
</code>
The complete output I see is the following:
<code>
$run-script
11/06/15 09:19:27 INFO executionengine.HExecutionEngine: Connecting to
hadoop file system at: hdfs://abcd:9000
11/06/15 09:19:28 INFO executionengine.HExecutionEngine: Connecting to
map-reduce job tracker at: abcd:9001
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit
exceeded
at java.util.HashMap.<init>(HashMap.java:209)
at
org.apache.pig.impl.logicalLayer.schema.Schema$FieldSchema.<init>(Schema.java:190)
at
org.apache.pig.impl.logicalLayer.schema.Schema$FieldSchema.clone(Schema.java:450)
at
org.apache.pig.impl.logicalLayer.schema.Schema.clone(Schema.java:1005)
at
org.apache.pig.impl.logicalLayer.schema.Schema$FieldSchema.clone(Schema.java:450)
at
org.apache.pig.impl.logicalLayer.ExpressionOperator.clone(ExpressionOperator.java:144)
at
org.apache.pig.impl.logicalLayer.LOProject.clone(LOProject.java:447)
at
org.apache.pig.impl.logicalLayer.LogicalPlan.clone(LogicalPlan.java:116)
at
org.apache.pig.impl.logicalLayer.LogicalPlanCloneHelper.<init>(LogicalPlanCloneHelper.java:63)
at
org.apache.pig.impl.logicalLayer.LogicalPlanCloner.getClonedPlan(LogicalPlanCloner.java:45)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.ForEachClause(QueryParser.java:3504)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1464)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:1013)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:800)
at
org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1601)
at org.apache.pig.PigServer$Graph.clone(PigServer.java:1645)
at org.apache.pig.PigServer.getClonedGraph(PigServer.java:527)
at org.apache.pig.PigServer.storeEx(PigServer.java:850)
at org.apache.pig.PigServer.store(PigServer.java:816)
at org.apache.pig.PigServer.store(PigServer.java:784)
</code>
The process uses around 1.2 gigs of ram before crapping out with the
exception above. Has anyone else faced a similar situation? Any way out of
this?
Thanks,
Shubham.