Hi Rohini, Can you provide some details into how big is the input dataset, data volume that reducers receive from Mappers and the number of reducers you are using?
Thanks, Prashant On Wed, Mar 21, 2012 at 12:34 PM, Rohini U <[email protected]> wrote: > Hi, > > I have a pig script which does a simple GROUPing followed by couting and I > get this error. My data is certaining not that big for it to cause this > out of memory error. Is there a chance that this is because of some bug ? > Did any one come across this kind of error before? > > I am using pig 0.9.1 with hadoop 0.20.205 > > My script: > rawItems = LOAD 'in' as (item1, item2, item3, type, count); > > grouped = GROUP rawItems BY (item1, item2, item3, type); > > counts = FOREACH grouped { > selectedFields = FILTER rawItems BY type="EMPLOYER"; > GENERATE > FLATTEN(group) as (item1, item2, item3, type) , > SUM(selectedFields.count) as count > > } > > Stack Trace: > > 2012-03-21 19:19:59,346 FATAL org.apache.hadoop.mapred.Child (main): Error > running child : java.lang.OutOfMemoryError: GC overhead limit exceeded > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:387) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:95) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:406) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:570) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PORelationToExprProject.getNext(PORelationToExprProject.java:107) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:570) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:248) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:316) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:159) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:184) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:293) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:453) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:324) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:159) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:184) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:281) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:324) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332) > at > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284) > at > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:459) > at > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:427) > at > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:407) > at > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:261) > at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176) > at > org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:662) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:425) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > > Thanks > -Rohini >
