I have update to trunk and Hadoop 0.17.0. The memory limit per task is
400 Mb. An OutOfMemory exception is launched at first reduce. I have
notice that this Pig script worked with 1GB of memory per task. What are
the memory requirements for PIG?
Thanks!
Iván de Prado
www.ivanprado.es
2008-05-30 11:21:29,863 INFO
org.apache.pig.impl.util.SpillableMemoryManager: low memory handler
called init = 5439488(5312K) used = 166885368(162973K) committed =
246087680(240320K) max = 279642112(273088K)
2008-05-30 11:21:33,069 INFO org.apache.pig.impl.util.SpillableMemoryManager:
low memory handler called init = 5439488(5312K) used = 225822592(220529K)
committed = 279642112(273088K) max = 279642112(273088K)
2008-05-30 11:21:36,047 INFO org.apache.pig.impl.util.SpillableMemoryManager:
low memory handler called init = 5439488(5312K) used = 169349352(165380K)
committed = 267780096(261504K) max = 279642112(273088K)
2008-05-30 11:21:39,369 INFO org.apache.pig.impl.util.SpillableMemoryManager:
low memory handler called init = 5439488(5312K) used = 267780072(261503K)
committed = 279642112(273088K) max = 279642112(273088K)
2008-05-30 11:21:44,505 INFO org.apache.pig.impl.util.SpillableMemoryManager:
low memory handler called init = 5439488(5312K) used = 255668880(249676K)
committed = 279642112(273088K) max = 279642112(273088K)
2008-05-30 11:21:51,019 INFO org.apache.pig.impl.util.SpillableMemoryManager:
low memory handler called init = 5439488(5312K) used = 265970168(259736K)
committed = 279642112(273088K) max = 279642112(273088K)
2008-05-30 11:21:58,115 INFO org.apache.pig.impl.util.SpillableMemoryManager:
low memory handler called init = 5439488(5312K) used = 266914224(260658K)
committed = 279642112(273088K) max = 279642112(273088K)
2008-05-30 11:22:01,423 INFO org.apache.pig.impl.util.SpillableMemoryManager:
low memory handler called init = 5439488(5312K) used = 223674352(218431K)
committed = 279642112(273088K) max = 279642112(273088K)
2008-05-30 11:22:05,163 INFO org.apache.pig.impl.util.SpillableMemoryManager:
low memory handler called init = 5439488(5312K) used = 258252264(252199K)
committed = 279642112(273088K) max = 279642112(273088K)
2008-05-30 11:22:41,457 ERROR
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce:
java.lang.OutOfMemoryError: Java heap space
________________________________________________________________________
Explain:
Logical Plan:
|---LOSort ( BY GENERATE {[FLATTEN PROJECT $1]} )
|---LOEval ( GENERATE {[FLATTEN PROJECT $1],[FLATTEN PROJECT $2],[FLATTEN
PROJECT $3],[FLATTEN PROJECT $4],[FLATTEN PROJECT $5]} )
|---LOCogroup ( GENERATE {[PROJECT $0],[*]}, GENERATE {[PROJECT
$0],[*]}, GENERATE {[PROJECT $0],[*]}, GENERATE {[PROJECT $0],[*]}, GENERATE
{[PROJECT $0],[*]} )
|---LOEval ( GENERATE {[PROJECT $0],[COUNT(GENERATE {[PROJECT
$1]})]} )
|---LOCogroup ( GENERATE {[PROJECT $2],[*]} )
|---LOEval ( [FILTER BY ([PROJECT $6] == ['1'])]
)
|---LOEval ( GENERATE {[FLATTEN PROJECT
$1],[FLATTEN PROJECT $2]} )
|---LOCogroup ( GENERATE {[PROJECT
$1],[*]}, GENERATE {[PROJECT $0],[*]} )
|---LOEval ( [FILTER BY
(([PROJECT $3] == ['2']) AND ([PROJECT $4] != ['0']) AND ([PROJECT $6] !=
['0']) AND ([PROJECT $6] != ['2']))] )
|---LOLoad ( file =
/user/properazzi/mc/mc_20080529000002/input/partition_B.dump AS
id,wid,locid,status,proptype,country,sor )
|---LOLoad ( file =
/user/properazzi/flm/quotas.txt AS wqid )
|---LOEval ( GENERATE {[PROJECT $0],[COUNT(GENERATE {[PROJECT
$1]})]} )
|---LOCogroup ( GENERATE {[PROJECT $0],[*]} )
|---LOEval ( GENERATE {[FLATTEN PROJECT $0]} )
|---LOCogroup ( GENERATE {[*],[*]} )
|---LOEval ( GENERATE {[PROJECT
$2],[PROJECT $1]} )
|---LOEval ( [FILTER BY
([PROJECT $6] == ['1'])] )
|---LOEval ( GENERATE
{[FLATTEN PROJECT $1],[FLATTEN PROJECT $2]} )
|---LOCogroup (
GENERATE {[PROJECT $1],[*]}, GENERATE {[PROJECT $0],[*]} )
|---LOEval (
[FILTER BY (([PROJECT $3] == ['2']) AND ([PROJECT $4] != ['0']) AND ([PROJECT
$6] != ['0']) AND ([PROJECT $6] != ['2']))] )
|---LOLoad ( file =
/user/properazzi/mc/mc_20080529000002/input/partition_B.dump AS
id,wid,locid,status,proptype,country,sor )
|---LOLoad (
file = /user/properazzi/flm/quotas.txt AS wqid )
|---LOEval ( GENERATE {[PROJECT $0],[COUNT(GENERATE {[PROJECT
$1]})]} )
|---LOCogroup ( GENERATE {[PROJECT $2],[*]} )
|---LOEval ( [FILTER BY (([PROJECT $6] == ['3'])
OR ([PROJECT $6] == ['4']) OR ([PROJECT $6] == ['5']))] )
|---LOEval ( GENERATE {[FLATTEN PROJECT
$1],[FLATTEN PROJECT $2]} )
|---LOCogroup ( GENERATE {[PROJECT
$1],[*]}, GENERATE {[PROJECT $0],[*]} )
|---LOEval ( [FILTER BY
(([PROJECT $3] == ['2']) AND ([PROJECT $4] != ['0']) AND ([PROJECT $6] !=
['0']) AND ([PROJECT $6] != ['2']))] )
|---LOLoad ( file =
/user/properazzi/mc/mc_20080529000002/input/partition_B.dump AS
id,wid,locid,status,proptype,country,sor )
|---LOLoad ( file =
/user/properazzi/flm/quotas.txt AS wqid )
|---LOEval ( GENERATE {[PROJECT $0],[COUNT(GENERATE {[PROJECT
$1]})]} )
|---LOCogroup ( GENERATE {[PROJECT $0],[*]} )
|---LOEval ( GENERATE {[FLATTEN PROJECT $0]} )
|---LOCogroup ( GENERATE {[*],[*]} )
|---LOEval ( GENERATE {[PROJECT
$2],[PROJECT $1]} )
|---LOEval ( [FILTER BY
(([PROJECT $6] == ['3']) OR ([PROJECT $6] == ['4']) OR ([PROJECT $6] ==
['5']))] )
|---LOEval ( GENERATE
{[FLATTEN PROJECT $1],[FLATTEN PROJECT $2]} )
|---LOCogroup (
GENERATE {[PROJECT $1],[*]}, GENERATE {[PROJECT $0],[*]} )
|---LOEval (
[FILTER BY (([PROJECT $3] == ['2']) AND ([PROJECT $4] != ['0']) AND ([PROJECT
$6] != ['0']) AND ([PROJECT $6] != ['2']))] )
|---LOLoad ( file =
/user/properazzi/mc/mc_20080529000002/input/partition_B.dump AS
id,wid,locid,status,proptype,country,sor )
|---LOLoad (
file = /user/properazzi/flm/quotas.txt AS wqid )
|---LOEval ( GENERATE {[FLATTEN PROJECT $0]} )
|---LOCogroup ( GENERATE {[*],[*]} )
|---LOEval ( GENERATE {[PROJECT $2],[PROJECT $5]}
)
|---LOLoad ( file =
/user/properazzi/mc/mc_20080529000002/input/partition_B.dump AS
id,wid,locid,status,proptype,country,sor )
-----------------------------------------------
Physical Plan:
|---POMapreduce
Partition Function:
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.SortPartitioner
Map : *
Reduce : Generate(Project(1))
Grouping : Generate(Generate(Project(1)),*)
Input File(s) : /tmp/temp1398936874/tmp-1538794351
Properties :
|---POMapreduce
Map : Composite(*,Generate(Project(1)))
Reduce :
Generate(FuncEval(org.apache.pig.impl.builtin.FindQuantiles(Generate(Const(1),Composite(Project(1),Sort(*))))))
Grouping : Generate(Const(all),*)
Input File(s) : /tmp/temp1398936874/tmp-1538794351
Properties :
|---POMapreduce
Map : *****
Reduce :
Generate(Project(1),Project(2),Project(3),Project(4),Project(5))
Grouping :
Generate(Project(0),*)Generate(Project(0),*)Generate(Project(0),*)Generate(Project(0),*)Generate(Project(0),*)
Input File(s) : /tmp/temp1398936874/tmp-585863913,
/tmp/temp1398936874/tmp-536934015, /tmp/temp1398936874/tmp23578316,
/tmp/temp1398936874/tmp662497645, /tmp/temp1398936874/tmp582570364
Properties : pig.input.splittable:true
|---POMapreduce
Map : *
Combine :
Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Initial(Generate(Project(1)))))
Reduce :
Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Final(Generate(Composite(Project(1),Project(1))))))
Grouping : Generate(Project(2),*)
Input File(s) : /tmp/temp1398936874/tmp-1880872512
Properties : pig.input.splittable:true
|---POMapreduce
Map : Composite(*,Filter: AND )*
Reduce :
Composite(Generate(Project(1),Project(2)),Filter: COMP )
Grouping :
Generate(Project(1),*)Generate(Project(0),*)
Input File(s) :
/user/properazzi/mc/mc_20080529000002/input/partition_B.dump,
/user/properazzi/flm/quotas.txt
Properties : pig.input.splittable:true
|---POMapreduce
Map : *
Combine :
Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Initial(Generate(Project(1)))))
Reduce :
Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Final(Generate(Composite(Project(1),Project(1))))))
Grouping : Generate(Project(0),*)
Input File(s) : /tmp/temp1398936874/tmp-1242543041
Properties : pig.input.splittable:true
|---POMapreduce
Map : *
Reduce : Generate(Project(0))
Grouping : Generate(*,*)
Input File(s) : /tmp/temp1398936874/tmp2015750396
Properties : pig.input.splittable:true
|---POMapreduce
Map : Composite(*,Filter: AND )*
Reduce :
Composite(Generate(Project(1),Project(2)),Filter: COMP
,Generate(Project(2),Project(1)))
Grouping :
Generate(Project(1),*)Generate(Project(0),*)
Input File(s) :
/user/properazzi/mc/mc_20080529000002/input/partition_B.dump,
/user/properazzi/flm/quotas.txt
Properties : pig.input.splittable:true
|---POMapreduce
Map : *
Combine :
Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Initial(Generate(Project(1)))))
Reduce :
Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Final(Generate(Composite(Project(1),Project(1))))))
Grouping : Generate(Project(2),*)
Input File(s) : /tmp/temp1398936874/tmp-1934972255
Properties : pig.input.splittable:true
|---POMapreduce
Map : Composite(*,Filter: AND )*
Reduce :
Composite(Generate(Project(1),Project(2)),Filter: OR )
Grouping :
Generate(Project(1),*)Generate(Project(0),*)
Input File(s) :
/user/properazzi/mc/mc_20080529000002/input/partition_B.dump,
/user/properazzi/flm/quotas.txt
Properties : pig.input.splittable:true
|---POMapreduce
Map : *
Combine :
Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Initial(Generate(Project(1)))))
Reduce :
Generate(Project(0),FuncEval(org.apache.pig.builtin.COUNT$Final(Generate(Composite(Project(1),Project(1))))))
Grouping : Generate(Project(0),*)
Input File(s) : /tmp/temp1398936874/tmp799024189
Properties : pig.input.splittable:true
|---POMapreduce
Map : *
Reduce : Generate(Project(0))
Grouping : Generate(*,*)
Input File(s) : /tmp/temp1398936874/tmp1055965366
Properties : pig.input.splittable:true
|---POMapreduce
Map : Composite(*,Filter: AND )*
Reduce :
Composite(Generate(Project(1),Project(2)),Filter: OR
,Generate(Project(2),Project(1)))
Grouping :
Generate(Project(1),*)Generate(Project(0),*)
Input File(s) :
/user/properazzi/mc/mc_20080529000002/input/partition_B.dump,
/user/properazzi/flm/quotas.txt
Properties : pig.input.splittable:true
|---POMapreduce
Map : Composite(*,Generate(Project(2),Project(5)))
Reduce : Generate(Project(0))
Grouping : Generate(*,*)
Input File(s) :
/user/properazzi/mc/mc_20080529000002/input/partition_B.dump
Properties : pig.input.splittable:true
El vie, 30-05-2008 a las 20:01 +1000, pi song escribió:
> We've already fixed the memory issue introduced in Pig-85. Could you please
> update to the latest version and try again?
>
> Pi
>
> On Wed, May 28, 2008 at 9:18 AM, pi song <[EMAIL PROTECTED]> wrote:
>
> > This might have nothing to do with Hadoop 0.17 but something else that we
> > fixed right after it. I'm investigating. Sorry for inconvenience.
> >
> > FYI,
> > Pi
> >
> >
> > On 5/28/08, Tanton Gibbs <[EMAIL PROTECTED]> wrote:
> >>
> >> I think you need to increase the amount of memory you give to java.
> >>
> >> It looks like it is currently set to 256M. I upped mine to 2G. Of
> >> course it depends on how much ram you have available.
> >>
> >> mapred.child.java.opts is the parameter
> >> mine is currently set to 2048M in my hadoop-site.xml file.
> >>
> >> For performance reasons, I upped the io.sort.mb parameter. However,
> >> if this is too close to 50% of the total memory, you will get the
> >> Spillable messages.
> >>
> >> HTH,
> >> Tanton
> >>
> >
> >