Have you tried increasing g to 1000 and above ?

On Tue, Nov 23, 2010 at 8:50 PM, <[email protected]> wrote:

> Hello all,
> I was able to successfully test PFPGrowth with 50M transactions. Now I am
> testing with 150M transactions and no matter what group size I use I am
> getting out of memory when running the FPGrowth job. It finishes parallel
> counting and transaction sorting job fine but when its running FPGrowth job,
> I always get outofmemory.
>
> On Hadoop side:map/reduce process heap size is 2G. No. of reduce jobs is 24
> on total of 4 hadoop cluster.
> On Mahout side: I specified minSupport as 250 and tried with group size
> from 500 to 3000.
> Out of 150M transactions, Its generating about 6500 features so I thought
> group size of 500 should be good enough to avoid out of memory.
>
> What params can I change to fix the outofmemory issue?
> Can someone throw some light on how to come up with optimal parameter
> values to avoid such issues on production system?
>
> Any help is appreciated.
>
> Praveen
>
> 10/11/23 10:16:52 INFO mapred.JobClient:  map 100% reduce 20%
> 10/11/23 10:17:01 INFO mapred.JobClient:  map 100% reduce 17%
> 10/11/23 10:17:03 INFO mapred.JobClient: Task Id :
> attempt_201011221932_0009_r_000013_2, Status : FAILED
> Error: Java heap space
> 10/11/23 10:17:10 INFO mapred.JobClient:  map 100% reduce 14%
> 10/11/23 10:17:12 INFO mapred.JobClient: Task Id :
> attempt_201011221932_0009_r_000018_0, Status : FAILED
> Error: Java heap space
> 10/11/23 10:17:14 INFO mapred.JobClient:  map 100% reduce 11%
> 10/11/23 10:17:16 INFO mapred.JobClient:  map 100% reduce 12%
> 10/11/23 10:17:16 INFO mapred.JobClient: Task Id :
> attempt_201011221932_0009_r_000016_1, Status : FAILED
> Error: Java heap space
> 10/11/23 10:17:19 INFO mapred.JobClient:  map 100% reduce 8%
> 10/11/23 10:17:22 INFO mapred.JobClient:  map 100% reduce 9%
> 10/11/23 10:17:25 INFO mapred.JobClient: Task Id :
> attempt_201011221932_0009_r_000019_0, Status : FAILED
> Error: Java heap space
>
>

Reply via email to