RE: Increase timeout for running PFPGrowth

Joshi, Amit Krishna Tue, 23 Oct 2012 19:53:23 -0700

Thanks. I am still looking on how to increase the timeouts for FPGrowth.

________________________________________
From: 戴清灏 [[email protected]]
Sent: Monday, October 22, 2012 10:53 PM
To: [email protected]
Subject: Re: Increase timeout for running PFPGrowth


-g means the number of groups when executing the fp-growth.
it equals with the number of the reduce tasks, so I suggest you using the
same number of your reducer in your cluster.

-k means the cache that will be kept, so it could be larger if you have a
big memory on single node.

在 2012年10月23日星期二，Matt Molek 写道：

> Did you have those spaces "-D mapred.task.timeout=18000000"? That
> won't be parsed correctly. It should be:
> "-Dmapred.task.timeout=18000000"
>
> On Mon, Oct 22, 2012 at 1:08 PM, Amit Krishna Joshi 
> <[email protected]<javascript:;>>
> wrote:
> > Hi,
> >
> > I am running PFP on several datasets and it works well for smaller ones
> (<
> > 5GB)
> > However, for the larger ones, I keep getting following timeout message.
> >
> > Task attempt_201210140938_0105_r_000000_0 failed to report status for 600
> > seconds. Killing!
> >
> > Is there a way I can increase the timeout?
> >
> > I even tried passing these parameter but in vain:
> > -D mapred.task.timeout=18000000 -D mapred.child.java.opts=-Xmx4000m
> >
> > My input params are:  -s 10000 -g 1000  -tc 8  -k 50 -method mapreduce
> >
> > Also, please suggest what would be the optimum value of g and k.
> > Number of features > million
> >
> >
> > Thanks,
> > Amit
>


--
Regards,
Q

RE: Increase timeout for running PFPGrowth

Reply via email to