Are you running it on the trunk or the 0.2 release version ? Robin On Tue, Jan 19, 2010 at 9:43 AM, sej <[email protected]> wrote:
> > Hello all, > > I am running PFP on a fairly large dataset and it works well for smaller > subsets of the data. However, once I attempt larger samples, I run into > this error in the reducer phase: > > 1) 10/01/19 00:25:35 INFO mapred.JobClient: Task Id : attempt_, Status : > FAILED > Task attempt_ failed to report status for 607 seconds. Killing! > > I've also noticed that only one reducer is launched for the FP-Tree mining > phase. > I've tried passing in -D mapred options but it doesn't seem like > PFPGrowthJob supports it. Is there anyway I can increase the timeout, heap > size, and/or number of reducers without explicitly changing the code and > recompiling? > This wouldnt be the case unless you specify number of groups =1. Could you give some idea about your dataset > > Also, from my understanding of the algorithm, as long as the number of > groups is higher than the number of features that are above min support, > each tree will be able to utilize the maximum available heap resources > because each feature will be guaranteed to be mined separately, is that > assumption correct? > No. Number of groups should be always lower than the number of features, else it maxes out at the count of features as there would be no features left to fill in the group. What is the number of features you are working on. As a rule of thumb if the data is too large, try and keep 10-20 features per group. So assign groups that way > Thanks! > --sej > > p.s. > the last few log lines outputted for the failed reducer: > 2010-01-18 13:06:39,418 INFO > org.apache.mahout.fpm.pfpgrowth.fpgrowth.FPGrowth: Number of unique pruned > items 9091 > 2010-01-18 13:06:39,530 INFO > org.apache.mahout.fpm.pfpgrowth.fpgrowth.FPGrowth: FPTree Building: Read > 10000 Transactions > 2010-01-18 13:06:39,649 INFO > org.apache.mahout.fpm.pfpgrowth.fpgrowth.FPGrowth: FPTree Building: Read > 20000 Transactions > 2010-01-18 13:06:39,758 INFO > org.apache.mahout.fpm.pfpgrowth.fpgrowth.FPGrowth: FPTree Building: Read > 30000 Transactions > 2010-01-18 13:06:39,774 INFO > org.apache.mahout.fpm.pfpgrowth.fpgrowth.FPGrowth: Number of Nodes in the > FP > Tree: 40904 > 2010-01-18 13:06:39,775 INFO > org.apache.mahout.fpm.pfpgrowth.fpgrowth.FPGrowth: Mining FTree Tree for > all > patterns with 3393 > > > -- > View this message in context: > http://old.nabble.com/PFP---failed-to-report-status----of-reducers-tp27220725p27220725.html > Sent from the Mahout User List mailing list archive at Nabble.com. > > -- ------ Robin Anil Blog: http://techdigger.wordpress.com ------- Try out Swipeball for iPhone Video: http://www.youtube.com/watch?v=3hvEbWHciwU iTunes: http://itunes.com/apps/swipeball
