Hi, something must be completely going wrong in this experiment. Please use the latest version of Mahout (Mahout 0.6) and tell us exactly at which point the job fails.
I have been able to process datasets seven times as large as Netflix (http://webscope.sandbox.yahoo.com/catalog.php?datatype=r) in a few hours on a 6 machine cluster. --sebastian On 14.05.2012 03:44, 许春玲 wrote: > Hi, > > I run item recommemder base on Netflix, but it always fail for not > enough local disk space. So, I cut the User Id to half(not user account but > user Id),to reduce the temp data. Now, it finish but > take 40 hours. The command like follow: > > hadoop jar /app/mahout-distribution-0.5/core/target/mahout-core-0.5-job.jar > org.apache.mahout.cf.taste.hadoop.item.RecommenderJob -Dmapred.map.tasks=196 > -Dmapred.reduce.tasks=196 -Dmapred.input.dir=NetFlix_data_new > -Dmapred.output.dir=output_netflix8 > > my hadoop cluster: > > 28 nodes > 16G memory per node > 8 core per node > 250G local disk per node > > > >
