David's rule of thumb was to let the iterations go until relative change in LL becomes around 10^-4
Robin On Sat, May 22, 2010 at 9:12 PM, Jeff Eastman <[email protected]>wrote: > I suggest you try running with a trunk checkout and upgrading to Hadoop > 0.20.2. Mahout is still in motion and I've run LDA on Reuters on trunk in > the last few days. The maxIter parameter should not be an issue; you could > try removing it entirely and LDA will default to running to convergence > (about 100 iterations which can take some time). I've found the Reuters > results don't change too much after 20. Even with a clean trunk checkout > Reuters will only use a single node and the iterations should take about 5 > mins each. If you want to run on a multi-node cluster, install the patch in > MAHOUT-397 ( > > > https://issues.apache.org/jira/browse/MAHOUT-397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel) > and use the same arguments as in examples/bin/build-reuters.sh. Even on a > 3-node cluster this brings the iteration time down to about a minute and a > half which is worth doing. > > Hope this helps, > Jeff > > http://www.windwardsolutions.com > > > > > On 5/22/10 5:40 AM, 杨杰 wrote: > >> Hi, everyone >> >> I'm trying mahout now. When running LDA on reuter corpus >> ( >> http://lucene.grantingersoll.com/2010/02/16/trijug-intro-to-mahout-slides-and-demo-examples/ >> ), >> A parameter refuses to work. This parameter is "maxIter", without >> which, i cannot decide the iteration to run~ >> >> My CMD is: >> bin/mahout.hadoop lda --input mahout/seq-sparse-tf/vectors --output >> mahout/seq-sparse-tf/lda-out5 --numWords 34000 --numTopics 20 >> --maxIter 1 >> >> But got a exception: >> 10/05/22 20:32:11 ERROR lda.LDADriver: Exception >> org.apache.commons.cli2.OptionException: Unexpected 2 while processing >> Options >> at >> org.apache.commons.cli2.commandline.Parser.parse(Parser.java:100) >> at >> org.apache.mahout.clustering.lda.LDADriver.main(LDADriver.java:115) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at >> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) >> at >> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) >> at >> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:172) >> ... >> >> What's the problem? I'm using version 0.3& Hadoop 0.20.0. >> >> Thank you! >> >> >> >> > >
