Yes, but it takes another 80 iterations to get there and the results, on Reuters at least, don't seem to improve that much.

On 5/22/10 5:01 PM, Robin Anil wrote:
David's rule of thumb was to let the iterations go until relative change in
LL becomes around 10^-4

Robin

On Sat, May 22, 2010 at 9:12 PM, Jeff Eastman<[email protected]>wrote:

I suggest you try running with a trunk checkout and upgrading to Hadoop
0.20.2. Mahout is still in motion and I've run LDA on Reuters on trunk in
the last few days. The maxIter parameter should not be an issue; you could
try removing it entirely and LDA will default to running to convergence
(about 100 iterations which can take some time). I've found the Reuters
results don't change too much after 20. Even with a clean trunk checkout
Reuters will only use a single node and the iterations should take about 5
mins each. If you want to run on a multi-node cluster, install the patch in
MAHOUT-397 (


https://issues.apache.org/jira/browse/MAHOUT-397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel)
and use the same arguments as in examples/bin/build-reuters.sh. Even on a
3-node cluster this brings the iteration time down to about a minute and a
half which is worth doing.

Hope this helps,
Jeff

http://www.windwardsolutions.com




On 5/22/10 5:40 AM, 杨杰 wrote:

Hi, everyone

I'm trying mahout now. When running LDA on reuter corpus
(
http://lucene.grantingersoll.com/2010/02/16/trijug-intro-to-mahout-slides-and-demo-examples/
),
A parameter refuses to work. This parameter is "maxIter", without
which, i cannot decide the iteration to run~

My CMD is:
bin/mahout.hadoop lda --input mahout/seq-sparse-tf/vectors --output
mahout/seq-sparse-tf/lda-out5 --numWords 34000 --numTopics 20
--maxIter 1

But got a exception:
10/05/22 20:32:11 ERROR lda.LDADriver: Exception
org.apache.commons.cli2.OptionException: Unexpected 2 while processing
Options
        at
org.apache.commons.cli2.commandline.Parser.parse(Parser.java:100)
        at
org.apache.mahout.clustering.lda.LDADriver.main(LDADriver.java:115)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
        at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
        at
org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:172)
...

What's the problem? I'm using version 0.3&   Hadoop 0.20.0.

Thank you!






Reply via email to