I have been running it with no problems for some time.  Previously I posted 
the code from a script I use to run it (shown at bottom of this link):
 
http://mail-archives.apache.org/mod_mbox/mahout-user/201206.mbox/%3CCACYXym_LJJgWMBQ4TEgPJjx2PUvjicA1kZ1Kmo_xQ=wkjnu...@mail.gmail.com%3E
 
CVB call:
 
$MAHOUT cvb -i ${WORK_DIR}/sparse-vectors-cvb -o ${WORK_DIR}/reuters-cvb -k 150 
-ow -x 10 -dict ${WORK_DIR}/reuters-out-seqdir-sparse-cvb/dictionary.file-0 -mt 
${WORK_DIR}/topic-model-cvb -dt ${WORK_DIR}/doc-topic-cvb
 
So, I guess one thing to check is that your input vector folder actually 
contains files, and in the correct format (i.e., keys need to be Integers) 
which is why I use rowid to format my prior sparse vectors.  And make sure any 
other inputs being passed also exist in the correct format (e.g., -dict).
 
Also in my case I also specify -mt.  I seem to recall having issues when not 
doing so.  Prior to a run I delete the -mt file too as I had trouble if a prior 
run generated an error.
 
Not sure what your  "-a" parameter does.
 
Dan
  

________________________________
 From: seth <[email protected]>
To: [email protected] 
Sent: Thursday, July 12, 2012 7:18 PM
Subject: help with cvb
  
I'm trying to run the cvb lda algorithm like so:


$MAHOUT_HOME/mahout cvb -i ./mahout_data/vectors/vectors/vectors\ for\ cvb/
-ow -x 20 -o ./mahout_data/clusters/ -k 140 -dt dist.txt -dict
./mahout_data/vectors/vectors/dictionary.file-0 -a 3 

but I get this error

12/07/12 16:06:06 INFO cvb.CVB0Driver: Will run Collapsed Variational Bayes
(0th-derivative approximation) learning for LDA on
mahout_data/vectors/vectors/vectors for cvb (numTerms: 20165), finding
140-topics, with document/topic prior 3.0, topic/term prior 1.0E-4.  Maximum
iterations to run will be 20, unless the change in perplexity is less than
0.0.  Topic model output (p(term|topic) for each topic) will be stored
mahout_data/clusters.  Random initialization seed is 9411, holding out 0.0
of the data for perplexity check

12/07/12 16:06:06 INFO cvb.CVB0Driver: Dictionary to be used located
mahout_data/vectors/vectors/dictionary.file-0
p(topic|docId) will be stored dist.txt

12/07/12 16:06:06 INFO cvb.CVB0Driver: Found previous state:
temp/topicModelState/model-1
12/07/12 16:06:06 INFO cvb.CVB0Driver: Current iteration number: 1
12/07/12 16:06:06 WARN cvb.CVB0Driver: Perplexity path
temp/topicModelState/perplexity-1 does not exist, returning NaN
12/07/12 16:06:06 INFO cvb.CVB0Driver: About to run iteration 2 of 20
12/07/12 16:06:06 INFO cvb.CVB0Driver: About to run: Iteration 2 of 20,
input path: temp/topicModelState/model-1
Exception in thread "main" java.lang.IllegalStateException: No part files
found in model path 'temp/topicModelState/model-1'
    at com.google.common.base.Preconditions.checkState(Preconditions.java:172)
    at
org.apache.mahout.clustering.lda.cvb.CVB0Driver.setModelPaths(CVB0Driver.java:529)
    at
org.apache.mahout.clustering.lda.cvb.CVB0Driver.runIteration(CVB0Driver.java:515)
    at org.apache.mahout.clustering.lda.cvb.CVB0Driver.run(CVB0Driver.java:304)
    at org.apache.mahout.clustering.lda.cvb.CVB0Driver.run(CVB0Driver.java:187)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at
org.apache.mahout.clustering.lda.cvb.CVB0Driver.main(CVB0Driver.java:550)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
    at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
    at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)


Can anyone help me understand what I'm missing?

Thanks,

Seth

--
View this message in context: 
http://lucene.472066.n3.nabble.com/help-with-cvb-tp3994763.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Reply via email to