Hi all,
   I have a question about parameter setting for LDA model. When I tried to set 
a large number like 500 for  setMaxIterations, the program always fails.  There 
is a very straightforward LDA tutorial using an example data set in the mllib 
package:http://stackoverflow.com/questions/36631991/latent-dirichlet-allocation-lda-algorithm-not-printing-results-in-spark-scala.
  The codes are here:
import org.apache.spark.mllib.clustering.LDAimport 
org.apache.spark.mllib.linalg.Vectors// Load and parse the dataval data = 
sc.textFile("/data/mllib/sample_lda_data.txt") // you might need to change the 
path for the data setval parsedData = data.map(s => 
Vectors.dense(s.trim.split(' ').map(_.toDouble)))// Index documents with unique 
IDsval corpus = parsedData.zipWithIndex.map(_.swap).cache()// Cluster the 
documents into three topics using LDAval ldaModel = new 
LDA().setK(3).run(corpus)

But if I change the last line to 
val ldaModel = new LDA().setK(3).setMaxIterations(500).run(corpus), the program 
fails.  

    I greatly appreciate your help! 
Best,
    Frank




   

Reply via email to