Hi Frank,

Which version of Spark are you using? Also can you share more information about 
the exception.

If it’s not confidential, you can send the data sample to me 
(yuhao.y...@intel.com) and I can try to investigate.

Regards,
Yuhao

From: Frank Zhang [mailto:dataminin...@yahoo.com.INVALID]
Sent: Monday, September 19, 2016 9:20 PM
To: user@spark.apache.org
Subject: LDA and Maximum Iterations

Hi all,

   I have a question about parameter setting for LDA model. When I tried to set 
a large number like 500 for
setMaxIterations, the program always fails.  There is a very straightforward 
LDA tutorial using an example data set in the mllib 
package:http://stackoverflow.com/questions/36631991/latent-dirichlet-allocation-lda-algorithm-not-printing-results-in-spark-scala.
  The codes are here:

import org.apache.spark.mllib.clustering.LDA
import org.apache.spark.mllib.linalg.Vectors
// Load and parse the data
val data = sc.textFile("/data/mllib/sample_lda_data.txt") // you might need to 
change the path for the data set
val parsedData = data.map(s => Vectors.dense(s.trim.split(' ').map(_.toDouble)))
// Index documents with unique IDs
val corpus = parsedData.zipWithIndex.map(_.swap).cache()
// Cluster the documents into three topics using LDA
val ldaModel = new LDA().setK(3).run(corpus)

But if I change the last line to
val ldaModel = new LDA().setK(3).setMaxIterations(500).run(corpus), the program 
fails.

    I greatly appreciate your help!

Best,

    Frank



Reply via email to