RE: LDA and Maximum Iterations

Yang, Yuhao Tue, 20 Sep 2016 09:50:31 -0700

Hi Frank,

Which version of Spark are you using? Also can you share more information about 
the exception.

If it’s not confidential, you can send the data sample to me 
(yuhao.y...@intel.com) and I can try to investigate.

Regards,
Yuhao

From: Frank Zhang [mailto:dataminin...@yahoo.com.INVALID]
Sent: Monday, September 19, 2016 9:20 PM
To: user@spark.apache.org
Subject: LDA and Maximum Iterations

Hi all,

   I have a question about parameter setting for LDA model. When I tried to set 
a large number like 500 for
setMaxIterations, the program always fails.  There is a very straightforward 
LDA tutorial using an example data set in the mllib 
package:http://stackoverflow.com/questions/36631991/latent-dirichlet-allocation-lda-algorithm-not-printing-results-in-spark-scala.
  The codes are here:

import org.apache.spark.mllib.clustering.LDA
import org.apache.spark.mllib.linalg.Vectors
// Load and parse the data
val data = sc.textFile("/data/mllib/sample_lda_data.txt") // you might need to 
change the path for the data set
val parsedData = data.map(s => Vectors.dense(s.trim.split(' ').map(_.toDouble)))
// Index documents with unique IDs
val corpus = parsedData.zipWithIndex.map(_.swap).cache()
// Cluster the documents into three topics using LDA
val ldaModel = new LDA().setK(3).run(corpus)

But if I change the last line to
val ldaModel = new LDA().setK(3).setMaxIterations(500).run(corpus), the program 
fails.

    I greatly appreciate your help!

Best,

    Frank

RE: LDA and Maximum Iterations

Reply via email to