Hi,

I am trying to visualize the LDA model developed in spark scala (2.0 ML) in
LDAvis.

Is there any links to convert the spark model parameters to the following 5
params to visualize ?

1. φ, the K × W matrix containing the estimated probability mass function
over the W terms in the vocabulary for each of the K topics in the model.
Note that φkw > 0 for all k ∈ 1...K and all w ∈ 1...W, because of the
priors. (Although our software allows values of zero due to rounding). Each
of the K rows of φ must sum to one.
2. θ, the D × K matrix containing the estimated probability mass function
over the K topics in the model for each of the D documents in the corpus.
Note that θdk > 0 for all d ∈ 1...D and all k ∈ 1...K, because of the
priors (although, as above, our software accepts zeroes due to rounding).
Each of the D rows of θ must sum to one.
3. nd, the number of tokens observed in document d, where nd is required to
be an integer greater than zero, for documents d = 1...D. Denoted
doc.length in our code.
4. vocab, the length-W character vector containing the terms in the
vocabulary (listed in the same order as the columns of φ).
5. Mw, the frequency of term w across the entire corpus, where Mw is
required to be an integer greater than zero for each term w = 1...W.
Denoted term.frequency in our code.

Reply via email to