GitHub user mpjlu opened a pull request:
https://github.com/apache/spark/pull/19337
[SPARK-22114][ML][MLLIB]add epsilon for LDA
## What changes were proposed in this pull request?
The current convergence condition of OnlineLDAOptimizer is:
while(meanGammaChange > 1e-3)
The condition is critical for the performance and accuracy of LDA.
We should keep this configurable, like it is in Vowpal Vabbit:
https://github.com/JohnLangford/vowpal_wabbit/blob/430f69453bc4876a39351fba1f18771bdbdb7122/vowpalwabbit/lda_core.cc
:638
## How was this patch tested?
The existing UT
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mpjlu/spark setLDA
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19337.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19337
----
commit a6c3c79efe4d77ef4beba3f431fd9c2527735875
Author: Peng Meng <[email protected]>
Date: 2017-09-25T09:12:47Z
add epsilon for LDA
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]