[
https://issues.apache.org/jira/browse/SPARK-20082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16022379#comment-16022379
]
yuhao yang commented on SPARK-20082:
------------------------------------
refer to https://issues.apache.org/jira/browse/SPARK-20767 for some insights
shared by [~cezden]
{quote}
Technical aspects:
1. The implementation of LDA fitting does not currently allow the coefficients
pre-setting (private setter), as noted by a comment in the source code of
OnlineLDAOptimizer.setLambda: "This is only used for testing now. In the
future, it can help support training stop/resume".
2. The lambda matrix is always randomly initialized by the optimizer, which
needs fixing for preset lambda matrix.
{quote}
> Incremental update of LDA model, by adding initialModel as start point
> ----------------------------------------------------------------------
>
> Key: SPARK-20082
> URL: https://issues.apache.org/jira/browse/SPARK-20082
> Project: Spark
> Issue Type: New Feature
> Components: ML
> Affects Versions: 2.1.0
> Reporter: Mathieu D
>
> Some mllib models support an initialModel to start from and update it
> incrementally with new data.
> From what I understand of OnlineLDAOptimizer, it is possible to incrementally
> update an existing model with batches of new documents.
> I suggest to add an initialModel as a start point for LDA.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]