Cezary Dendek created SPARK-20767:
-------------------------------------

             Summary: The LDA model update
                 Key: SPARK-20767
                 URL: https://issues.apache.org/jira/browse/SPARK-20767
             Project: Spark
          Issue Type: Improvement
          Components: ML
    Affects Versions: 2.1.1
            Reporter: Cezary Dendek


Current online implementation of the LDA model fit (OnlineLDAOptimizer) does 
not support the model update (ie. to account for the population/covariates 
drift) nor the continuation of model fitting in case of the insufficient number 
of iterations.

Technical aspects:

1. The implementation of LDA fitting does not currently allow the coefficients 
pre-setting (private setter), as noted by a comment in the source code of 
OnlineLDAOptimizer.setLambda: "This is only used for testing now. In the 
future, it can help support training stop/resume".

2. The lambda matrix is always randomly initialized by the optimizer, which 
needs fixing for preset lambda matrix.

The adaptation of the classes by the user is not possible due to protected 
setters & sealed / final classes.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to