[ https://issues.apache.org/jira/browse/MADLIB-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Frank McQuillan updated MADLIB-1352: ------------------------------------ Fix Version/s: (was: v1.18.0) v2.0 > Add warm start to LDA > ---------------------- > > Key: MADLIB-1352 > URL: https://issues.apache.org/jira/browse/MADLIB-1352 > Project: Apache MADlib > Issue Type: New Feature > Components: Module: Parallel Latent Dirichlet Allocation > Reporter: Frank McQuillan > Assignee: Himanshu Pandey > Priority: Major > Fix For: v2.0 > > > In LDA > http://madlib.apache.org/docs/latest/group__grp__lda.html > implement warm start so can pick up from where you left off in the last > training. > I would suggest we model this on the warm start implemented in MLP > http://madlib.apache.org/docs/latest/group__grp__nn.html > since it will be the same general idea for LDA. > The LDA interface will be: > {code} > lda_train( data_table, > model_table, > output_data_table, > voc_size, > topic_num, > iter_num, > alpha, > beta, > evaluate_every, > perplexity_tol, > warm_start -- new param > ) > warm_start (optional) > BOOLEAN, default: FALSE. Initialize weights with the coefficients from the > last call of the training function. If set to true, weights will be > initialized from the model_table generated by the previous run. Note that > parameters voc_size and topic_num must remain constant between calls when > warm_start is used. Other parameters can be changed for the warm start run. > {code} > Open questions > 1) Validate this statement: > {code} > Note that parameters voc_size and topic_num must remain constant between > calls when warm_start is used. Other parameters can be changed for the warm > start run. > {code} > Notes > 1) Depending on open question #1 above, do validation checks on user input to > ensure that user does not change any parameter that they are not allowed to > change from the previous run. -- This message was sent by Atlassian Jira (v8.3.4#803005)