[ https://issues.apache.org/jira/browse/MADLIB-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Frank McQuillan updated MADLIB-1352: ------------------------------------ Description: In LDA http://madlib.apache.org/docs/latest/group__grp__lda.html implement warm start so can pick up from where you left off in the last training. I would suggest we model this on the warm start implemented in MLP http://madlib.apache.org/docs/latest/group__grp__nn.html since it will be the same general idea for LDA. The LDA interface will be: {code} lda_train( data_table, model_table, output_data_table, voc_size, topic_num, iter_num, alpha, beta, evaluate_every, perplexity_tol, warm_start -- new param ) warm_start (optional) BOOLEAN, default: FALSE. Initialize weights with the coefficients from the last call of the training function. If set to true, weights will be initialized from the model_table generated by the previous run. Note that parameters voc_size and topic_num must remain constant between calls when warm_start is used. Other parameters can be changed for the warm start run. {code} Open questions 1) Validate this statement: {code} Note that parameters voc_size and topic_num must remain constant between calls when warm_start is used. Other parameters can be changed for the warm start run. {code} Notes 1) Depending on open question #1 above, do validation checks on user input to ensure that user does not change any parameter that they are not allowed to change from the previous run. was: In LDA http://madlib.apache.org/docs/latest/group__grp__lda.html implement warm start so can pick up from where you left off in the last training. > Add warm start to LDA > ---------------------- > > Key: MADLIB-1352 > URL: https://issues.apache.org/jira/browse/MADLIB-1352 > Project: Apache MADlib > Issue Type: New Feature > Components: Module: Parallel Latent Dirichlet Allocation > Reporter: Frank McQuillan > Assignee: Himanshu Pandey > Priority: Major > Fix For: v2.0 > > > In LDA > http://madlib.apache.org/docs/latest/group__grp__lda.html > implement warm start so can pick up from where you left off in the last > training. > I would suggest we model this on the warm start implemented in MLP > http://madlib.apache.org/docs/latest/group__grp__nn.html > since it will be the same general idea for LDA. > The LDA interface will be: > {code} > lda_train( data_table, > model_table, > output_data_table, > voc_size, > topic_num, > iter_num, > alpha, > beta, > evaluate_every, > perplexity_tol, > warm_start -- new param > ) > warm_start (optional) > BOOLEAN, default: FALSE. Initialize weights with the coefficients from the > last call of the training function. If set to true, weights will be > initialized from the model_table generated by the previous run. Note that > parameters voc_size and topic_num must remain constant between calls when > warm_start is used. Other parameters can be changed for the warm start run. > {code} > Open questions > 1) Validate this statement: > {code} > Note that parameters voc_size and topic_num must remain constant between > calls when warm_start is used. Other parameters can be changed for the warm > start run. > {code} > Notes > 1) Depending on open question #1 above, do validation checks on user input to > ensure that user does not change any parameter that they are not allowed to > change from the previous run. -- This message was sent by Atlassian Jira (v8.3.4#803005)