Frank McQuillan created MADLIB-1446: ---------------------------------------
Summary: DL: Hyperband phase 2 - generate MST table Key: MADLIB-1446 URL: https://issues.apache.org/jira/browse/MADLIB-1446 Project: Apache MADlib Issue Type: New Feature Components: Deep Learning Reporter: Frank McQuillan Fix For: v1.18.0 Python code to do some version of this is in https://github.com/apache/madlib-site/blob/asf-site/community-artifacts/Deep-learning/automl/hyperband-diag-cifar10-v1.ipynb in methods called `setup_full_schedule()` and `create_mst_superset()` + combine with the random search function from https://www.pivotaltracker.com/story/show/173692930 **Story*** Generate the MST table and do input validation on input params (to the extent possible without implementing the whole method). It does not do the whole hyperband method. The proposed interface: {code} madlib_keras_automl( source_table, -- input model_output_table, -- output model_selection_table, -- output model_arch_table, -- input model_id_list, compile_params_grid, fit_params_grid, automl_method, -- new params vvv automl_params random_state, -- optional -- from generate model configs vvv object_table -- optional use_gpus, -- optional -- from fit multiple vvv validation_table, -- optional metrics_compute_frequency, -- optional name, -- optional description -- optional ) {code} Here are the output tables: (1) <model_output_table> Same as model output table in https://madlib.apache.org/docs/latest/group__grp__keras__run__model__selection.html e.g., for R=81 and n=3 will have 81+27+9+6+5 rows (2) <model_output_table>_summary Same as model output table summary in https://madlib.apache.org/docs/latest/group__grp__keras__run__model__selection.html will have 1 row + add the following columns at the bottom, i.e., right side of the table: {code} use_gpus BOOLEAN e.g., TRUE -- this is missing from summary table from before automl_method TEXT e.g., 'hyperband' automl_params_names TEXT[] e.g., {'R', 'eta', 'skip_last' } automl_params_vals TEXT[] e.g., {'81', '3', 'TRUE'} -- note this needs to be text array since mixed types of autoML params {code} (3) <model_output_table>_info Same as model output table info in https://madlib.apache.org/docs/latest/group__grp__keras__run__model__selection.html e.g., for R=81 and n=3 will have 81+27+9+6+5 rows + add the following columns at the bottom, i.e., right side of the table: {code} s INTEGER "Bracket number" e.g., 4 i INTEGER "Depth in bracket model trained to" e.g., 3 {code} (4) <model_selection_table> Same as model selection table in https://madlib.apache.org/docs/latest/group__grp__keras__setup__model__selection.html e.g., for R=81 and n=3 will have 81+27+9+6+5 rows (5) <model_selection_table>_summary Same as model selection table in https://madlib.apache.org/docs/latest/group__grp__keras__setup__model__selection.html **Acceptance** 1) For `R=81, eta=3` check that it creates the correct MST tables <model_selection_table> and <model_selection_table>_summary 2) Set `skip_last =1` and check that it creates the correct MST tables 3) Try multiple other values to see if produces the correct schedule -- This message was sent by Atlassian Jira (v8.3.4#803005)