Frank McQuillan created MADLIB-1446:
---------------------------------------

             Summary: DL:  Hyperband phase 2 - generate MST table
                 Key: MADLIB-1446
                 URL: https://issues.apache.org/jira/browse/MADLIB-1446
             Project: Apache MADlib
          Issue Type: New Feature
          Components: Deep Learning
            Reporter: Frank McQuillan
             Fix For: v1.18.0


Python code to do some version of this is in 
https://github.com/apache/madlib-site/blob/asf-site/community-artifacts/Deep-learning/automl/hyperband-diag-cifar10-v1.ipynb
 in methods called `setup_full_schedule()` and `create_mst_superset()` + 
combine with the random search function from 
https://www.pivotaltracker.com/story/show/173692930


**Story***

Generate the MST table and do input validation on input params (to the extent 
possible without implementing the whole method).  It does not do the whole 
hyperband method.  The proposed interface:
{code}
madlib_keras_automl(
    source_table,                                       -- input
    model_output_table,                         -- output
    model_selection_table,                      -- output
    model_arch_table,                           -- input
    model_id_list,
   compile_params_grid,
   fit_params_grid,

    automl_method,                                      -- new params vvv
    automl_params                                       

    random_state,                                       -- optional -- from 
generate model configs vvv
        object_table                                    -- optional

    use_gpus,                                           -- optional -- from fit 
multiple vvv
    validation_table,                           -- optional
    metrics_compute_frequency,          -- optional
    name,                                                       -- optional
    description                                         -- optional
    )
{code}

Here are the output tables:

(1)
<model_output_table>

Same as model output table in 
https://madlib.apache.org/docs/latest/group__grp__keras__run__model__selection.html
e.g., for R=81 and n=3 will have 81+27+9+6+5 rows

(2)
<model_output_table>_summary

Same as model output table summary in 
https://madlib.apache.org/docs/latest/group__grp__keras__run__model__selection.html
will have 1 row + add the following columns at the bottom, i.e., right side of 
the table:
{code}
use_gpus                                BOOLEAN                                 
e.g.,  TRUE                     -- this is missing from summary table from 
before
automl_method                   TEXT                                    e.g., 
'hyperband'
automl_params_names             TEXT[]                                  e.g., 
{'R', 'eta', 'skip_last' }
automl_params_vals              TEXT[]                                  e.g., 
{'81', '3', 'TRUE'}       -- note this needs to be text array since mixed types 
of autoML params          
{code}

(3)
<model_output_table>_info

Same as model output table info in 
https://madlib.apache.org/docs/latest/group__grp__keras__run__model__selection.html
e.g., for R=81 and n=3 will have 81+27+9+6+5 rows + add the following columns 
at the bottom, i.e., right side of the table:
{code}
s                                               INTEGER                         
"Bracket number"                                                        e.g., 4 
        
i                                               INTEGER                         
"Depth in bracket model trained to"                     e.g., 3
{code}

(4)
<model_selection_table>

Same as model selection table in 
https://madlib.apache.org/docs/latest/group__grp__keras__setup__model__selection.html
e.g., for R=81 and n=3 will have 81+27+9+6+5 rows

(5)
<model_selection_table>_summary

Same as model selection table in 
https://madlib.apache.org/docs/latest/group__grp__keras__setup__model__selection.html

**Acceptance**

1) For `R=81, eta=3` check that it creates the correct MST tables  
<model_selection_table> and <model_selection_table>_summary
2) Set `skip_last =1` and check that it creates the correct MST tables
3) Try multiple other values to see if produces the correct schedule



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to