[jira] [Commented] (MADLIB-1223) MLP regression predict fails if input table does not exist

Frank McQuillan (JIRA) Tue, 10 Apr 2018 11:50:19 -0700

    [ 
https://issues.apache.org/jira/browse/MADLIB-1223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16432762#comment-16432762
 ]


Frank McQuillan commented on MADLIB-1223:
-----------------------------------------

This works for me

{code}
load mnist data into mnist_train table and run preprocessor to create 
mnist_train_packed table
{code}

{code}
madlib=# drop table mnist_train;
DROP TABLE
{code}

{code}
madlib=# DROP TABLE IF EXISTS mnist_result, mnist_result_summary, 
mnist_result_standardization;
DROP TABLE
madlib=# 
madlib=# SELECT madlib.mlp_classification(
madlib(#     'mnist_train_packed',        -- Packed table from preprocessor
madlib(#     'mnist_result',              -- Destination table
madlib(#     'independent_varname',       -- Independent
madlib(#     'dependent_varname',         -- Dependent
madlib(#     ARRAY[5],                    -- Hidden layer sizes
madlib(#     'learning_rate_init=0.1,
madlib'#     n_iterations=1,
madlib'#     learning_rate_policy=const,
madlib'#     lambda=0.0001,               -- Regularization
madlib'#     tolerance=0',
madlib(#     'tanh',                      -- Activation function
madlib(#     '',                          -- No weights
madlib(#     FALSE,                       -- No warmstart
madlib(#     FALSE);                       -- Verbose
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column(s) named 
'coeff' as the Greenplum Database data distribution key for this table.
HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make 
sure column(s) chosen are the optimal data distribution key to minimize skew.
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 
'source_table' as the Greenplum Database data distribution key for this table.
HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make 
sure column(s) chosen are the optimal data distribution key to minimize skew.
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause. Creating a NULL policy 
entry.
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause. Creating a NULL policy 
entry.
 mlp_classification 
--------------------
 (1 row)
{code}

{code}
madlib=# DROP TABLE IF EXISTS mnist_test_prediction;
DROP TABLE
madlib=# SELECT madlib.mlp_predict(
madlib(#     'mnist_result',
madlib(#     'mnist_test',
madlib(#     'id',
madlib(#     'mnist_test_prediction',
madlib(#     'response');
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause. Creating a NULL policy 
entry.
 mlp_predict 
-------------
(1 row)
{code}


so accepting this JIRA


> MLP regression predict fails if input table does not exist
> ----------------------------------------------------------
>
>                 Key: MADLIB-1223
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1223
>             Project: Apache MADlib
>          Issue Type: Bug
>          Components: Module: Neural Networks
>            Reporter: Nikhil
>            Priority: Major
>             Fix For: v1.14
>
>
> If a model is trained with mlp regression and then the input table is 
> dropped, mlp predict fails for that model.
> Ideally the predict function should not depend on the existence of the 
> training data. 
> The predict code for regression only needs to know if the dependent varname 
> type is an array or not. This information can be potentially stored in the 
> model's summary table.
> We also need to make sure that the predict function is backwards compatible. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (MADLIB-1223) MLP regression predict fails if input table does not exist

Reply via email to