Frank McQuillan commented on MADLIB-1227:

 looks Ok, I got this error as expected when I tried to run mlp classification 
w/minibatch w/integer dependent variable that had not been encoded:

InternalError: (psycopg2.InternalError) plpy.Error: MiniBatch expects the 
variable dependent_varname to be one hot encoded. You might need to re run the 
minibatch_preprocessor function and make sure that the variable is encoded 
CONTEXT:  Traceback (most recent call last):
  PL/Python function "mlp_classification", line 36, in <module>
  PL/Python function "mlp_classification", line 103, in mlp
  PL/Python function "mlp_classification", line 734, in 
  PL/Python function "mlp_classification", line 685, in _validate_dependent_var
  PL/Python function "mlp_classification", line 28, in 
PL/Python function "mlp_classification"
 [SQL: "SELECT madlib.mlp_classification(\n    'mnist_train_packed',        -- 
Packed table from preprocessor\n    'mnist_result',              -- Destination 
table\n    'independent_varname',       -- Independent\n    
'dependent_varname',         -- Dependent\n    ARRAY[20,20],                    
-- Hidden layer sizes\n    'learning_rate_init=0.1,\n    n_iterations=25,\n    
learning_rate_policy=const,\n    lambda=0.0001,               -- 
Regularization\n    tolerance=0',\n    'tanh',                      -- 
Activation function\n    '',                          -- No weights\n    FALSE, 
                      -- No warmstart\n    FALSE);                       -- 

> In MLP classification with mini-batch, check for 1-hot encoding of dependent 
> variable
> -------------------------------------------------------------------------------------
>                 Key: MADLIB-1227
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1227
>             Project: Apache MADlib
>          Issue Type: Improvement
>            Reporter: Frank McQuillan
>            Priority: Minor
>             Fix For: v1.14
> Related to
> https://issues.apache.org/jira/browse/MADLIB-1226
> Add check to MLP classification code to check that the dependent var has been 
> 1-hot encoded by accident, and error out if that is not the case.  
> This is to avoid the case of passing INTs as dep var to MLP that have not 
> been 1-hot encoded and having it run classification and not converge, or give 
> erroneous results, and giving no notification to the user about the problem.
> OK to pick 1st row and check one that y array value for encoding looks like:
> {code}
> [0.0, 1.0, 0.0 ...]
> {code}
> i.e., all 0's and just one 1

This message was sent by Atlassian JIRA

Reply via email to