[ https://issues.apache.org/jira/browse/MADLIB-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16380827#comment-16380827 ]
Nandish Jayaram commented on MADLIB-1206: ----------------------------------------- Based on the output of preprocess step in https://issues.apache.org/jira/browse/MADLIB-1200, MLP should decide to use mini-batch or not, with some basic testing: Check for <preprocessed_table_name>_summary, and <preprocessed_table_name>_standardization, and the column names in them to verify if the data is pre-processed or not. If preprocessed, then use mini-batch, else use regular IGD. Other information we should get from pre-process step: # the mean and standard deviation for independent variable. # Figure out if the data is pre-processed for classification or regression by looking at a column named `classes` in <preprocessed_table_name>_summary. # Get the original input table name, independent/dependent variable names, grouping columns from <preprocessed_table_name>_summary. # Use buffer size from <preprocessed_table_name>_summary to validate the batch_size to be used in MLP mini-batch. > Add mini batch based gradient descent support to MLP > ---------------------------------------------------- > > Key: MADLIB-1206 > URL: https://issues.apache.org/jira/browse/MADLIB-1206 > Project: Apache MADlib > Issue Type: New Feature > Components: Module: Neural Networks > Reporter: Nandish Jayaram > Assignee: Nandish Jayaram > Priority: Major > Fix For: v1.14 > > > Mini-batch gradient descent is typically the algorithm of choice when > training a neural network. > MADlib currently supports IGD, we may have to add extensions to include > mini-batch as a solver for MLP. Other modules will continue to use the > existing IGD that does not support mini-batching. Later JIRAs will move other > modules over one at a time to use the new mini-batch GD. > Related JIRA that will pre-process the input data to be consumed by > mini-batch is https://issues.apache.org/jira/browse/MADLIB-1200 -- This message was sent by Atlassian JIRA (v7.6.3#76005)