[ 
https://issues.apache.org/jira/browse/MADLIB-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16388482#comment-16388482
 ] 

Nandish Jayaram commented on MADLIB-1206:
-----------------------------------------

We can follow python's path for setting the default value for batch_size:
{code}
default_batch_size = min(200, buffer_size)
{code}
The buffer_size is the number of rows that are packed into one by the 
preprocessor step
(https://issues.apache.org/jira/browse/MADLIB-1200). Note that python uses the 
total
number of input data points instead of buffer_size.

We may have to revisit this default batch size after some experiments to see 
how it
affects performance/accuracy.

> Add mini batch based gradient descent support to MLP
> ----------------------------------------------------
>
>                 Key: MADLIB-1206
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1206
>             Project: Apache MADlib
>          Issue Type: New Feature
>          Components: Module: Neural Networks
>            Reporter: Nandish Jayaram
>            Assignee: Nandish Jayaram
>            Priority: Major
>             Fix For: v1.14
>
>
> Mini-batch gradient descent is typically the algorithm of choice when 
> training a neural network.
> MADlib currently supports IGD, we may have to add extensions to include 
> mini-batch as a solver for MLP. Other modules will continue to use the 
> existing IGD that does not support mini-batching. Later JIRAs will move other 
> modules over one at a time to use the new mini-batch GD.
> Related JIRA that will pre-process the input data to be consumed by 
> mini-batch isĀ https://issues.apache.org/jira/browse/MADLIB-1200



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to