[GitHub] madlib pull request #334: Minibatch Preprocessor: Update online doc
Github user asfgit closed the pull request at: https://github.com/apache/madlib/pull/334 ---
[GitHub] madlib pull request #334: Minibatch Preprocessor: Update online doc
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/334#discussion_r234051398 --- Diff: src/ports/postgres/modules/utilities/minibatch_preprocessing.py_in --- @@ -487,10 +487,16 @@ class MiniBatchDocumentation: SUMMARY -MiniBatch Preprocessor is a utility function to pre process the input -data for use with models that support mini-batching as an optimization +The mini-batch preprocessor is a utility that prepares input data for +use by models that support mini-batch as an optimization option. (This +is currently only the case for Neural Networks.) It is effectively a --- End diff -- /s/Neural Networks/Neural Network ---
[GitHub] madlib pull request #334: Minibatch Preprocessor: Update online doc
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/334#discussion_r234051252 --- Diff: src/ports/postgres/modules/utilities/minibatch_preprocessing.py_in --- @@ -487,10 +487,16 @@ class MiniBatchDocumentation: SUMMARY -MiniBatch Preprocessor is a utility function to pre process the input -data for use with models that support mini-batching as an optimization +The mini-batch preprocessor is a utility that prepares input data for +use by models that support mini-batch as an optimization option. (This +is currently only the case for Neural Networks.) It is effectively a +packing operation that builds arrays of dependent and independent +variables from the source data table. -#TODO add more here +The advantage of using mini-batching is that it can perform better than +stochastic gradient descent (default MADlib optimizer) because it uses +more than one training example at a time, typically resulting faster --- End diff -- missing the word in `resulting in faster .` ---
[GitHub] madlib pull request #334: Minibatch Preprocessor: Update online doc
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/334#discussion_r234051503 --- Diff: src/ports/postgres/modules/utilities/minibatch_preprocessing.py_in --- @@ -487,10 +487,16 @@ class MiniBatchDocumentation: SUMMARY -MiniBatch Preprocessor is a utility function to pre process the input -data for use with models that support mini-batching as an optimization +The mini-batch preprocessor is a utility that prepares input data for +use by models that support mini-batch as an optimization option. (This +is currently only the case for Neural Networks.) It is effectively a +packing operation that builds arrays of dependent and independent --- End diff -- should we instead say `build matrix of independent variable(s) and arrays of dependent variable` ? ---
[GitHub] madlib pull request #334: Minibatch Preprocessor: Update online doc
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/334#discussion_r234051175 --- Diff: src/ports/postgres/modules/utilities/minibatch_preprocessing.py_in --- @@ -508,8 +514,13 @@ class MiniBatchDocumentation: dependent_varname, -- TEXT. Name of the dependent variable column independent_varname, -- TEXT. Name of the independent variable column -buffer_size-- INTEGER. Number of source input rows to - pack into batch +grouping_col -- TEXT. Default NULL. An expression list used + to group the input dataset into discrete groups +buffer_size-- INTEGER. Default computed automatically. + Number of source input rows to pack into batch --- End diff -- /s/batch/buffer ---
[GitHub] madlib pull request #334: Minibatch Preprocessor: Update online doc
GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/334 Minibatch Preprocessor: Update online doc The online doc is outdated. This commit adds two new parameters that have been introduced since the last time the doc was edited. You can merge this pull request into a Git repository by running: $ git pull https://github.com/madlib/madlib doc/minibatch-preprocessor Alternatively you can review and apply these changes as the patch at: https://github.com/apache/madlib/pull/334.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #334 commit 7e95fc7d936f25e74ceceb74dfa7473c4eda45c8 Author: Nandish Jayaram Date: 2018-10-23T17:35:02Z Minibatch Preprocessor: Update online doc The online doc is outdated. This commit adds two new parameters that have been introduced since the last time the doc was edited. ---