[GitHub] madlib issue #338: Install/Dev check: Add new test cases for some modules
Github user asfgit commented on the issue: https://github.com/apache/madlib/pull/338 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/madlib-pr-build/711/ ---
[GitHub] madlib issue #334: Minibatch Preprocessor: Update online doc
Github user asfgit commented on the issue: https://github.com/apache/madlib/pull/334 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/madlib-pr-build/710/ ---
[GitHub] madlib pull request #338: Install/Dev check: Add new test cases for some mod...
Github user asfgit closed the pull request at: https://github.com/apache/madlib/pull/338 ---
[GitHub] madlib pull request #334: Minibatch Preprocessor: Update online doc
Github user asfgit closed the pull request at: https://github.com/apache/madlib/pull/334 ---
[GitHub] madlib pull request #338: Install/Dev check: Add new test cases for some mod...
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/338#discussion_r234052747 --- Diff: src/ports/postgres/modules/pmml/test/pmml.ic.sql_in --- @@ -0,0 +1,119 @@ +/* --- *//** + * + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + * + *//* --- */ +DROP TABLE IF EXISTS abalone CASCADE; + +CREATE TABLE abalone ( +id integer, +sex text, +length double precision, +diameter double precision, +height double precision, +whole double precision, +shucked double precision, +viscera double precision, +shell double precision, +rings integer +); + +INSERT INTO abalone VALUES +(3151, 'F', 0.655027, 0.505004, 0.165008, 1.36699, 0.583519, 0.351479, 0.396019, 10), --- End diff -- Since this is an install check test, can we cut down on the dataset size and call the `glm` and the `pmml` functions only once. ---
[GitHub] madlib pull request #334: Minibatch Preprocessor: Update online doc
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/334#discussion_r234051398 --- Diff: src/ports/postgres/modules/utilities/minibatch_preprocessing.py_in --- @@ -487,10 +487,16 @@ class MiniBatchDocumentation: SUMMARY -MiniBatch Preprocessor is a utility function to pre process the input -data for use with models that support mini-batching as an optimization +The mini-batch preprocessor is a utility that prepares input data for +use by models that support mini-batch as an optimization option. (This +is currently only the case for Neural Networks.) It is effectively a --- End diff -- /s/Neural Networks/Neural Network ---
[GitHub] madlib pull request #334: Minibatch Preprocessor: Update online doc
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/334#discussion_r234051252 --- Diff: src/ports/postgres/modules/utilities/minibatch_preprocessing.py_in --- @@ -487,10 +487,16 @@ class MiniBatchDocumentation: SUMMARY -MiniBatch Preprocessor is a utility function to pre process the input -data for use with models that support mini-batching as an optimization +The mini-batch preprocessor is a utility that prepares input data for +use by models that support mini-batch as an optimization option. (This +is currently only the case for Neural Networks.) It is effectively a +packing operation that builds arrays of dependent and independent +variables from the source data table. -#TODO add more here +The advantage of using mini-batching is that it can perform better than +stochastic gradient descent (default MADlib optimizer) because it uses +more than one training example at a time, typically resulting faster --- End diff -- missing the word in `resulting in faster .` ---
[GitHub] madlib pull request #334: Minibatch Preprocessor: Update online doc
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/334#discussion_r234051503 --- Diff: src/ports/postgres/modules/utilities/minibatch_preprocessing.py_in --- @@ -487,10 +487,16 @@ class MiniBatchDocumentation: SUMMARY -MiniBatch Preprocessor is a utility function to pre process the input -data for use with models that support mini-batching as an optimization +The mini-batch preprocessor is a utility that prepares input data for +use by models that support mini-batch as an optimization option. (This +is currently only the case for Neural Networks.) It is effectively a +packing operation that builds arrays of dependent and independent --- End diff -- should we instead say `build matrix of independent variable(s) and arrays of dependent variable` ? ---
[GitHub] madlib pull request #334: Minibatch Preprocessor: Update online doc
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/334#discussion_r234051175 --- Diff: src/ports/postgres/modules/utilities/minibatch_preprocessing.py_in --- @@ -508,8 +514,13 @@ class MiniBatchDocumentation: dependent_varname, -- TEXT. Name of the dependent variable column independent_varname, -- TEXT. Name of the independent variable column -buffer_size-- INTEGER. Number of source input rows to - pack into batch +grouping_col -- TEXT. Default NULL. An expression list used + to group the input dataset into discrete groups +buffer_size-- INTEGER. Default computed automatically. + Number of source input rows to pack into batch --- End diff -- /s/batch/buffer ---
[GitHub] madlib issue #338: Install/Dev check: Add new test cases for some modules
Github user asfgit commented on the issue: https://github.com/apache/madlib/pull/338 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/madlib-pr-build/709/ ---
[GitHub] madlib pull request #338: Install/Dev check: Add new test cases for some mod...
GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/338 Install/Dev check: Add new test cases for some modules Some modules such as array_ops and pmml did not have any install check files, while stemmer did not have any test files. This commit adds some basic test cases for these modules. You can merge this pull request into a Git repository by running: $ git pull https://github.com/madlib/madlib ic-pmml-stemmer Alternatively you can review and apply these changes as the patch at: https://github.com/apache/madlib/pull/338.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #338 commit c351f176b305fb44bd87bc6a4f79c099a3d6fbe3 Author: Nandish Jayaram Date: 2018-09-29T00:15:40Z Install/Dev check: Add new test cases for some modules Some modules such as array_ops and pmml did not have any install check files, while stemmer did not have any test files. This commit adds some basic test cases for these modules. ---