[GitHub] madlib issue #342: Minibatch Preprocessor for Deep learning

2018-12-22 Thread njayaram2
Github user njayaram2 commented on the issue: https://github.com/apache/madlib/pull/342 @reductionista thank you for the comments. The existing `minibatch_preprocessor` module outputs new columns called `dependent_varname` and `independent_varname` instead of the column names from

[GitHub] madlib pull request #342: Minibatch Preprocessor for Deep learning

2018-12-19 Thread njayaram2
GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/342 Minibatch Preprocessor for Deep learning The minibatch preprocessor we currently have in MADlib is bloated for DL tasks. This feature adds a simplified version of creating buffers

[GitHub] madlib pull request #341: Minibatch Preprocessor for Deep learning

2018-12-19 Thread njayaram2
Github user njayaram2 closed the pull request at: https://github.com/apache/madlib/pull/341 ---

[GitHub] madlib pull request #341: Minibatch Preprocessor for Deep learning

2018-12-19 Thread njayaram2
GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/341 Minibatch Preprocessor for Deep learning The minibatch preprocessor we currently have in MADlib is bloated for DL tasks. This feature adds a simplified version of creating buffers

[GitHub] madlib pull request #338: Install/Dev check: Add new test cases for some mod...

2018-11-15 Thread njayaram2
GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/338 Install/Dev check: Add new test cases for some modules Some modules such as array_ops and pmml did not have any install check files, while stemmer did not have any test files. This commit adds

[GitHub] madlib pull request #334: Minibatch Preprocessor: Update online doc

2018-10-23 Thread njayaram2
GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/334 Minibatch Preprocessor: Update online doc The online doc is outdated. This commit adds two new parameters that have been introduced since the last time the doc was edited. You can merge

[GitHub] madlib pull request #326: Install/Dev check: Add new test cases for some mod...

2018-10-02 Thread njayaram2
Github user njayaram2 closed the pull request at: https://github.com/apache/madlib/pull/326 ---

[GitHub] madlib pull request #327: Upgrade: Fix issue with upgrading RPM to 1.15.1

2018-10-01 Thread njayaram2
GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/327 Upgrade: Fix issue with upgrading RPM to 1.15.1 JIRA: MADLIB-1278 During RPM upgrade, rpm_post.sh is run first, followed by rpm_post_uninstall.sh. So we must do all the uninstallation

[GitHub] madlib pull request #326: Install/Dev check: Add new test cases for some mod...

2018-10-01 Thread njayaram2
GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/326 Install/Dev check: Add new test cases for some modules Some modules such as array_ops and pmml did not have any install check files, while stemmer did not have any test files. This commit adds

[GitHub] madlib issue #315: JIRA:1060 - Modified KNN to accept expressions in point_c...

2018-09-07 Thread njayaram2
Github user njayaram2 commented on the issue: https://github.com/apache/madlib/pull/315 @fmcquillan99 thanks for testing this out. I can have a look at this issue. ---

[GitHub] madlib pull request #315: JIRA:1060 - Modified KNN to accept expressions in ...

2018-09-05 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/315#discussion_r215464110 --- Diff: src/ports/postgres/modules/knn/knn.py_in --- @@ -53,22 +55,12 @@ def knn_validate_src(schema_madlib, point_source, point_column_name, point_id

[GitHub] madlib pull request #315: JIRA:1060 - Modified KNN to accept expressions in ...

2018-08-31 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/315#discussion_r214485090 --- Diff: src/ports/postgres/modules/knn/knn.py_in --- @@ -53,22 +55,12 @@ def knn_validate_src(schema_madlib, point_source, point_column_name, point_id

[GitHub] madlib pull request #315: JIRA:1060 - Modified KNN to accept expressions in ...

2018-08-31 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/315#discussion_r214487281 --- Diff: src/ports/postgres/modules/knn/knn.py_in --- @@ -53,22 +55,12 @@ def knn_validate_src(schema_madlib, point_source, point_column_name, point_id

[GitHub] madlib pull request #315: JIRA:1060 - Modified KNN to accept expressions in ...

2018-08-31 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/315#discussion_r214484071 --- Diff: src/ports/postgres/modules/knn/knn.py_in --- @@ -53,22 +55,12 @@ def knn_validate_src(schema_madlib, point_source, point_column_name, point_id

[GitHub] madlib pull request #315: JIRA:1060 - Modified KNN to accept expressions in ...

2018-08-31 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/315#discussion_r214485621 --- Diff: src/ports/postgres/modules/knn/knn.py_in --- @@ -53,22 +55,12 @@ def knn_validate_src(schema_madlib, point_source, point_column_name, point_id

[GitHub] madlib pull request #315: JIRA:1060 - Modified KNN to accept expressions in ...

2018-08-31 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/315#discussion_r214487426 --- Diff: src/ports/postgres/modules/knn/knn.py_in --- @@ -53,22 +55,12 @@ def knn_validate_src(schema_madlib, point_source, point_column_name, point_id

[GitHub] madlib pull request #315: JIRA:1060 - Modified KNN to accept expressions in ...

2018-08-31 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/315#discussion_r214482318 --- Diff: src/ports/postgres/modules/knn/knn.py_in --- @@ -264,12 +260,17 @@ def knn(schema_madlib, point_source, point_column_name, point_id

[GitHub] madlib pull request #315: JIRA:1060 - Modified KNN to accept expressions in ...

2018-08-29 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/315#discussion_r213806554 --- Diff: src/ports/postgres/modules/knn/knn.py_in --- @@ -53,22 +55,10 @@ def knn_validate_src(schema_madlib, point_source, point_column_name, point_id

[GitHub] madlib pull request #315: JIRA:1060 - Modified KNN to accept expressions in ...

2018-08-29 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/315#discussion_r213791885 --- Diff: src/ports/postgres/modules/knn/knn.py_in --- @@ -53,22 +55,10 @@ def knn_validate_src(schema_madlib, point_source, point_column_name, point_id

[GitHub] madlib pull request #315: JIRA:1060 - Modified KNN to accept expressions in ...

2018-08-29 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/315#discussion_r213792935 --- Diff: src/ports/postgres/modules/knn/knn.py_in --- @@ -264,12 +275,14 @@ def knn(schema_madlib, point_source, point_column_name, point_id

[GitHub] madlib issue #314: Ubuntu support: Enable creation of gppkg on Ubuntu

2018-08-27 Thread njayaram2
Github user njayaram2 commented on the issue: https://github.com/apache/madlib/pull/314 Thank you for the comments @reductionista , I have updated the comment. Please do have a look at it. ---

[GitHub] madlib pull request #314: Ubuntu support: Enable creation of gppkg on Ubuntu

2018-08-22 Thread njayaram2
GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/314 Ubuntu support: Enable creation of gppkg on Ubuntu This commit makes necessary changes to create a gppkg on Ubuntu. The default behavior when MADlib is built on Ubuntu is to create a .deb

[GitHub] madlib pull request #295: Recursive Partitioning: Add function to report imp...

2018-07-23 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/295#discussion_r204593622 --- Diff: src/ports/postgres/modules/recursive_partitioning/random_forest.py_in --- @@ -1564,69 +1578,69 @@ def get_var_importance(schema_madlib

[GitHub] madlib pull request #291: Feature: Vector-Column Transformations

2018-07-23 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/291#discussion_r204589559 --- Diff: src/ports/postgres/modules/utilities/transform_vec_cols.py_in --- @@ -0,0 +1,513 @@ +# Licensed to the Apache Software Foundation (ASF) under

[GitHub] madlib pull request #297: Madpack: Fix various schema related bugs

2018-07-23 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/297#discussion_r204572428 --- Diff: src/madpack/madpack.py --- @@ -995,19 +996,20 @@ def run_unit_tests(args, testcase): Run unit tests

[GitHub] madlib issue #295: Recursive Partitioning: Add function to report importance...

2018-07-19 Thread njayaram2
Github user njayaram2 commented on the issue: https://github.com/apache/madlib/pull/295 @fmcquillan only impurity, I don't think we scale oob to 100. ---

[GitHub] madlib pull request #291: Feature: Vector to Columns

2018-07-19 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/291#discussion_r203890181 --- Diff: src/ports/postgres/modules/utilities/transform_vec_cols.py_in --- @@ -0,0 +1,492 @@ +# Licensed to the Apache Software Foundation (ASF) under

[GitHub] madlib issue #294: Pagerank: Remove duplicate entries from grouping output

2018-07-16 Thread njayaram2
Github user njayaram2 commented on the issue: https://github.com/apache/madlib/pull/294 Thank you for the comments @jingyimei , have pushed a commit with a new dev-check test. ---

[GitHub] madlib pull request #290: madpack: Add madpack option to run unit tests.

2018-07-16 Thread njayaram2
Github user njayaram2 closed the pull request at: https://github.com/apache/madlib/pull/290 ---

[GitHub] madlib pull request #294: Pagerank: Remove duplicate entries from grouping o...

2018-07-16 Thread njayaram2
GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/294 Pagerank: Remove duplicate entries from grouping output JIRA: MADLIB-1229 JIRA: MADLIB-1253 Fixes the missing output for complete graphs bug as well. Co-authored-by: Nandish

[GitHub] madlib issue #290: madpack: Add madpack option to run unit tests.

2018-07-12 Thread njayaram2
Github user njayaram2 commented on the issue: https://github.com/apache/madlib/pull/290 Thank you for the comments @iyerr3 , will do the needful. ---

[GitHub] madlib pull request #290: madpack: Add madpack option to run unit tests.

2018-07-11 Thread njayaram2
GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/290 madpack: Add madpack option to run unit tests. JIRA: MADLIB-1252 Unit tests in MADlib are written in python files, that are located in the .../test/unit_tests/ folders, whose names

[GitHub] madlib pull request #288: Jira:1239: Converts features from multiple columns...

2018-07-05 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/288#discussion_r200444598 --- Diff: src/ports/postgres/modules/cols_vec/cols2vec.py_in --- @@ -0,0 +1,110 @@ +""" +@file cols2vec.py_in + +@brief Uti

[GitHub] madlib issue #284: SVM: Fix flaky dev-check failure

2018-06-27 Thread njayaram2
Github user njayaram2 commented on the issue: https://github.com/apache/madlib/pull/284 Thank you for the comment @iyerr3 , will relax the constraint as suggested. ---

[GitHub] madlib pull request #284: SVM: Fix flaky dev-check failure

2018-06-27 Thread njayaram2
GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/284 SVM: Fix flaky dev-check failure JIRA: MADLIB-1232 SVM has a dev-check query that is flaky on a large cluster. This commit relaxes the assert condition for that query. Closes

[GitHub] madlib issue #283: Bugfix: Fix failing dev check in CRF

2018-06-27 Thread njayaram2
Github user njayaram2 commented on the issue: https://github.com/apache/madlib/pull/283 Thank you for the comments @kaknikhil . I moved out the jenkins build script to a different commit. ---

[GitHub] madlib pull request #283: Bugfix: Fix failing dev check in CRF

2018-06-27 Thread njayaram2
GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/283 Bugfix: Fix failing dev check in CRF This commit has the following changes: - A couple of dev check files in CRF did not have the label table creation in it. But the label table

[GitHub] madlib pull request #277: DT: Add impurity importance metric

2018-06-18 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/277#discussion_r196150565 --- Diff: src/ports/postgres/modules/recursive_partitioning/decision_tree.py_in --- @@ -1097,28 +1121,21 @@ def _one_step(schema_madlib

[GitHub] madlib issue #276: Feature/dev check

2018-06-15 Thread njayaram2
Github user njayaram2 commented on the issue: https://github.com/apache/madlib/pull/276 Thank you for the comments @iyerr3 , will make the changes you have requested. Having one IC file for each module makes sense, but on Greenplum, for some modules the IC run time is still

[GitHub] madlib pull request #272: MLP: Add momentum and nesterov to gradient updates...

2018-06-08 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/272#discussion_r194151705 --- Diff: doc/design/modules/neural-network.tex --- @@ -117,6 +122,26 @@ \subsubsection{Backpropagation} \[\boxed{\delta_{k}^j = \sum_{t=1}^{n_{k+1

[GitHub] madlib pull request #272: MLP: Add momentum and nesterov to gradient updates...

2018-06-08 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/272#discussion_r194151826 --- Diff: doc/design/modules/neural-network.tex --- @@ -196,17 +221,28 @@ \subsubsection{The $\mathit{Gradient}$ Function} \end{algorithmic} \end

[GitHub] madlib pull request #275: Madpack: Fix error with dropping user after IC fai...

2018-06-05 Thread njayaram2
GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/275 Madpack: Fix error with dropping user after IC failure. JIRA: MADLIB-1182 Previously, when install check did not fail gracefully, the user created by madpack hung around and disturbed

[GitHub] madlib pull request #272: MLP: Add momentum and nesterov to gradient updates...

2018-05-31 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/272#discussion_r192246910 --- Diff: doc/design/modules/neural-network.tex --- @@ -117,6 +117,24 @@ \subsubsection{Backpropagation} \[\boxed{\delta_{k}^j = \sum_{t=1}^{n_{k+1

[GitHub] madlib pull request #272: MLP: Add momentum and nesterov to gradient updates...

2018-05-31 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/272#discussion_r192246463 --- Diff: doc/design/modules/neural-network.tex --- @@ -117,6 +117,24 @@ \subsubsection{Backpropagation} \[\boxed{\delta_{k}^j = \sum_{t=1}^{n_{k+1

[GitHub] madlib pull request #272: MLP: Add momentum and nesterov to gradient updates...

2018-05-31 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/272#discussion_r192251198 --- Diff: doc/design/modules/neural-network.tex --- @@ -117,6 +117,24 @@ \subsubsection{Backpropagation} \[\boxed{\delta_{k}^j = \sum_{t=1}^{n_{k+1

[GitHub] madlib pull request #272: MLP: Add momentum and nesterov to gradient updates...

2018-05-31 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/272#discussion_r192245605 --- Diff: doc/design/modules/neural-network.tex --- @@ -117,6 +117,24 @@ \subsubsection{Backpropagation} \[\boxed{\delta_{k}^j = \sum_{t=1}^{n_{k+1

[GitHub] madlib pull request #272: MLP: Add momentum and nesterov to gradient updates...

2018-05-31 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/272#discussion_r192248589 --- Diff: doc/design/modules/neural-network.tex --- @@ -117,6 +117,24 @@ \subsubsection{Backpropagation} \[\boxed{\delta_{k}^j = \sum_{t=1}^{n_{k+1

[GitHub] madlib pull request #271: Madpack: Make install, reinstall and upgrade atomi...

2018-05-31 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/271#discussion_r192186350 --- Diff: src/madpack/madpack.py --- @@ -131,10 +141,73 @@ def _get_relative_maddir(maddir, port): return maddir

[GitHub] madlib pull request #271: Madpack: Make install, reinstall and upgrade atomi...

2018-05-31 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/271#discussion_r192174800 --- Diff: src/madpack/upgrade_util.py --- @@ -1299,18 +1303,19 @@ def _clean_function(self): pattern = re.compile(r"""CREA

[GitHub] madlib pull request #271: Madpack: Make install, reinstall and upgrade atomi...

2018-05-31 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/271#discussion_r192193089 --- Diff: src/madpack/madpack.py --- @@ -559,71 +650,59 @@ def _db_rename_schema(from_schema, to_schema

[GitHub] madlib pull request #271: Madpack: Make install, reinstall and upgrade atomi...

2018-05-31 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/271#discussion_r192205268 --- Diff: src/madpack/madpack.py --- @@ -987,275 +1276,42 @@ def main(argv): error_(this, "Missing -p/--platform parameter.&q

[GitHub] madlib pull request #271: Madpack: Make install, reinstall and upgrade atomi...

2018-05-31 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/271#discussion_r192175486 --- Diff: src/madpack/upgrade_util.py --- @@ -1299,18 +1303,19 @@ def _clean_function(self): pattern = re.compile(r"""CREA

[GitHub] madlib pull request #271: Madpack: Make install, reinstall and upgrade atomi...

2018-05-31 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/271#discussion_r192204168 --- Diff: src/madpack/madpack.py --- @@ -824,6 +873,246 @@ def parse_arguments(): # Get the arguments return parser.parse_args

[GitHub] madlib pull request #271: Madpack: Make install, reinstall and upgrade atomi...

2018-05-31 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/271#discussion_r192182069 --- Diff: src/madpack/madpack.py --- @@ -95,6 +95,16 @@ def _internal_run_query(sql, show_error): return run_query(sql, con_args, show_error

[GitHub] madlib pull request #271: Madpack: Make install, reinstall and upgrade atomi...

2018-05-31 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/271#discussion_r192193755 --- Diff: src/madpack/madpack.py --- @@ -824,6 +873,246 @@ def parse_arguments(): # Get the arguments return parser.parse_args

[GitHub] madlib pull request #271: Madpack: Make install, reinstall and upgrade atomi...

2018-05-31 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/271#discussion_r192181634 --- Diff: src/madpack/madpack.py --- @@ -95,6 +95,16 @@ def _internal_run_query(sql, show_error): return run_query(sql, con_args, show_error

[GitHub] madlib pull request #271: Madpack: Make install, reinstall and upgrade atomi...

2018-05-31 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/271#discussion_r192178673 --- Diff: src/madpack/utilities.py --- @@ -33,6 +33,23 @@ this = os.path.basename(sys.argv[0])# name of this script +class

[GitHub] madlib issue #271: Madpack: Make install, reinstall and upgrade atomic

2018-05-31 Thread njayaram2
Github user njayaram2 commented on the issue: https://github.com/apache/madlib/pull/271 Thank you for the comments @kaknikhil , will address them. ---

[GitHub] madlib pull request #272: MLP: Add momentum and nesterov to gradient updates...

2018-05-29 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/272#discussion_r191575057 --- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in --- @@ -1781,7 +1799,7 @@ class MLPMinibatchPreProcessor: summary_table_columns

[GitHub] madlib pull request #272: MLP: Add momentum and nesterov to gradient updates...

2018-05-29 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/272#discussion_r191533727 --- Diff: src/modules/convex/type/model.hpp --- @@ -126,45 +129,96 @@ struct MLPModel { for (k = 0; k < N; k ++) { s

[GitHub] madlib pull request #272: MLP: Add momentum and nesterov to gradient updates...

2018-05-29 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/272#discussion_r191574947 --- Diff: src/ports/postgres/modules/convex/mlp.sql_in --- @@ -1474,13 +1480,15 @@ CREATE AGGREGATE MADLIB_SCHEMA.mlp_minibatch_step

[GitHub] madlib pull request #272: MLP: Add momentum and nesterov to gradient updates...

2018-05-29 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/272#discussion_r191539168 --- Diff: src/modules/convex/task/mlp.hpp --- @@ -197,6 +244,7 @@ MLP::loss( const model_type, const

[GitHub] madlib pull request #272: MLP: Add momentum and nesterov to gradient updates...

2018-05-29 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/272#discussion_r191537654 --- Diff: src/modules/convex/task/mlp.hpp --- @@ -126,68 +157,84 @@ MLP::getLossAndUpdateModel( const Matrix _true_batch

[GitHub] madlib pull request #260: minibatch preprocessor improvements

2018-04-10 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/260#discussion_r180603956 --- Diff: src/ports/postgres/modules/utilities/minibatch_preprocessing.py_in --- @@ -397,8 +408,9 @@ class MiniBatchStandardizer

[GitHub] madlib pull request #259: Minibatch: Add one-hot encoding option for int

2018-04-10 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/259#discussion_r180576675 --- Diff: src/ports/postgres/modules/utilities/minibatch_preprocessing.sql_in --- @@ -91,6 +92,22 @@ minibatch_preprocessor( When this value

[GitHub] madlib pull request #258: RF: Comment out assert in flaky install check quer...

2018-04-06 Thread njayaram2
GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/258 RF: Comment out assert in flaky install check query JIRA: MADLIB-1225 The variable importance computation involves randomization inherently. So it is hard to reproduce this error

[GitHub] madlib pull request #254: Enable grouping for minibatch preprocessing

2018-04-03 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/254#discussion_r178684532 --- Diff: src/ports/postgres/modules/utilities/mean_std_dev_calculator.py_in --- @@ -40,15 +41,27 @@ class MeanStdDevCalculator

[GitHub] madlib pull request #254: Enable grouping for minibatch preprocessing

2018-04-03 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/254#discussion_r178684358 --- Diff: src/ports/postgres/modules/convex/utils_regularization.py_in --- @@ -85,6 +86,8 @@ def utils_ind_var_scales_grouping(tbl_data, col_ind_var

[GitHub] madlib pull request #251: MLP: Simplify initialization of model coefficients

2018-03-29 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/251#discussion_r178204067 --- Diff: src/modules/convex/mlp_igd.cpp --- @@ -98,23 +98,28 @@ mlp_igd_transition::run(AnyType ) { double is_classification_double

[GitHub] madlib pull request #250: MLP: Allow one-hot encoded dependent var for class...

2018-03-29 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/250#discussion_r178167691 --- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in --- @@ -667,7 +678,8 @@ def _validate_dependent_var(source_table, dependent_varname

[GitHub] madlib pull request #250: MLP: Allow one-hot encoded dependent var for class...

2018-03-29 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/250#discussion_r178168389 --- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in --- @@ -856,8 +868,16 @@ def mlp_predict(schema_madlib, model_table, data_table, id_col_name

[GitHub] madlib pull request #250: MLP: Allow one-hot encoded dependent var for class...

2018-03-29 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/250#discussion_r178168247 --- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in --- @@ -667,7 +678,8 @@ def _validate_dependent_var(source_table, dependent_varname

[GitHub] madlib pull request #250: MLP: Allow one-hot encoded dependent var for class...

2018-03-29 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/250#discussion_r178168263 --- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in --- @@ -667,7 +678,8 @@ def _validate_dependent_var(source_table, dependent_varname

[GitHub] madlib pull request #248: DT: Ensure proper quoting in grouping coalesce

2018-03-27 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/248#discussion_r177495864 --- Diff: src/ports/postgres/modules/recursive_partitioning/decision_tree.py_in --- @@ -970,16 +970,35 @@ def _get_bins_grps

[GitHub] madlib pull request #250: MLP: Allow one-hot encoded dependent var for class...

2018-03-26 Thread njayaram2
GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/250 MLP: Allow one-hot encoded dependent var for classification JIRA:MADLIB-1222 MLP currently automatically encodes categorical variables for classification but does not allow already

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-21 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r176218740 --- Diff: src/modules/convex/mlp_igd.cpp --- @@ -130,6 +145,90 @@ mlp_igd_transition::run(AnyType ) { return state

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175949965 --- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in --- @@ -222,67 +243,83 @@ def mlp(schema_madlib, source_table, output_table, independent_varname

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175950079 --- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in --- @@ -292,26 +329,33 @@ def mlp(schema_madlib, source_table, output_table, independent_varname

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175947887 --- Diff: src/modules/convex/algo/igd.hpp --- @@ -90,20 +90,27 @@ IGD<State, ConstState, Task>::transition(state_type , for (int curr

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175947857 --- Diff: src/modules/convex/algo/igd.hpp --- @@ -90,20 +90,27 @@ IGD<State, ConstState, Task>::transition(state_type , for (int curr

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175950139 --- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in --- @@ -491,10 +571,28 @@ def _update_temp_model_table(args, iteration, temp_output_table

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175948252 --- Diff: src/modules/convex/mlp_igd.cpp --- @@ -130,6 +145,90 @@ mlp_igd_transition::run(AnyType ) { return state

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175949682 --- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in --- @@ -222,67 +243,83 @@ def mlp(schema_madlib, source_table, output_table, independent_varname

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175948750 --- Diff: src/modules/convex/task/mlp.hpp --- @@ -111,6 +117,57 @@ class MLP { template double MLP<Model, Tuple>::lamb

[GitHub] madlib issue #241: MiniBatch Pre-Processor: Add new module minibatch_preproc...

2018-03-19 Thread njayaram2
Github user njayaram2 commented on the issue: https://github.com/apache/madlib/pull/241 Another issue I found but forgot to mention in the review: The `__id__` column has double values instead of integers. For instance, I found values such as `0.2000

[GitHub] madlib pull request #241: MiniBatch Pre-Processor: Add new module minibatch_...

2018-03-19 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/241#discussion_r175548350 --- Diff: src/ports/postgres/modules/utilities/minibatch_preprocessing.py_in --- @@ -0,0 +1,559 @@ +# coding=utf-8 +# +# Licensed

[GitHub] madlib pull request #241: MiniBatch Pre-Processor: Add new module minibatch_...

2018-03-19 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/241#discussion_r175588969 --- Diff: src/ports/postgres/modules/utilities/minibatch_preprocessing.py_in --- @@ -0,0 +1,559 @@ +# coding=utf-8 +# +# Licensed

[GitHub] madlib pull request #241: MiniBatch Pre-Processor: Add new module minibatch_...

2018-03-19 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/241#discussion_r175593796 --- Diff: src/ports/postgres/modules/utilities/minibatch_preprocessing.py_in --- @@ -0,0 +1,559 @@ +# coding=utf-8 +# +# Licensed

[GitHub] madlib pull request #241: MiniBatch Pre-Processor: Add new module minibatch_...

2018-03-19 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/241#discussion_r175531202 --- Diff: src/ports/postgres/modules/utilities/minibatch_preprocessing.py_in --- @@ -0,0 +1,559 @@ +# coding=utf-8 +# +# Licensed

[GitHub] madlib pull request #241: MiniBatch Pre-Processor: Add new module minibatch_...

2018-03-19 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/241#discussion_r175522378 --- Diff: src/ports/postgres/modules/utilities/utilities.py_in --- @@ -794,6 +794,41 @@ def collate_plpy_result(plpy_result_rows

[GitHub] madlib pull request #241: MiniBatch Pre-Processor: Add new module minibatch_...

2018-03-19 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/241#discussion_r175585050 --- Diff: src/ports/postgres/modules/utilities/minibatch_preprocessing.py_in --- @@ -0,0 +1,559 @@ +# coding=utf-8 +# +# Licensed

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-16 Thread njayaram2
GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/243 MLP: Add minibatch gradient descent solver JIRA: MADLIB-1206 This commit adds support for mini-batch based gradient descent for MLP. If the input table contains a 2D matrix

[GitHub] madlib pull request #240: MLP: Fix step size initialization based on learnin...

2018-03-09 Thread njayaram2
GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/240 MLP: Fix step size initialization based on learning rate policy JIRA: MADLIB-1212 The step_size is supposed to be updated based on the learning rate. The formulae for different

[GitHub] madlib pull request #239: Balance Sample: Add support for grouping

2018-03-08 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/239#discussion_r173254181 --- Diff: src/ports/postgres/modules/sample/balance_sample.sql_in --- @@ -543,6 +545,95 @@ SELECT * FROM output_table ORDER BY mainhue, name; (25 rows

[GitHub] madlib pull request #239: Balance Sample: Add support for grouping

2018-03-07 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/239#discussion_r172954235 --- Diff: src/ports/postgres/modules/sample/balance_sample.py_in --- @@ -468,81 +544,107 @@ def balance_sample(schema_madlib, source_table, output_table

[GitHub] madlib pull request #239: Balance Sample: Add support for grouping

2018-03-07 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/239#discussion_r172953687 --- Diff: src/ports/postgres/modules/sample/balance_sample.py_in --- @@ -468,81 +544,107 @@ def balance_sample(schema_madlib, source_table, output_table

[GitHub] madlib pull request #239: Balance Sample: Add support for grouping

2018-03-07 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/239#discussion_r172958587 --- Diff: src/ports/postgres/modules/sample/balance_sample.py_in --- @@ -58,28 +60,64 @@ NOSAMPLE = 'nosample' NEW_ID_COLUMN = '__madlib_id__

[GitHub] madlib issue #237: Bugfix: MLP predict using 1.12 model fails on later versi...

2018-02-22 Thread njayaram2
Github user njayaram2 commented on the issue: https://github.com/apache/madlib/pull/237 Thank you for the comments @iyerr3 , will push another commit. ---

[GitHub] madlib pull request #237: Bugfix: MLP predict using 1.12 model fails on late...

2018-02-21 Thread njayaram2
GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/237 Bugfix: MLP predict using 1.12 model fails on later versions JIRA: MADLIB-1207 MADlib 1.12 did not support grouping in MLP. The summary table created used to have the mean and std

[GitHub] madlib pull request #230: Balanced sets final

2018-02-05 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/230#discussion_r166056096 --- Diff: src/ports/postgres/modules/sample/balance_sample.py_in --- @@ -0,0 +1,748 @@ +# coding=utf-8 +# +# Licensed to the Apache Software

  1   2   >