[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175923655 --- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in --- @@ -33,11 +34,12 @@ from convex.utils_regularization import __utils_normalize_data_grouping

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175894372 --- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in --- @@ -1457,3 +1660,85 @@ def mlp_predict_help(schema_madlib, message): return """

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175923217 --- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in --- @@ -292,26 +329,33 @@ def mlp(schema_madlib, source_table, output_table,

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175889157 --- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in --- @@ -590,51 +664,103 @@ def _validate_warm_start(output_table, summary_table,

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175921215 --- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in --- @@ -222,67 +243,83 @@ def mlp(schema_madlib, source_table, output_table,

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175628144 --- Diff: src/modules/convex/mlp_igd.cpp --- @@ -130,6 +145,90 @@ mlp_igd_transition::run(AnyType ) { return state; } +/** + *

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175877947 --- Diff: src/modules/convex/task/mlp.hpp --- @@ -111,6 +117,57 @@ class MLP { template double MLP::lambda = 0;

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175929883 --- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in --- @@ -292,26 +329,33 @@ def mlp(schema_madlib, source_table, output_table,

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175620624 --- Diff: src/modules/convex/algo/igd.hpp --- @@ -90,20 +90,27 @@ IGD::transition(state_type , for (int

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175871655 --- Diff: src/modules/convex/mlp_igd.cpp --- @@ -170,6 +289,24 @@ mlp_igd_final::run(AnyType ) { return state; } + +/** + *

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175917822 --- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in --- @@ -222,67 +243,83 @@ def mlp(schema_madlib, source_table, output_table,

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175895832 --- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in --- @@ -72,107 +73,127 @@ def mlp(schema_madlib, source_table, output_table,

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175873168 --- Diff: src/modules/convex/task/mlp.hpp --- @@ -111,6 +117,57 @@ class MLP { template double MLP::lambda = 0;

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175626817 --- Diff: src/modules/convex/mlp_igd.cpp --- @@ -130,6 +145,90 @@ mlp_igd_transition::run(AnyType ) { return state; } +/** + *

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175625333 --- Diff: src/modules/convex/algo/igd.hpp --- @@ -90,20 +90,27 @@ IGD::transition(state_type , for (int

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-20 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/243#discussion_r175621915 --- Diff: src/modules/convex/algo/igd.hpp --- @@ -90,20 +90,27 @@ IGD::transition(state_type , for (int

[GitHub] madlib pull request #246: DT user doc updates

2018-03-20 Thread fmcquillan99
GitHub user fmcquillan99 opened a pull request: https://github.com/apache/madlib/pull/246 DT user doc updates @rahiyer please review DT user doc updates Will start working on RF in parallel. You can merge this pull request into a Git repository by running: $ git pull

[GitHub] madlib pull request #244: Changes for Personalized Page Rank : Jira:1084

2018-03-20 Thread jingyimei
Github user jingyimei commented on a diff in the pull request: https://github.com/apache/madlib/pull/244#discussion_r175663431 --- Diff: src/ports/postgres/modules/graph/pagerank.py_in --- @@ -149,25 +164,39 @@ def pagerank(schema_madlib, vertex_table, vertex_id, edge_table,

[GitHub] madlib pull request #244: Changes for Personalized Page Rank : Jira:1084

2018-03-20 Thread jingyimei
Github user jingyimei commented on a diff in the pull request: https://github.com/apache/madlib/pull/244#discussion_r175664342 --- Diff: src/ports/postgres/modules/graph/test/pagerank.sql_in --- @@ -84,7 +89,8 @@ SELECT pagerank( NULL, NULL,

[GitHub] madlib pull request #244: Changes for Personalized Page Rank : Jira:1084

2018-03-20 Thread jingyimei
Github user jingyimei commented on a diff in the pull request: https://github.com/apache/madlib/pull/244#discussion_r175627510 --- Diff: src/ports/postgres/modules/graph/pagerank.py_in --- @@ -527,14 +562,63 @@ def pagerank(schema_madlib, vertex_table, vertex_id, edge_table,

[GitHub] madlib pull request #244: Changes for Personalized Page Rank : Jira:1084

2018-03-20 Thread jingyimei
Github user jingyimei commented on a diff in the pull request: https://github.com/apache/madlib/pull/244#discussion_r175665727 --- Diff: src/ports/postgres/modules/graph/pagerank.sql_in --- @@ -120,6 +121,10 @@ distribution per group. When this value is NULL, no grouping is used

[GitHub] madlib pull request #244: Changes for Personalized Page Rank : Jira:1084

2018-03-20 Thread jingyimei
Github user jingyimei commented on a diff in the pull request: https://github.com/apache/madlib/pull/244#discussion_r175631615 --- Diff: src/ports/postgres/modules/graph/pagerank.sql_in --- @@ -273,6 +278,48 @@ SELECT * FROM pagerank_out_summary ORDER BY user_id; (2 rows)

[GitHub] madlib pull request #240: MLP: Fix step size initialization based on learnin...

2018-03-19 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/madlib/pull/240 ---

[GitHub] madlib pull request #241: MiniBatch Pre-Processor: Add new module minibatch_...

2018-03-19 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/241#discussion_r175548350 --- Diff: src/ports/postgres/modules/utilities/minibatch_preprocessing.py_in --- @@ -0,0 +1,559 @@ +# coding=utf-8 +# +# Licensed to the

[GitHub] madlib pull request #241: MiniBatch Pre-Processor: Add new module minibatch_...

2018-03-19 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/241#discussion_r175588969 --- Diff: src/ports/postgres/modules/utilities/minibatch_preprocessing.py_in --- @@ -0,0 +1,559 @@ +# coding=utf-8 +# +# Licensed to the

[GitHub] madlib pull request #241: MiniBatch Pre-Processor: Add new module minibatch_...

2018-03-19 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/241#discussion_r175593796 --- Diff: src/ports/postgres/modules/utilities/minibatch_preprocessing.py_in --- @@ -0,0 +1,559 @@ +# coding=utf-8 +# +# Licensed to the

[GitHub] madlib pull request #241: MiniBatch Pre-Processor: Add new module minibatch_...

2018-03-19 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/241#discussion_r175531202 --- Diff: src/ports/postgres/modules/utilities/minibatch_preprocessing.py_in --- @@ -0,0 +1,559 @@ +# coding=utf-8 +# +# Licensed to the

[GitHub] madlib pull request #241: MiniBatch Pre-Processor: Add new module minibatch_...

2018-03-19 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/241#discussion_r175522378 --- Diff: src/ports/postgres/modules/utilities/utilities.py_in --- @@ -794,6 +794,41 @@ def collate_plpy_result(plpy_result_rows): #

[GitHub] madlib pull request #241: MiniBatch Pre-Processor: Add new module minibatch_...

2018-03-19 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/241#discussion_r175585050 --- Diff: src/ports/postgres/modules/utilities/minibatch_preprocessing.py_in --- @@ -0,0 +1,559 @@ +# coding=utf-8 +# +# Licensed to the

[GitHub] madlib pull request #245: Reduce Install Check run time

2018-03-19 Thread jingyimei
GitHub user jingyimei opened a pull request: https://github.com/apache/madlib/pull/245 Reduce Install Check run time To reduce the total run time of install check, we looked at the top 5 modules that take longest and modified install check test cases. See each commit for details.

[GitHub] madlib pull request #244: Changes for Personalized Page Rank : Jira:1084

2018-03-16 Thread hpandeycodeit
GitHub user hpandeycodeit opened a pull request: https://github.com/apache/madlib/pull/244 Changes for Personalized Page Rank : Jira:1084 Jira : 1084 This PR contains changes for Personalized Page Rank. - Added extra parameter, nodes_of_interest in main pagerank

[GitHub] madlib pull request #240: MLP: Fix step size initialization based on learnin...

2018-03-16 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/240#discussion_r175224785 --- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in --- @@ -112,6 +112,7 @@ def mlp(schema_madlib, source_table, output_table, independent_varname,

[GitHub] madlib pull request #243: MLP: Add minibatch gradient descent solver

2018-03-16 Thread njayaram2
GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/243 MLP: Add minibatch gradient descent solver JIRA: MADLIB-1206 This commit adds support for mini-batch based gradient descent for MLP. If the input table contains a 2D matrix for

[GitHub] madlib pull request #240: MLP: Fix step size initialization based on learnin...

2018-03-09 Thread njayaram2
GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/240 MLP: Fix step size initialization based on learning rate policy JIRA: MADLIB-1212 The step_size is supposed to be updated based on the learning rate. The formulae for different

[GitHub] madlib pull request #239: Balance Sample: Add support for grouping

2018-03-08 Thread fmcquillan99
Github user fmcquillan99 commented on a diff in the pull request: https://github.com/apache/madlib/pull/239#discussion_r173254469 --- Diff: src/ports/postgres/modules/sample/balance_sample.sql_in --- @@ -543,6 +545,95 @@ SELECT * FROM output_table ORDER BY mainhue, name; (25

[GitHub] madlib pull request #239: Balance Sample: Add support for grouping

2018-03-08 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/239#discussion_r173254181 --- Diff: src/ports/postgres/modules/sample/balance_sample.sql_in --- @@ -543,6 +545,95 @@ SELECT * FROM output_table ORDER BY mainhue, name; (25

[GitHub] madlib pull request #239: Balance Sample: Add support for grouping

2018-03-08 Thread fmcquillan99
Github user fmcquillan99 commented on a diff in the pull request: https://github.com/apache/madlib/pull/239#discussion_r173239594 --- Diff: src/ports/postgres/modules/sample/balance_sample.sql_in --- @@ -543,6 +545,95 @@ SELECT * FROM output_table ORDER BY mainhue, name; (25

[GitHub] madlib pull request #239: Balance Sample: Add support for grouping

2018-03-08 Thread fmcquillan99
Github user fmcquillan99 commented on a diff in the pull request: https://github.com/apache/madlib/pull/239#discussion_r173238804 --- Diff: src/ports/postgres/modules/sample/balance_sample.sql_in --- @@ -543,6 +545,95 @@ SELECT * FROM output_table ORDER BY mainhue, name; (25

[GitHub] madlib pull request #239: Balance Sample: Add support for grouping

2018-03-07 Thread iyerr3
Github user iyerr3 commented on a diff in the pull request: https://github.com/apache/madlib/pull/239#discussion_r173080726 --- Diff: src/ports/postgres/modules/sample/balance_sample.py_in --- @@ -468,81 +544,107 @@ def balance_sample(schema_madlib, source_table, output_table,

[GitHub] madlib pull request #239: Balance Sample: Add support for grouping

2018-03-07 Thread iyerr3
Github user iyerr3 commented on a diff in the pull request: https://github.com/apache/madlib/pull/239#discussion_r173080410 --- Diff: src/ports/postgres/modules/sample/balance_sample.py_in --- @@ -58,28 +60,64 @@ NOSAMPLE = 'nosample' NEW_ID_COLUMN = '__madlib_id__'

[GitHub] madlib pull request #239: Balance Sample: Add support for grouping

2018-03-07 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/239#discussion_r172954235 --- Diff: src/ports/postgres/modules/sample/balance_sample.py_in --- @@ -468,81 +544,107 @@ def balance_sample(schema_madlib, source_table, output_table,

[GitHub] madlib pull request #239: Balance Sample: Add support for grouping

2018-03-07 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/239#discussion_r172953687 --- Diff: src/ports/postgres/modules/sample/balance_sample.py_in --- @@ -468,81 +544,107 @@ def balance_sample(schema_madlib, source_table, output_table,

[GitHub] madlib pull request #239: Balance Sample: Add support for grouping

2018-03-07 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/239#discussion_r172958587 --- Diff: src/ports/postgres/modules/sample/balance_sample.py_in --- @@ -58,28 +60,64 @@ NOSAMPLE = 'nosample' NEW_ID_COLUMN = '__madlib_id__'

[GitHub] madlib pull request #239: Balance Sample: Add support for grouping

2018-03-07 Thread iyerr3
Github user iyerr3 commented on a diff in the pull request: https://github.com/apache/madlib/pull/239#discussion_r172943311 --- Diff: src/ports/postgres/modules/sample/balance_sample.sql_in --- @@ -543,6 +544,90 @@ SELECT * FROM output_table ORDER BY mainhue, name; (25 rows)

[GitHub] madlib pull request #239: Balance Sample: Add support for grouping

2018-03-07 Thread fmcquillan99
Github user fmcquillan99 commented on a diff in the pull request: https://github.com/apache/madlib/pull/239#discussion_r172922334 --- Diff: src/ports/postgres/modules/sample/balance_sample.sql_in --- @@ -543,6 +544,90 @@ SELECT * FROM output_table ORDER BY mainhue, name; (25

[GitHub] madlib pull request #239: Balance Sample: Add support for grouping

2018-03-07 Thread fmcquillan99
Github user fmcquillan99 commented on a diff in the pull request: https://github.com/apache/madlib/pull/239#discussion_r172921714 --- Diff: src/ports/postgres/modules/sample/balance_sample.sql_in --- @@ -543,6 +544,90 @@ SELECT * FROM output_table ORDER BY mainhue, name; (25

[GitHub] madlib pull request #239: Balance Sample: Add support for grouping

2018-03-07 Thread fmcquillan99
Github user fmcquillan99 commented on a diff in the pull request: https://github.com/apache/madlib/pull/239#discussion_r172921328 --- Diff: src/ports/postgres/modules/sample/balance_sample.sql_in --- @@ -543,6 +544,90 @@ SELECT * FROM output_table ORDER BY mainhue, name; (25

[GitHub] madlib pull request #239: Balance Sample: Add support for grouping

2018-03-07 Thread fmcquillan99
Github user fmcquillan99 commented on a diff in the pull request: https://github.com/apache/madlib/pull/239#discussion_r172920935 --- Diff: src/ports/postgres/modules/sample/balance_sample.sql_in --- @@ -543,6 +544,90 @@ SELECT * FROM output_table ORDER BY mainhue, name; (25

[GitHub] madlib pull request #239: Balance Sample: Add support for grouping

2018-03-07 Thread fmcquillan99
Github user fmcquillan99 commented on a diff in the pull request: https://github.com/apache/madlib/pull/239#discussion_r172920825 --- Diff: src/ports/postgres/modules/sample/balance_sample.sql_in --- @@ -543,6 +544,90 @@ SELECT * FROM output_table ORDER BY mainhue, name; (25

[GitHub] madlib pull request #239: Balance Sample: Add support for grouping

2018-03-07 Thread fmcquillan99
Github user fmcquillan99 commented on a diff in the pull request: https://github.com/apache/madlib/pull/239#discussion_r172920581 --- Diff: src/ports/postgres/modules/sample/balance_sample.sql_in --- @@ -149,8 +149,10 @@ non-stratified, that is, the whole table is treated as a

[GitHub] madlib pull request #234: Encode categorical variables: create lower case co...

2018-02-23 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/madlib/pull/234 ---

[GitHub] madlib pull request #236: DT: Ensure n_folds and null_proxy are set correctl...

2018-02-23 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/madlib/pull/236 ---

[GitHub] madlib pull request #233: Install git on postgres centos 7 docker images.

2018-02-23 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/madlib/pull/233 ---

[GitHub] madlib pull request #235: update KNN, DT and RF docs to match recent commits

2018-02-23 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/madlib/pull/235 ---

[GitHub] madlib pull request #237: Bugfix: MLP predict using 1.12 model fails on late...

2018-02-23 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/madlib/pull/237 ---

[GitHub] madlib pull request #238: MLP: Use array_upper to get the last array element

2018-02-22 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/madlib/pull/238 ---

[GitHub] madlib pull request #237: Bugfix: MLP predict using 1.12 model fails on late...

2018-02-22 Thread iyerr3
Github user iyerr3 commented on a diff in the pull request: https://github.com/apache/madlib/pull/237#discussion_r169845958 --- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in --- @@ -749,8 +749,18 @@ def mlp_predict(schema_madlib, summary['layer_sizes'],

[GitHub] madlib pull request #237: Bugfix: MLP predict using 1.12 model fails on late...

2018-02-22 Thread iyerr3
Github user iyerr3 commented on a diff in the pull request: https://github.com/apache/madlib/pull/237#discussion_r169846145 --- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in --- @@ -796,14 +807,34 @@ def mlp_predict(schema_madlib, else: # if not

[GitHub] madlib pull request #237: Bugfix: MLP predict using 1.12 model fails on late...

2018-02-22 Thread iyerr3
Github user iyerr3 commented on a diff in the pull request: https://github.com/apache/madlib/pull/237#discussion_r169846321 --- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in --- @@ -796,14 +807,34 @@ def mlp_predict(schema_madlib, else: # if not

[GitHub] madlib pull request #238: MLP: Use array_upper to get the last array element

2018-02-21 Thread iyerr3
GitHub user iyerr3 opened a pull request: https://github.com/apache/madlib/pull/238 MLP: Use array_upper to get the last array element JIRA: MADLIB-1209 Postgresql arrays can be indexed in an arbitrary range. Hence, array_length is not necessarily the last element of

[GitHub] madlib pull request #237: Bugfix: MLP predict using 1.12 model fails on late...

2018-02-21 Thread njayaram2
GitHub user njayaram2 opened a pull request: https://github.com/apache/madlib/pull/237 Bugfix: MLP predict using 1.12 model fails on later versions JIRA: MADLIB-1207 MADlib 1.12 did not support grouping in MLP. The summary table created used to have the mean and std

[GitHub] madlib pull request #232: Multiple LDA improvements and fixes

2018-02-20 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/madlib/pull/232 ---

[GitHub] madlib pull request #:

2018-02-16 Thread iyerr3
Github user iyerr3 commented on the pull request: https://github.com/apache/madlib/commit/b3d528c44c01f507cd18e1676d65698a46366b10#commitcomment-27615498 In src/ports/postgres/modules/utilities/encode_categorical.py_in: In

[GitHub] madlib pull request #235: update KNN, DT and RF docs to match recent commits

2018-02-15 Thread fmcquillan99
Github user fmcquillan99 commented on a diff in the pull request: https://github.com/apache/madlib/pull/235#discussion_r168557191 --- Diff: src/ports/postgres/modules/recursive_partitioning/random_forest.sql_in --- @@ -208,13 +208,26 @@ forest_train(training_table_name,

[GitHub] madlib pull request #234: Create lower case column name in encode_categorica...

2018-02-15 Thread iyerr3
Github user iyerr3 commented on a diff in the pull request: https://github.com/apache/madlib/pull/234#discussion_r168525141 --- Diff: src/ports/postgres/modules/utilities/encode_categorical.py_in --- @@ -317,7 +317,19 @@ class CategoricalEncoder(object): if

[GitHub] madlib pull request #236: DT: Ensure n_folds and null_proxy are set correctl...

2018-02-15 Thread iyerr3
GitHub user iyerr3 opened a pull request: https://github.com/apache/madlib/pull/236 DT: Ensure n_folds and null_proxy are set correctly The summary table in Decision Tree included two entries: k and null_proxy. The 'k' value is supposed to reflect the 'n_folds' value but was

[GitHub] madlib pull request #235: update KNN, DT and RF docs to match recent commits

2018-02-15 Thread iyerr3
Github user iyerr3 commented on a diff in the pull request: https://github.com/apache/madlib/pull/235#discussion_r168523757 --- Diff: src/ports/postgres/modules/recursive_partitioning/random_forest.sql_in --- @@ -208,13 +208,26 @@ forest_train(training_table_name,

[GitHub] madlib pull request #235: update KNN, DT and RF docs to match recent commits

2018-02-13 Thread fmcquillan99
GitHub user fmcquillan99 opened a pull request: https://github.com/apache/madlib/pull/235 update KNN, DT and RF docs to match recent commits KNN * describe weighted average in more detail DT & RF * correct some doc errors and omissions * update example to show

[GitHub] madlib pull request #234: Create lower case column name in encode_categorica...

2018-02-13 Thread jingyimei
GitHub user jingyimei opened a pull request: https://github.com/apache/madlib/pull/234 Create lower case column name in encode_categorical_variables() JIRA:MADLIB-1202 The previous madlib.encode_categorical_variables() function generates column name with some capital

[GitHub] madlib pull request #232: Multiple LDA improvements and fixes

2018-02-12 Thread jingyimei
Github user jingyimei commented on a diff in the pull request: https://github.com/apache/madlib/pull/232#discussion_r167717958 --- Diff: src/ports/postgres/modules/utilities/text_utilities.sql_in --- @@ -74,175 +81,231 @@ tasks related to text. Flag to indicate if a

[GitHub] madlib pull request #232: Multiple LDA improvements and fixes

2018-02-12 Thread jingyimei
Github user jingyimei commented on a diff in the pull request: https://github.com/apache/madlib/pull/232#discussion_r167715245 --- Diff: src/ports/postgres/modules/utilities/text_utilities.sql_in --- @@ -74,175 +81,231 @@ tasks related to text. Flag to indicate if a

[GitHub] madlib pull request #232: Multiple LDA improvements and fixes

2018-02-12 Thread jingyimei
Github user jingyimei commented on a diff in the pull request: https://github.com/apache/madlib/pull/232#discussion_r167708065 --- Diff: src/ports/postgres/modules/lda/lda.sql_in --- @@ -182,324 +105,789 @@ lda_train( data_table, \b Arguments data_table -

[GitHub] madlib pull request #232: Multiple LDA improvements and fixes

2018-02-12 Thread jingyimei
Github user jingyimei commented on a diff in the pull request: https://github.com/apache/madlib/pull/232#discussion_r167708360 --- Diff: src/ports/postgres/modules/lda/lda.sql_in --- @@ -182,324 +105,789 @@ lda_train( data_table, \b Arguments data_table -

[GitHub] madlib pull request #232: Multiple LDA improvements and fixes

2018-02-12 Thread jingyimei
Github user jingyimei commented on a diff in the pull request: https://github.com/apache/madlib/pull/232#discussion_r167709835 --- Diff: src/ports/postgres/modules/lda/lda.sql_in --- @@ -182,324 +105,789 @@ lda_train( data_table, \b Arguments data_table -

[GitHub] madlib pull request #231: RF: Output non-negative importance values

2018-02-08 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/madlib/pull/231 ---

[GitHub] madlib pull request #230: Balanced sets final

2018-02-08 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/madlib/pull/230 ---

[GitHub] madlib pull request #231: RF: Output non-negative importance values

2018-02-06 Thread iyerr3
GitHub user iyerr3 opened a pull request: https://github.com/apache/madlib/pull/231 RF: Output non-negative importance values Variable importance is computed in RF as the difference in prediction accuracy between original data and permuted data from out-of-bag samples (OOB).

[GitHub] madlib pull request #230: Balanced sets final

2018-02-05 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/230#discussion_r166056096 --- Diff: src/ports/postgres/modules/sample/balance_sample.py_in --- @@ -0,0 +1,748 @@ +# coding=utf-8 +# +# Licensed to the Apache Software

[GitHub] madlib pull request #230: Balanced sets final

2018-02-02 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/230#discussion_r165529067 --- Diff: src/ports/postgres/modules/sample/balance_sample.py_in --- @@ -0,0 +1,748 @@ +# coding=utf-8 +# +# Licensed to the Apache Software

[GitHub] madlib pull request #230: Balanced sets final

2018-02-02 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/230#discussion_r165523942 --- Diff: src/ports/postgres/modules/sample/balance_sample.py_in --- @@ -0,0 +1,748 @@ +# coding=utf-8 +# +# Licensed to the Apache Software

[GitHub] madlib pull request #230: Balanced sets final

2018-02-02 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/230#discussion_r165737045 --- Diff: src/ports/postgres/modules/sample/balance_sample.py_in --- @@ -0,0 +1,748 @@ +# coding=utf-8 +# +# Licensed to the Apache Software

[GitHub] madlib pull request #230: Balanced sets final

2018-02-02 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/230#discussion_r165736819 --- Diff: src/ports/postgres/modules/sample/balance_sample.py_in --- @@ -0,0 +1,748 @@ +# coding=utf-8 +# +# Licensed to the Apache Software

[GitHub] madlib pull request #230: Balanced sets final

2018-02-02 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/230#discussion_r165526039 --- Diff: src/ports/postgres/modules/sample/balance_sample.py_in --- @@ -0,0 +1,748 @@ +# coding=utf-8 +# +# Licensed to the Apache Software

[GitHub] madlib pull request #230: Balanced sets final

2018-02-02 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/230#discussion_r165527448 --- Diff: src/ports/postgres/modules/sample/balance_sample.py_in --- @@ -0,0 +1,748 @@ +# coding=utf-8 +# +# Licensed to the Apache Software

[GitHub] madlib pull request #230: Balanced sets final

2018-02-02 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/230#discussion_r165479832 --- Diff: src/ports/postgres/modules/sample/balance_sample.py_in --- @@ -0,0 +1,748 @@ +# coding=utf-8 +# +# Licensed to the Apache Software

[GitHub] madlib pull request #230: Balanced sets final

2018-02-02 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/230#discussion_r165505215 --- Diff: src/ports/postgres/modules/sample/balance_sample.sql_in --- @@ -0,0 +1,355 @@ +/*

[GitHub] madlib pull request #230: Balanced sets final

2018-02-02 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/230#discussion_r165530126 --- Diff: src/ports/postgres/modules/sample/balance_sample.py_in --- @@ -0,0 +1,748 @@ +# coding=utf-8 +# +# Licensed to the Apache Software

[GitHub] madlib pull request #230: Balanced sets final

2018-02-02 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/230#discussion_r165733978 --- Diff: src/ports/postgres/modules/sample/balance_sample.py_in --- @@ -0,0 +1,748 @@ +# coding=utf-8 +# +# Licensed to the Apache Software

[GitHub] madlib pull request #230: Balanced sets final

2018-02-02 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/230#discussion_r165516241 --- Diff: src/ports/postgres/modules/sample/balance_sample.py_in --- @@ -0,0 +1,748 @@ +# coding=utf-8 +# +# Licensed to the Apache Software

[GitHub] madlib pull request #230: Balanced sets final

2018-02-02 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/230#discussion_r165537394 --- Diff: src/ports/postgres/modules/sample/balance_sample.py_in --- @@ -0,0 +1,748 @@ +# coding=utf-8 +# +# Licensed to the Apache Software

[GitHub] madlib pull request #230: Balanced sets final

2018-02-02 Thread kaknikhil
Github user kaknikhil commented on a diff in the pull request: https://github.com/apache/madlib/pull/230#discussion_r165734791 --- Diff: src/ports/postgres/modules/sample/balance_sample.py_in --- @@ -0,0 +1,748 @@ +# coding=utf-8 +# +# Licensed to the Apache Software

[GitHub] madlib pull request #223: Balance datasets : re-sampling technique

2018-01-31 Thread Swatisoni
Github user Swatisoni closed the pull request at: https://github.com/apache/madlib/pull/223 ---

[GitHub] madlib pull request #225: Added option for weighted average for both classif...

2018-01-29 Thread hpandeycodeit
Github user hpandeycodeit commented on a diff in the pull request: https://github.com/apache/madlib/pull/225#discussion_r164594150 --- Diff: src/ports/postgres/modules/knn/test/knn.sql_in --- @@ -72,43 +72,55 @@ copy knn_test_data (id, data) from stdin delimiter '|'; \.

[GitHub] madlib pull request #225: Added option for weighted average for both classif...

2018-01-29 Thread hpandeycodeit
Github user hpandeycodeit commented on a diff in the pull request: https://github.com/apache/madlib/pull/225#discussion_r164593774 --- Diff: src/ports/postgres/modules/knn/knn.py_in --- @@ -212,22 +244,27 @@ def knn(schema_madlib, point_source, point_column_name, point_id,

[GitHub] madlib pull request #229: SVM: Add minibatch as a new solver

2018-01-24 Thread orhankislal
Github user orhankislal commented on a diff in the pull request: https://github.com/apache/madlib/pull/229#discussion_r163690557 --- Diff: src/modules/convex/linear_svm_igd.cpp --- @@ -120,6 +124,98 @@ linear_svm_igd_transition::run(AnyType ) { return state; }

[GitHub] madlib pull request #229: SVM: Add minibatch as a new solver

2018-01-24 Thread orhankislal
Github user orhankislal commented on a diff in the pull request: https://github.com/apache/madlib/pull/229#discussion_r163120094 --- Diff: src/modules/convex/linear_svm_igd.cpp --- @@ -120,6 +124,100 @@ linear_svm_igd_transition::run(AnyType ) { return state; }

[GitHub] madlib pull request #225: Added option for weighted average for both classif...

2018-01-24 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/225#discussion_r163653414 --- Diff: src/ports/postgres/modules/knn/knn.py_in --- @@ -212,22 +244,27 @@ def knn(schema_madlib, point_source, point_column_name, point_id,

[GitHub] madlib pull request #225: Added option for weighted average for both classif...

2018-01-24 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/225#discussion_r163657952 --- Diff: src/ports/postgres/modules/knn/test/knn.sql_in --- @@ -72,43 +72,55 @@ copy knn_test_data (id, data) from stdin delimiter '|'; \.

[GitHub] madlib pull request #229: SVM: Add minibatch as a new solver

2018-01-22 Thread njayaram2
Github user njayaram2 commented on a diff in the pull request: https://github.com/apache/madlib/pull/229#discussion_r163048733 --- Diff: src/modules/convex/algo/igd.hpp --- @@ -56,6 +59,62 @@ IGD::transition(state_type ,

[GitHub] madlib pull request #229: SVM: Add minibatch as a new solver

2018-01-22 Thread iyerr3
Github user iyerr3 commented on a diff in the pull request: https://github.com/apache/madlib/pull/229#discussion_r163022926 --- Diff: src/modules/convex/algo/igd.hpp --- @@ -34,7 +34,10 @@ class IGD { typedef typename Task::model_type model_type; static void

<    1   2   3   4   5   6   7   >