Github user iyerr3 commented on the issue:
https://github.com/apache/madlib/pull/189
There is a minor risk in doing this. GPDB does not support all features in
the current Postgres and I believe the documentation is trying to redirect both
GPDB and Postgres users.
---
Github user iyerr3 commented on the issue:
https://github.com/apache/madlib/pull/189
Looks great.
On Fri, Oct 6, 2017 at 1:03 PM, Ed Espino wrote:
> @rahiyer <https://github.com/rahiyer> - Collaborating with Lisa Owen (
> @lisakowen <https://githu
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/191#discussion_r146646897
--- Diff: src/ports/postgres/modules/knn/knn.py_in ---
@@ -135,13 +135,17 @@ def knn(schema_madlib, point_source,
point_column_name, point_id, label_column_n
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/191#discussion_r146646995
--- Diff: src/ports/postgres/modules/knn/knn.py_in ---
@@ -135,13 +135,17 @@ def knn(schema_madlib, point_source,
point_column_name, point_id, label_column_n
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/191#discussion_r146703574
--- Diff: src/ports/postgres/modules/knn/knn.py_in ---
@@ -215,7 +222,8 @@ def knn(schema_madlib, point_source, point_column_name,
point_id, label_column_n
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/191#discussion_r146702725
--- Diff: src/ports/postgres/modules/knn/knn.py_in ---
@@ -135,13 +135,17 @@ def knn(schema_madlib, point_source,
point_column_name, point_id, label_column_n
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/191#discussion_r146702329
--- Diff: src/ports/postgres/modules/knn/knn.py_in ---
@@ -135,13 +135,17 @@ def knn(schema_madlib, point_source,
point_column_name, point_id, label_column_n
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/191#discussion_r146702427
--- Diff: src/ports/postgres/modules/knn/knn.py_in ---
@@ -135,13 +135,17 @@ def knn(schema_madlib, point_source,
point_column_name, point_id, label_column_n
Github user iyerr3 commented on the issue:
https://github.com/apache/madlib/pull/191
jenkins ok to retest
---
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/192#discussion_r147543845
--- Diff: src/ports/postgres/modules/convex/lmf_igd.py_in ---
@@ -33,40 +34,45 @@ def compute_lmf_igd(schema_madlib, rel_args, rel_state,
rel_source
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/195#discussion_r150637914
--- Diff: src/ports/postgres/modules/utilities/utilities.py_in ---
@@ -709,6 +709,17 @@ def _check_groups(tbl1, tbl2, grp_list):
return ' AND &
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/195#discussion_r150638553
--- Diff: src/ports/postgres/modules/utilities/validate_args.py_in ---
@@ -262,6 +262,13 @@ def get_first_schema(table_name):
return None
Github user iyerr3 commented on the issue:
https://github.com/apache/madlib/pull/195
Not sure what the `*_for_centrality_measures` names mean - are those
function only used in centrality measures?
---
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/197#discussion_r150640177
--- Diff: src/madpack/upgrade_util.py ---
@@ -142,11 +142,11 @@ def _load(self):
"""
# _mad_dbrev = 1.9.1
-
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/197#discussion_r150640686
--- Diff: src/madpack/upgrade_util.py ---
@@ -142,11 +142,11 @@ def _load(self):
"""
# _mad_dbrev = 1.9.1
-
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/197#discussion_r150684741
--- Diff: src/madpack/upgrade_util.py ---
@@ -142,11 +142,11 @@ def _load(self):
"""
# _mad_dbrev = 1.9.1
-
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/197#discussion_r150697403
--- Diff: src/madpack/upgrade_util.py ---
@@ -142,11 +142,11 @@ def _load(self):
"""
# _mad_dbrev = 1.9.1
-
GitHub user iyerr3 opened a pull request:
https://github.com/apache/madlib/pull/200
Madpack: Move unit tests + refactor minor code
1. Unit tests for get_rev_num and is_rev_gte moved to the correct
location
2. GPDB 5/6 version extraction made more general
3. Bare except
GitHub user iyerr3 opened a pull request:
https://github.com/apache/madlib/pull/201
Allow array feature with more than 1664 entries
JIRA: MADLIB-1173
The tree_predict function concatenates cat_feature_str and
con_feature_str in summary table to obtain the feature string
GitHub user iyerr3 opened a pull request:
https://github.com/apache/madlib/pull/202
Multiple: Add casting to allow compilation in GCC 6+
JIRA: MADLIB-1025
GCC 6+ introduced stricter rules for implicit casting where loss of
information is possible.
Closes #202
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/195#discussion_r151526779
--- Diff: src/ports/postgres/modules/graph/hits.py_in ---
@@ -95,234 +109,391 @@ def hits(schema_madlib, vertex_table, vertex_id,
edge_table, edge_args
GitHub user iyerr3 opened a pull request:
https://github.com/apache/madlib/pull/203
Build: Create single binary for all PG10 versions
JIRA: MADLIB-1179
Postgresql starting 10.0 is switching to semantic versioning (see
https://www.postgresql.org/support/versioning
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/204#discussion_r152129858
--- Diff: src/ports/postgres/modules/knn/test/knn.sql_in ---
@@ -73,23 +73,23 @@ copy knn_test_data (id, data) from stdin delimiter
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/204#discussion_r152123514
--- Diff: src/ports/postgres/modules/knn/knn.py_in ---
@@ -88,12 +88,28 @@ def knn_validate_src(schema_madlib, point_source,
point_column_name, point_id
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/204#discussion_r152122999
--- Diff: src/ports/postgres/modules/knn/knn.py_in ---
@@ -88,12 +88,28 @@ def knn_validate_src(schema_madlib, point_source,
point_column_name, point_id
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/204#discussion_r152122803
--- Diff: src/ports/postgres/modules/knn/knn.py_in ---
@@ -88,12 +88,28 @@ def knn_validate_src(schema_madlib, point_source,
point_column_name, point_id
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/195#discussion_r152136163
--- Diff: src/ports/postgres/modules/utilities/utilities.py_in ---
@@ -709,16 +709,35 @@ def _check_groups(tbl1, tbl2, grp_list):
return ' AND &
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/204#discussion_r152419848
--- Diff: src/ports/postgres/modules/knn/knn.py_in ---
@@ -89,20 +89,20 @@ def knn_validate_src(schema_madlib, point_source,
point_column_name, point_id
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/204#discussion_r152420416
--- Diff: src/ports/postgres/modules/knn/knn.py_in ---
@@ -155,6 +155,9 @@ def knn(schema_madlib, point_source, point_column_name,
point_id, label_column_n
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/204#discussion_r152420629
--- Diff: src/ports/postgres/modules/knn/test/knn.sql_in ---
@@ -73,23 +73,23 @@ copy knn_test_data (id, data) from stdin delimiter
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/204#discussion_r152420323
--- Diff: src/ports/postgres/modules/knn/knn.py_in ---
@@ -89,20 +89,20 @@ def knn_validate_src(schema_madlib, point_source,
point_column_name, point_id
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/205#discussion_r152463529
--- Diff: src/ports/postgres/modules/graph/hits.sql_in ---
@@ -103,18 +102,18 @@ It is named by adding the suffix '_summary' to the
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/205#discussion_r152463581
--- Diff: src/ports/postgres/modules/graph/hits.sql_in ---
@@ -103,18 +102,18 @@ It is named by adding the suffix '_summary' to the
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/205#discussion_r152463751
--- Diff: src/ports/postgres/modules/regress/linear.sql_in ---
@@ -183,16 +183,15 @@ FROM (
@par Prediction Function
The prediction function is as
Github user iyerr3 commented on the issue:
https://github.com/apache/madlib/pull/205
Commit daf67f81b merges this PR. This can be closed.
---
Github user iyerr3 commented on the issue:
https://github.com/apache/madlib/pull/204
Changes look good. I'll merge this.
---
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/206#discussion_r154242803
--- Diff: src/ports/postgres/modules/stats/correlation.sql_in ---
@@ -207,8 +203,9 @@ Result:
@par Notes
-Current implementation ignores
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/206#discussion_r154241686
--- Diff: src/ports/postgres/modules/stats/correlation.py_in ---
@@ -180,31 +180,29 @@ def _populate_output_table(schema_madlib,
source_table, output_table
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/206#discussion_r154242141
--- Diff: src/ports/postgres/modules/stats/correlation.py_in ---
@@ -180,31 +180,29 @@ def _populate_output_table(schema_madlib,
source_table, output_table
Github user iyerr3 commented on the issue:
https://github.com/apache/madlib/pull/206
Also please change commit message to better indicate the intention of the
commit.
Something along the lines of
"Correlation: Impute NULL values with mean"
---
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/206#discussion_r154422395
--- Diff: src/ports/postgres/modules/stats/correlation.py_in ---
@@ -165,8 +165,8 @@ def _populate_output_table(schema_madlib, source_table,
output_table
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/206#discussion_r154422562
--- Diff: src/ports/postgres/modules/stats/correlation.sql_in ---
@@ -207,8 +204,17 @@ Result:
@par Notes
-Current implementation ignores
Github user iyerr3 commented on the issue:
https://github.com/apache/madlib/pull/208
Looks good, will merge this.
---
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/211#discussion_r155598556
--- Diff: deploy/CMakeLists.txt ---
@@ -82,4 +82,4 @@ cpack_add_component_group(ports
file(GLOB PORT_COMPONENTS "${CMAKE_CURRENT_BINARY_DIR}/Compo
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/211#discussion_r155598729
--- Diff: deploy/gppkg/CMakeLists.txt ---
@@ -2,8 +2,11 @@
# Packaging for Greenplum's
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/211#discussion_r155598502
--- Diff: cmake/LinuxUtils.cmake ---
@@ -9,3 +9,14 @@ macro(rh_version OUT_VERSION)
set(${OUT_VERSION} "${OUT_VERSION}-NOTFOUND")
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/211#discussion_r155598453
--- Diff: CMakeLists.txt ---
@@ -275,4 +275,3 @@ install(CODE "
${CMAKE_MADLIB_ROOT}/doc
)
")
-
--
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/211#discussion_r155599007
--- Diff: src/ports/greenplum/cmake/GreenplumUtils.cmake ---
@@ -17,6 +17,9 @@ function(add_gppkg GPDB_VERSION GPDB_VARIANT
GPDB_VARIANT_SHORT
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/211#discussion_r155598266
--- Diff: cmake/LinuxUtils.cmake ---
@@ -9,3 +9,14 @@ macro(rh_version OUT_VERSION)
set(${OUT_VERSION} "${OUT_VERSION}-NOTFOUND")
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/211#discussion_r155597507
--- Diff: CMakeLists.txt ---
@@ -275,4 +275,3 @@ install(CODE "
${CMAKE_MADLIB_ROOT}/doc
)
")
--
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/214#discussion_r156472963
--- Diff: src/ports/postgres/modules/stats/correlation.py_in ---
@@ -179,9 +179,11 @@ def _populate_output_table(schema_madlib,
source_table, output_table
Github user iyerr3 commented on the issue:
https://github.com/apache/madlib/pull/215
Is there overlap between this and #213? Maybe consolidate to a single PR?
---
GitHub user iyerr3 opened a pull request:
https://github.com/apache/madlib/pull/216
Release: Upgrade to v1.13
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/iyerr3/incubator-madlib
feature/upgrade_to_1.13
Alternatively you
Github user iyerr3 commented on the issue:
https://github.com/apache/madlib/pull/216
Upgrade has been tested with postgres 9.6 and greenplum 4.3, 5.0. Merging
this PR.
---
GitHub user iyerr3 opened a pull request:
https://github.com/apache/madlib/pull/217
Release: Update RELEASE_NOTES for v1.13
JIRA: MADLIB-1189
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/iyerr3/incubator-madlib
feature
Github user iyerr3 closed the pull request at:
https://github.com/apache/madlib/pull/217
---
Github user iyerr3 commented on the issue:
https://github.com/apache/madlib/pull/217
Closed this in d0ad93d261337661e40312caa9168eb9d6dc761f
---
GitHub user iyerr3 opened a pull request:
https://github.com/apache/madlib/pull/219
Multiple: Hard-wire values for construct_array calls
JIRA: MADLIB-1185
Original investigation and RCA performed by
Nikhil Kak and
Orhan Kislal
Multiple modules called
GitHub user iyerr3 opened a pull request:
https://github.com/apache/madlib/pull/221
Multiple: Hard-wire values for construct_array calls
JIRA: MADLIB-1185
Original investigation and RCA performed by
Nikhil Kak and
Orhan Kislal
Multiple modules called
Github user iyerr3 commented on the issue:
https://github.com/apache/madlib/pull/220
This can be closed since merged by
d025bb4609baeb7c7a1d136590780a8fafdee208.
---
Github user iyerr3 commented on the issue:
https://github.com/apache/madlib/pull/222
+1
---
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/225#discussion_r162371982
--- Diff: src/ports/postgres/modules/knn/knn.py_in ---
@@ -211,23 +222,43 @@ def knn(schema_madlib, point_source,
point_column_name, point_id
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/225#discussion_r162369682
--- Diff: src/ports/postgres/modules/knn/knn.py_in ---
@@ -167,22 +169,31 @@ def knn(schema_madlib, point_source,
point_column_name, point_id
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/225#discussion_r162369645
--- Diff: src/ports/postgres/modules/knn/knn.py_in ---
@@ -167,22 +169,31 @@ def knn(schema_madlib, point_source,
point_column_name, point_id
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/225#discussion_r162369486
--- Diff: src/ports/postgres/modules/knn/knn.py_in ---
@@ -167,22 +169,31 @@ def knn(schema_madlib, point_source,
point_column_name, point_id
GitHub user iyerr3 opened a pull request:
https://github.com/apache/madlib/pull/229
SVM: Add minibatch as a new solver
Additional author: Nikhil Kak
This work is based on the original work by Xiaocheng Tang
in #75.
This PR adds two main features:
1. A
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/229#discussion_r163022926
--- Diff: src/modules/convex/algo/igd.hpp ---
@@ -34,7 +34,10 @@ class IGD {
typedef typename Task::model_type model_type;
static void
GitHub user iyerr3 opened a pull request:
https://github.com/apache/madlib/pull/231
RF: Output non-negative importance values
Variable importance is computed in RF as the difference in prediction
accuracy between original data and permuted data from out-of-bag
samples (OOB
Github user iyerr3 commented on the issue:
https://github.com/apache/madlib/pull/231
This change ensures that all variable importance values are positive. The
remaining properties remain as is: i.e. the feature with max value is most
important and the values are not normalized
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/235#discussion_r168523662
--- Diff:
src/ports/postgres/modules/recursive_partitioning/decision_tree.sql_in ---
@@ -355,6 +355,19 @@ tree_train(
independent_var_types
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/235#discussion_r168523757
--- Diff:
src/ports/postgres/modules/recursive_partitioning/random_forest.sql_in ---
@@ -208,13 +208,26 @@ forest_train(training_table_name
GitHub user iyerr3 opened a pull request:
https://github.com/apache/madlib/pull/236
DT: Ensure n_folds and null_proxy are set correctly
The summary table in Decision Tree included two entries: k and
null_proxy. The 'k' value is supposed to reflect the 'n_folds
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/234#discussion_r168525141
--- Diff: src/ports/postgres/modules/utilities/encode_categorical.py_in ---
@@ -317,7 +317,19 @@ class CategoricalEncoder(object):
if
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/232#discussion_r168526426
--- Diff: src/ports/postgres/modules/lda/lda.py_in ---
@@ -120,14 +120,22 @@ class LDATrainer:
# etime = time.time()
# plpy.notice
Github user iyerr3 commented on the pull request:
https://github.com/apache/madlib/commit/b3d528c44c01f507cd18e1676d65698a46366b10#commitcomment-27615498
In src/ports/postgres/modules/utilities/encode_categorical.py_in:
In src/ports/postgres/modules/utilities
Github user iyerr3 commented on the issue:
https://github.com/apache/madlib/pull/234
LGTM
---
Github user iyerr3 commented on the issue:
https://github.com/apache/madlib/pull/234
I've pushed a commit (912a4d629) to `madlib/madlib` repo, branch
`feature/encode_categorial_column_name_change` to address above issues. That
commit can be cherry-picked here to continue with this PR.
---
Github user iyerr3 commented on the issue:
https://github.com/apache/madlib/pull/234
Please note in the above example `height>.10_false` and `height>.10_true`
will have to be double quoted when referred to. The lower case `false` and
`true` does not eliminate the quotes i
GitHub user iyerr3 opened a pull request:
https://github.com/apache/madlib/pull/238
MLP: Use array_upper to get the last array element
JIRA: MADLIB-1209
Postgresql arrays can be indexed in an arbitrary range. Hence,
array_length is not necessarily the last element of
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/237#discussion_r169846145
--- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in ---
@@ -796,14 +807,34 @@ def mlp_predict(schema_madlib,
else:
# if not
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/237#discussion_r169846321
--- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in ---
@@ -796,14 +807,34 @@ def mlp_predict(schema_madlib,
else:
# if not
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/237#discussion_r169845958
--- Diff: src/ports/postgres/modules/convex/mlp_igd.py_in ---
@@ -749,8 +749,18 @@ def mlp_predict(schema_madlib,
summary['layer_
GitHub user iyerr3 opened a pull request:
https://github.com/apache/madlib/pull/239
Balance Sample: Add support for grouping
JIRA: MADLIB-1168
This commit adds grouping support for balanced sampling.
Grouping is implemented as a loop over the existing logic,
with
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/239#discussion_r172941914
--- Diff: src/ports/postgres/modules/sample/balance_sample.sql_in ---
@@ -543,6 +544,90 @@ SELECT * FROM output_table ORDER BY mainhue, name;
(25 rows
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/239#discussion_r172943311
--- Diff: src/ports/postgres/modules/sample/balance_sample.sql_in ---
@@ -543,6 +544,90 @@ SELECT * FROM output_table ORDER BY mainhue, name;
(25 rows
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/239#discussion_r173080410
--- Diff: src/ports/postgres/modules/sample/balance_sample.py_in ---
@@ -58,28 +60,64 @@ NOSAMPLE = 'nosample'
NEW_ID_COLUMN =
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/239#discussion_r173080726
--- Diff: src/ports/postgres/modules/sample/balance_sample.py_in ---
@@ -468,81 +544,107 @@ def balance_sample(schema_madlib, source_table,
output_table
GitHub user iyerr3 opened a pull request:
https://github.com/apache/madlib/pull/242
PCA: Fix issue with text grouping col input
JIRA: MADLIB-1215
PCA fails when the grouping column is a text column (a common use case).
This is because the column is compared to its
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/246#discussion_r175924018
--- Diff:
src/ports/postgres/modules/recursive_partitioning/decision_tree.sql_in ---
@@ -127,7 +132,11 @@ tree_train(
weights (optional
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/246#discussion_r175927937
--- Diff:
src/ports/postgres/modules/recursive_partitioning/decision_tree.sql_in ---
@@ -418,7 +468,10 @@ tree_predict(tree_model,
new_data_table
GitHub user iyerr3 opened a pull request:
https://github.com/apache/madlib/pull/247
SVM: Revert minibatch-related work
This commit is a partial revert of a8bbe08.
Minibatch was not found to be useful for SVM and broken due to recent
changes. We're disablin
Github user iyerr3 commented on the issue:
https://github.com/apache/madlib/pull/247
@kaknikhil
Yes, both those changes are related to MLP and ideally should have been in
a different commit. We can still make it that way by reverting them here and
then reintroducing these
GitHub user iyerr3 opened a pull request:
https://github.com/apache/madlib/pull/248
DT: Ensure proper quoting in grouping coalesce
JIRA: MADLIB-1217
Grouping column value is coalesced with null_proxy to get the right null
identifier when null_as_category is True. The
Github user iyerr3 commented on the issue:
https://github.com/apache/madlib/pull/248
I've combined two independent commits into a single PR, since the changes
in the commits are close by and would benefit from the same reviewer looking at
both changes.
---
Github user iyerr3 commented on the issue:
https://github.com/apache/madlib/pull/248
Commits 97064e2 and 2ad0bf7 will eventually be combined in ffd6355 before
merging.
---
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/246#discussion_r176661267
--- Diff:
src/ports/postgres/modules/recursive_partitioning/random_forest.sql_in ---
@@ -97,327 +264,220 @@ forest_train(training_table_name
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/246#discussion_r176660561
--- Diff:
src/ports/postgres/modules/recursive_partitioning/decision_tree.sql_in ---
@@ -127,18 +128,20 @@ tree_train(
grouping_cols (optional
GitHub user iyerr3 opened a pull request:
https://github.com/apache/madlib/pull/249
RF: Use NULL::integer[] when no continuous features
JIRA: MADLIB-1219
When variable importance is enabled, to compute importance score,
distribution of the categorical and continuous
Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/248#discussion_r177503935
--- Diff:
src/ports/postgres/modules/recursive_partitioning/decision_tree.py_in ---
@@ -970,16 +970,35 @@ def _get_bins_grps
Github user iyerr3 commented on the issue:
https://github.com/apache/madlib/pull/249
@kaknikhil Thanks for the suggestion. It makes sense to use that function
with integer types excluded (since DT does not treat integer as continuous).
I'll add that as a separate commit since
1 - 100 of 209 matches
Mail list logo