GitHub user iyerr3 opened a pull request:
https://github.com/apache/madlib/pull/201
Allow array feature with more than 1664 entries
JIRA: MADLIB-1173
The tree_predict function concatenates cat_feature_str and
con_feature_str in summary table to obtain the feature string. This
contains individual elements of any array feature. The concatenated
string is used in a SELECT operation for validation, which limits the
number of target entries to 1664. To allow arrays with more than 1664
entries, the validation has been udpated to use the original feature
string, which contains the array feature name instead of its indexed
elements.
Closes #201
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/iyerr3/incubator-madlib
bugfix/dt_array_features
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/madlib/pull/201.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #201
----
commit fc374cc854d957923a713901900aacd306527a77
Author: Rahul Iyer <[email protected]>
Date: 2017-11-13T18:49:29Z
DT: Consolidate tree_rmse and tree_misclassified
commit feb9781aa1c6fee52eb94e6863998f8aadd77217
Author: Rahul Iyer <[email protected]>
Date: 2017-11-13T21:42:10Z
DT: Validate original feature string in tree_predict
JIRA: MADLIB-1173
The tree_predict function concatenates cat_feature_str and
con_feature_str in summary table to obtain the feature string. This
contains individual elements of any array feature. The concatenated
string is used in a SELECT operation for validation, which limits the
number of target entries to 1664. To allow arrays with more than 1664
entries, the validation has been udpated to use the original feature
string, which contains the array feature name instead of its indexed
elements.
Closes #201
----
---