GitHub user njayaram2 opened a pull request:
https://github.com/apache/incubator-madlib/pull/35
Elastic Net: Skip arrays with NULL values in train
Jira: MADLIB-978
Having NULL values in the input array of the training
data was leading to an unhandled exception. This fix
now catches the exception and skips such input arrays.
The fix also modifies the code which was used to normalize
the input data (independent variable), which now ignores
such arrays with NULLs while normalizing.
The number of rows in the input table was used while
normalizing the data. The query used to get the number
of rows is now changed to count only those rows that have
no NULL values in the array.
The mean of the dependent variable was still computed using
an SQL command which was not ignoring the independent variables
(arrays) with NULLs. They are ignored which computing the mean now.
@mktal
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/njayaram2/incubator-madlib elasticnet_train
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-madlib/pull/35.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #35
----
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---