GitHub user iyerr3 opened a pull request: https://github.com/apache/madlib/pull/259
Minibatch: Add one-hot encoding option for int JIRA: MADLIB-1226 Integer dependent variables can be used either in regression or classification. To use in classification, they need to be one-hot encoded. This commit adds an option to allow users to pick if a integer dependent input needs to one-hot encoded or not. The flag is ignored if the variable is not of integer type. Other changes include adding an appropriate test in install-check, code cleanup and PEP8 conformance. You can merge this pull request into a Git repository by running: $ git pull https://github.com/madlib/madlib feature/minibatch_one_hot_encode Alternatively you can review and apply these changes as the patch at: https://github.com/apache/madlib/pull/259.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #259 ---- commit 4729973d4e477cfef42cb21f8b8a3778171a5a3d Author: Rahul Iyer <riyer@...> Date: 2018-04-10T19:34:23Z Minibatch: Add one-hot encoding option for int JIRA: MADLIB-1226 Integer dependent variables can be used either in regression or classification. To use in classification, they need to be one-hot encoded. This commit adds an option to allow users to pick if a integer dependent input needs to one-hot encoded or not. The flag is ignored if the variable is not of integer type. Other changes include adding an appropriate test in install-check, code cleanup and PEP8 conformance. ---- ---