Nandish Jayaram created MADLIB-1203:
Summary: k-NN Interface changes for classification and regression
Project: Apache MADlib
Issue Type: Improvement
Reporter: Nandish Jayaram
Fix For: v1.14
k-NN has a single function for both classification and regression. To be
consistent with other modules such as MLP and SVM, can we instead have two
knn_classification and knn_regression
A couple of other usability changes:
1) The first 7 parameters for the current knn implementation deal with
providing some details about the training table and test table. Can we instead
have two parameters, one for training table and the other for test table
instead of those 7 params? We could have a comma separated key-value pair like
in the optimization params for elastic net.
2) The output table currently has the `id` and `point` columns among others.
The `point` column in redundant since the `id` is a unique identifier of a row.
We could remove the `point` column from the output table.
This message was sent by Atlassian JIRA