[
https://issues.apache.org/jira/browse/MADLIB-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15832143#comment-15832143
]
ASF GitHub Bot commented on MADLIB-927:
---------------------------------------
Github user orhankislal commented on the issue:
https://github.com/apache/incubator-madlib/pull/81
Hi Auon,
My suggestion is to give them a try and if you agree with the content,
merge them.
Here is a small list of validations (I know you covered some of them in the
code):
- Every input should be checked for null
- Every string should be checked for empty string ''
- Columns should exist in their respective tables
- Input Tables should not be empty
- Output tables should not exist
Thanks
Orhan
> Initial implementation of k-NN
> ------------------------------
>
> Key: MADLIB-927
> URL: https://issues.apache.org/jira/browse/MADLIB-927
> Project: Apache MADlib
> Issue Type: New Feature
> Reporter: Rahul Iyer
> Labels: gsoc2016, starter
>
> k-Nearest Neighbors is a simple algorithm based on finding nearest neighbors
> of data points in a metric feature space according to a specified distance
> function. It is considered one of the canonical algorithms of data science.
> It is a nonparametric method, which makes it applicable to a lot of
> real-world problems where the data doesn’t satisfy particular distribution
> assumptions. It can also be implemented as a lazy algorithm, which means
> there is no training phase where information in the data is condensed into
> coefficients, but there is a costly testing phase where all data (or some
> subset) is used to make predictions.
> This JIRA involves implementing the naïve approach - i.e. compute the k
> nearest neighbors by going through all points.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)