[
https://issues.apache.org/jira/browse/MADLIB-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15819825#comment-15819825
]
ASF GitHub Bot commented on MADLIB-927:
---------------------------------------
Github user auonhaidar commented on the issue:
https://github.com/apache/incubator-madlib/pull/81
I ran this command inside build:
$ du -h doc/
4.0K doc/design/figures
4.0K doc/design/modules
20K doc/design/CMakeFiles/auxclean.dir
44K doc/design/CMakeFiles/design_ps.dir
20K doc/design/CMakeFiles/html.dir
20K doc/design/CMakeFiles/design_html.dir
20K doc/design/CMakeFiles/design.dir
28K doc/design/CMakeFiles/design_auxclean.dir
40K doc/design/CMakeFiles/design_dvi.dir
20K doc/design/CMakeFiles/pdf.dir
20K doc/design/CMakeFiles/safepdf.dir
20K doc/design/CMakeFiles/ps.dir
20K doc/design/CMakeFiles/design_safepdf.dir
40K doc/design/CMakeFiles/design_pdf.dir
20K doc/design/CMakeFiles/dvi.dir
344K doc/design/CMakeFiles
4.0K doc/design/other-chapters
380K doc/design
12K doc/bin/CMakeFiles
36K doc/bin
8.0K doc/imgs
20K doc/CMakeFiles/update_mathjax.dir
40K doc/CMakeFiles/doxysql.dir
20K doc/CMakeFiles/devdoc.dir
20K doc/CMakeFiles/doc.dir
112K doc/CMakeFiles
12K doc/etc/CMakeFiles
152K doc/etc
720K doc/
> Initial implementation of k-NN
> ------------------------------
>
> Key: MADLIB-927
> URL: https://issues.apache.org/jira/browse/MADLIB-927
> Project: Apache MADlib
> Issue Type: New Feature
> Reporter: Rahul Iyer
> Labels: gsoc2016, starter
>
> k-Nearest Neighbors is a simple algorithm based on finding nearest neighbors
> of data points in a metric feature space according to a specified distance
> function. It is considered one of the canonical algorithms of data science.
> It is a nonparametric method, which makes it applicable to a lot of
> real-world problems where the data doesn’t satisfy particular distribution
> assumptions. It can also be implemented as a lazy algorithm, which means
> there is no training phase where information in the data is condensed into
> coefficients, but there is a costly testing phase where all data (or some
> subset) is used to make predictions.
> This JIRA involves implementing the naïve approach - i.e. compute the k
> nearest neighbors by going through all points.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)