[ 
https://issues.apache.org/jira/browse/FLINK-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184308#comment-15184308
 ] 

ASF GitHub Bot commented on FLINK-1745:
---------------------------------------

Github user chiwanpark commented on the pull request:

    https://github.com/apache/flink/pull/1220#issuecomment-193580086
  
    Hi @danielblazevski, thanks for update and sorry for late reply. I tried to 
test your implementation and have found few things to do before merging this.
    
    First, It is about test cases. I think we should add a test case for KNN 
with quad-tree rather than modifying a test case without quad-tree. Also we 
need some test cases with non-executable configuration such as KNN with 
quad-tree and non-compatible distance metric. A method to create a test case 
with exceptions is described in scalatest documentation (**Intercepted 
exceptions** section in  http://www.scalatest.org/user_guide/using_assertions).
    
    Second, package definitions of `QuadTree` and `QuadTreeSuite` are not 
matched with directory structure.
    
    Finally, I think we need to add more detail description with some 
mathematical background of KNN and quad-tree (including link of your slides and 
papers which you referred to) to the documentation. Also we need examples and  
description of parameters with default value.
    
    About rebasing, if you set `apache/flink` as remote `apache`, you can apply 
commands I suggested with renaming `upstream` to `apache`. You don't need to 
worry during rebasing. I also have copied branch of your `FLINK-1745` branch in 
my local machine. If you have some problems with rebasing, I'll rebase on 
`apache/master`.


> Add exact k-nearest-neighbours algorithm to machine learning library
> --------------------------------------------------------------------
>
>                 Key: FLINK-1745
>                 URL: https://issues.apache.org/jira/browse/FLINK-1745
>             Project: Flink
>          Issue Type: New Feature
>          Components: Machine Learning Library
>            Reporter: Till Rohrmann
>            Assignee: Daniel Blazevski
>              Labels: ML, Starter
>
> Even though the k-nearest-neighbours (kNN) [1,2] algorithm is quite trivial 
> it is still used as a mean to classify data and to do regression. This issue 
> focuses on the implementation of an exact kNN (H-BNLJ, H-BRJ) algorithm as 
> proposed in [2].
> Could be a starter task.
> Resources:
> [1] [http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm]
> [2] [https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to