[GitHub] flink pull request: Flink 1745

danielblazevski Sun, 04 Oct 2015 09:33:03 -0700

Github user danielblazevski commented on the pull request:

    https://github.com/apache/flink/pull/1220#issuecomment-145364401
  
    Thanks @chiwanpark for the very useful comments.  I have made changes to 
the comments, which can be found here:
    
https://github.com/danielblazevski/flink/tree/FLINK-1745/flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/nn
    
    I also changed the testing of KNN + QuadTree, which can be found here:
    
https://github.com/danielblazevski/flink/tree/FLINK-1745/flink-staging/flink-ml/src/test/scala/org/apache/flink/ml/nn
    
    Since useQuadTree is now a parameter, I did not need KNNQuadTreeSuite 
anymore and I removed it.
    
    I did not address comment 6 yet.  I need to have the training set before I 
can define a non-user specified useQuadTree, so any main if(useQuadTree) should 
come within ` val crossed = trainingSet.cross(inputSplit).mapPartition {`
    
    About your last "P.S" comment,  Creating the quadtree after the cross 
operation is likely more efficient -- each CPU/Node will form their own 
quadtree, which is what is suggested for the R-tree here:
    https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf
    
    This will result less communication overhead than creating a more global 
quadtree, if that is what you were referring to.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: Flink 1745

Reply via email to