[GitHub] spark issue #15874: [Spark-18408][ML] API Improvements for LSH

jkbradley Fri, 18 Nov 2016 12:34:07 -0800

Github user jkbradley commented on the issue:

    https://github.com/apache/spark/pull/15874
  
    @Yunni Thanks for the updates!  I don't think we should include 
AND-amplification for 2.1 since we're already in QA.  But it'd be nice to get 
it in 2.2.  Also, 2.2 will give us plenty of time to discuss distributed 
approxNearestNeighbors.
    
    FYI: I asked around about the managed memory leak warning/failure.  It is 
usually just a warning, but some test suites are set to fail upon seeing that 
warning.  That was apparently useful for debugging some memory leak bugs but is 
not cause to worry.  I recommend we make tests small enough to avoid them for 
now.  If the warning becomes an issue, we could configure ML suites to ignore 
the warning, or we could even downgrade the warning to a lower-priority log 
message for all of Spark.
    
    This LGTM.  What does everyone think?
    
    For 2.1, the main thing I'd still like to do is to send a PR to clarify 
terminology.  That could be done in [https://github.com/apache/spark/pull/15795]



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #15874: [Spark-18408][ML] API Improvements for LSH

Reply via email to