[GitHub] spark pull request: [SPARK-14489][SPARK-14153][ML][PYSPARK] Suppor...

sethah Fri, 22 Apr 2016 13:42:12 -0700

Github user sethah commented on the pull request:

    https://github.com/apache/spark/pull/12577#issuecomment-213582197
  
    @MLnick Good points. In my mind there are two scenarios here:
    
    1. This is a quick fix for users wanting to do cross validation with 
ALS/recommenders in Spark, until we can fully address the issue in the ALS 
algorithm.
    2. This is an improvement to Evaluators in general, giving users 
flexibility to ignore NaNs in whatever scenario produces them.
    
    If (1) is true then, as you said, we can think about deprecating this in 
the future since it may happen that we can think of no specific use case for it 
once (if?) ALS stops predicting NaNs on new data. If (2) is true,  perhaps we 
should consider adding this to _all_ evaluators? Again, I'd be interested to 
hear other use cases. One I thought of is a Naive Bayes classifier with no 
smoothing predicting on unseen words in text classification, but I wasn't able 
to produce a similar failure in the bit of time I spent on it. Either way, I 
think this is an improvement, but just wanted to be a bit more explicit on the 
_why_ and how it might affect scope.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-14489][SPARK-14153][ML][PYSPARK] Suppor...

Reply via email to