[GitHub] spark pull request: ROC area under the curve for binary classifica...

schmit Tue, 18 Mar 2014 16:58:17 -0700

Github user schmit commented on the pull request:

    https://github.com/apache/spark/pull/160#issuecomment-38003806
  
    On your more general remarks @srowen:
    
    I think those are valid concerns, here is my reasoning for doing it this 
way:
    The predict function returns the label, but I need the predicted "score" or 
predicted probability (in case of LR) of the test samples in order to sort them.
    
    Also, in more generality, this seems like a useful function to have. I do 
not want to change the predict function, since that is what is probably most 
used and wanted, and it would be annoying to change the score into a label by 
hand, and only in the binary classification setting.
    
    However, this score function only makes sense in the binary classification 
setting, and so does ROC AUC. Later I hope to add the PR AUC as well, and that 
can be added to the same class, but first things first.
    
    The alternative is to define this function for both LR and SVM separately, 
but I don't like that either.
    
    However, I do agree it is not the most clean code, so your suggestions are 
very welcome.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: ROC area under the curve for binary classifica...

Reply via email to