[GitHub] spark issue #14547: [SPARK-16718][MLlib] gbm-style treeboost [WIP]

vlad17 Mon, 08 Aug 2016 19:52:49 -0700

Github user vlad17 commented on the issue:

    https://github.com/apache/spark/pull/14547
  
    @sethah Thanks for the FYI. I'm pretty confident that it'll help since now 
we're directly optimizing the loss function. However, it would be nice to prove 
this. Unfortunately, the example I linked above uses a skewed dataset.
    
    The only estimator whose behavior changed is GBTClassifier (now the 
bernoulli predictions use an NR step rather than guess the mean). And since the 
raw prediction column is unavailable for the GBTClassifier, I can't really 
compare the classifiers sensibly on skewed datasets since AUC is out of the 
question.
    
    I'm going to have to spend some time trying to find a "real" dataset that's 
not skewed but large enough to be meaningful or just make an artificial one. 
And also spark-perf will need to be re-run.
    
    Also, regarding the binary incompatibility failure - part of that was my 
fault, part of it was due to an incompatibility with a package-private method. 
I added an exception for the binary incompatibility for the package-private 
method - is that OK?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #14547: [SPARK-16718][MLlib] gbm-style treeboost [WIP]

Reply via email to