[GitHub] spark issue #14547: [SPARK-16718][MLlib] gbm-style treeboost

jkbradley Mon, 26 Sep 2016 14:10:48 -0700

Github user jkbradley commented on the issue:

    https://github.com/apache/spark/pull/14547
  
    @sethah AFAIK, the original gradient boosting algorithm was generic, not 
specific to trees.  That's Algorithm 1 from 
[https://statweb.stanford.edu/~jhf/ftp/trebst.pdf] and is what MLlib has 
currently.
    
    I agree with your intuition about options 3 > 2 > 1 and encouraging users 
to use option 3 via our API.  I'd be OK with disallowing option 1.  As a 
software engineer, I'd want to allow 1 for backwards API compatibility, where 
behavior and algorithms are part of the API.  But as an ML person, I'd be Ok 
with not even allowing 1 in the future to prevent users from doing the wrong 
thing.  Combining these, I'd recommend:
    * For now, we make 2 the default behavior but still allow 1.  (as in this 
PR)
    * In the future, we make 3 the default behavior, maybe allow 2, and do not 
allow 1.
    
    > "loss-based"  What exactly does that mean to the user?
    
    If this is unclear, then let's make the documentation for that Param 
clearer and/or use a more intuitive name such as "auto."




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #14547: [SPARK-16718][MLlib] gbm-style treeboost

Reply via email to