Interesting. For feature sub-sampling, is it per-node or per-tree? Do you think you can implement generic GBM and have it merged as part of Spark codebase?
Sincerely, DB Tsai ---------------------------------------------------------- Web: https://www.dbtsai.com PGP Key ID: 0xAF08DF8D On Mon, Oct 26, 2015 at 11:42 AM, Meihua Wu <[email protected]> wrote: > Hi Spark User/Dev, > > Inspired by the success of XGBoost, I have created a Spark package for > gradient boosting tree with 2nd order approximation of arbitrary > user-defined loss functions. > > https://github.com/rotationsymmetry/SparkXGBoost > > Currently linear (normal) regression, binary classification, Poisson > regression are supported. You can extend with other loss function as > well. > > L1, L2, bagging, feature sub-sampling are also employed to avoid overfitting. > > Thank you for testing. I am looking forward to your comments and > suggestions. Bugs or improvements can be reported through GitHub. > > Many thanks! > > Meihua > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
