Interesting. For feature sub-sampling, is it per-node or per-tree? Do you think you can implement generic GBM and have it merged as part of Spark codebase?
Sincerely, DB Tsai ---------------------------------------------------------- Web: https://www.dbtsai.com PGP Key ID: 0xAF08DF8D On Mon, Oct 26, 2015 at 11:42 AM, Meihua Wu <rotationsymmetr...@gmail.com> wrote: > Hi Spark User/Dev, > > Inspired by the success of XGBoost, I have created a Spark package for > gradient boosting tree with 2nd order approximation of arbitrary > user-defined loss functions. > > https://github.com/rotationsymmetry/SparkXGBoost > > Currently linear (normal) regression, binary classification, Poisson > regression are supported. You can extend with other loss function as > well. > > L1, L2, bagging, feature sub-sampling are also employed to avoid overfitting. > > Thank you for testing. I am looking forward to your comments and > suggestions. Bugs or improvements can be reported through GitHub. > > Many thanks! > > Meihua > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org