On Friday, 25 March 2016, Sebastian Raschka <se.rasc...@gmail.com> wrote:

> > wondering what changes are needed to make
> > RandomForestClassifier competitive with xgboost and H20 at
>
> Do you mean in terms of predictive performance (not computational
> efficiency)? Not sure what other's think, but I wouldn't change the core
> algorithm since otherwise it's not really a "Random forest" anymore as it
> is described in literature -- and that would be very confusing for users
> and researchers.
>
>

I really meant just to ask the question, what is preventing the scikit
learn random forest implementation from a) scaling as well as xgboost and
h20 and b) getting as good AUC?

If the answer is that this is fundamentally the limit of bagging random
forests ( and that xgboost and h20 both implement boosting or something
else that scales and performs better) then that is already very interesting.
Raphael

> > On Mar 22, 2016, at 7:52 AM, Raphael C <drr...@gmail.com <javascript:;>>
> wrote:
> >
> >>
> >> - In tree-based Not handling categorical variables as such hurts us a
> lot
> >>  There's a PR to fix that, it still needs a bit of love:
> >>  https://github.com/scikit-learn/scikit-learn/pull/4899
> >>
> >
> > This is a conversation moved from
> > https://github.com/scikit-learn/scikit-learn/pull/4899 .
> >
> > In the light of the comment above and comments in the PR, I was
> > wondering what changes are needed to make
> > RandomForestClassifier competitive with xgboost and H20 at
> > http://datascience.la/benchmarking-random-forest-implementations/ .
> >
> > Raphael
> >
> >
> ------------------------------------------------------------------------------
> > Transform Data into Opportunity.
> > Accelerate data analysis in your applications with
> > Intel Data Analytics Acceleration Library.
> > Click to learn more.
> > http://pubads.g.doubleclick.net/gampad/clk?id=278785351&iu=/4140
> > _______________________________________________
> > Scikit-learn-general mailing list
> > Scikit-learn-general@lists.sourceforge.net <javascript:;>
> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
>
> ------------------------------------------------------------------------------
> Transform Data into Opportunity.
> Accelerate data analysis in your applications with
> Intel Data Analytics Acceleration Library.
> Click to learn more.
> http://pubads.g.doubleclick.net/gampad/clk?id=278785351&iu=/4140
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net <javascript:;>
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785351&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to