I think you can find here something of more rigorous.
http://orbi.ulg.ac.be/handle/2268/170309
On Mon, Apr 27, 2015 at 11:20 PM, Daniel Homola <
daniel.homol...@imperial.ac.uk> wrote:
> Hi Luca,
>
> The reason I asked is because I'm interested in the second problem. Thanks
> a lot for the paper and the suggested params, I'll read it and try them!
>
> Has anyone tested these assumptions/parameters rigorously on simulated
> data, or is this more of a feeling?
>
> Thanks again for the quick and informative response!
> Best,
> Daniel
>
>
> On 27/04/15 20:43, Luca Puggini wrote:
>
> Hey,
> I spent quiet some time with this problem.
>
> 1) if you are interested only in prediction this is not a big problem.
> You can preproces the data with PCA
>
> 2) if you want to understand which variables are important
> I suggest you to read the paper "Understanding variable importances in
> forests of randomized trees".
> In general I suggest you to use ExtraTreesClassifier with max_depth=3 or
> 5. There is a discussion if it is better to use max_features=1 or
> max_features=n_features (I will go for the latter one).
>
> I went thought some problems with the R package that you are suggesting
> so I would not use that.
>
> I hope this can help.
> Best,
> Luca
>
> On Mon, Apr 27, 2015 at 4:48 PM, Daniel Homola <
> daniel.homol...@imperial.ac.uk> wrote:
>
>> Dear all,
>>
>> I've found several articles expressing concerns about using Random
>> Forest with highly correlated features (e.g.
>> http://www.biomedcentral.com/1471-2105/9/307).
>>
>> I was wondering if this drawback of the RF algorithm could be somehow
>> remedied using scikit-learn methods? The above linked paper has an R
>> package but it's known to offer a super-slow solution to the problem.
>> When I thought about this problem (quite naively as I'm at a best an
>> enthusiastic beginner in ML) I thought maybe further randomisation in
>> the tree building might help with this.. So would using
>> ExtraTreesClassifier provide some protection against this issue?
>>
>> Thanks a lot for any suggestions in advance!
>>
>> Cheers,
>> Daniel
>>
>>
>> ------------------------------------------------------------------------------
>> One dashboard for servers and applications across Physical-Virtual-Cloud
>> Widest out-of-the-box monitoring support with 50+ applications
>> Performance metrics, stats and reports that give you Actionable Insights
>> Deep dive visibility with transaction tracing using APM Insight.
>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>
>
>
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM
> Insight.http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>
>
>
> _______________________________________________
> Scikit-learn-general mailing
> listScikit-learn-general@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
>
>
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general