thanks paolo, will give all of this a try.
i'll also send a pr with a section on patterns for sklearn. although this
pattern might be specific to my problem domain, having more real-world
scripts/examples that reflect such considerations might be useful to the
community.
cheers,
satra
On Sun, Mar 25, 2012 at 9:32 AM, Paolo Losi <[email protected]> wrote:
> Hi Satraijit,
>
> On Sun, Mar 25, 2012 at 3:02 PM, Satrajit Ghosh <[email protected]> wrote:
> > hi giles,
> >
> > when dealing with skinny matrices of the type few samples x lots of
> > features what are the recommendations for extra trees in terms of max
> > features and number of estimators?
>
> as far as number of estimators (trees) is concerned ... the higher the
> better.
>
> 100 is a reasonable default but if you are in a n << p setting it may
> be too low.
>
> for max features I would suggest performing hyper parameter search:
> 1, 2, 4, 8, .... p
>
> > also if a lot of the features are nuisance and most are noisy, are there
> any
> > recommendations for feature reduction using extra trees themselves.
>
> You could rank features by feature importance and perform recursive feature
> limitation (drop at each iteration 10% of feature discarding the least
> important)
>
> Ciao
> Paolo
>
>
> ------------------------------------------------------------------------------
> This SF email is sponsosred by:
> Try Windows Azure free for 90 days Click Here
> http://p.sf.net/sfu/sfd2d-msazure
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general