Re: [Scikit-learn-general] Question and comments on RandomForests

Gilles Louppe Tue, 10 Jan 2012 06:22:41 -0800

> The current code works great for me (thanks for contributing!!!!),
> still it would mean a lot if I could make it even faster. At the moment
> it takes me
> about 8 hours to grow a tree with only a subset of the features
> that I actually want to use.... I have a 128 core cluster here but then
> building
> a forest with 1000 trees would still take roughly 6 days....


Did you stick to random forests? They are much slower than extra-trees
(because they look for the best splits, while in extra-trees splits
are drawn at random). They also compare to each other in terms of
accuracy. In addition, from experience, on large to big datasets,
bootstrap doesn't help. You can turn it off (as long as max_features
<< n_features, with RFs).

Gilles

------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create 
new or port existing apps to sell to consumers worldwide. Explore the 
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Question and comments on RandomForests

Reply via email to