On 01/10/2012 03:21 PM, Gilles Louppe wrote:
>> The current code works great for me (thanks for contributing!!!!),
>> still it would mean a lot if I could make it even faster. At the moment
>> it takes me
>> about 8 hours to grow a tree with only a subset of the features
>> that I actually want to use.... I have a 128 core cluster here but then
>> building
>> a forest with 1000 trees would still take roughly 6 days....
>>      
> Did you stick to random forests? They are much slower than extra-trees
> (because they look for the best splits, while in extra-trees splits
> are drawn at random). They also compare to each other in terms of
> accuracy. In addition, from experience, on large to big datasets,
> bootstrap doesn't help. You can turn it off (as long as max_features
> <<  n_features, with RFs).
>    
Up to now I used RandomForests.
Thanks for the tips. I'll give it a try.

------------------------------------------------------------------------------
Write once. Port to many.
Get the SDK and tools to simplify cross-platform app development. Create 
new or port existing apps to sell to consumers worldwide. Explore the 
Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join
http://p.sf.net/sfu/intel-appdev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to