On 01/10/2012 03:21 PM, Gilles Louppe wrote: >> The current code works great for me (thanks for contributing!!!!), >> still it would mean a lot if I could make it even faster. At the moment >> it takes me >> about 8 hours to grow a tree with only a subset of the features >> that I actually want to use.... I have a 128 core cluster here but then >> building >> a forest with 1000 trees would still take roughly 6 days.... >> > Did you stick to random forests? They are much slower than extra-trees > (because they look for the best splits, while in extra-trees splits > are drawn at random). They also compare to each other in terms of > accuracy. In addition, from experience, on large to big datasets, > bootstrap doesn't help. You can turn it off (as long as max_features > << n_features, with RFs). > Up to now I used RandomForests. Thanks for the tips. I'll give it a try.
------------------------------------------------------------------------------ Write once. Port to many. Get the SDK and tools to simplify cross-platform app development. Create new or port existing apps to sell to consumers worldwide. Explore the Intel AppUpSM program developer opportunity. appdeveloper.intel.com/join http://p.sf.net/sfu/intel-appdev _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
