Hey everybody. While looking at the pruning PR, I wondered why we don't have regularization path features in the tree module for the other regularization methods, i.e. depth, min_samples_split and min_samples_leaf. For all of these, the more regularized models are submodels of the stronger regularized ones and computing just the least regularized one could basically give you all of them.
This would make model-selection on trees much faster and could also be used in the ensembles-module. Together with the oob estimate, this would enable us to do model selection more or less in a single go. What do you think? I'm not sure what would be the easiest way to accomplish this but I think it is worth investigating. One possible way would be to build a normal model and then have "apply_tree" or "predict_tree" check the regularization conditions, i.e. make them stop at a certain depth. (or rather have separate methods that do that). Does that make sense? Gilles, as you are just working on this part, what do you think? Cheers, Andy ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
