Hey everybody.
While looking at the pruning PR, I wondered why we don't have
regularization path features in the tree module for the other regularization
methods, i.e. depth, min_samples_split and min_samples_leaf.
For all of these, the more regularized models are submodels of the
stronger regularized ones and computing just the least regularized one
could basically give you all of them.

This would make model-selection on trees much faster and could also be
used in the ensembles-module. Together with the oob estimate, this would
enable us to do model selection more or less in a single go.

What do you think? I'm not sure what would be the easiest way to accomplish
this but I think it is worth investigating.

One possible way would be to build a normal model and  then have "apply_tree" 
or "predict_tree"
check the regularization conditions, i.e. make them stop at a certain depth.
(or rather have separate methods that do that).

Does that make sense?
Gilles, as you are just working on this part, what do you think?

Cheers,
Andy

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to