Hi all, I was checking the archive of the mailing list to see if there were any attempts in the past to incorporate Conditional Inferences Trees into the Ensemble module. I've found a mail from Theo Strinopoulos (07-07-2013) asking if this would be welcomed as a contribution of his. Gilles Louppe replied that it would be very much so but the Tree module is under rewrite and Theo should wait a bit more.
Does anyone know what happened with this initiative? I've been working on RF based feature selection methods in the past few months, and realized that what several people have pointed out earlier might be true :) Namely that the information based decision criteria like Gini and Entropy favor variables with larger cardinality, plus that RF isn't terribly good at dealing with correlated predictors. This is what they found here: http://www.biomedcentral.com/1471-2105/8/25 and I think this is what Gilles thesis concludes as well. (please correct me if I've misunderstood your work): http://www.montefiore.ulg.ac.be/~glouppe/pdf/phd-thesis.pdf Gilles proposed that limiting the max_depth of the tree might be of help, however neither this nor using ExtraTrees helped (made a substantial difference) in my experiments. The paper above shows with simulation studies that using Conditional Inference Trees as base learners in the ensemble might ameliorate these issues, if it's coupled with subsampling without replacement instead of the traditional bootstrapping. So I was wondering if any of these two things are available in some bleeding-edge form, or someone's private branch maybe? When I naively checked the Ensemble and Tree code on github, hoping I could contribute and implement these, I must admit, I shied away from it quite quickly due to my lack of C and Cython knowledge.. Thanks for any help in advance! Cheers, Daniel ps.: I know R has party which has ctrees, but it's non-parallel and really slow, so it would amazing if scikit would have this, I think.. ------------------------------------------------------------------------------ _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general