Would this be difficult for a moderate user to implement in sklearn by modifying the existing code base?
Estimation and Inference of Heterogeneous Treatment Effects using Random Forests 342 citations in less than a year (Google Scholar): https://amstat.tandfonline.com/doi/full/10.1080/01621459.2017.1319839 "In this article, we develop a nonparametric *causal forest* for estimating heterogeneous treatment effects that extends Breiman’s widely used random forest algorithm. In the potential outcomes framework with unconfoundedness, we show that causal forests are pointwise consistent for the true treatment effect and have an asymptotically Gaussian and centered sampling distribution. We also discuss a practical method for constructing asymptotic confidence intervals for the true treatment effect that are centered at the causal forest estimates. Our theoretical results rely on a generic Gaussian theory for a large family of random forest algorithms. To our knowledge, this is the first set of results that allows any type of random forest, including classification and regression forests, to be used for provably valid statistical inference. In experiments, we find causal forests to be substantially more powerful than classical methods based on nearest-neighbor matching, especially in the presence of irrelevant covariates." -- *Randall J. Ellis* PhD Student, Hurd lab <http://labs.neuroscience.mssm.edu/project/hurd-lab/>, Mount Sinai School of Medicine Special Volunteer, Michaelides lab <http://www.michaelideslab.org/>, NIDA IRP Phone: +1-954-260-9891
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn