Causal forest are a very nice work. However, they deal with causal
inference, rather than prediction. Hence, I am not really sure how we
could implement them in the API of scikit-learn. Do you have a
suggestion?
Cheers,
Gaël
On Fri, May 24, 2019 at 05:21:50PM -0400, Randy Ellis wrote:
> Would this be difficult for a moderate user to implement in sklearn by
> modifying the existing code base?
> Estimation and Inference of Heterogeneous Treatment Effects using Random
> Forests
> 342 citations in less than a year (Google Scholar): https://
> amstat.tandfonline.com/doi/full/10.1080/01621459.2017.1319839
> "In this article, we develop a nonparametric causal forest for estimating
> heterogeneous treatment effects that extends Breiman’s widely used random
> forest algorithm. In the potential outcomes framework with unconfoundedness,
> we
> show that causal forests are pointwise consistent for the true treatment
> effect
> and have an asymptotically Gaussian and centered sampling distribution. We
> also
> discuss a practical method for constructing asymptotic confidence intervals
> for
> the true treatment effect that are centered at the causal forest estimates.
> Our
> theoretical results rely on a generic Gaussian theory for a large family of
> random forest algorithms. To our knowledge, this is the first set of results
> that allows any type of random forest, including classification and regression
> forests, to be used for provably valid statistical inference. In experiments,
> we find causal forests to be substantially more powerful than classical
> methods
> based on nearest-neighbor matching, especially in the presence of irrelevant
> covariates."
--
Gael Varoquaux
Senior Researcher, INRIA
http://gael-varoquaux.info http://twitter.com/GaelVaroquaux
_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn