Hi, Scikit-learn does not cover this problem.
I think that it relates to what is called survival analysis. You'll find a survival analysis package in Python at https://lifelines.readthedocs.io/en/latest/ Best, Gaƫl On Tue, Jun 08, 2021 at 04:22:14PM +0900, Francois Berenger wrote: > Hello, > https://en.wikipedia.org/wiki/Truncated_regression_model > Sometimes, data have missing samples when the target variable > is above or below a threshold value. > This is very often the case for biochemical data (e.g. target > variable outside detection range of some lab equipment). > I highly suspect some specific models could handle such datasets > better than generic methods (i.e. train better models). > Some points of entry, if that might help: > - R has a truncreg package > https://cran.r-project.org/web/packages/truncreg/index.html > - a related paper from the wikipedia page: > "Local likelihood estimation of truncated regression and > its partial derivatives: Theory and application" > https://hal.archives-ouvertes.fr/hal-00520650/file/PEER_stage2_10.1016%252Fj.jeconom.2008.08.007.pdf > I can provide a cleaned public regression dataset, if someone is interested, > for tests > (there are many such datasets in ChEMBL and PubChem by the way, but you need > to know how > to "featurize"/encode molecules). > Regards, > F. > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn -- Gael Varoquaux Research Director, INRIA Visiting professor, McGill http://gael-varoquaux.info http://twitter.com/GaelVaroquaux _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn