Greetings, I am working on a problem that involves predicting the binding affinity of small molecules on a receptor structure (is regression problem, not classification). I have multiple small datasets of molecules with measured binding affinities on a receptor, but each dataset was measured in different experimental conditions and therefore I cannot use them all together as trainning set. So, instead of using them individually, I was wondering whether there is a method to combine them all into a super training set. The first way I could think of is to convert the binding affinities to Z-scores and then combine all the small datasets of molecules. But this is would be inaccurate because, firstly the datasets are very small (10-50 molecules each), and secondly, the range of binding affinities differs in each experiment (some datasets contain really strong binders, while others do not, etc.). Is there any other approach to combine datasets with values coming from different sources? Maybe if someone points me to the right reference I could read and understand if it is applicable to my case.
Thanks, Thomas -- ====================================================================== Dr Thomas Evangelidis Post-doctoral Researcher CEITEC - Central European Institute of Technology Masaryk University Kamenice 5/A35/2S049, 62500 Brno, Czech Republic email: tev...@pharm.uoa.gr teva...@gmail.com website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn