Dear Yu, On Wed, Dec 16, 2015 at 1:19 PM, 杜宇 <y...@student.ecnu.edu.cn> wrote: > I just achieved a ligand-based model with ChEMBL data. Is there any standard > data set to evaluate models against other models? > By comparison, I could depict a ROC curve to validate my model and convince > others. But as far as I know, although ChEMBL is a large and superb positive > data set, there is no negative data set for me to calculate the real true > and false negative (i.e. proteins really validated by experiments don't > interact with some ligands under some affinity value). All we used to > simulate the negative data is random selection.
ChEMBL actually does have negative data... it's binding affinities, low and high affinity. But that's indeed not quite the same as testing against targets ChEMBL does not report data for. > Maybe I miss some references but I look forward to your advice and > suggestions. And I would be appreciated if you can share your experience of > evaluating chemoinformatic models. Besides actually measuring new data, using decoy data sets, etc, I would also focus on interpretation of your model. What does your model say about the chemistry that describes your interactions. Some CDK descriptors are more suited for that than others... there is also a bit to gain by using certain modelling methods over others. Neural network, deep learning can give some insight via the inner layers (not always), but you can also think about coefficient vectors in PLS, variable selection, and grouping of compounds with supervised self-organizing maps. Also, there is no need to just look at the ROC; at least also include bootstrapping to show that your model does more than random fitting. Egon -- E.L. Willighagen Department of Bioinformatics - BiGCaT Maastricht University (http://www.bigcat.unimaas.nl/) Homepage: http://egonw.github.com/ LinkedIn: http://se.linkedin.com/in/egonw Blog: http://chem-bla-ics.blogspot.com/ PubList: http://www.citeulike.org/user/egonw/tag/papers ORCID: 0000-0001-7542-0286 ImpactStory: https://impactstory.org/EgonWillighagen ------------------------------------------------------------------------------ _______________________________________________ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user