Hi Greg,The recent updates to the way explicit hydrogens are handled in the RDKit nodes for KNIME http://goo.gl/DK0FS have dramatically improved the number of correct matches that we observe when using the PAINS filters workflow http://goo.gl/T9mT2 .
Against the reference set from WEHI, we're now seeing 652 matches (up from 329), but we also now get 231 false positives where we were getting none before.
Attached is a tab-sep file containing the mis-matches (regID, smiles, smarts, smartsID).
The smarts strings come from Raj's blog: http://blog.rguha.net/?p=850. Let us know if you need additional info to diagnose what's going on. -- Cheers, Simon
%RDKIT2-231.txt
Description: application/applefile
RDKIT2-231.txt
Description: Binary data
------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________ Rdkit-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

