Dear RDKit Team,

Firstly, thank you for making such a great tool available to the community and 
for continuing to develop it.

I have taken the SMARTS patterns, documented here 
[http://www.rdkit.org/docs/GettingStartedInPython.html#feature-definitions-used-in-the-morgan-fingerprints],
 for defining acidic and basic groups. (I appreciate the SMARTS will assign 
atom types.)

However, assuming a "basic" group means "protonated at typical pH = 7 - 7.4", I 
believe there is a mistake in the basic group SMARTS pattern.

This identifies the nitrogen atoms in the following compounds as basic: (1) 
aniline [c1ccccc1N]; (2) methyl thiazole [C[n+]1cscc1].

Methyl thiazole does not have an atom which can receive a proton, to the best 
of my knowledge.

Protonated aniline has a pKa of 4.6 in H2O 
[http://evans.rc.fas.harvard.edu/pdf/evans_pKa_table.pdf].

Hence, whilst no claim is made to have comprehensively integrated all the 
information from Evan's Table, I suggest adapting the SMARTS as follows:
[$([N;H2&+0][$([C]);!$([C,a](=O))]),$([N;H1&+0]([$([C]);!$([C,a](=O))])[$([C]);!$([C,a](=O))]),$([N;H0&+0]([C;!$(C(=O))])([C;!$(C(=O))])[C;!$(C(=O))])]

Does that sound reasonable?

Best regards,

Richard


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to