Hi Greg,
Digging around a bit more, I noticed there are at least two published SMARTS
definitions of hydrogen bond acceptor. The first by Gillet et al. (1998, see
below) that is also found on the Daylight web site and the second by Gobbi et
al. (1998). It appears that both versions are deficient for different reasons.
Gillet et al. exclude pyrrole nitrogen atoms but not amide nitrogen atoms
whereas Gobbi et al. do the reverse. I don’t think either omission was
intentional, but rather an oversight.
Both amide nitrogen atoms and pyrrole nitrogen atoms should be excluded from
the definitions for precisely the same reason. These nitrogen pi electrons are
delocalized and are not available for hydrogen bonding. Also I agree with
Gillet that halogens and aromatic oxygen and sulfurs should be excluded since
these are exceedingly weak hydrogen bond acceptors.
I also noticed that the HAcceptorSmarts descriptor from Chem.Lipinski also uses
the Gobbi definition, but is implemented slightly differently. In this version,
a pyrrole with a hydrogen attached to the nitrogen atom is excluded (as it
should), but not N-alkyl pyrrole (this IMHO is incorrect).
What I think is needed is a hybrid definition that corrects both deficiencies.
I am not sure what to call it. Perhaps Gillet/Gobbi?
Cheers,
Konrad
-------------------------------------------------------------------------------------------------------------
From:
http://www.daylight.com/dayhtml_tutorials/languages/smarts/smarts_examples.html#H_BOND
Hydrogen-bond acceptor
[!$([#6,F,Cl,Br,I,o,s,nX3,#7v5,#15v5,#16v4,#16v6,*+1,*+2,*+3])]
A H-bond acceptor is a heteroatom with no positive charge, note that negatively
charged oxygen or sulphur are included. Excluded are halogens, including F,
heteroaromatic oxygen, sulphur and pyrrole N.
Which in turn is taken from:
Identification of biological activity profiles using substructural analysis and
genetic algorithms.
Gillet VJ, Willett P, Bradshaw J.
J Chem Inf Comput Sci. 1998 Mar-Apr;38(2):165-79.
Quote: HBA is defined as a heteroatom with no positive charge, excluding the
halogens, aromatic oxygen, sulfur, and pyrrole nitrogen and the higher
oxidation levels of nitrogen, phosphorus, and sulfur.
Table 1. SMARTS Definitions for Substructural Features feature
HBD [!#6;!H0]
HBA [$([!#6;+0]);!$([F,Cl,Br,I]);!$([o,s,nX3]);!$([Nv5,Pv5,Sv4,Sv6])]
RB [! $([NH]!@C()O))&!D1&!(*#*)]&!@[!$([NH]!@C()O))!D1&!(*#*)]
>>> from rdkit import Chem
>>> p =
>>> Chem.MolFromSmarts('[!$([#6,F,Cl,Br,I,o,s,nX3,#7v5,#15v5,#16v4,#16v6,*+1,*+2,*+3])]')
>>> m = Chem.MolFromSmiles('c1ccccc1’) # benzene
>>> m.HasSubstructMatch(p)
False # correct
>>> m = Chem.MolFromSmiles('n1ccccc1’) # pyridine
>>> m.HasSubstructMatch(p)
True # correct
>>> m = Chem.MolFromSmiles('[nH]1cccc1') # pyrrole
>>> m.HasSubstructMatch(p)
False # correct
>>> m = Chem.MolFromSmiles('C(=O)N') # amide
>>> m.GetSubstructMatches(p)
((1,), (2,)) # correctly matches the oxygen atom but incorrectly matches the
nitrogen atom
-------------------------------------------------------------------------------------------------------------
From: http://www.rdkit.org/Python_Docs/rdkit.Chem.Lipinski-pysrc.html
HAcceptorSmarts = Chem.MolFromSmarts('[$([O,S;H1;v2]-[!$(*=[O,N,P,S])]),\
$([O,S;H0;v2]),$([O,S;-]),\
$([N;v3;!$(N-*=!@[O,N,P,S])]),\
$([nH0,o,s;+0])\
]’)
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss