If the intention is to follow Lipinski's definitions of Hbond
acceptors, then it should be a simple N+O count (look back at the
original paper and that is how he difined it "for simplicity").
However, if the descriptor is intended to match a more intuitive/
realistic definition of HBA, then N-H shouldn't be a part of it.
Hans
On Oct 15, 2008, at 11:50 AM, Greg Landrum wrote:
[heh, worse than sending a message without an attachment is hitting
send before the message is done and sending a message without text...
sorry]
On Wed, Oct 15, 2008 at 7:59 PM, Robert DeLisle
<[email protected]> wrote:
As you know, I've been working with descriptors in RDKit, and I
think I've
found a bug in the calculation of H-bond Acceptors. Attached is an
example
structure, N-methyl-1H-indole-6-carboxamide. When I calculate
NumHAcceptors
for this structure, I get 3. I've looked at numerous other
strucures and it
seems that nitrogens are always counted. I went into the code and
found the
definitions used for HAcceptors:
Here's a simpler case showing the same behavior:
[15] >>> m2 = Chem.MolFromSmiles('CNC(=O)c1c[nH]cc1')
[16] >>> Lipinski.NumHAcceptors(m2)
Out[16]: 3
so that confirms the wrong count
$([O,S;H1;v2]-[!$(*=[O,N,P,S])])
$([O,S;H0;v2])
$([O,S;-])
$([N&v3;H1,H2]-[!$(*=[O,N,P,S])])
$([N;v3;H0])
$([n,o,s;+0])
F
Unless I'm misinterpreting the SMARTS (a very good possiblity),
both NH
groups are being counted as an acceptor due to matching
$([N&v3;H1,H2]-[!$(*=[O,N,P,S])]), but shouldn't the amide NH be
excluded
according to this same definition?
[20] >>> m2.GetSubstructMatches(Chem.MolFromSmarts('[$([N&v3;H1,H2]-
[!$(*=[O,N,P,S])])]'))
Out[20]: ((1,),)
Only matches one nitrogen... the amide nitrogen. The aromatic N
matches the second but last definition:
[29] >>> m2.GetSubstructMatches(Chem.MolFromSmarts('[$([n,o,s;+0])]'))
Out[29]: ((6,),)
The problem is that the first definition matches an N that is single
bonded to an atom that isn't doubly bonded to O,N,P, or S. It does not
exclude Ns that are single bonded to an atom that is doubly bonded to
O,N,P, or S. So your amide with a secondary N matches. The problem
isn't the matcher, it's the definition.
Is that clear?
I agree that this is a bug in the definition and will fix it. Would
you mind entering the bug at sf.net or should I do it?
-greg
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's
challenge
Build the coolest Linux based applications with Moblin SDK & win
great prizes
Grand prize is a trip for two to an Open Source event anywhere in
the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss