If the intention is to follow Lipinski's definitions of Hbond acceptors, then it should be a simple N+O count (look back at the original paper and that is how he difined it "for simplicity").

However, if the descriptor is intended to match a more intuitive/ realistic definition of HBA, then N-H shouldn't be a part of it.

Hans

On Oct 15, 2008, at 11:50 AM, Greg Landrum wrote:

[heh, worse than sending a message without an attachment is hitting
send before the message is done and sending a message without text...
sorry]

On Wed, Oct 15, 2008 at 7:59 PM, Robert DeLisle <[email protected]> wrote:

As you know, I've been working with descriptors in RDKit, and I think I've found a bug in the calculation of H-bond Acceptors. Attached is an example structure, N-methyl-1H-indole-6-carboxamide. When I calculate NumHAcceptors for this structure, I get 3. I've looked at numerous other strucures and it seems that nitrogens are always counted. I went into the code and found the
definitions used for HAcceptors:

Here's a simpler case showing the same behavior:
[15] >>> m2 = Chem.MolFromSmiles('CNC(=O)c1c[nH]cc1')

[16] >>> Lipinski.NumHAcceptors(m2)
Out[16]: 3

so that confirms the wrong count


$([O,S;H1;v2]-[!$(*=[O,N,P,S])])
$([O,S;H0;v2])
$([O,S;-])
$([N&v3;H1,H2]-[!$(*=[O,N,P,S])])
$([N;v3;H0])
$([n,o,s;+0])
F

Unless I'm misinterpreting the SMARTS (a very good possiblity), both NH
groups are being counted as an acceptor due to matching
$([N&v3;H1,H2]-[!$(*=[O,N,P,S])]), but shouldn't the amide NH be excluded
according to this same definition?

[20] >>> m2.GetSubstructMatches(Chem.MolFromSmarts('[$([N&v3;H1,H2]- [!$(*=[O,N,P,S])])]'))
Out[20]: ((1,),)

Only matches one nitrogen... the amide nitrogen. The aromatic N
matches the second but last definition:
[29] >>> m2.GetSubstructMatches(Chem.MolFromSmarts('[$([n,o,s;+0])]'))
Out[29]: ((6,),)

The problem is that the first definition matches an N that is single
bonded to an atom that isn't doubly bonded to O,N,P, or S. It does not
exclude Ns that are single bonded to an atom that is doubly bonded to
O,N,P, or S. So your amide with a secondary N matches. The problem
isn't the matcher, it's the definition.

Is that clear?

I agree that this is a bug in the definition and will fix it. Would
you mind entering the bug at sf.net or should I do it?

-greg

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



Reply via email to