[heh, worse than sending a message without an attachment is hitting
send before the message is done and sending a message without text...
sorry]

On Wed, Oct 15, 2008 at 7:59 PM, Robert DeLisle <[email protected]> wrote:
>
> As you know, I've been working with descriptors in RDKit, and I think I've
> found a bug in the calculation of H-bond Acceptors.  Attached is an example
> structure, N-methyl-1H-indole-6-carboxamide.  When I calculate NumHAcceptors
> for this structure, I get 3.  I've looked at numerous other strucures and it
> seems that nitrogens are always counted.  I went into the code and found the
> definitions used for HAcceptors:

Here's a simpler case showing the same behavior:
[15] >>> m2 = Chem.MolFromSmiles('CNC(=O)c1c[nH]cc1')

[16] >>> Lipinski.NumHAcceptors(m2)
Out[16]: 3

so that confirms the wrong count

>
> $([O,S;H1;v2]-[!$(*=[O,N,P,S])])
> $([O,S;H0;v2])
> $([O,S;-])
> $([N&v3;H1,H2]-[!$(*=[O,N,P,S])])
> $([N;v3;H0])
> $([n,o,s;+0])
> F
>
> Unless I'm misinterpreting the SMARTS (a very good possiblity), both NH
> groups are being counted as an acceptor due to matching
> $([N&v3;H1,H2]-[!$(*=[O,N,P,S])]), but shouldn't the amide NH be excluded
> according to this same definition?

[20] >>> 
m2.GetSubstructMatches(Chem.MolFromSmarts('[$([N&v3;H1,H2]-[!$(*=[O,N,P,S])])]'))
Out[20]: ((1,),)

Only matches one nitrogen... the amide nitrogen. The aromatic N
matches the second but last definition:
[29] >>> m2.GetSubstructMatches(Chem.MolFromSmarts('[$([n,o,s;+0])]'))
Out[29]: ((6,),)

The problem is that the first definition matches an N that is single
bonded to an atom that isn't doubly bonded to O,N,P, or S. It does not
exclude Ns that are single bonded to an atom that is doubly bonded to
O,N,P, or S. So your amide with a secondary N matches. The problem
isn't the matcher, it's the definition.

Is that clear?

I agree that this is a bug in the definition and will fix it. Would
you mind entering the bug at sf.net or should I do it?

-greg

Reply via email to