Good point, Hans. I see that within the available descriptors there are NHOHCount and NOCount, which I assume are equivalent to Lipinski's Donors and Acceptors. Also there are NumHAcceptors and NumHDonors which I would expect to differentiate themselves from the Linpinski versions in some way.
-Kirk On Wed, Oct 15, 2008 at 1:19 PM, Hans Purkey <[email protected]> wrote: > If the intention is to follow Lipinski's definitions of Hbond acceptors, > then it should be a simple N+O count (look back at the original paper and > that is how he difined it "for simplicity"). > > However, if the descriptor is intended to match a more intuitive/realistic > definition of HBA, then N-H shouldn't be a part of it. > > Hans > > > On Oct 15, 2008, at 11:50 AM, Greg Landrum wrote: > > [heh, worse than sending a message without an attachment is hitting >> send before the message is done and sending a message without text... >> sorry] >> >> On Wed, Oct 15, 2008 at 7:59 PM, Robert DeLisle <[email protected]> >> wrote: >> >>> >>> As you know, I've been working with descriptors in RDKit, and I think >>> I've >>> found a bug in the calculation of H-bond Acceptors. Attached is an >>> example >>> structure, N-methyl-1H-indole-6-carboxamide. When I calculate >>> NumHAcceptors >>> for this structure, I get 3. I've looked at numerous other strucures and >>> it >>> seems that nitrogens are always counted. I went into the code and found >>> the >>> definitions used for HAcceptors: >>> >> >> Here's a simpler case showing the same behavior: >> [15] >>> m2 = Chem.MolFromSmiles('CNC(=O)c1c[nH]cc1') >> >> [16] >>> Lipinski.NumHAcceptors(m2) >> Out[16]: 3 >> >> so that confirms the wrong count >> >> >>> $([O,S;H1;v2]-[!$(*=[O,N,P,S])]) >>> $([O,S;H0;v2]) >>> $([O,S;-]) >>> $([N&v3;H1,H2]-[!$(*=[O,N,P,S])]) >>> $([N;v3;H0]) >>> $([n,o,s;+0]) >>> F >>> >>> Unless I'm misinterpreting the SMARTS (a very good possiblity), both NH >>> groups are being counted as an acceptor due to matching >>> $([N&v3;H1,H2]-[!$(*=[O,N,P,S])]), but shouldn't the amide NH be excluded >>> according to this same definition? >>> >> >> [20] >>> >> m2.GetSubstructMatches(Chem.MolFromSmarts('[$([N&v3;H1,H2]-[!$(*=[O,N,P,S])])]')) >> Out[20]: ((1,),) >> >> Only matches one nitrogen... the amide nitrogen. The aromatic N >> matches the second but last definition: >> [29] >>> m2.GetSubstructMatches(Chem.MolFromSmarts('[$([n,o,s;+0])]')) >> Out[29]: ((6,),) >> >> The problem is that the first definition matches an N that is single >> bonded to an atom that isn't doubly bonded to O,N,P, or S. It does not >> exclude Ns that are single bonded to an atom that is doubly bonded to >> O,N,P, or S. So your amide with a secondary N matches. The problem >> isn't the matcher, it's the definition. >> >> Is that clear? >> >> I agree that this is a bug in the definition and will fix it. Would >> you mind entering the bug at sf.net or should I do it? >> >> -greg >> >> ------------------------------------------------------------------------- >> This SF.Net email is sponsored by the Moblin Your Move Developer's >> challenge >> Build the coolest Linux based applications with Moblin SDK & win great >> prizes >> Grand prize is a trip for two to an Open Source event anywhere in the >> world >> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >> _______________________________________________ >> Rdkit-discuss mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> >

