Hi Greg, maybe some comments on your suggestions.
> 1) Should the renaming mentioned above (i.e. the NumHAcceptor and > NumHDonor descriptors start returning the "official" Lipinski values > and the existing functions are renamed to NumHAcceptorAlt and > NumHDonorAlt) be done? Personally, I would guess that most people would not expect to receive an N/O count if they are asking for H-donors and acceptors. Hence, I would propably use a different naming convention that includes the Lipinski specification (e.g. LipNumHAcc or similar). That way people will not get confused by very high counts for those values. > 2) Is the above SMARTS reasonable for the more detailed HAcceptor definition? As you say - they are very basic but to me they look reasonable. If you actually want to tune them at a low level than I would propably change the F definition to fluoro's attached to aromatic rings only ( I know there is a lot of papers out there that discuss this issue ) but that's only me and I would guess that over time people should fine-tune these definitions to their own like anyway. My 2 pence Nik "Greg Landrum" <[email protected]> 28.10.2008 06:55 To [email protected] cc Subject Re: [Rdkit-discuss] H-bond Acceptor problem I wanted to make one more post on this topic, ask a couple questions (at the bottom of the post), and give people a few days to comment before I regenerate the regression test data and commit a change for this bug. On Wed, Oct 15, 2008 at 8:19 PM, Hans Purkey <[email protected]> wrote: > If the intention is to follow Lipinski's definitions of Hbond acceptors, > then it should be a simple N+O count (look back at the original paper and > that is how he difined it "for simplicity"). For those who are coming to this late, this is the NOCount() descriptor, which is already present in the RDKit. > However, if the descriptor is intended to match a more intuitive/realistic > definition of HBA, then N-H shouldn't be a part of it. I don't think I agree with this. There are plenty of cases of nitrogens with attached Hs that act as H-bond acceptors (I did a CCD search yesterday to be sure), but that's a side topic. Back to the main topic: since these descriptors are all defined in a module named "Lipinski", and since this all qualitative anyway, I'd propose the following change: The existing NumHDonors and NumHAcceptors (with fixes, discussed below) be renamed to NumHDonorsAlt and NumHAcceptorsAlt and NOCount and NHOHCount be aliased to NumHAcceptors and NumHDonors. I'd then deprecate NOCount and NHOHCount (they will generate warnings when used in the next release and then be completely removed in the release after that). For the purposes of fixing the more complex HAcceptor descriptor I propose the following SMARTS: HAcceptorSmarts = Chem.MolFromSmarts('[$([O,S;H1;v2]-[!$(*=[O,N,P,S])]),\ $([O,S;H0;v2]),$([O,S;-]),\ $([N;v3;!$(n-...@[o,N,P,S])]),\ $([nH0,o,s;+0]),\ $([F;!$(F-*-F)])]')d There are two changes here: the third line and the last one. The third line includes nitrogens that have three neighbors and that are not connected to another atom that has a non-ring double bond to O, N, P, or S. The last line includes Fs that are not connected to another atom that has more than one F attached (to exclude CF3 and CF2). I realize these are not highly tuned, very detailed definitions like those in the fdef file discussed elsewhere on this thread, but are they acceptable for a qualitative descriptor? So, the two questions: 1) Should the renaming mentioned above (i.e. the NumHAcceptor and NumHDonor descriptors start returning the "official" Lipinski values and the existing functions are renamed to NumHAcceptorAlt and NumHDonorAlt) be done? 2) Is the above SMARTS reasonable for the more detailed HAcceptor definition? Thanks for any feedback, -greg ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Rdkit-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss _________________________ CONFIDENTIALITY NOTICE The information contained in this e-mail message is intended only for the exclusive use of the individual or entity named above and may contain information that is privileged, confidential or exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivery of the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately by e-mail and delete the material from any computer. Thank you.

