On Dec 27, 2010, at 5:41 AM, Greg Landrum wrote: > Heh, I was wondering if you were going to take that one up. Knowing > how much you enjoy (ab)using dot diconnects it seemed likely. :-)
And take it up I did. Here's the essay I just wrote about the technique. http://dalkescientific.com/writings/diary/archive/2010/12/28/reordering_smiles.html I managed to work around the bug for the first and second versions of the algorithm but the workaround didn't work for the third. I instead went over to OpenBabel for it. One of the features I would like in a toolkit is the ability to say: format_atom(atom) format_bond(bond) and get back the appropriate SMILES for that atom or bond. This would include the logic for representing "[CH4]" vs "C", and if there's a single bond between two aromatic atoms then it would return "-" instead of "". I ended up writing those myself, and found out that reporting the isotope number is hard. As far as I can tell, the closest solution is: mass = atom.GetMass() if mass == int(mass): print "isotope is", int(mass) else: print "isotope not specified" but it isn't perfect since this test passes for [Tc] [Pm] [Po] [At] [Rn] [Fr] [Ra] [Ac] [Np] [Pu] [Am] [Cm] [Bk] [Cf] [Es] [Fm] [Md] [No] [Lr] Those aren't common in drugs, but it would still be nice to know if there was a user-specified isotope number or not. Andrew [email protected] ------------------------------------------------------------------------------ Learn how Oracle Real Application Clusters (RAC) One Node allows customers to consolidate database storage, standardize their database environment, and, should the need arise, upgrade to a full multi-node Oracle RAC database without downtime or disruption http://p.sf.net/sfu/oracle-sfdevnl _______________________________________________ Rdkit-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

