Just for the record...one or two other changes I've been making: (1) Sort fragments by number of atoms, then number of bonds, then length of SMILES string. Previously it was just sorted by length of string. This makes sure that more complicated fragments are matched first, e.g. a 5+3 fused rings will come before cyclohexane (I decided not to use SSSR in the end - the number of bonds is simpler).
(2) Remove restrictions on Has3D(). This was eliminating, for example, cyclopropane, and indeed any perfecty planar fragment. I'm not sure why this restriction was present - maybe because some molecules were messed up? (3) Don't consider any molecules that have spiro centres when creating the fragments. We don't want fragments with spiro centres; the regular fragments work fine for such molecules (the builder has code to deal with this situation). - Noel 2009/12/15 Noel O'Boyle <[email protected]>: > 2009/12/14 Geoffrey Hutchison <[email protected]>: >>> So I thought about changing the fragment to be composed of C atoms and >>> using that to derive the canonical representation. >> >> The problem is that if you change the fragment to C atoms, you have lots of >> un-filled valence. Consider, changing the "N" in pyrrole (or S in thiophene) >> to carbon. It's no longer aromatic. So with that approach (making fragments, >> changing atoms to carbon, writing output), the resulting SMARTS patterns >> weren't > > Got you. > >> I see your point. Perhaps the "best" solution is to look for similar SMARTS >> in the post-processing step and merge them? > > Aha, finally a use for canonical SMARTS strings! Which I'm afraid we > don't have. But I might be able to identify at least the simpler > duplicates. > >> Craig, do you have another solution? Is there a wild-card or dummy atom for >> the canonical SMILES? >> >> -Geoff > ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ OpenBabel-Devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/openbabel-devel
