2009/12/16 Craig A. James <[email protected]>: > Noel O'Boyle wrote: >> >> Actually just thought of a way to create canonical SMARTS string. >> Craig, do you have code for generating all possible SMILES strings for >> a particular molecule? If so, then I could do this for each fragment, >> do string replacement to create the corresponding SMARTS, and pick the >> 'canonical SMARTS' with the lowest alphabetical order. > > "All possible SMILES strings" is a virtually uncountable number for any > moderately large molecule, so nobody has ever generated an algorithm for > that (that I know of). It would only work for fairly small fragments, one > or two rings, or fragments with low branching.
I should have said the set of all possible SMILES strings that OpenBabel will create when presented with atoms with all possible canonical labels (which is quite a bit smaller). In any case, it still works well enough for our purposes. When it works, it eliminates duplicates. When it doesn't, it leaves us no worse off. To put things in context, duplicates are not a major problem, but for a test set I find that about 10% are duplicates (or at least 10% are removed by this procedure). These 10% don't explicitly cause poorer performance, but it would be better to replace them with additional new fragments. > Craig > > > ------------------------------------------------------------------------------ This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev _______________________________________________ OpenBabel-Devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/openbabel-devel
