Some minor typos in my commands - should be: obabel failures.sdf -ocan -O sdf_to_can.txt obabel failures.sdf -osmi -O sdf_to_smi.txt obabel -ismi sdf_to_smi.txt -ocan smi_to_can.txt diff sdf_to_can.txt smi_to_can.txt
- Noel On 4 October 2010 22:10, Noel O'Boyle <baoille...@gmail.com> wrote: > Hello all, > > Back on the 19/03/2009 I emailed to this list with the subject > "Canonical SMILES performance" about a test set of around 18000 > PubChem 3D structures. I did the following analysis: > (1) sdf -> can > (2) sdf -> smi -> can > (3) diff of (1) and (2) > > At that time, we had 1424 failures (8%), which wasn't great. According > to a later email, the 22x branch finished with 190 failures. > > I've just redone the analysis - the download from PubChem has changed, > but still has 18000 or so molecules > (ftp://ftp.ncbi.nlm.nih.gov/pubchem/Compound_3D/SDF/Conformers_00000001_00025000.sdf.gz) > > Now we have only 5 failures. Pretty good by any measure. > > (There were two canonicalisation timeouts...I think we should add an > option either to obabel, or to the canonical format, to set the > timeout.) > > obabel failures.sdf -ocan -O sdf_to_can.txt > obabel failures.sdf -osmi -O sdf_to_smi.txt > obabel -ismi sdf_to_can.txt -ocan smi_to_can.txt > diff sdf_to_can.txt sdf_to_smi.txt > > < c12=NCCN=c1ncnc2 167 > < N12CC[C@@H](CC1)CC2 7527 > < c12c3c(cc4c1c1c(nn2)c2c(cc1cc4)cccc2)cccc3 9107 > < c12c(c(c[nH]1)C[C@@h]1n3c...@h](C1)CC3)cccc2 21918 > < c\1(=c/2\[n+](=O)cccc2)/n(cccc1)[O-] 23699 > --- >> C12=NCCN=C1NCNC2 167 >> n12c...@h](CC1)CC2 7527 >> c12c3c(cc4c1c1c([nH][nH]2)c2c(cc1cc4)cccc2)cccc3 9107 >> c12c(c(c[nH]1)C[C@@H]1N3CC[C@@H](C1)CC3)cccc2 21918 >> C1(C2[N+](=O)CCCC2)N(CCCC1)[O-] 23699 > > I make it two kekulization problems and two canonicalisation problems > (both the same substructure). The fifth structure (23699) is a tough > one. > > failures.sdf attached. > > - Noel > ------------------------------------------------------------------------------ Virtualization is moving to the mainstream and overtaking non-virtualized environment for deploying applications. Does it make network security easier or more difficult to achieve? Read this whitepaper to separate the two and get a better understanding. http://p.sf.net/sfu/hp-phase2-d2d _______________________________________________ OpenBabel-Devel mailing list OpenBabel-Devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-devel