Hi Thomas, On Sun, Feb 21, 2010 at 5:20 PM, Thomas G. Kristensen <t...@cs.au.dk> wrote: > Here's the output > Cannot percieve atom type for the 13th atom: N > Molecule(136998678, ... omitted for brevity ...)
SMILES parsing requires atom typing to be performed. In this process, the CDK tries to understand the chemistry representation in the SMILES... there are two reasons that can fail: 1. your SMILES is wrong 2. you hit a bug in the atom typing algorithm > As you can see, the molecule is still parsed. The smiles strings were > generated using openbabel, the molecules were taken from the DUD sets > SDF files. > > Any ideas why this error occurs? Is the input still valid? I cannot say without the SMILES itself... Assuming the SMILES from the example code is a failing SMILES, I briefly ran it against Daylight's Depict [0], and that confirms that this SMILES is indeed invalid: C[C@@H](C(=O)n1cc...@h]1c(=O)O)[NH2][C@@H](CCc1ccccc1)C(=O)O gives WARNING: Atom has unusual valence 4 (normal 5) (dy_rmbord) WARNING: ...1cc...@h]1c(=O)O)[NH2][C@@H](CCc1ccccc1)... (dy_rmbord) WARNING: ^^^^^ (dy_rmbord) Egon 0.http://www.daylight.com/daycgi/depict (the de facto SMILES format checker) -- Post-doc @ Uppsala University Proteochemometrics / Bioclipse Group of Prof. Jarl Wikberg Homepage: http://egonw.github.com/ Blog: http://chem-bla-ics.blogspot.com/ PubList: http://www.citeulike.org/user/egonw/tag/papers ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user