Hi Thomas,

On Sun, Feb 21, 2010 at 5:20 PM, Thomas G. Kristensen <t...@cs.au.dk> wrote:
> Here's the output
> Cannot percieve atom type for the 13th atom: N
> Molecule(136998678, ... omitted for brevity ...)

SMILES parsing requires atom typing to be performed. In this process,
the CDK tries to understand the chemistry representation in the
SMILES... there are two reasons that can fail:

1. your SMILES is wrong
2. you hit a bug in the atom typing algorithm

> As you can see, the molecule is still parsed. The smiles strings were
> generated using openbabel, the molecules were taken from the DUD sets
> SDF files.
>
> Any ideas why this error occurs? Is the input still valid?

I cannot say without the SMILES itself...

Assuming the SMILES from the example code is a failing SMILES, I
briefly ran it against Daylight's Depict [0], and that confirms that
this SMILES is indeed invalid:

C[C@@H](C(=O)n1cc...@h]1c(=O)O)[NH2][C@@H](CCc1ccccc1)C(=O)O

gives

 WARNING: Atom has unusual valence 4 (normal 5) (dy_rmbord)
 WARNING: ...1cc...@h]1c(=O)O)[NH2][C@@H](CCc1ccccc1)... (dy_rmbord)
 WARNING:                     ^^^^^                      (dy_rmbord)

Egon

0.http://www.daylight.com/daycgi/depict (the de facto SMILES format checker)

-- 
Post-doc @ Uppsala University
Proteochemometrics / Bioclipse Group of Prof. Jarl Wikberg
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to