Hi there RDkitters,

Using RDKit 2011.09.1 on Ubuntu Linux 11.10 64 bit with a noisy fan.

I am trying to read a MOL2 file (which I think is in line with the Tripos
spec http://tripos.com/data/support/mol2.pdf -- your favorite molecular
format, I know).

The structure is a simple indole.  If the atom types in the mol atom block
are C.ar or N.ar the sanitization fails (but I think this should be allowed
- especially since the bonds are also defined as aromatic).  If I change
the atom types to C.2 and N.2 respectively then everything works fine and
the aromatic parts of the molecules are still correct (because of the
aromatic bond definitions).

An example of this so you can just copy and paste it:

#!/usr/bin/env python

from rdkit import Chem

# the following is a valid molecule - why does it break?
indole_broken = """@<TRIPOS>MOLECULE
MVSketch_Indole
10 11 1
SMALL
NO_CHARGES
@<TRIPOS>ATOM
1 C1    38.6029   -19.6265     0.0000 C.ar 1 noname
2 C2    38.6029   -21.1665     0.0000 C.ar 1 noname
3 C3    37.2692   -21.9365     0.0000 C.ar 1 noname
4 C4    35.9356   -21.1665     0.0000 C.ar 1 noname
5 C5    35.9356   -19.6265     0.0000 C.ar 1 noname
6 C6    37.2692   -18.8565     0.0000 C.ar 1 noname
7 C7    34.4709   -21.6424     0.0000 C.ar 1 noname
8 C8    33.5657   -20.3965     0.0000 C.ar 1 noname
9 N1    34.4709   -19.1506     0.0000 N.ar 1 noname
10 H1    33.9950   -17.6860     0.0000 H 1 noname
@<TRIPOS>BOND
1 1 2 ar
2 2 3 ar
3 3 4 ar
4 5 6 ar
5 1 6 ar
6 4 7 ar
7 5 4 ar
8 5 9 ar
9 7 8 ar
10 8 9 ar
11 9 10 1
@<TRIPOS>SUBSTRUCTURE
1 noname 1"""

indole_fixed = """@<TRIPOS>MOLECULE
MVSketch_Indole
10 11 1
SMALL
NO_CHARGES
@<TRIPOS>ATOM
1 C1    38.6029   -19.6265     0.0000 C.2 1 noname
2 C2    38.6029   -21.1665     0.0000 C.2 1 noname
3 C3    37.2692   -21.9365     0.0000 C.2 1 noname
4 C4    35.9356   -21.1665     0.0000 C.2 1 noname
5 C5    35.9356   -19.6265     0.0000 C.2 1 noname
6 C6    37.2692   -18.8565     0.0000 C.2 1 noname
7 C7    34.4709   -21.6424     0.0000 C.2 1 noname
8 C8    33.5657   -20.3965     0.0000 C.2 1 noname
9 N1    34.4709   -19.1506     0.0000 N.2 1 noname
10 H1    33.9950   -17.6860     0.0000 H 1 noname
@<TRIPOS>BOND
1 1 2 ar
2 2 3 ar
3 3 4 ar
4 5 6 ar
5 1 6 ar
6 4 7 ar
7 5 4 ar
8 5 9 ar
9 7 8 ar
10 8 9 ar
11 9 10 1
@<TRIPOS>SUBSTRUCTURE
1 noname 1"""

print Chem.MolFromMol2Block(indole_broken)
print Chem.MolFromMol2Block(indole_fixed)
print Chem.MolToSmiles(Chem.MolFromMol2Block(indole_fixed))
# [c]1[nH]c2c([c]1)[c][c][c][c]2


Any comments?
# Please

Many thanks,


-
Jean-Paul Ebejer
Early Stage Researcher
------------------------------------------------------------------------------
RSA(R) Conference 2012
Mar 27 - Feb 2
Save $400 by Jan. 27
Register now!
http://p.sf.net/sfu/rsa-sfdev2dev2
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to