Thanks for the explanation Stiefl.  File formats - what a pain.
So Corina does not make use of C.ar or N.ar?

This is a "Won't Fix" then ... right?  Maybe a note in the documentation of
the list of unsupported atom types from the spec (pg 53 in
http://tripos.com/data/support/mol2.pdf) which are not supported may be
useful then (as people like me have never used corina) ?

Many thanks,

-
Jean-Paul Ebejer
Early Stage Researcher


On 12 January 2012 14:13, Stiefl, Nikolaus <[email protected]>wrote:

>  Dear JP,****
>
> ** **
>
> When the Mol2 parser was implemented we had to take a decision at some
> point about which format to use. Given the “unspecific” Tripos specs this
> was actually quite tricky. If you write the same molecule using Sybyl,
> Tripos’ db tools or other software like Corina you will get all different
> results (note that Tripos is not even giving the same results when using
> their own tools).****
>
> ** **
>
> Hence, we decided on corina since this is one of them most widely used
> tools and also seems to give the most consitsent results when evaluating a
> largish set I converted and reviewed. As you can see, there is a Note when
> checking the Mol2 parser (eg MolFromMol2File) that will tell you that it is
> optimized for the atom-typing scheme by Corina.****
>
> ** **
>
> Sorry I can’t be of more help****
>
> ** **
>
> Nik****
>
> ** **
>
> *From:* JP [mailto:[email protected]]
> *Sent:* Thursday, January 12, 2012 2:57 PM
> *To:* [email protected]
> *Subject:* [Rdkit-discuss] Mol2 Format problem ? Can't kekulize mol --
> but with a twist.****
>
> ** **
>
> Hi there RDkitters,****
>
> ** **
>
> Using RDKit 2011.09.1 on Ubuntu Linux 11.10 64 bit with a noisy fan.****
>
> ** **
>
> I am trying to read a MOL2 file (which I think is in line with the Tripos
> spec http://tripos.com/data/support/mol2.pdf -- your favorite molecular
> format, I know).****
>
> ** **
>
> The structure is a simple indole.  If the atom types in the mol atom block
> are C.ar or N.ar the sanitization fails (but I think this should be allowed
> - especially since the bonds are also defined as aromatic).  If I change
> the atom types to C.2 and N.2 respectively then everything works fine and
> the aromatic parts of the molecules are still correct (because of the
> aromatic bond definitions).****
>
> ** **
>
> An example of this so you can just copy and paste it:****
>
> ** **
>
> #!/usr/bin/env python****
>
> ** **
>
> from rdkit import Chem****
>
> ** **
>
> # the following is a valid molecule - why does it break?****
>
> indole_broken = """@<TRIPOS>MOLECULE****
>
> MVSketch_Indole****
>
> 10 11 1****
>
> SMALL****
>
> NO_CHARGES****
>
> @<TRIPOS>ATOM****
>
> 1          C1    38.6029   -19.6265     0.0000    C.ar     1
> noname****
>
> 2          C2    38.6029   -21.1665     0.0000    C.ar     1
> noname****
>
> 3          C3    37.2692   -21.9365     0.0000    C.ar     1
> noname****
>
> 4          C4    35.9356   -21.1665     0.0000    C.ar     1
> noname****
>
> 5          C5    35.9356   -19.6265     0.0000    C.ar     1
> noname****
>
> 6          C6    37.2692   -18.8565     0.0000    C.ar     1
> noname****
>
> 7          C7    34.4709   -21.6424     0.0000    C.ar     1
> noname****
>
> 8          C8    33.5657   -20.3965     0.0000    C.ar     1
> noname****
>
> 9          N1    34.4709   -19.1506     0.0000    N.ar     1
> noname****
>
> 10        H1    33.9950   -17.6860     0.0000    H         1
> noname****
>
> @<TRIPOS>BOND****
>
> 1          1          2          ar****
>
> 2          2          3          ar****
>
> 3          3          4          ar****
>
> 4          5          6          ar****
>
> 5          1          6          ar****
>
> 6          4          7          ar****
>
> 7          5          4          ar****
>
> 8          5          9          ar****
>
> 9          7          8          ar****
>
> 10        8          9          ar****
>
> 11        9          10        1****
>
> @<TRIPOS>SUBSTRUCTURE****
>
> 1          noname           1"""****
>
> ** **
>
> indole_fixed = """@<TRIPOS>MOLECULE****
>
> MVSketch_Indole****
>
> 10 11 1****
>
> SMALL****
>
> NO_CHARGES****
>
> @<TRIPOS>ATOM****
>
> 1          C1    38.6029   -19.6265     0.0000    C.2      1
> noname****
>
> 2          C2    38.6029   -21.1665     0.0000    C.2      1
> noname****
>
> 3          C3    37.2692   -21.9365     0.0000    C.2      1
> noname****
>
> 4          C4    35.9356   -21.1665     0.0000    C.2      1
> noname****
>
> 5          C5    35.9356   -19.6265     0.0000    C.2      1
> noname****
>
> 6          C6    37.2692   -18.8565     0.0000    C.2      1
> noname****
>
> 7          C7    34.4709   -21.6424     0.0000    C.2      1
> noname****
>
> 8          C8    33.5657   -20.3965     0.0000    C.2      1
> noname****
>
> 9          N1    34.4709   -19.1506     0.0000    N.2      1
> noname****
>
> 10        H1    33.9950   -17.6860     0.0000    H         1
> noname****
>
> @<TRIPOS>BOND****
>
> 1          1          2          ar****
>
> 2          2          3          ar****
>
> 3          3          4          ar****
>
> 4          5          6          ar****
>
> 5          1          6          ar****
>
> 6          4          7          ar****
>
> 7          5          4          ar****
>
> 8          5          9          ar****
>
> 9          7          8          ar****
>
> 10        8          9          ar****
>
> 11        9          10        1****
>
> @<TRIPOS>SUBSTRUCTURE****
>
> 1          noname           1"""****
>
> ** **
>
> print Chem.MolFromMol2Block(indole_broken)****
>
> print Chem.MolFromMol2Block(indole_fixed)****
>
> print Chem.MolToSmiles(Chem.MolFromMol2Block(indole_fixed))
> # [c]1[nH]c2c([c]1)[c][c][c][c]2****
>
> ** **
>
> ** **
>
> Any comments? ****
>
> # Please****
>
> ** **
>
> Many thanks,****
>
> ** **
>
>
> -
> Jean-Paul Ebejer
> Early Stage Researcher****
>
------------------------------------------------------------------------------
RSA(R) Conference 2012
Mar 27 - Feb 2
Save $400 by Jan. 27
Register now!
http://p.sf.net/sfu/rsa-sfdev2dev2
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to