Dear all,

I have troubles to kekulize molecule using rdkit, below is an example:

The example.mol2 file looks like below:

@MOLECULE
example
46 49 0 0 0
SMALL
GASTEIGER

@ATOM
1 C -4.5556 -0.2844 1.1718 C.3 1 LIG1 -0.0109
2 C -6.0291 -0.7271 1.2334 C.3 1 LIG1 0.0493
3 C -6.4413 -0.5958 -1.0493 C.3 1 LIG1 0.0493
4 C -5.1977 0.3130 -1.1927 C.3 1 LIG1 -0.0109
5 C 5.5992 -2.5640 -0.8780 C.ar 1 LIG1 -0.0253
6 O -6.3822 -1.4588 0.0764 O.3 1 LIG1 -0.3796
7 C 2.8943 1.6722 0.9911 C.ar 1 LIG1 0.2664
8 C 5.1745 -2.0407 0.3480 C.ar 1 LIG1 0.1371
9 C -1.6179 0.4017 0.1577 C.ar 1 LIG1 0.2173
10 C -4.0573 -0.1702 -0.2838 C.3 1 LIG1 0.0275
11 C 0.8767 -0.2307 1.1489 C.ar 1 LIG1 0.0370
12 C 2.1438 -0.5325 1.6439 C.ar 1 LIG1 -0.0306
13 C 6.1958 -1.7294 -1.8279 C.ar 1 LIG1 -0.0590
14 C 6.3717 -0.3702 -1.5525 C.ar 1 LIG1 -0.0605
15 C 5.9487 0.1564 -0.3282 C.ar 1 LIG1 -0.0452
16 C 0.6358 1.0320 0.5744 C.ar 1 LIG1 0.1483
17 C -0.1716 -1.1537 1.2042 C.ar 1 LIG1 0.0418
18 C 3.1618 0.4153 1.5592 C.ar 1 LIG1 0.0780
19 C 5.3424 -0.6749 0.6231 C.ar 1 LIG1 0.0480
20 C 1.3530 3.2786 -0.1013 C.3 1 LIG1 0.0167
21 F 4.6032 -2.8623 1.2640 F 1 LIG1 -0.2043
22 S 4.7969 0.0115 2.1898 S.3 1 LIG1 -0.0812
23 N -1.3906 -0.8211 0.7091 N.ar 1 LIG1 -0.2222
24 O 3.8206 2.5277 0.9363 O.2 1 LIG1 -0.2664
25 N 1.6412 1.9659 0.5033 N.ar 1 LIG1 -0.2949
26 N -0.6088 1.3106 0.0937 N.ar 1 LIG1 -0.1964
27 N -2.9091 0.7394 -0.3655 N.pl3 1 LIG1 -0.3104
28 H -3.9262 -1.0225 1.7144 H 1 LIG1 0.0305
29 H -4.4544 0.6942 1.6907 H 1 LIG1 0.0305
30 H -6.1785 -1.3738 2.1237 H 1 LIG1 0.0560
31 H -6.6965 0.1565 1.3647 H 1 LIG1 0.0560
32 H -7.3658 0.0220 -1.0063 H 1 LIG1 0.0560
33 H -6.5227 -1.2302 -1.9574 H 1 LIG1 0.0560
34 H -4.8575 0.3261 -2.2513 H 1 LIG1 0.0305
35 H -5.4753 1.3532 -0.9112 H 1 LIG1 0.0305
36 H 5.4676 -3.6168 -1.0922 H 1 LIG1 0.0646
37 H -3.7461 -1.1771 -0.6436 H 1 LIG1 0.0500
38 H 2.3428 -1.4998 2.0895 H 1 LIG1 0.0638
39 H 6.5237 -2.1362 -2.7758 H 1 LIG1 0.0618
40 H 6.8363 0.2748 -2.2870 H 1 LIG1 0.0618
41 H 6.0904 1.2094 -0.1219 H 1 LIG1 0.0630
42 H -0.0243 -2.1352 1.6372 H 1 LIG1 0.0838
43 H 2.2342 3.9528 -0.1073 H 1 LIG1 0.0457
44 H 0.5450 3.7853 0.4685 H 1 LIG1 0.0457
45 H 1.0258 3.1432 -1.1544 H 1 LIG1 0.0457
46 H -3.0166 1.6655 -0.8392 H 1 LIG1 0.1492
@BOND
1 1 2 1
2 1 10 1
3 2 6 1
4 3 4 1
5 3 6 1
6 4 10 1
7 5 8 ar
8 5 13 ar
9 7 18 ar
10 7 24 2
11 7 25 ar
12 8 19 ar
13 8 21 1
14 9 23 ar
15 9 26 ar
16 9 27 1
17 10 27 1
18 11 12 ar
19 11 16 ar
20 11 17 ar
21 12 18 ar
22 13 14 ar
23 14 15 ar
24 15 19 ar
25 16 25 ar
26 16 26 ar
27 17 23 ar
28 18 22 1
29 19 22 1
30 20 25 1
31 1 28 1
32 1 29 1
33 2 30 1
34 2 31 1
35 3 32 1
36 3 33 1
37 4 34 1
38 4 35 1
39 5 36 1
40 10 37 1
41 12 38 1
42 13 39 1
43 14 40 1
44 15 41 1
45 17 42 1
46 20 43 1
47 20 44 1
48 20 45 1
49 27 46 1

And the example.py code looks like

from rdkit.Chem import AllChem
from rdkit import Chem

rdkit_mol = Chem.MolFromMol2File("example.mol2", sanitize=False,
removeHs=False)
mol = AllChem.RemoveHs(rdkit_mol)

If running the example.py, it returns an error as below:

ValueError: Sanitization error: Can't kekulize mol. Unkekulized atoms: 8 10
11 15 16 17 22 24 25

It seems rdkit cannot understand the molecules when it try to remove the
hydrogens, probably related to the format of the mol2 file I used here? I
use openbabel to convert the mol2 file from an sdf file. So I wonder if
there is a plan to parse the mol2 file like this or I need to further cook
the mol2 file. I appreciate for any advices!


Thanks,

Shuai
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to