Dear list,
This code:
---
#!/usr/bin/env python3
import argparse, sys
from rdkit import Chem
def debug_mol(m):
for a in m.GetAtoms():
i = a.GetIdx()
anum = a.GetAtomicNum()
numHs = a.GetTotalNumHs()
print('%d %d %d' % (i, anum, numHs))
if __name__ == '__main__':
# CLI options parsing
parser = argparse.ArgumentParser(description = "test strange rdkit
behavior")
parser.add_argument("-i", metavar = "input.sdf", dest = "input_fn",
help = "3D conformer input file ")
# parse CLI
---------------------------------------------------------------
if len(sys.argv) == 1:
# user has no clue of what to do -> usage
parser.print_help(sys.stderr)
sys.exit(1)
args = parser.parse_args()
input_fn = args.input_fn
# parse CLI end
-----------------------------------------------------------
for mol in Chem.SDMolSupplier(input_fn, removeHs=False):
debug_mol(mol)
---
On this file:
---
caffeine
OpenBabel10171811233D
24 25 0 0 0 0 0 0 0 0999 V2000
-1.4537 2.7848 0.2699 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.0108 1.4083 0.1062 N 0 0 0 0 0 0 0 0 0 0 0 0
0.3015 1.1323 0.0489 C 0 0 0 0 0 0 0 0 0 0 0 0
1.1081 2.0920 0.1407 O 0 0 0 0 0 0 0 0 0 0 0 0
0.8161 -0.1286 -0.1033 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.0929 -1.1771 -0.2031 C 0 0 0 0 0 0 0 0 0 0 0 0
0.6111 -2.3242 -0.3462 N 0 0 0 0 0 0 0 0 0 0 0 0
1.9386 -2.0269 -0.3392 C 0 0 0 0 0 0 0 0 0 0 0 0
2.0299 -0.6962 -0.1913 N 0 0 0 0 0 0 0 0 0 0 0 0
3.2729 0.0261 -0.1349 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.4004 -0.8770 -0.1432 N 0 0 0 0 0 0 0 0 0 0 0 0
-2.3540 -1.9596 -0.2459 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.8697 0.3771 0.0073 C 0 0 0 0 0 0 0 0 0 0 0 0
-3.0974 0.6510 0.0627 O 0 0 0 0 0 0 0 0 0 0 0 0
-0.6884 3.3191 0.8569 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.5024 3.2204 -0.7549 H 0 0 0 0 0 0 0 0 0 0 0 0
-2.4690 2.8350 0.7286 H 0 0 0 0 0 0 0 0 0 0 0 0
2.7299 -2.7636 -0.4379 H 0 0 0 0 0 0 0 0 0 0 0 0
3.4783 0.4186 0.8888 H 0 0 0 0 0 0 0 0 0 0 0 0
4.1200 -0.5981 -0.4606 H 0 0 0 0 0 0 0 0 0 0 0 0
3.2700 0.9110 -0.8337 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.8812 -2.8834 0.1466 H 0 0 0 0 0 0 0 0 0 0 0 0
-2.6277 -2.0396 -1.3222 H 0 0 0 0 0 0 0 0 0 0 0 0
-3.2286 -1.7014 0.3855 H 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 0 0
1 15 1 0 0 0 0
1 16 1 0 0 0 0
1 17 1 0 0 0 0
2 3 1 0 0 0 0
3 4 2 0 0 0 0
3 5 1 0 0 0 0
5 6 2 0 0 0 0
6 7 1 0 0 0 0
6 11 1 0 0 0 0
7 8 2 0 0 0 0
8 9 1 0 0 0 0
8 18 1 0 0 0 0
9 10 1 0 0 0 0
9 5 1 0 0 0 0
10 19 1 0 0 0 0
10 20 1 0 0 0 0
10 21 1 0 0 0 0
11 12 1 0 0 0 0
11 13 1 0 0 0 0
12 22 1 0 0 0 0
12 23 1 0 0 0 0
12 24 1 0 0 0 0
13 14 2 0 0 0 0
13 2 1 0 0 0 0
M END
$$$$
---
Tells me that a.GetTotalNumHs() is always 0:
---
0 6 0
1 7 0
2 6 0
3 8 0
4 6 0
5 6 0
6 7 0
7 6 0
8 7 0
9 6 0
10 7 0
11 6 0
12 6 0
13 8 0
14 1 0
15 1 0
16 1 0
17 1 0
18 1 0
19 1 0
20 1 0
21 1 0
22 1 0
23 1 0
---
This is wrong: e.g. atom at index 0 (Carbon) should have 3 hydrogens.
The involved bonds are 1 15, 1 16 and 1 17 in the sdf file.
The total of Hs attached to heavy atoms should be 10.
The rdkit I am using:
---
# pip3 list rdkit | grep rdkit
rdkit 2022.9.5
rdkit-pypi 2022.9.3
---
Should I feel in a bug on github, or am I doing something stupid?
If I leave the removeHs flag to its default value (of False), then the
result
becomes correct !
Regards,
F.
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss