the sdf doesnt parse so well for me when pasted from the mail(and seems to contain an unusual conformation) but have you tried to turn includeNeighbors=True in this line: numHs = a.GetTotalNumHs(includeNeighbors=True)
in case this fixes your issue, discussion why this flag is needed can be found in this gtihub issue: https://github.com/rdkit/rdkit/issues/1357 best wishes wim On Mon, Apr 3, 2023 at 9:41 AM Francois Berenger <mli...@ligand.eu> wrote: > Dear list, > > This code: > --- > #!/usr/bin/env python3 > > import argparse, sys > from rdkit import Chem > > def debug_mol(m): > for a in m.GetAtoms(): > i = a.GetIdx() > anum = a.GetAtomicNum() > numHs = a.GetTotalNumHs() > print('%d %d %d' % (i, anum, numHs)) > > if __name__ == '__main__': > # CLI options parsing > parser = argparse.ArgumentParser(description = "test strange rdkit > behavior") > parser.add_argument("-i", metavar = "input.sdf", dest = "input_fn", > help = "3D conformer input file ") > # parse CLI > --------------------------------------------------------------- > if len(sys.argv) == 1: > # user has no clue of what to do -> usage > parser.print_help(sys.stderr) > sys.exit(1) > args = parser.parse_args() > input_fn = args.input_fn > # parse CLI end > ----------------------------------------------------------- > for mol in Chem.SDMolSupplier(input_fn, removeHs=False): > debug_mol(mol) > --- > > On this file: > --- > caffeine > OpenBabel10171811233D > > 24 25 0 0 0 0 0 0 0 0999 V2000 > -1.4537 2.7848 0.2699 C 0 0 0 0 0 0 0 0 0 0 0 0 > -1.0108 1.4083 0.1062 N 0 0 0 0 0 0 0 0 0 0 0 0 > 0.3015 1.1323 0.0489 C 0 0 0 0 0 0 0 0 0 0 0 0 > 1.1081 2.0920 0.1407 O 0 0 0 0 0 0 0 0 0 0 0 0 > 0.8161 -0.1286 -0.1033 C 0 0 0 0 0 0 0 0 0 0 0 0 > -0.0929 -1.1771 -0.2031 C 0 0 0 0 0 0 0 0 0 0 0 0 > 0.6111 -2.3242 -0.3462 N 0 0 0 0 0 0 0 0 0 0 0 0 > 1.9386 -2.0269 -0.3392 C 0 0 0 0 0 0 0 0 0 0 0 0 > 2.0299 -0.6962 -0.1913 N 0 0 0 0 0 0 0 0 0 0 0 0 > 3.2729 0.0261 -0.1349 C 0 0 0 0 0 0 0 0 0 0 0 0 > -1.4004 -0.8770 -0.1432 N 0 0 0 0 0 0 0 0 0 0 0 0 > -2.3540 -1.9596 -0.2459 C 0 0 0 0 0 0 0 0 0 0 0 0 > -1.8697 0.3771 0.0073 C 0 0 0 0 0 0 0 0 0 0 0 0 > -3.0974 0.6510 0.0627 O 0 0 0 0 0 0 0 0 0 0 0 0 > -0.6884 3.3191 0.8569 H 0 0 0 0 0 0 0 0 0 0 0 0 > -1.5024 3.2204 -0.7549 H 0 0 0 0 0 0 0 0 0 0 0 0 > -2.4690 2.8350 0.7286 H 0 0 0 0 0 0 0 0 0 0 0 0 > 2.7299 -2.7636 -0.4379 H 0 0 0 0 0 0 0 0 0 0 0 0 > 3.4783 0.4186 0.8888 H 0 0 0 0 0 0 0 0 0 0 0 0 > 4.1200 -0.5981 -0.4606 H 0 0 0 0 0 0 0 0 0 0 0 0 > 3.2700 0.9110 -0.8337 H 0 0 0 0 0 0 0 0 0 0 0 0 > -1.8812 -2.8834 0.1466 H 0 0 0 0 0 0 0 0 0 0 0 0 > -2.6277 -2.0396 -1.3222 H 0 0 0 0 0 0 0 0 0 0 0 0 > -3.2286 -1.7014 0.3855 H 0 0 0 0 0 0 0 0 0 0 0 0 > 1 2 1 0 0 0 0 > 1 15 1 0 0 0 0 > 1 16 1 0 0 0 0 > 1 17 1 0 0 0 0 > 2 3 1 0 0 0 0 > 3 4 2 0 0 0 0 > 3 5 1 0 0 0 0 > 5 6 2 0 0 0 0 > 6 7 1 0 0 0 0 > 6 11 1 0 0 0 0 > 7 8 2 0 0 0 0 > 8 9 1 0 0 0 0 > 8 18 1 0 0 0 0 > 9 10 1 0 0 0 0 > 9 5 1 0 0 0 0 > 10 19 1 0 0 0 0 > 10 20 1 0 0 0 0 > 10 21 1 0 0 0 0 > 11 12 1 0 0 0 0 > 11 13 1 0 0 0 0 > 12 22 1 0 0 0 0 > 12 23 1 0 0 0 0 > 12 24 1 0 0 0 0 > 13 14 2 0 0 0 0 > 13 2 1 0 0 0 0 > M END > $$$$ > --- > > Tells me that a.GetTotalNumHs() is always 0: > --- > 0 6 0 > 1 7 0 > 2 6 0 > 3 8 0 > 4 6 0 > 5 6 0 > 6 7 0 > 7 6 0 > 8 7 0 > 9 6 0 > 10 7 0 > 11 6 0 > 12 6 0 > 13 8 0 > 14 1 0 > 15 1 0 > 16 1 0 > 17 1 0 > 18 1 0 > 19 1 0 > 20 1 0 > 21 1 0 > 22 1 0 > 23 1 0 > --- > > This is wrong: e.g. atom at index 0 (Carbon) should have 3 hydrogens. > The involved bonds are 1 15, 1 16 and 1 17 in the sdf file. > The total of Hs attached to heavy atoms should be 10. > > The rdkit I am using: > --- > # pip3 list rdkit | grep rdkit > rdkit 2022.9.5 > rdkit-pypi 2022.9.3 > --- > > Should I feel in a bug on github, or am I doing something stupid? > > If I leave the removeHs flag to its default value (of False), then the > result > becomes correct ! > > Regards, > F. > > > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss