the sdf doesnt parse so well for me when pasted from the mail(and seems to
contain an unusual conformation) but have you tried to turn
includeNeighbors=True in this line:
numHs = a.GetTotalNumHs(includeNeighbors=True)

in case this fixes your issue, discussion why this flag is needed can be
found in this gtihub issue: https://github.com/rdkit/rdkit/issues/1357

best wishes
wim

On Mon, Apr 3, 2023 at 9:41 AM Francois Berenger <mli...@ligand.eu> wrote:

> Dear list,
>
> This code:
> ---
> #!/usr/bin/env python3
>
> import argparse, sys
> from rdkit import Chem
>
> def debug_mol(m):
>      for a in m.GetAtoms():
>          i = a.GetIdx()
>          anum = a.GetAtomicNum()
>          numHs = a.GetTotalNumHs()
>          print('%d %d %d' % (i, anum, numHs))
>
> if __name__ == '__main__':
>      # CLI options parsing
>      parser = argparse.ArgumentParser(description = "test strange rdkit
> behavior")
>      parser.add_argument("-i", metavar = "input.sdf", dest = "input_fn",
>                          help = "3D conformer input file ")
>      # parse CLI
> ---------------------------------------------------------------
>      if len(sys.argv) == 1:
>          # user has no clue of what to do -> usage
>          parser.print_help(sys.stderr)
>          sys.exit(1)
>      args = parser.parse_args()
>      input_fn = args.input_fn
>      # parse CLI end
> -----------------------------------------------------------
>      for mol in Chem.SDMolSupplier(input_fn, removeHs=False):
>          debug_mol(mol)
> ---
>
> On this file:
> ---
> caffeine
>   OpenBabel10171811233D
>
>   24 25  0  0  0  0  0  0  0  0999 V2000
>     -1.4537    2.7848    0.2699 C   0  0  0  0  0  0  0  0  0  0  0  0
>     -1.0108    1.4083    0.1062 N   0  0  0  0  0  0  0  0  0  0  0  0
>      0.3015    1.1323    0.0489 C   0  0  0  0  0  0  0  0  0  0  0  0
>      1.1081    2.0920    0.1407 O   0  0  0  0  0  0  0  0  0  0  0  0
>      0.8161   -0.1286   -0.1033 C   0  0  0  0  0  0  0  0  0  0  0  0
>     -0.0929   -1.1771   -0.2031 C   0  0  0  0  0  0  0  0  0  0  0  0
>      0.6111   -2.3242   -0.3462 N   0  0  0  0  0  0  0  0  0  0  0  0
>      1.9386   -2.0269   -0.3392 C   0  0  0  0  0  0  0  0  0  0  0  0
>      2.0299   -0.6962   -0.1913 N   0  0  0  0  0  0  0  0  0  0  0  0
>      3.2729    0.0261   -0.1349 C   0  0  0  0  0  0  0  0  0  0  0  0
>     -1.4004   -0.8770   -0.1432 N   0  0  0  0  0  0  0  0  0  0  0  0
>     -2.3540   -1.9596   -0.2459 C   0  0  0  0  0  0  0  0  0  0  0  0
>     -1.8697    0.3771    0.0073 C   0  0  0  0  0  0  0  0  0  0  0  0
>     -3.0974    0.6510    0.0627 O   0  0  0  0  0  0  0  0  0  0  0  0
>     -0.6884    3.3191    0.8569 H   0  0  0  0  0  0  0  0  0  0  0  0
>     -1.5024    3.2204   -0.7549 H   0  0  0  0  0  0  0  0  0  0  0  0
>     -2.4690    2.8350    0.7286 H   0  0  0  0  0  0  0  0  0  0  0  0
>      2.7299   -2.7636   -0.4379 H   0  0  0  0  0  0  0  0  0  0  0  0
>      3.4783    0.4186    0.8888 H   0  0  0  0  0  0  0  0  0  0  0  0
>      4.1200   -0.5981   -0.4606 H   0  0  0  0  0  0  0  0  0  0  0  0
>      3.2700    0.9110   -0.8337 H   0  0  0  0  0  0  0  0  0  0  0  0
>     -1.8812   -2.8834    0.1466 H   0  0  0  0  0  0  0  0  0  0  0  0
>     -2.6277   -2.0396   -1.3222 H   0  0  0  0  0  0  0  0  0  0  0  0
>     -3.2286   -1.7014    0.3855 H   0  0  0  0  0  0  0  0  0  0  0  0
>    1  2  1  0  0  0  0
>    1 15  1  0  0  0  0
>    1 16  1  0  0  0  0
>    1 17  1  0  0  0  0
>    2  3  1  0  0  0  0
>    3  4  2  0  0  0  0
>    3  5  1  0  0  0  0
>    5  6  2  0  0  0  0
>    6  7  1  0  0  0  0
>    6 11  1  0  0  0  0
>    7  8  2  0  0  0  0
>    8  9  1  0  0  0  0
>    8 18  1  0  0  0  0
>    9 10  1  0  0  0  0
>    9  5  1  0  0  0  0
>   10 19  1  0  0  0  0
>   10 20  1  0  0  0  0
>   10 21  1  0  0  0  0
>   11 12  1  0  0  0  0
>   11 13  1  0  0  0  0
>   12 22  1  0  0  0  0
>   12 23  1  0  0  0  0
>   12 24  1  0  0  0  0
>   13 14  2  0  0  0  0
>   13  2  1  0  0  0  0
> M  END
> $$$$
> ---
>
> Tells me that a.GetTotalNumHs() is always 0:
> ---
> 0 6 0
> 1 7 0
> 2 6 0
> 3 8 0
> 4 6 0
> 5 6 0
> 6 7 0
> 7 6 0
> 8 7 0
> 9 6 0
> 10 7 0
> 11 6 0
> 12 6 0
> 13 8 0
> 14 1 0
> 15 1 0
> 16 1 0
> 17 1 0
> 18 1 0
> 19 1 0
> 20 1 0
> 21 1 0
> 22 1 0
> 23 1 0
> ---
>
> This is wrong: e.g. atom at index 0 (Carbon) should have 3 hydrogens.
> The involved bonds are 1 15, 1 16 and 1 17 in the sdf file.
> The total of Hs attached to heavy atoms should be 10.
>
> The rdkit I am using:
> ---
> # pip3 list rdkit | grep rdkit
> rdkit                        2022.9.5
> rdkit-pypi                   2022.9.3
> ---
>
> Should I feel in a bug on github, or am I doing something stupid?
>
> If I leave the removeHs flag to its default value (of False), then the
> result
> becomes correct !
>
> Regards,
> F.
>
>
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to