Dear Steven,

yes, it is the expected behaviour.

Molecule mol encodes acetone with all hydrogens made explicit, such that the RDKit does not need to use its valence model to compute the number of implicit Hs each heavy atom is connected to.

However, as you can see in your mol.GetAtoms() loop, the molecule does not contain any real hydrogen atoms in its graph; when you loop over the molecule's atoms you only retrieve the heavy atoms.

If you omit one of the explicit hydrogens on the first carbon in your SMILES string, the RDKit will replace it with an implicit hydrogen as it will use its valence model to fill in the missing explicit hydrogen in the assumption that the molecule is not a radical:

mol2  =  Chem.MolFromSmiles('C(C(=O)C([H])([H])[H])([H])[H]')

for  atom  in  mol2.GetAtoms():
    print(atom.GetAtomicNum(),  atom.GetNumImplicitHs(),
          atom.GetNumExplicitHs(),  atom.GetNumRadicalElectrons())

6 1 2 0
6 0 0 0
8 0 0 0
6 0 3 0

However, if you explicitly set that you'd rather not want that the RDKit adds implicit hydrogens, you can set the NoImplicit flag on the parent heavy atoms. If you then sanitize the molecule, you'll see that it has indeed become a radical:

for  atom  in  mol2.GetAtoms():
    atom.SetNoImplicit(True)

Chem.SanitizeMol(mol2)

rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE

for  atom  in  mol2.GetAtoms():
    print(atom.GetAtomicNum(),  atom.GetNumImplicitHs(),
          atom.GetNumExplicitHs(),  atom.GetNumRadicalElectrons())

6 0 2 1
6 0 0 0
8 0 0 0
6 0 3 0

If you now want to add real hydrogen atoms to the molecule graph you can do it with Chem.AddHs(); this zeroes all explicit/implicit H counts on heavy atoms and replaces explicit/implicit Hs with real hydrogen atoms:

mol2_h  =  Chem.AddHs(mol2,  explicitOnly=True)

for  atom  in  mol2_h.GetAtoms():
    print(atom.GetAtomicNum(),  atom.GetNumImplicitHs(),
          atom.GetNumExplicitHs(),  atom.GetNumRadicalElectrons())

6 0 0 1
6 0 0 0
8 0 0 0
6 0 0 0
1 0 0 0
1 0 0 0
1 0 0 0
1 0 0 0
1 0 0 0

I hope the above helps. I suggest to also have a look at this blog post by Roger Sayle:

https://nextmovesoftware.com/blog/2013/02/27/explicit-and-implicit-hydrogens-taking-liberties-with-valence/

It is a quick read and it explains very clearly the difference between implicit Hs, explicit Hs, and explicit Hs included in the molecule graph.

Cheers,
p.

On 17/06/2019 20:40, Steven Kearnes via Rdkit-discuss wrote:
mol = Chem.MolFromSmiles('C(C(=O)C([H])([H])[H])([H])([H])[H]')
for atom in mol.GetAtoms():
print(atom.GetNumImplicitHs(), atom.GetNumExplicitHs())

>>>
0 3
0 0
0 0
0 3

mol_noh = Chem.RemoveHs(mol, updateExplicitCount=True)
for atom in mol_noh.GetAtoms():
print(atom.GetNumImplicitHs(), atom.GetNumExplicitHs())

>>>
0 3
0 0
0 0
0 3

Note that if I add the hydrogens first, everything works as expected:

mol_h = Chem.AddHs(mol, explicitOnly=True)
mol_noh = Chem.RemoveHs(mol_h)
for atom in mol_noh.GetAtoms():
print(atom.GetNumImplicitHs(), atom.GetNumExplicitHs())

>>>
3 0
0 0
0 0
3 0

Perhaps this is expected behavior?


_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to