Dear Steven,
yes, it is the expected behaviour.
Molecule mol encodes acetone with all hydrogens made explicit, such that
the RDKit does not need to use its valence model to compute the number
of implicit Hs each heavy atom is connected to.
However, as you can see in your mol.GetAtoms() loop, the molecule does
not contain any real hydrogen atoms in its graph; when you loop over the
molecule's atoms you only retrieve the heavy atoms.
If you omit one of the explicit hydrogens on the first carbon in your
SMILES string, the RDKit will replace it with an implicit hydrogen as it
will use its valence model to fill in the missing explicit hydrogen in
the assumption that the molecule is not a radical:
mol2 = Chem.MolFromSmiles('C(C(=O)C([H])([H])[H])([H])[H]')
for atom in mol2.GetAtoms():
print(atom.GetAtomicNum(), atom.GetNumImplicitHs(),
atom.GetNumExplicitHs(), atom.GetNumRadicalElectrons())
6 1 2 0
6 0 0 0
8 0 0 0
6 0 3 0
However, if you explicitly set that you'd rather not want that the RDKit
adds implicit hydrogens, you can set the NoImplicit flag on the parent
heavy atoms. If you then sanitize the molecule, you'll see that it has
indeed become a radical:
for atom in mol2.GetAtoms():
atom.SetNoImplicit(True)
Chem.SanitizeMol(mol2)
rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE
for atom in mol2.GetAtoms():
print(atom.GetAtomicNum(), atom.GetNumImplicitHs(),
atom.GetNumExplicitHs(), atom.GetNumRadicalElectrons())
6 0 2 1
6 0 0 0
8 0 0 0
6 0 3 0
If you now want to add real hydrogen atoms to the molecule graph you can
do it with Chem.AddHs(); this zeroes all explicit/implicit H counts on
heavy atoms and replaces explicit/implicit Hs with real hydrogen atoms:
mol2_h = Chem.AddHs(mol2, explicitOnly=True)
for atom in mol2_h.GetAtoms():
print(atom.GetAtomicNum(), atom.GetNumImplicitHs(),
atom.GetNumExplicitHs(), atom.GetNumRadicalElectrons())
6 0 0 1
6 0 0 0
8 0 0 0
6 0 0 0
1 0 0 0
1 0 0 0
1 0 0 0
1 0 0 0
1 0 0 0
I hope the above helps. I suggest to also have a look at this blog post
by Roger Sayle:
https://nextmovesoftware.com/blog/2013/02/27/explicit-and-implicit-hydrogens-taking-liberties-with-valence/
It is a quick read and it explains very clearly the difference between
implicit Hs, explicit Hs, and explicit Hs included in the molecule graph.
Cheers,
p.
On 17/06/2019 20:40, Steven Kearnes via Rdkit-discuss wrote:
mol = Chem.MolFromSmiles('C(C(=O)C([H])([H])[H])([H])([H])[H]')
for atom in mol.GetAtoms():
print(atom.GetNumImplicitHs(), atom.GetNumExplicitHs())
>>>
0 3
0 0
0 0
0 3
mol_noh = Chem.RemoveHs(mol, updateExplicitCount=True)
for atom in mol_noh.GetAtoms():
print(atom.GetNumImplicitHs(), atom.GetNumExplicitHs())
>>>
0 3
0 0
0 0
0 3
Note that if I add the hydrogens first, everything works as expected:
mol_h = Chem.AddHs(mol, explicitOnly=True)
mol_noh = Chem.RemoveHs(mol_h)
for atom in mol_noh.GetAtoms():
print(atom.GetNumImplicitHs(), atom.GetNumExplicitHs())
>>>
3 0
0 0
0 0
3 0
Perhaps this is expected behavior?
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss