[Rdkit-discuss] adding custom number of explicit H to specified non-hydrogen atoms

Janusz Petkowski Fri, 20 Jan 2017 14:22:58 -0800

Dear RDKit Community,

By default H atoms are not explicit in the molecular graph and because of that 
the substructure matching is ignoring them when searching for substructures. It 
is possible to use Chem.AddHs(mol) to add explicit hydrogens to all atoms in 
the molecule and then perform substructure matching but is it possible, in 
RDkit, to add explicit hydrogens specifically to atoms of choice instead to all 
of them?


So let's say if I do:

m1 = Chem.MolFromSmiles('C=C')
m1_H = Chem.AddHs(m1)
print m1_H.GetNumAtoms()
print Chem.MolToSmiles(m1_H)

The result is:

>>> 6
>>> [H]C([H])=C([H])[H]

What if I would like to add only one (1)  explicit hydrogen atom to a specific 
non-hydrogen atom (let's say m1.GetAtomWithIdx(0). In that case I would want to 
have:

print m1_H.GetNumAtoms()
print Chem.MolToSmiles(m1_H)

>>> 3
>>> [H]C=C

I tried to use the following method: m1.GetAtomWithIdx(0).SetNumExplicitHs(1) 
which correctly adds an explicit H to C=C molecule but somehow I cannot convert 
it to smiles with this one additional explicit H added or subsequently use  for 
substructure matching.

At the end I would like to do a substructure matching where the following query 
structures:


[H]C=C or [H]C=CC match the following molecule: 
[H]C(=C([H])C([H])([H])[H])C([H])([H])[H]

but at the same time those query structures: [H]C=C([H])[H] or [H]C([H])=CC do 
not match [H]C(=C([H])C([H])([H])[H])C([H])([H])[H]

PS. Of course, the structure [H]C([H])=C([H])[H] converted from C=C using 
Chem.AddHs(mol) will not be matched onto 
[H]C(=C([H])C([H])([H])[H])C([H])([H])[H] which is correct.

Thank you very much for your help,

Best regards,

Janusz Petkowski

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

[Rdkit-discuss] adding custom number of explicit H to specified non-hydrogen atoms

Reply via email to