Dear RDKit Community,
By default H atoms are not explicit in the molecular graph and because of that
the substructure matching is ignoring them when searching for substructures. It
is possible to use Chem.AddHs(mol) to add explicit hydrogens to all atoms in
the molecule and then perform substructure matching but is it possible, in
RDkit, to add explicit hydrogens specifically to atoms of choice instead to all
of them?
So let's say if I do:
m1 = Chem.MolFromSmiles('C=C')
m1_H = Chem.AddHs(m1)
print m1_H.GetNumAtoms()
print Chem.MolToSmiles(m1_H)
The result is:
>>> 6
>>> [H]C([H])=C([H])[H]
What if I would like to add only one (1) explicit hydrogen atom to a specific
non-hydrogen atom (let's say m1.GetAtomWithIdx(0). In that case I would want to
have:
print m1_H.GetNumAtoms()
print Chem.MolToSmiles(m1_H)
>>> 3
>>> [H]C=C
I tried to use the following method: m1.GetAtomWithIdx(0).SetNumExplicitHs(1)
which correctly adds an explicit H to C=C molecule but somehow I cannot convert
it to smiles with this one additional explicit H added or subsequently use for
substructure matching.
At the end I would like to do a substructure matching where the following query
structures:
[H]C=C or [H]C=CC match the following molecule:
[H]C(=C([H])C([H])([H])[H])C([H])([H])[H]
but at the same time those query structures: [H]C=C([H])[H] or [H]C([H])=CC do
not match [H]C(=C([H])C([H])([H])[H])C([H])([H])[H]
PS. Of course, the structure [H]C([H])=C([H])[H] converted from C=C using
Chem.AddHs(mol) will not be matched onto
[H]C(=C([H])C([H])([H])[H])C([H])([H])[H] which is correct.
Thank you very much for your help,
Best regards,
Janusz Petkowski
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss