Hi Jose Manuel,
the problem is just that the scaffold returned by
MurckoScaffold.GetScaffoldForMol() has no explicit hydrogens on the imino N:
for atom in ms.GetAtoms():
print(atom.GetIdx(), atom.GetAtomicNum(), atom.GetNumExplicitHs(),
atom.GetNumImplicitHs(), atom.GetIsAromatic())
0 7 1 0 True
1 6 0 1 True
2 7 0 0 True
3 6 0 0 True
4 6 0 0 True
5 7 0 0 False <--
6 7 1 0 True
7 6 0 1 True
8 7 0 0 True
9 6 0 0 True
Therefore, after sanitizing, that nitrogen is set to be a radical:
ms_all.GetAtomWithIdx(5).GetNumRadicalElectrons()
1
and the Unicode bullet operator used to represent the radical cannot be
encoded by the latin-1 codec, hence theUnicodeEncodeError.
If you do a
ms_all.GetAtomWithIdx(5).SetNumExplicitHs(1)
before sanitizing, your problem will disappear.
Cheers,
p.
On 28/08/2019 13:22, Jose-Manuel Gally wrote:
Dear all,
I noticed a strange behavior when extracting murcko scaffolds from
preprocessed molecules with an inhouse standardization protocol.
I made a gist to illustrate the problem:
https://gist.github.com/jose-manuel/04d69dd3ac52cca74449e73d614df42e
This leaves me with several questions:
1. When working with the standardized molecule, I get a drawing of
the murcko scaffold without Hs on the terminal nitrogen.
Why is that? I would expect either a radical (so with '.') or an
additional hydrogen. The smiles does not indicate the molecule is
a radical either.
2. When sanitizing the molecule to update the smiles, I get a radical
by default, instead of a H bound to the nitrogen. Why is not a H
added instead? If I switch off the FINDRADICALS sanitization flag,
I do not get an extra hydrogen either...
3. When I apply the default Sanitization to the murcko scaffold and
try to display it, I get an UnicodeEncodeError.
If I manually replace [N] by N in the smiles and create a new
molecule from it, I don't get an error anymore. Is there a
workaround? Interestingly, the function Draw.MolsToGridImage works
just fine but I could not find how to change the atom label size
and bond width.
Am I missing something obvious?
Many thanks in advance as any feedback would be much appreciated!
Cheers,
Jose Manuel
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient&utm_term=icon>
Virus-free. www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient&utm_term=link>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss