Hi Gianmarco,

this issue has been discussed before.
Removing bonds with RWMol.RemoveBond() will not adjust the implicit H count
of the atom at the two ends of the bond.
While this is not important for the atom that is going to be removed, the
count on the atom that stays needs to be adjusted. In particular, you need
to add as many implicit Hs as the order of the bond that you have removed.
See below for such an algorithm:

with Chem.RWMol(mol) as rwmol:
    for b in rwmol.GetBonds():
        for a in (b.GetBeginAtom(), b.GetEndAtom()):
            if a.GetDegree() == 1:
                oa = b.GetOtherAtom(a)
                if oa.GetDegree() > 1:
                    oa.SetNumExplicitHs(oa.GetNumExplicitHs() +
int(b.GetBondTypeAsDouble()))
                    rwmol.RemoveBond(a.GetIdx(), oa.GetIdx())
                    rwmol.RemoveAtom(a.GetIdx())
                    break

Chem.SanitizeMol(rwmol)

rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE

rwmol

[image: image.png]

Using Chem.RWMol as a context manager will allow you commit the changes
only when you are done, so you won't invalidate atom and bond iterators
along the way.

Cheers,
p.

On Thu, Apr 7, 2022 at 10:51 AM Gianmarco Ghiandoni <ghiandon...@gmail.com>
wrote:

> Hi all,
>
> I am writing a function that removes atoms with only one bond and can be
> applied recursively in order to find the scaffold of a molecule. The
> function works in most cases but I have observed that, when aromatic rings
> are involved, it produces a loss of information. This is an example:
>
> from rdkit import Chem
> smiles =
> "CCN1CCC(N(CCCCCCc2ccc(C(F)(F)F)cc2)C(=O)Cn2c(CCc3cccc(F)c3F)cc(=O)c3ccccc32)CC1"
> Chem.MolFromSmiles(smiles.replace('@',''))
>
> [image: image.png]
> mol = Chem.MolFromSmiles(smiles.replace('@', ''))
> rwmol = Chem.RWMol(mol)
>
> # Note that in this case, I am iterating through the atoms only once
> # but the idea is to repeat this iteration until all single-bonded
> # atoms have been removed
> for a in list(rwmol.GetAtoms()):
>     if len(a.GetBonds()) == 1:
>         nbr = a.GetNeighbors()[0]
>         rwmol.RemoveBond(a.GetIdx(), nbr.GetIdx())
>         rwmol.RemoveAtom(a.GetIdx())
> rwmol
>
> [image: image.png]
> If you check the top aromatic ring in the starting molecule and the result
> after the first iteration, you can already see that the removal messes up
> the aromatic configuration of the ring after removing the two fluorines. If
> I try to Chem.SanitizeMol, at the end of the loop, it will not fix the
> problem (KekulizeException: Can't kekulize mol.  Unkekulized atoms: 21 30
> 31 32 33 34 35 36 37). I have also done some attempts with using different
> sanitiseOps or by increasing the number of hydrogens on the atoms attached
> to those I remove but I could not figure out a solution - and clearly, if I
> carry on removing atoms recursively (i.e., I reapply the removal again),
> then the molecule gets messed up completely:
>
> [image: image.png]
>
> Can anyone help me with understanding what's missing?
>
> Thanks,
>
> Giammy
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to