Hi all,

I am writing a function that removes atoms with only one bond and can be
applied recursively in order to find the scaffold of a molecule. The
function works in most cases but I have observed that, when aromatic rings
are involved, it produces a loss of information. This is an example:

from rdkit import Chem
smiles =
"CCN1CCC(N(CCCCCCc2ccc(C(F)(F)F)cc2)C(=O)Cn2c(CCc3cccc(F)c3F)cc(=O)c3ccccc32)CC1"
Chem.MolFromSmiles(smiles.replace('@',''))

[image: image.png]
mol = Chem.MolFromSmiles(smiles.replace('@', ''))
rwmol = Chem.RWMol(mol)

# Note that in this case, I am iterating through the atoms only once
# but the idea is to repeat this iteration until all single-bonded
# atoms have been removed
for a in list(rwmol.GetAtoms()):
    if len(a.GetBonds()) == 1:
        nbr = a.GetNeighbors()[0]
        rwmol.RemoveBond(a.GetIdx(), nbr.GetIdx())
        rwmol.RemoveAtom(a.GetIdx())
rwmol

[image: image.png]
If you check the top aromatic ring in the starting molecule and the result
after the first iteration, you can already see that the removal messes up
the aromatic configuration of the ring after removing the two fluorines. If
I try to Chem.SanitizeMol, at the end of the loop, it will not fix the
problem (KekulizeException: Can't kekulize mol.  Unkekulized atoms: 21 30
31 32 33 34 35 36 37). I have also done some attempts with using different
sanitiseOps or by increasing the number of hydrogens on the atoms attached
to those I remove but I could not figure out a solution - and clearly, if I
carry on removing atoms recursively (i.e., I reapply the removal again),
then the molecule gets messed up completely:

[image: image.png]

Can anyone help me with understanding what's missing?

Thanks,

Giammy
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to