Hi Paolo, This is exactly what I have tried to do but unsuccessfully because I was just increasing the number of explicit Hs by nbr.SetNumExplicitHs(nbr.GetNumExplicitHs() + 1). In fact, my logic was to increase the number of Hs by 1 for each atom removed and I am still puzzled on why that should be increased by 2 for aromatic bonds (int(GetBondTypeAsDouble) == 2). Could you elaborate that for me?
Giammy On Thu, 7 Apr 2022 at 10:38, Paolo Tosco <paolo.tosco.m...@gmail.com> wrote: > Hi Gianmarco, > > this issue has been discussed before. > Removing bonds with RWMol.RemoveBond() will not adjust the implicit H > count of the atom at the two ends of the bond. > While this is not important for the atom that is going to be removed, the > count on the atom that stays needs to be adjusted. In particular, you need > to add as many implicit Hs as the order of the bond that you have removed. > See below for such an algorithm: > > with Chem.RWMol(mol) as rwmol: > for b in rwmol.GetBonds(): > for a in (b.GetBeginAtom(), b.GetEndAtom()): > if a.GetDegree() == 1: > oa = b.GetOtherAtom(a) > if oa.GetDegree() > 1: > oa.SetNumExplicitHs(oa.GetNumExplicitHs() + > int(b.GetBondTypeAsDouble())) > rwmol.RemoveBond(a.GetIdx(), oa.GetIdx()) > rwmol.RemoveAtom(a.GetIdx()) > break > > Chem.SanitizeMol(rwmol) > > rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE > > rwmol > > [image: image.png] > > Using Chem.RWMol as a context manager will allow you commit the changes > only when you are done, so you won't invalidate atom and bond iterators > along the way. > > Cheers, > p. > > On Thu, Apr 7, 2022 at 10:51 AM Gianmarco Ghiandoni <ghiandon...@gmail.com> > wrote: > >> Hi all, >> >> I am writing a function that removes atoms with only one bond and can be >> applied recursively in order to find the scaffold of a molecule. The >> function works in most cases but I have observed that, when aromatic rings >> are involved, it produces a loss of information. This is an example: >> >> from rdkit import Chem >> smiles = >> "CCN1CCC(N(CCCCCCc2ccc(C(F)(F)F)cc2)C(=O)Cn2c(CCc3cccc(F)c3F)cc(=O)c3ccccc32)CC1" >> Chem.MolFromSmiles(smiles.replace('@','')) >> >> [image: image.png] >> mol = Chem.MolFromSmiles(smiles.replace('@', '')) >> rwmol = Chem.RWMol(mol) >> >> # Note that in this case, I am iterating through the atoms only once >> # but the idea is to repeat this iteration until all single-bonded >> # atoms have been removed >> for a in list(rwmol.GetAtoms()): >> if len(a.GetBonds()) == 1: >> nbr = a.GetNeighbors()[0] >> rwmol.RemoveBond(a.GetIdx(), nbr.GetIdx()) >> rwmol.RemoveAtom(a.GetIdx()) >> rwmol >> >> [image: image.png] >> If you check the top aromatic ring in the starting molecule and the >> result after the first iteration, you can already see that the removal >> messes up the aromatic configuration of the ring after removing the two >> fluorines. If I try to Chem.SanitizeMol, at the end of the loop, it will >> not fix the problem (KekulizeException: Can't kekulize mol. Unkekulized >> atoms: 21 30 31 32 33 34 35 36 37). I have also done some attempts with >> using different sanitiseOps or by increasing the number of hydrogens on the >> atoms attached to those I remove but I could not figure out a solution - >> and clearly, if I carry on removing atoms recursively (i.e., I reapply the >> removal again), then the molecule gets messed up completely: >> >> [image: image.png] >> >> Can anyone help me with understanding what's missing? >> >> Thanks, >> >> Giammy >> _______________________________________________ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > -- *Gianmarco*
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss