Re: [Rdkit-discuss] Do we have an exact implementation of Bemis-Murcko scaffolds in rdkit?
I'm not sure why you'd want to reimplement something that's already there, but if this works better for you... the easiest way to get a single function you could call would be to do something like: In [18]: def MolToGenericScaffold(mol): ...: return MurckoScaffold.MakeScaffoldGeneric(MurckoScaffold.GetScaffoldForMol(mol)) ...: In [19]: Chem.MolToSmiles(MolToGenericScaffold(Chem.MolFromSmiles('CCc1ccc(O)cc1C(=O)C1CC1'))) Out[19]: 'CC(C1C1)C1CC1' -greg On Tue, Apr 27, 2021 at 4:32 AM Francois Berenger wrote: > On 27/04/2021 10:12, Francois Berenger wrote: > > On 26/04/2021 23:35, Greg Landrum wrote: > >> Hi Francois, > >> > >> The implementation which is there does, I believe, the right thing. > >> However... first you need to find the Murcko Scaffold, then you can > >> convert that scaffold to the generic form: > >> > >>> In [5]: m = Chem.MolFromSmiles('CCc1ccc(O)cc1C(=O)C1CC1') > >>> In [6]: scaff = MurckoScaffold.GetScaffoldForMol(m) > >>> In [7]: Chem.MolToSmiles(scaff) > >>> Out[7]: 'O=C(c1c1)C1CC1' > >>> In [8]: framework = MurckoScaffold.MakeScaffoldGeneric(scaff) > >>> In [9]: print(Chem.MolToSmiles(framework)) > >>> CC(C1C1)C1CC1 > > > > Ok, maybe this two steps process is a little bit better, but still > > not exactly what I would expect in some cases. > > > > I'll say if I program something which I prefer. > > Hello, > > I end up with this: > --- > def find_terminal_atoms(mol): > res = [] > for a in mol.GetAtoms(): > if len(a.GetBonds()) == 1: > res.append(a) > return res > > # Bemis, G. W., & Murcko, M. A. (1996). > # "The properties of known drugs. 1. Molecular frameworks." > # Journal of medicinal chemistry, 39(15), 2887-2893. > def BemisMurckoFramework(mol): > # keep only Heavy Atoms (HA) > only_HA = rdkit.Chem.rdmolops.RemoveHs(mol) > # switch all HA to Carbon > rw_mol = Chem.RWMol(only_HA) > for i in range(rw_mol.GetNumAtoms()): > rw_mol.ReplaceAtom(i, Chem.Atom(6)) > # switch all non single bonds to single > non_single_bonds = [] > for b in rw_mol.GetBonds(): > if b.GetBondType() != Chem.BondType.SINGLE: > non_single_bonds.append(b) > for b in non_single_bonds: > j = b.GetBeginAtomIdx() > k = b.GetEndAtomIdx() > rw_mol.RemoveBond(j, k) > rw_mol.AddBond(j, k, Chem.BondType.SINGLE) > # as long as there are terminal atoms, remove them > terminal_atoms = find_terminal_atoms(rw_mol) > while terminal_atoms != []: > for a in terminal_atoms: > for b in a.GetBonds(): > rw_mol.RemoveBond(b.GetBeginAtomIdx(), > b.GetEndAtomIdx()) > rw_mol.RemoveAtom(a.GetIdx()) > terminal_atoms = find_terminal_atoms(rw_mol) > return rw_mol.GetMol() > --- > > I don't claim this is very efficient Python code. I am not very good at > snake charming. > > Regards, > F. > > >> Best, > >> -greg > >> > >> On Mon, Apr 26, 2021 at 11:15 AM Francois Berenger > >> wrote: > >> > >>> Hello, > >>> > >>> I am trying MurckoScaffold.MakeScaffoldGeneric(mol), > >>> but this keeps the side chains. > >>> > >>> While my understanding of BM scaffolds is that only rings > >>> and ring linkers should be kept. > >>> > >>> The fact that the rdkit implementation keeps the > >>> side chains makes Murcko scaffolds a much less powerful filter > >>> to enforce molecular diversity. > >>> > >>> And I don't even see any option to force the standard/vanilla > >>> behavior. > >>> Or, am I missing something? > >>> > >>> Regards, > >>> F. > >>> > >>> ___ > >>> Rdkit-discuss mailing list > >>> Rdkit-discuss@lists.sourceforge.net > >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > > > > > ___ > > Rdkit-discuss mailing list > > Rdkit-discuss@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Do we have an exact implementation of Bemis-Murcko scaffolds in rdkit?
On 27/04/2021 10:12, Francois Berenger wrote: On 26/04/2021 23:35, Greg Landrum wrote: Hi Francois, The implementation which is there does, I believe, the right thing. However... first you need to find the Murcko Scaffold, then you can convert that scaffold to the generic form: In [5]: m = Chem.MolFromSmiles('CCc1ccc(O)cc1C(=O)C1CC1') In [6]: scaff = MurckoScaffold.GetScaffoldForMol(m) In [7]: Chem.MolToSmiles(scaff) Out[7]: 'O=C(c1c1)C1CC1' In [8]: framework = MurckoScaffold.MakeScaffoldGeneric(scaff) In [9]: print(Chem.MolToSmiles(framework)) CC(C1C1)C1CC1 Ok, maybe this two steps process is a little bit better, but still not exactly what I would expect in some cases. I'll say if I program something which I prefer. Hello, I end up with this: --- def find_terminal_atoms(mol): res = [] for a in mol.GetAtoms(): if len(a.GetBonds()) == 1: res.append(a) return res # Bemis, G. W., & Murcko, M. A. (1996). # "The properties of known drugs. 1. Molecular frameworks." # Journal of medicinal chemistry, 39(15), 2887-2893. def BemisMurckoFramework(mol): # keep only Heavy Atoms (HA) only_HA = rdkit.Chem.rdmolops.RemoveHs(mol) # switch all HA to Carbon rw_mol = Chem.RWMol(only_HA) for i in range(rw_mol.GetNumAtoms()): rw_mol.ReplaceAtom(i, Chem.Atom(6)) # switch all non single bonds to single non_single_bonds = [] for b in rw_mol.GetBonds(): if b.GetBondType() != Chem.BondType.SINGLE: non_single_bonds.append(b) for b in non_single_bonds: j = b.GetBeginAtomIdx() k = b.GetEndAtomIdx() rw_mol.RemoveBond(j, k) rw_mol.AddBond(j, k, Chem.BondType.SINGLE) # as long as there are terminal atoms, remove them terminal_atoms = find_terminal_atoms(rw_mol) while terminal_atoms != []: for a in terminal_atoms: for b in a.GetBonds(): rw_mol.RemoveBond(b.GetBeginAtomIdx(), b.GetEndAtomIdx()) rw_mol.RemoveAtom(a.GetIdx()) terminal_atoms = find_terminal_atoms(rw_mol) return rw_mol.GetMol() --- I don't claim this is very efficient Python code. I am not very good at snake charming. Regards, F. Best, -greg On Mon, Apr 26, 2021 at 11:15 AM Francois Berenger wrote: Hello, I am trying MurckoScaffold.MakeScaffoldGeneric(mol), but this keeps the side chains. While my understanding of BM scaffolds is that only rings and ring linkers should be kept. The fact that the rdkit implementation keeps the side chains makes Murcko scaffolds a much less powerful filter to enforce molecular diversity. And I don't even see any option to force the standard/vanilla behavior. Or, am I missing something? Regards, F. ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Do we have an exact implementation of Bemis-Murcko scaffolds in rdkit?
Hi Francois, The implementation which is there does, I believe, the right thing. However... first you need to find the Murcko Scaffold, then you can convert that scaffold to the generic form: In [5]: m = Chem.MolFromSmiles('CCc1ccc(O)cc1C(=O)C1CC1') In [6]: scaff = MurckoScaffold.GetScaffoldForMol(m) In [7]: Chem.MolToSmiles(scaff) Out[7]: 'O=C(c1c1)C1CC1' In [8]: framework = MurckoScaffold.MakeScaffoldGeneric(scaff) In [9]: print(Chem.MolToSmiles(framework)) CC(C1C1)C1CC1 Best, -greg On Mon, Apr 26, 2021 at 11:15 AM Francois Berenger wrote: > Hello, > > I am trying MurckoScaffold.MakeScaffoldGeneric(mol), > but this keeps the side chains. > > While my understanding of BM scaffolds is that only rings > and ring linkers should be kept. > > The fact that the rdkit implementation keeps the > side chains makes Murcko scaffolds a much less powerful filter > to enforce molecular diversity. > > And I don't even see any option to force the standard/vanilla behavior. > Or, am I missing something? > > Regards, > F. > > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Do we have an exact implementation of Bemis-Murcko scaffolds in rdkit?
Hello, I am trying MurckoScaffold.MakeScaffoldGeneric(mol), but this keeps the side chains. While my understanding of BM scaffolds is that only rings and ring linkers should be kept. The fact that the rdkit implementation keeps the side chains makes Murcko scaffolds a much less powerful filter to enforce molecular diversity. And I don't even see any option to force the standard/vanilla behavior. Or, am I missing something? Regards, F. ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss