Re: [Rdkit-discuss] Do we have an exact implementation of Bemis-Murcko scaffolds in rdkit?

2021-04-27 Thread Greg Landrum
I'm not sure why you'd want to reimplement something that's already there,
but if this works better for you...

the easiest way to get a single function you could call would be to do
something like:

In [18]: def MolToGenericScaffold(mol):
...: return
MurckoScaffold.MakeScaffoldGeneric(MurckoScaffold.GetScaffoldForMol(mol))
...:
In [19]:
Chem.MolToSmiles(MolToGenericScaffold(Chem.MolFromSmiles('CCc1ccc(O)cc1C(=O)C1CC1')))
Out[19]: 'CC(C1C1)C1CC1'


-greg


On Tue, Apr 27, 2021 at 4:32 AM Francois Berenger  wrote:

> On 27/04/2021 10:12, Francois Berenger wrote:
> > On 26/04/2021 23:35, Greg Landrum wrote:
> >> Hi Francois,
> >>
> >> The implementation which is there does, I believe, the right thing.
> >> However... first you need to find the Murcko Scaffold, then you can
> >> convert that scaffold to the generic form:
> >>
> >>> In [5]: m = Chem.MolFromSmiles('CCc1ccc(O)cc1C(=O)C1CC1')
> >>> In [6]: scaff = MurckoScaffold.GetScaffoldForMol(m)
> >>> In [7]: Chem.MolToSmiles(scaff)
> >>> Out[7]: 'O=C(c1c1)C1CC1'
> >>> In [8]: framework = MurckoScaffold.MakeScaffoldGeneric(scaff)
> >>> In [9]: print(Chem.MolToSmiles(framework))
> >>> CC(C1C1)C1CC1
> >
> > Ok, maybe this two steps process is a little bit better, but still
> > not exactly what I would expect in some cases.
> >
> > I'll say if I program something which I prefer.
>
> Hello,
>
> I end up with this:
> ---
> def find_terminal_atoms(mol):
>  res = []
>  for a in mol.GetAtoms():
>  if len(a.GetBonds()) == 1:
>  res.append(a)
>  return res
>
> # Bemis, G. W., & Murcko, M. A. (1996).
> # "The properties of known drugs. 1. Molecular frameworks."
> # Journal of medicinal chemistry, 39(15), 2887-2893.
> def BemisMurckoFramework(mol):
>  # keep only Heavy Atoms (HA)
>  only_HA = rdkit.Chem.rdmolops.RemoveHs(mol)
>  # switch all HA to Carbon
>  rw_mol = Chem.RWMol(only_HA)
>  for i in range(rw_mol.GetNumAtoms()):
>  rw_mol.ReplaceAtom(i, Chem.Atom(6))
>  # switch all non single bonds to single
>  non_single_bonds = []
>  for b in rw_mol.GetBonds():
>  if b.GetBondType() != Chem.BondType.SINGLE:
>  non_single_bonds.append(b)
>  for b in non_single_bonds:
>  j = b.GetBeginAtomIdx()
>  k = b.GetEndAtomIdx()
>  rw_mol.RemoveBond(j, k)
>  rw_mol.AddBond(j, k, Chem.BondType.SINGLE)
>  # as long as there are terminal atoms, remove them
>  terminal_atoms = find_terminal_atoms(rw_mol)
>  while terminal_atoms != []:
>  for a in terminal_atoms:
>  for b in a.GetBonds():
>  rw_mol.RemoveBond(b.GetBeginAtomIdx(),
> b.GetEndAtomIdx())
>  rw_mol.RemoveAtom(a.GetIdx())
>  terminal_atoms = find_terminal_atoms(rw_mol)
>  return rw_mol.GetMol()
> ---
>
> I don't claim this is very efficient Python code. I am not very good at
> snake charming.
>
> Regards,
> F.
>
> >> Best,
> >> -greg
> >>
> >> On Mon, Apr 26, 2021 at 11:15 AM Francois Berenger 
> >> wrote:
> >>
> >>> Hello,
> >>>
> >>> I am trying MurckoScaffold.MakeScaffoldGeneric(mol),
> >>> but this keeps the side chains.
> >>>
> >>> While my understanding of BM scaffolds is that only rings
> >>> and ring linkers should be kept.
> >>>
> >>> The fact that the rdkit implementation keeps the
> >>> side chains makes Murcko scaffolds a much less powerful filter
> >>> to enforce molecular diversity.
> >>>
> >>> And I don't even see any option to force the standard/vanilla
> >>> behavior.
> >>> Or, am I missing something?
> >>>
> >>> Regards,
> >>> F.
> >>>
> >>> ___
> >>> Rdkit-discuss mailing list
> >>> Rdkit-discuss@lists.sourceforge.net
> >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> >
> >
> > ___
> > Rdkit-discuss mailing list
> > Rdkit-discuss@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Do we have an exact implementation of Bemis-Murcko scaffolds in rdkit?

2021-04-26 Thread Francois Berenger

On 27/04/2021 10:12, Francois Berenger wrote:

On 26/04/2021 23:35, Greg Landrum wrote:

Hi Francois,

The implementation which is there does, I believe, the right thing.
However... first you need to find the Murcko Scaffold, then you can
convert that scaffold to the generic form:


In [5]: m = Chem.MolFromSmiles('CCc1ccc(O)cc1C(=O)C1CC1')
In [6]: scaff = MurckoScaffold.GetScaffoldForMol(m)
In [7]: Chem.MolToSmiles(scaff)
Out[7]: 'O=C(c1c1)C1CC1'
In [8]: framework = MurckoScaffold.MakeScaffoldGeneric(scaff)
In [9]: print(Chem.MolToSmiles(framework))
CC(C1C1)C1CC1


Ok, maybe this two steps process is a little bit better, but still
not exactly what I would expect in some cases.

I'll say if I program something which I prefer.


Hello,

I end up with this:
---
def find_terminal_atoms(mol):
res = []
for a in mol.GetAtoms():
if len(a.GetBonds()) == 1:
res.append(a)
return res

# Bemis, G. W., & Murcko, M. A. (1996).
# "The properties of known drugs. 1. Molecular frameworks."
# Journal of medicinal chemistry, 39(15), 2887-2893.
def BemisMurckoFramework(mol):
# keep only Heavy Atoms (HA)
only_HA = rdkit.Chem.rdmolops.RemoveHs(mol)
# switch all HA to Carbon
rw_mol = Chem.RWMol(only_HA)
for i in range(rw_mol.GetNumAtoms()):
rw_mol.ReplaceAtom(i, Chem.Atom(6))
# switch all non single bonds to single
non_single_bonds = []
for b in rw_mol.GetBonds():
if b.GetBondType() != Chem.BondType.SINGLE:
non_single_bonds.append(b)
for b in non_single_bonds:
j = b.GetBeginAtomIdx()
k = b.GetEndAtomIdx()
rw_mol.RemoveBond(j, k)
rw_mol.AddBond(j, k, Chem.BondType.SINGLE)
# as long as there are terminal atoms, remove them
terminal_atoms = find_terminal_atoms(rw_mol)
while terminal_atoms != []:
for a in terminal_atoms:
for b in a.GetBonds():
rw_mol.RemoveBond(b.GetBeginAtomIdx(), 
b.GetEndAtomIdx())

rw_mol.RemoveAtom(a.GetIdx())
terminal_atoms = find_terminal_atoms(rw_mol)
return rw_mol.GetMol()
---

I don't claim this is very efficient Python code. I am not very good at 
snake charming.


Regards,
F.


Best,
-greg

On Mon, Apr 26, 2021 at 11:15 AM Francois Berenger 
wrote:


Hello,

I am trying MurckoScaffold.MakeScaffoldGeneric(mol),
but this keeps the side chains.

While my understanding of BM scaffolds is that only rings
and ring linkers should be kept.

The fact that the rdkit implementation keeps the
side chains makes Murcko scaffolds a much less powerful filter
to enforce molecular diversity.

And I don't even see any option to force the standard/vanilla
behavior.
Or, am I missing something?

Regards,
F.

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Do we have an exact implementation of Bemis-Murcko scaffolds in rdkit?

2021-04-26 Thread Greg Landrum
Hi Francois,

The implementation which is there does, I believe, the right thing.
However... first you need to find the Murcko Scaffold, then you can convert
that scaffold to the generic form:

In [5]: m = Chem.MolFromSmiles('CCc1ccc(O)cc1C(=O)C1CC1')
In [6]: scaff = MurckoScaffold.GetScaffoldForMol(m)
In [7]: Chem.MolToSmiles(scaff)
Out[7]: 'O=C(c1c1)C1CC1'
In [8]: framework = MurckoScaffold.MakeScaffoldGeneric(scaff)
In [9]: print(Chem.MolToSmiles(framework))
CC(C1C1)C1CC1



Best,
-greg


On Mon, Apr 26, 2021 at 11:15 AM Francois Berenger  wrote:

> Hello,
>
> I am trying MurckoScaffold.MakeScaffoldGeneric(mol),
> but this keeps the side chains.
>
> While my understanding of BM scaffolds is that only rings
> and ring linkers should be kept.
>
> The fact that the rdkit implementation keeps the
> side chains makes Murcko scaffolds a much less powerful filter
> to enforce molecular diversity.
>
> And I don't even see any option to force the standard/vanilla behavior.
> Or, am I missing something?
>
> Regards,
> F.
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Do we have an exact implementation of Bemis-Murcko scaffolds in rdkit?

2021-04-26 Thread Francois Berenger

Hello,

I am trying MurckoScaffold.MakeScaffoldGeneric(mol),
but this keeps the side chains.

While my understanding of BM scaffolds is that only rings
and ring linkers should be kept.

The fact that the rdkit implementation keeps the
side chains makes Murcko scaffolds a much less powerful filter
to enforce molecular diversity.

And I don't even see any option to force the standard/vanilla behavior.
Or, am I missing something?

Regards,
F.


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss