Hi Jenke,

The reason why the resulting molecule contains no atoms is that the 3,5-dichlorophenyl substituent matches twice the 3-chlorophenyl substructure. If you look at the deleteSubstructs() code in ChemTransforms.cpp you may see that the atoms deleted from the main structures are the union of all possible matches (two in this case). Therefore, both the 3- and the 5-chloro substituents are deleted, and as all other atoms are part of the MCS, you are left with an empty molecule.


If you wish to delete matches one at a time, rather than deleting the union of all matches, you may use the DeleteSubstructs() Python implementation that I drafted in this gist:

https://gist.github.com/ptosco/1b6bc727ddee32b4d411cf5c2aea7291

(disclaimer: I wrote it very quickly and tested it only with your example, so I don't guarantee it is bug-free)

This DeleteSubstructs() implementation returns a list of all structures obtained by deleting the matched substructures one at a time. As I see you are using 3D structures, the code deletes all orphaned hydrogens and caps the fragments remaining after deleting substructures with hydrogens. If you don't care about 3D coordinates you may want to uniquify results based on canonical SMILES and, for example in the case of lig12, keep the remaining HCl fragment only once.

Another observation is that the PDB reader implemented in the RDKit autoconnects atoms based on proximity, but does not guess bond orders, so your structures are not chemically correct as they are all single-bonded even if you have carbonyls, aromatic rings, etc. In my gist I used Flare to perceive bond orders as I was running the code in the Flare Jupyter Notebook; you may also use other cheminformatics toolkit for that purpose.

I hope the above helps; please feel free to get back to me off-list if something is not clear.

Cheers,
p.

On 01/21/19 15:51, SCHEEN Jenke wrote:

Hi all,


I'm trying to remove the MCS between two molecules (attached, 02.pdb and 12.pdb) using rdFMCS.FindMCS and AllChem.DeleteSubstructs using the following code:



#########################################

from rdkit import Chem
from rdkit.Chem import AllChem, rdmolfiles, rdFMCS

#########################################

# load molecules:
lig02_pdb = open("02.pdb", 'r').read()
lig12_pdb = open("12.pdb", 'r').read()

lig02_mol = rdmolfiles.MolFromPDBBlock(lig02_pdb)
lig12_mol = rdmolfiles.MolFromPDBBlock(lig12_pdb)

# make list of molecules to map the MCS to:
perturbation_pair = []

perturbation_pair.append(lig02_mol)
perturbation_pair.append(lig12_mol)

MCS_object = rdFMCS.FindMCS(perturbation_pair, completeRingsOnly=True)
MCS_SMARTS = Chem.MolFromSmarts(MCS_object.smartsString)

# remove MCS from each molecule:
lig02_stripped = AllChem.DeleteSubstructs(lig02_mol, MCS_SMARTS)
lig12_stripped = AllChem.DeleteSubstructs(lig12_mol, MCS_SMARTS)

# print SMILES of each stripped molecule:
print("lig02: " + str(Chem.MolToSmiles(lig02_stripped)))
print("lig12: " + str(Chem.MolToSmiles(lig12_stripped)))

#print(Chem.MolToMolBlock(MCS_SMARTS),file=open('./MCS.mol','w+'))

#########################################


I attached the MCS.mol file as well. The lig12.pdb contains an extra Cl atom, and lig12_stripped should thus contain a single Cl atom after deletion of the MCS substructure (the MCS substructure is equal to lig02.pdb). When running the script it actually contains 0 atoms.


I haven't been able to locate the source of the issue in the AllChem.DeleteSubstructs documentation, does anyone have a suggestion?


I'm using a conda install of rdkit (2018.09.1.0).



Best,

Jenke
--------------------------------------------------------------
University of Edinburgh
David Brewster road
Edinburgh, EH9 3FJ
United Kingdom
-------------------------------------------------------------

The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.




_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to