RDKitters,

I'm working on a project in which I want to align a collection of
structures with their most similar structures and display the results in
PyMOL.  To accomplish this, I've built a Python script similar to the one
attached here in which I start with pairs of structures, find the MCS of
those structures, create a template based on the MCS and a 3D conformation
of the structure of interest, and then generate a constrained conformation
of a query structure.  I tried to comment the attached code enough to lead
you through the process.

What I find is that quite often, the ConstrainedEmbed() function fails with
the error "molecule doesn't match the core" which seems very odd since the
pairs for which it fails are very similar.  The attached .png shows one
such pair and their MCS.

What I've found is that when I generate a 3D conformation for the first
structure and optimize it with MMFF (MMFFOptimize), this often causes
GetSubstructMatch to fail finding the MCS within the structure.  If instead
I used UFFOptimize, everything seems to work OK most of the time.

In my code, I've noted where the error occurs and flanked it with some
print statements to show what happens.  Specficially, at like 36 I have the
MMFFOptimize line, and at 37 the UFFOptimize line.   I've also attached a
set of structures for which MMFF fails.

While using UFFOptimize produces great results, I'm curious regarding why
MMFFOptimize creates a problem.  And, whether this is a bug which should be
fixed, or just a glitch related to atom typing and other parameterizations
that occur with MMFF.

Thanks for any explanation or ideas.

-Kirk
Struct1 Cc1cc(NC(=O)CSc2ccc3nnc(CCNC(=O)c4ccccc4)n3n2)no1
Struct2 CCOc1ccccc1NC(=O)CSc1ccc2nnc(CCNC(=O)c3ccc(C)cc3)n2n1
from copy import deepcopy

from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem import Draw
from rdkit.Chem import rdFMCS

from rdkit.Chem import PyMol
pymol=PyMol.MolViewer()

#get the structures
fin = open('Substruct.txt', 'r')
mols = []
for l in fin:
    arr = l.strip().split('\t')
    mols.append(Chem.MolFromSmiles(arr[1]))
    
#find the maximum common substructure
mcs = rdFMCS.FindMCS( mols, completeRingsOnly=True, 
                         ringMatchesRingOnly=True )
mcs_mol = Chem.MolFromSmarts(mcs.smartsString)

#check the mcs - looks reasonable
z = [ AllChem.Compute2DCoords(m) for m in mols + [mcs_mol] ]
img = Draw.MolsToGridImage( mols + [mcs_mol], subImgSize=(300,300),
                          legends = ['Struct1', 'Struct2', 'MCS'] )
img.save('Substruct.png')


#here's where the error occurs
#before MMFF optimization, GetSubstructMatch is correct
print mols[0].GetSubstructMatch(mcs_mol)

#create a 3D structure for the first 
AllChem.EmbedMolecule(mols[0])
AllChem.MMFFOptimizeMolecule(mols[0])
#AllChem.UFFOptimizeMolecule(mols[0]) #UFF works!

#after MMFF optimization, substruct match no longer correct
print mols[0].GetSubstructMatch(mcs_mol)

#create a template from the mcs and structure 1
mcs_match = mols[0].GetSubstructMatch(mcs_mol)
template = deepcopy(mols[0])
for i,a in enumerate( template.GetAtoms() ):
    if (i not in mcs_match):
        template.GetAtomWithIdx(i).SetAtomicNum(0)
        
template = Chem.DeleteSubstructs(template, Chem.MolFromSmarts('[#0]'))

#create a 3d structure for the second constrained to the mcs
mols[1] = AllChem.ConstrainedEmbed(mols[1], template)

#show the results in PyMOL
pymol.ShowMol(mols[0], name='Struct1')
pymol.Zoom('Struct1')
pymol.SetDisplayStyle('Struct1', 'sticks')
pymol.ShowMol(mols[1], name='Struct2', showOnly=False)



------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to