Hi Sereina,

This was definitely just about bond lengths and angles.  Setting them to
the MMFF values would seem to be a better option though I haven't made a
systematic study.  It would be more convenient for me if I didn't have to
minimise with MMFF but I realise I am but one voice.  Perhaps I'll find
time to do a better check on how they agree or differ with the CSD
generally rather than just on the few random structures I've looked at so
far.

I agree about the torsions. They're a whole different issue that I haven't
looked at yet.  It's certainly more nuanced than the z-score you get from
CSD about bond lengths and angles.

Thanks for the effort with the conformation generator generally - that
really is a valuable tool in the RDKit box.

Regards,
Dave


On Wed, Jun 22, 2016 at 9:20 AM, Sereina <sereina.rini...@gmail.com> wrote:

> Hi David,
>
> The bond distances and angles come in ETKDG from the UFF, so these may
> differ from the CSD (I have never checked this). It might be worth thinking
> about using the MMFF bond and angle terms instead.
>
> The torsion terms in ETKDG on the other hand come directly from the CSD
> (i.e. we fitted the torsion potentials to distributions obtained from the
> CSD), so there shouldn’t be systematic deviations, but of course there are
> some molecules in the CSD which exhibit torsion angles very different from
> the rest and these are difficult to capture with a torsion potential. If
> you have cases where ETKDG and CSD deviate a lot, I would be very
> interested to know because we are currently working on a version 2 of ETKDG.
>
> Minimization with MMFF is of course always an option. Just keep in mind
> though that MMFF and CSD do not always agree on the preferred torsion
> angles. We have never published this, but we have done a systematic
> analysis using the compounds in our ETKDG paper and MMFF minimization of
> the crystal structures could result in non-negligible differences (RMSD up
> to 3 Angstroem, median RMSD = 0.45 Angstroem).
>
> Best,
> Sereina
>
>
> On 22 Jun 2016, at 09:46, David Cosgrove <davidacosgrov...@gmail.com>
> wrote:
>
> Hi Sereina,
>
> I beg to differ on the advisability of minimisation, even after using the
> parameters you suggest to generate the conformation. I've recently been
> using the CCDC's excellent Python API to analyse the results of the
> generated conformations. This lets you very quickly assess whether any of
> the bond distances or angles in a molecule is "unusual" with reference to
> the data in the CSD. I've not done a systematic examination but my
> anecdotal result is that you get far fewer such unusual geometries if you
> give it a quick minimisation afterwards. Using the default optimiser
> settings is quick, painless and, in my limited number of tests, worth the
> effort.
>
> I'd be interested to know if anyone else has different experience.
>
> Cheers,
> Dave
>
>
> On Wednesday, 22 June 2016, Sereina <sereina.rini...@gmail.com> wrote:
>
>> Based on the code snippets, Paolo has not used the basic-knowledge terms
>> whereas Tim did.
>>
>> When setting useExpTorsionAnglePrefs=True and useBasicKnowledge=True, a
>> minimization is in principle not necessary anymore (unless there are
>> aliphatic rings, because we currently don’t have torsion rules for them).
>>
>> Best,
>> Sereina
>>
>>
>> On 22 Jun 2016, at 05:02, Greg Landrum <greg.land...@gmail.com> wrote:
>>
>> I don't have anything to add to the pieces about the alignment (Paolo is
>> the expert there!), but one comment on the conformation generation: If you
>> used the background knowledge terms in the embedding, I don't think you
>> should be getting the really distorted aromatic rings. Since in this case
>> that does happen, for at least some conformations, I suspect there may be
>> something wrong in the code.
>>
>> I'll take a look at that (and ask Sereina too).
>>
>> Best,
>> -greg
>>
>>
>> On Tue, Jun 21, 2016 at 10:30 PM, Paolo Tosco <paolo.to...@unito.it>
>> wrote:
>>
>>> Dear Tim,
>>>
>>> the Align() method returns an RMSD value, which however is computed only
>>> on a limited number of atom pairs, namely those that the algorithm was able
>>> to match between the two molecules, so a low value is not particularly
>>> informative of the overall goodness of the alignment, as it only indicates
>>> that the matched atoms were matched nicely, but there might only be a few
>>> of those in the core, while side chains are scattered all over.
>>> The Score() method instead returns the O3AScore for the alignment, which
>>> is a better way to assess the quality of the superimposition.
>>>
>>> The other problem in your script is that the i index is incremented
>>> before recording it in the lowest/highest variables, so the confIds are
>>> shifted by 1, as the conformation index in the RDKit is 0-based.
>>>
>>> I also noticed that without minimizing the conformations the aromatic
>>> rings look quite distorted, so I added a MMFF minimization, and I increased
>>> the number of generated conformations and the pruneRmsThreshold. Setting to
>>> False the experimental torsion angle preferences and basic knowledge about
>>> rings seems to yield a larger variety of geometries which helps reproducing
>>> this quite peculiar x-ray geometry which is probably not so commonly found.
>>> Please find the modified script below.
>>>
>>> Hope this helps, kind regards
>>> Paolo
>>>
>>>
>>> #!/usr/bin/env python
>>>
>>>
>>> from rdkit import Chem, RDConfig
>>> from rdkit.Chem import AllChem, rdMolAlign
>>>
>>> ref = Chem.MolFromSmiles('NC(=[NH2+])c1ccc(C[C@
>>> @H](NC(=O)CNS(=O)(=O)c2ccc3ccccc3c2)C(=O)N2CCCCC2)cc1')
>>> mol1 =
>>> Chem.MolFromPDBFile(RDConfig.RDBaseDir+'/rdkit/Chem/test_data/1DWD_ligand.pdb')
>>> mol1 = AllChem.AssignBondOrdersFromTemplate(ref, mol1)
>>> mol2 =
>>> Chem.MolFromPDBFile(RDConfig.RDBaseDir+'/rdkit/Chem/test_data/1PPC_ligand.pdb')
>>> mol2 = AllChem.AssignBondOrdersFromTemplate(ref, mol2)
>>>
>>> pyO3A = rdMolAlign.GetO3A(mol1, mol2)
>>> rmsd = pyO3A.Align()
>>> score = pyO3A.Score()
>>> print "Orig",score
>>> Chem.MolToMolFile(mol1, "orig.mol")
>>>
>>> cids = AllChem.EmbedMultipleConfs(mol1, numConfs=250, maxAttempts=100,
>>>     pruneRmsThresh=0.5, useExpTorsionAnglePrefs=False,
>>>     useBasicKnowledge=False)
>>> AllChem.MMFFOptimizeMoleculeConfs(mol1, mmffVariant='MMFF94s')
>>> pyO3As = rdMolAlign.GetO3AForProbeConfs(mol1, mol2, numThreads=0)
>>> i = 0
>>> lowest = 999999999.9
>>> highest = 0.0
>>> for pyO3A in pyO3As:
>>>     rmsd = pyO3A.Align()
>>>     score = pyO3A.Score()
>>>     if score < lowest:
>>>         lowest = score
>>>         lowestConfId = i
>>>     if score > highest:
>>>         highest = score
>>>         highestConfId = i
>>>     i +=1
>>>
>>> print "Lowest:", lowest, lowestConfId
>>> print "Highest:", highest, highestConfId
>>>
>>> Chem.MolToMolFile(mol1, "lowest.mol", confId=lowestConfId)
>>> Chem.MolToMolFile(mol1, "highest.mol", confId=highestConfId)
>>>
>>>
>>> On 06/21/16 15:41, Tim Dudgeon wrote:
>>>
>>> Hi All,
>>>
>>> I'm trying to get to grips with using Open3D Align in RDKit, but hitting
>>> problems.
>>>
>>> My approach is to generate random conformers of the probe molecule and
>>> align it to the reference molecule.  My example is cobbled together from
>>> the examples in the cookbook.
>>>
>>>
>>> from rdkit import Chem, RDConfig
>>> from rdkit.Chem import AllChem, rdMolAlign
>>>
>>> ref = Chem.MolFromSmiles('NC(=[NH2+])c1ccc(C[C@
>>> @H](NC(=O)CNS(=O)(=O)c2ccc3ccccc3c2)C(=O)N2CCCCC2)cc1')
>>> mol1 =
>>> Chem.MolFromPDBFile(RDConfig.RDBaseDir+'/rdkit/Chem/test_data/1DWD_ligand.pdb')
>>> mol1 = AllChem.AssignBondOrdersFromTemplate(ref, mol1)
>>> mol2 =
>>> Chem.MolFromPDBFile(RDConfig.RDBaseDir+'/rdkit/Chem/test_data/1PPC_ligand.pdb')
>>> mol2 = AllChem.AssignBondOrdersFromTemplate(ref, mol2)
>>>
>>> pyO3A = rdMolAlign.GetO3A(mol1, mol2)
>>> score = pyO3A.Align()
>>> print "Orig",score
>>> Chem.MolToMolFile(mol1, "orig.mol")
>>>
>>> cids = AllChem.EmbedMultipleConfs(mol1, numConfs=100, maxAttempts=100,
>>> pruneRmsThresh=0.1, useExpTorsionAnglePrefs=True, useBasicKnowledge=True)
>>>
>>> pyO3As = rdMolAlign.GetO3AForProbeConfs(mol1, mol2, numThreads=0)
>>> i = 0
>>> lowest = 999999999.9
>>> highest = 0.0
>>> for pyO3A in pyO3As:
>>>     i +=1
>>>     score = pyO3A.Align()
>>>     if score < lowest:
>>>         lowest = score
>>>         lowestConfId = i
>>>     if score > highest:
>>>         highest = score
>>>         highestConfId = i
>>>
>>> print "Lowest:", lowest, lowestConfId
>>> print "Highest:", highest, highestConfId
>>>
>>> Chem.MolToMolFile(mol1, "lowest.mol", confId=lowestConfId)
>>> Chem.MolToMolFile(mol1, "highest.mol", confId=highestConfId)
>>>
>>> What I'm finding is that the alignments with the lowest and highest O3A
>>> scores are much worse alignments (visually) than the original structure
>>> (the structure from 1DWD). Typical scores are:
>>>
>>> Original 1DWD structure: 0.38
>>> Lowest scoring conformer: 0.186
>>> Highest scoring conformer: 0.78
>>>
>>> Now I'm assuming that lower O3A align scores are better (though that's
>>> not specifically stated), so as my lowest score is lower than the original
>>> alignment I would have expected it to be a better alignment, but its
>>> clearly much worse. And if I'm wrong and higher scores are better then the
>>> same applies.
>>>
>>> Clearly I've not understood something correctly!
>>>
>>> Tim
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
>>> Francisco, CA to explore cutting-edge tech and listen to tech luminaries
>>> present their vision of the future. This family event has something for
>>> everyone, including kids. Get more information and register 
>>> today.http://sdm.link/attshape
>>>
>>>
>>>
>>> _______________________________________________
>>> Rdkit-discuss mailing 
>>> listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
>>> Francisco, CA to explore cutting-edge tech and listen to tech luminaries
>>> present their vision of the future. This family event has something for
>>> everyone, including kids. Get more information and register today.
>>> http://sdm.link/attshape
>>> _______________________________________________
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>>
>>
>> ------------------------------------------------------------------------------
>> Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
>> Francisco, CA to explore cutting-edge tech and listen to tech luminaries
>> present their vision of the future. This family event has something for
>> everyone, including kids. Get more information and register today.
>> http://sdm.link/attshape_______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>>
>
------------------------------------------------------------------------------
Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San
Francisco, CA to explore cutting-edge tech and listen to tech luminaries
present their vision of the future. This family event has something for
everyone, including kids. Get more information and register today.
http://sdm.link/attshape
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to