Re: [Rdkit-discuss] Code efficiency improvement

Michal Krompiec Thu, 19 Dec 2019 04:51:32 -0800

For mid-to-lower-energy conformers, MMFF relative energies are
essentially a fancy random-number generator. Still, all depends on
what you need this for. If you just want to filter out (very) high
energy conformers, your approach might work. But if you also want to
perform Boltzmann averaging over conformational ensemble (of lower
energy conformers), you will be disappointed.
BTW conformational analysis of your molecule with CREST (20 OMP
threads, -quick -norotmd) took 219 seconds and yielded 28 conformers
with energy up to 3.5 kcal/mol higher than lowest energy structure. So
it is ~2 orders of magnitude slower than MMFF.
Best,
Michal


On Thu, 19 Dec 2019 at 03:53, topgunhaides . <[email protected]> wrote:
>
> Hi Michal,
>
> Many thanks for the help! I am looking for an ensemble of conformers.
> My priority is to use RDKit to generate a large ensemble of conformers for 
> each molecule.
> For large and flexiable molecules, will need a lot more than 10K (like 100K) 
> to try to cover the entire conformational space.
>
> I do not have to use MMFF to optimize all conformers, but I do want to use 
> MMFF or UFF to get at least the energies of all conformers (which is also 
> quite time-consuming, even without optimization).
> With the conformer energies, I can call some energy_filtering function to 
> filter out conformers with high energies, etc.
> I am thinking that storing and processing a huge number of conformers could 
> be the reason to slow things down, but not quite sure.
> Any suggestions are very welcome!
>
> Best,
> Leon
>
>
>
>
>
>
>
> On Wed, Dec 18, 2019 at 7:08 PM Michal Krompiec <[email protected]> 
> wrote:
>>
>> Are you looking for the global minimum or an ensemble of conformers? Either 
>> way, this is already very fast. Bear in mind, however, that MMFF’s accuracy 
>> isn’t great for this type of tasks (see for example
>> https://arxiv.org/pdf/1705.04308.pdf ). In other words, I don’t see a use 
>> case for generation of 10k or more conformers with MMFF. And super-fast 
>> generation of large conformational ensembles for arbitrary molecules just 
>> isn’t realistic.
>> Best,
>> Michal
>>
>> On Wed, 18 Dec 2019 at 22:40, topgunhaides . <[email protected]> wrote:
>>>
>>> Hi guys,
>>>
>>> Can anyone give me some advices to improve the efficiency of the embedding 
>>> code? See example below:
>>>
>>>
>>> import time
>>> from rdkit import Chem
>>> from rdkit.Chem import AllChem
>>>
>>> suppl = Chem.SDMolSupplier('cid831548.sdf')   # medium size molecule (10 
>>> heavy atoms)
>>>
>>> for mol in suppl:
>>>     mh = Chem.AddHs(mol, addCoords=True)
>>>
>>> # embedding
>>>     start = time.time()
>>>     AllChem.EmbedMultipleConfs(mh, numConfs=5000, maxAttempts=100, 
>>> pruneRmsThresh=0.5,
>>>                                randomSeed=1, numThreads=0, 
>>> enforceChirality=True,
>>>                                useExpTorsionAnglePrefs=True, 
>>> useBasicKnowledge=True)
>>>     cids = [conf.GetId() for conf in mh.GetConformers()]
>>>     end = time.time()
>>>     print("time eclipsed: ", end - start)
>>>
>>>
>>> The results:
>>> numConfs=1000,   time eclipsed: 10 seconds
>>> numConfs=5000,   time eclipsed: 66 seconds
>>> numConfs=10000, time eclipsed: 176 seconds
>>>
>>> I need to request a lot more than 10000 conformers per molecule and have a 
>>> lot of molecules to process.
>>> I also wish to compute conformer energies and hopefully can do optimization 
>>> (both are time consuming). So need to make my code as efficient as 
>>> possible. Thank you!
>>>
>>> Best,
>>> Leon
>>>
>>>
>>> _______________________________________________
>>> Rdkit-discuss mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Code efficiency improvement

Reply via email to