Re: [Rdkit-discuss] Code efficiency improvement

topgunhaides . Wed, 18 Dec 2019 19:55:33 -0800

Hi Michal,

Many thanks for the help! I am looking for an ensemble of conformers.
My priority is to use RDKit to generate a large ensemble of conformers for
each molecule.
For large and flexiable molecules, will need a lot more than 10K (like
100K) to try to cover the entire conformational space.


I do not have to use MMFF to optimize all conformers, but I do want to use
MMFF or UFF to get at least the energies of all conformers (which is also
quite time-consuming, even without optimization).
With the conformer energies, I can call some energy_filtering function to
filter out conformers with high energies, etc.
I am thinking that storing and processing a huge number of conformers could
be the reason to slow things down, but not quite sure.
Any suggestions are very welcome!

Best,
Leon

On Wed, Dec 18, 2019 at 7:08 PM Michal Krompiec <michal.kromp...@gmail.com>
wrote:

> Are you looking for the global minimum or an ensemble of conformers?
> Either way, this is already very fast. Bear in mind, however, that MMFF’s
> accuracy isn’t great for this type of tasks (see for example
> https://arxiv.org/pdf/1705.04308.pdf ). In other words, I don’t see a use
> case for generation of 10k or more conformers with MMFF. And super-fast
> generation of large conformational ensembles for arbitrary molecules just
> isn’t realistic.
> Best,
> Michal
>
> On Wed, 18 Dec 2019 at 22:40, topgunhaides . <sunzhi....@gmail.com> wrote:
>
>> Hi guys,
>>
>> Can anyone give me some advices to improve the efficiency of the
>> embedding code? See example below:
>>
>>
>> import time
>> from rdkit import Chem
>> from rdkit.Chem import AllChem
>>
>> suppl = Chem.SDMolSupplier('cid831548.sdf')   # medium size molecule (10
>> heavy atoms)
>>
>> for mol in suppl:
>>     mh = Chem.AddHs(mol, addCoords=True)
>>
>> # embedding
>>     start = time.time()
>>     AllChem.EmbedMultipleConfs(mh, numConfs=5000, maxAttempts=100,
>> pruneRmsThresh=0.5,
>>                                randomSeed=1, numThreads=0,
>> enforceChirality=True,
>>                                useExpTorsionAnglePrefs=True,
>> useBasicKnowledge=True)
>>     cids = [conf.GetId() for conf in mh.GetConformers()]
>>     end = time.time()
>>     print("time eclipsed: ", end - start)
>>
>>
>> The results:
>> numConfs=1000,   time eclipsed: 10 seconds
>> numConfs=5000,   time eclipsed: 66 seconds
>> numConfs=10000, time eclipsed: 176 seconds
>>
>> I need to request a lot more than 10000 conformers per molecule and have
>> a lot of molecules to process.
>> I also wish to compute conformer energies and hopefully can do
>> optimization (both are time consuming). So need to make my code as
>> efficient as possible. Thank you!
>>
>> Best,
>> Leon
>>
>>
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Code efficiency improvement

Reply via email to