Re: [Rdkit-discuss] Code efficiency improvement

2019-12-18 Thread topgunhaides .
Hi Michal,

Many thanks for the help! I am looking for an ensemble of conformers.
My priority is to use RDKit to generate a large ensemble of conformers for
each molecule.
For large and flexiable molecules, will need a lot more than 10K (like
100K) to try to cover the entire conformational space.

I do not have to use MMFF to optimize all conformers, but I do want to use
MMFF or UFF to get at least the energies of all conformers (which is also
quite time-consuming, even without optimization).
With the conformer energies, I can call some energy_filtering function to
filter out conformers with high energies, etc.
I am thinking that storing and processing a huge number of conformers could
be the reason to slow things down, but not quite sure.
Any suggestions are very welcome!

Best,
Leon

On Wed, Dec 18, 2019 at 7:08 PM Michal Krompiec 
wrote:

> Are you looking for the global minimum or an ensemble of conformers?
> Either way, this is already very fast. Bear in mind, however, that MMFF’s
> accuracy isn’t great for this type of tasks (see for example
> https://arxiv.org/pdf/1705.04308.pdf ). In other words, I don’t see a use
> case for generation of 10k or more conformers with MMFF. And super-fast
> generation of large conformational ensembles for arbitrary molecules just
> isn’t realistic.
> Best,
> Michal
>
> On Wed, 18 Dec 2019 at 22:40, topgunhaides .  wrote:
>
>> Hi guys,
>>
>> Can anyone give me some advices to improve the efficiency of the
>> embedding code? See example below:
>>
>>
>> import time
>> from rdkit import Chem
>> from rdkit.Chem import AllChem
>>
>> suppl = Chem.SDMolSupplier('cid831548.sdf')   # medium size molecule (10
>> heavy atoms)
>>
>> for mol in suppl:
>> mh = Chem.AddHs(mol, addCoords=True)
>>
>> # embedding
>> start = time.time()
>> AllChem.EmbedMultipleConfs(mh, numConfs=5000, maxAttempts=100,
>> pruneRmsThresh=0.5,
>>randomSeed=1, numThreads=0,
>> enforceChirality=True,
>>useExpTorsionAnglePrefs=True,
>> useBasicKnowledge=True)
>> cids = [conf.GetId() for conf in mh.GetConformers()]
>> end = time.time()
>> print("time eclipsed: ", end - start)
>>
>>
>> The results:
>> numConfs=1000,   time eclipsed: 10 seconds
>> numConfs=5000,   time eclipsed: 66 seconds
>> numConfs=1, time eclipsed: 176 seconds
>>
>> I need to request a lot more than 1 conformers per molecule and have
>> a lot of molecules to process.
>> I also wish to compute conformer energies and hopefully can do
>> optimization (both are time consuming). So need to make my code as
>> efficient as possible. Thank you!
>>
>> Best,
>> Leon
>>
>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] Code efficiency improvement

2019-12-18 Thread Michal Krompiec
Are you looking for the global minimum or an ensemble of conformers? Either
way, this is already very fast. Bear in mind, however, that MMFF’s accuracy
isn’t great for this type of tasks (see for example
https://arxiv.org/pdf/1705.04308.pdf ). In other words, I don’t see a use
case for generation of 10k or more conformers with MMFF. And super-fast
generation of large conformational ensembles for arbitrary molecules just
isn’t realistic.
Best,
Michal

On Wed, 18 Dec 2019 at 22:40, topgunhaides .  wrote:

> Hi guys,
>
> Can anyone give me some advices to improve the efficiency of the embedding
> code? See example below:
>
>
> import time
> from rdkit import Chem
> from rdkit.Chem import AllChem
>
> suppl = Chem.SDMolSupplier('cid831548.sdf')   # medium size molecule (10
> heavy atoms)
>
> for mol in suppl:
> mh = Chem.AddHs(mol, addCoords=True)
>
> # embedding
> start = time.time()
> AllChem.EmbedMultipleConfs(mh, numConfs=5000, maxAttempts=100,
> pruneRmsThresh=0.5,
>randomSeed=1, numThreads=0,
> enforceChirality=True,
>useExpTorsionAnglePrefs=True,
> useBasicKnowledge=True)
> cids = [conf.GetId() for conf in mh.GetConformers()]
> end = time.time()
> print("time eclipsed: ", end - start)
>
>
> The results:
> numConfs=1000,   time eclipsed: 10 seconds
> numConfs=5000,   time eclipsed: 66 seconds
> numConfs=1, time eclipsed: 176 seconds
>
> I need to request a lot more than 1 conformers per molecule and have a
> lot of molecules to process.
> I also wish to compute conformer energies and hopefully can do
> optimization (both are time consuming). So need to make my code as
> efficient as possible. Thank you!
>
> Best,
> Leon
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Code efficiency improvement

2019-12-18 Thread topgunhaides .
Hi guys,

Can anyone give me some advices to improve the efficiency of the embedding
code? See example below:


import time
from rdkit import Chem
from rdkit.Chem import AllChem

suppl = Chem.SDMolSupplier('cid831548.sdf')   # medium size molecule (10
heavy atoms)

for mol in suppl:
mh = Chem.AddHs(mol, addCoords=True)

# embedding
start = time.time()
AllChem.EmbedMultipleConfs(mh, numConfs=5000, maxAttempts=100,
pruneRmsThresh=0.5,
   randomSeed=1, numThreads=0,
enforceChirality=True,
   useExpTorsionAnglePrefs=True,
useBasicKnowledge=True)
cids = [conf.GetId() for conf in mh.GetConformers()]
end = time.time()
print("time eclipsed: ", end - start)


The results:
numConfs=1000,   time eclipsed: 10 seconds
numConfs=5000,   time eclipsed: 66 seconds
numConfs=1, time eclipsed: 176 seconds

I need to request a lot more than 1 conformers per molecule and have a
lot of molecules to process.
I also wish to compute conformer energies and hopefully can do optimization
(both are time consuming). So need to make my code as efficient as
possible. Thank you!

Best,
Leon
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss