Re: [Rdkit-discuss] Code efficiency improvement

topgunhaides . Thu, 19 Dec 2019 10:41:55 -0800

Hi Michal,

Many thanks for the help!


The MMFF will be mainly used to remove only the (very) high energy
conformers, which is a good news here.
There is one dilemma here:
Without optimization, some potentially "good" conformers could be
filted out, due to the fact that small unreasonable atomic displacements
could increase the energy by a lot.
With optimization, however, many different conformers will just converge to
the same structure.
I will probably focus on just energies, without optimization. I even tried
UFF, which I found much faster than MMFF94s.

Best,
Leon





On Thu, Dec 19, 2019 at 7:49 AM Michal Krompiec <michal.kromp...@gmail.com>
wrote:

> For mid-to-lower-energy conformers, MMFF relative energies are
> essentially a fancy random-number generator. Still, all depends on
> what you need this for. If you just want to filter out (very) high
> energy conformers, your approach might work. But if you also want to
> perform Boltzmann averaging over conformational ensemble (of lower
> energy conformers), you will be disappointed.
> BTW conformational analysis of your molecule with CREST (20 OMP
> threads, -quick -norotmd) took 219 seconds and yielded 28 conformers
> with energy up to 3.5 kcal/mol higher than lowest energy structure. So
> it is ~2 orders of magnitude slower than MMFF.
> Best,
> Michal
>
> On Thu, 19 Dec 2019 at 03:53, topgunhaides . <sunzhi....@gmail.com> wrote:
> >
> > Hi Michal,
> >
> > Many thanks for the help! I am looking for an ensemble of conformers.
> > My priority is to use RDKit to generate a large ensemble of conformers
> for each molecule.
> > For large and flexiable molecules, will need a lot more than 10K (like
> 100K) to try to cover the entire conformational space.
> >
> > I do not have to use MMFF to optimize all conformers, but I do want to
> use MMFF or UFF to get at least the energies of all conformers (which is
> also quite time-consuming, even without optimization).
> > With the conformer energies, I can call some energy_filtering function
> to filter out conformers with high energies, etc.
> > I am thinking that storing and processing a huge number of conformers
> could be the reason to slow things down, but not quite sure.
> > Any suggestions are very welcome!
> >
> > Best,
> > Leon
> >
> >
> >
> >
> >
> >
> >
> > On Wed, Dec 18, 2019 at 7:08 PM Michal Krompiec <
> michal.kromp...@gmail.com> wrote:
> >>
> >> Are you looking for the global minimum or an ensemble of conformers?
> Either way, this is already very fast. Bear in mind, however, that MMFF’s
> accuracy isn’t great for this type of tasks (see for example
> >> https://arxiv.org/pdf/1705.04308.pdf ). In other words, I don’t see a
> use case for generation of 10k or more conformers with MMFF. And super-fast
> generation of large conformational ensembles for arbitrary molecules just
> isn’t realistic.
> >> Best,
> >> Michal
> >>
> >> On Wed, 18 Dec 2019 at 22:40, topgunhaides . <sunzhi....@gmail.com>
> wrote:
> >>>
> >>> Hi guys,
> >>>
> >>> Can anyone give me some advices to improve the efficiency of the
> embedding code? See example below:
> >>>
> >>>
> >>> import time
> >>> from rdkit import Chem
> >>> from rdkit.Chem import AllChem
> >>>
> >>> suppl = Chem.SDMolSupplier('cid831548.sdf')   # medium size molecule
> (10 heavy atoms)
> >>>
> >>> for mol in suppl:
> >>>     mh = Chem.AddHs(mol, addCoords=True)
> >>>
> >>> # embedding
> >>>     start = time.time()
> >>>     AllChem.EmbedMultipleConfs(mh, numConfs=5000, maxAttempts=100,
> pruneRmsThresh=0.5,
> >>>                                randomSeed=1, numThreads=0,
> enforceChirality=True,
> >>>                                useExpTorsionAnglePrefs=True,
> useBasicKnowledge=True)
> >>>     cids = [conf.GetId() for conf in mh.GetConformers()]
> >>>     end = time.time()
> >>>     print("time eclipsed: ", end - start)
> >>>
> >>>
> >>> The results:
> >>> numConfs=1000,   time eclipsed: 10 seconds
> >>> numConfs=5000,   time eclipsed: 66 seconds
> >>> numConfs=10000, time eclipsed: 176 seconds
> >>>
> >>> I need to request a lot more than 10000 conformers per molecule and
> have a lot of molecules to process.
> >>> I also wish to compute conformer energies and hopefully can do
> optimization (both are time consuming). So need to make my code as
> efficient as possible. Thank you!
> >>>
> >>> Best,
> >>> Leon
> >>>
> >>>
> >>> _______________________________________________
> >>> Rdkit-discuss mailing list
> >>> Rdkit-discuss@lists.sourceforge.net
> >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] Code efficiency improvement

Reply via email to