Re: [Rdkit-discuss] Code efficiency improvement

2019-12-20 Thread Dimitri Maziuk via Rdkit-discuss
On 12/19/19 7:27 PM, Francois Berenger wrote: > > You should parallelize the processing of molecules, since each can be > worked at independently. > Well, for "a lot" of conformers on "a lot" of molecules that'll work if you have access to a compute cluster and/or are willing to pay for

Re: [Rdkit-discuss] Code efficiency improvement

2019-12-19 Thread topgunhaides .
Hi Rafal, Thank you for this suggestion. I will try these to see the changes. Best, Leon On Thu, Dec 19, 2019 at 4:37 AM Rafal Roszak wrote: > On Wed, 18 Dec 2019 22:54:04 -0500 > "topgunhaides ." wrote: > > > For large and flexiable molecules, will need a lot more than 10K (like > > 100K)

Re: [Rdkit-discuss] Code efficiency improvement

2019-12-19 Thread topgunhaides .
Hi Michal, Many thanks for the help! The MMFF will be mainly used to remove only the (very) high energy conformers, which is a good news here. There is one dilemma here: Without optimization, some potentially "good" conformers could be filted out, due to the fact that small unreasonable atomic

Re: [Rdkit-discuss] Code efficiency improvement

2019-12-19 Thread topgunhaides .
Hi Greg, Many thanks for the help! The main purpose that I am "trying to generate a huge number of conformers for a bunch of molecules" is to reproduce experimentally determined structures. To increase accuracy, I want to try the following at least: - cover the entire (not quite possible for

Re: [Rdkit-discuss] Code efficiency improvement

2019-12-19 Thread Michal Krompiec
For mid-to-lower-energy conformers, MMFF relative energies are essentially a fancy random-number generator. Still, all depends on what you need this for. If you just want to filter out (very) high energy conformers, your approach might work. But if you also want to perform Boltzmann averaging over

Re: [Rdkit-discuss] Code efficiency improvement

2019-12-19 Thread Rafal Roszak
On Wed, 18 Dec 2019 22:54:04 -0500 "topgunhaides ." wrote: > For large and flexiable molecules, will need a lot more than 10K (like > 100K) to try to cover the entire conformational space. In such case useExpTorsionAnglePrefs=True, useBasicKnowledge=True can make your conformational set less

Re: [Rdkit-discuss] Code efficiency improvement

2019-12-19 Thread Greg Landrum
Hi Leon, If you want to be able to work efficiently on a problem like this, it's important to first take a step back and think about what you're doing. In this particular case you are asking the RDKit to generate 1 conformers for a molecule and requiring that the RMSD between each of those

Re: [Rdkit-discuss] Code efficiency improvement

2019-12-18 Thread topgunhaides .
Hi Michal, Many thanks for the help! I am looking for an ensemble of conformers. My priority is to use RDKit to generate a large ensemble of conformers for each molecule. For large and flexiable molecules, will need a lot more than 10K (like 100K) to try to cover the entire conformational space.

Re: [Rdkit-discuss] Code efficiency improvement

2019-12-18 Thread Michal Krompiec
Are you looking for the global minimum or an ensemble of conformers? Either way, this is already very fast. Bear in mind, however, that MMFF’s accuracy isn’t great for this type of tasks (see for example https://arxiv.org/pdf/1705.04308.pdf ). In other words, I don’t see a use case for generation