Re: [Rdkit-discuss] Memory management during conformer generation

2015-06-29 Thread az
Many thanks for your replies Triggering gc doesn't seem to help (though the object count goes down) but reducing how much is processed at a time does. I actually didn't have to go down to a molecule-at-a-time level but made due with inputs half of the previous size. The RAM still fills up

Re: [Rdkit-discuss] Memory management during conformer generation

2015-06-27 Thread Dmitri Maziuk
On 6/26/2015 9:48 AM, az wrote: Thanks Jean-Paul You're right that I eat up a lot of memory with large files but I think its not the whole story. If it were, my memory should come back each time a new file is being read (jobs=[]), no ? No. It's a feature of garbage collection: your memory

Re: [Rdkit-discuss] Memory management during conformer generation

2015-06-27 Thread David Hall
On Jun 27, 2015, at 6:05 AM, Dmitri Maziuk dmaz...@bmrb.wisc.edu wrote: On 6/26/2015 9:48 AM, az wrote: Thanks Jean-Paul You're right that I eat up a lot of memory with large files but I think its not the whole story. If it were, my memory should come back each time a new file is being

Re: [Rdkit-discuss] Memory management during conformer generation

2015-06-27 Thread Greg Landrum
I apologize that I haven't had a chance to look at this in detail yet, but I can at least give a quick answer to the below: Python uses a deterministic scheme for doing garbage collection based on reference counting, so memory should be freed as soon as you do jobs=[]. That's assuming that the

Re: [Rdkit-discuss] Memory management during conformer generation

2015-06-27 Thread Dmitri Maziuk
On 6/27/2015 5:45 AM, Greg Landrum wrote: ... That's assuming that the futures code (which I don't know) isn't doing anything odd behind the scenes to hold onto references. Or every mol in supplier holds a pointer to c++ dll that python vm doesn't quite know how to garbage-collect, which keeps

Re: [Rdkit-discuss] Memory management during conformer generation

2015-06-26 Thread az
Thanks Jean-Paul You're right that I eat up a lot of memory with large files but I think its not the whole story. If it were, my memory should come back each time a new file is being read (jobs=[]), no ? Instead I hit my limit after 8-10 very similar input files, even though the usage after

[Rdkit-discuss] Memory management during conformer generation

2015-06-24 Thread az
Hi Using the cookbook code as basis (apologies if I should have posted in the corresponding topic), I've put together a script to generate conformers for my smiles library. Works like a charm too, aside from the fact that after 10-20 hours, I'm out of RAM and swap (the memory consumption

Re: [Rdkit-discuss] Memory management during conformer generation

2015-06-24 Thread JP
Isn't the problem here that you are keeping an array (jobs) and you keep adding molecules to it never letting the garbage collector collect/clear any memory ? If your file has a million molecules, you will have an array of a million molecules in memory... Why dont you process each single