For some reason, my earlier reply didn't seem to make it to the mailing
list. Here it is in its entirety:
"""
If you assign each figure to a new number, it will keep all of those
figures around in memory (because pyplot thinks you may want to use it
again.) The best route is to call close('all') or fig.close() with each
loop iteration.
40MB per image doesn't sound way out of reason to me. How big are your
images?
"""
On 10/05/2009 03:46 AM, Leo Trottier wrote:
Hi,
I think I've figured out what's going on. It's a combination of things:
1) iPython is ignorant of the problems associated with caching massive
data output
2) iPython doesn't seem to have a good way to clear data from memory
reliably (https://bugs.launchpad.net/ipython/+bug/412350)
iPython is designed for interactive use, and stores a lot of values so
they can be conveniently reused later. For long running "batch"
scripts, you can use "regular" Python, or run the code in iPython such
that it isn't displayed at the console (by using "import" or "%run").
Bug 2) may help looks like it would still require some manual
intervention to be usefull. You're still using a tool designed for
fine-grained interactive use (eg. a pen) where one designed for
automation may be more appropriate (eg. a laser printer) :)
3) matplotlib/Python seems to be insufficiently aggressive in its
garbage collection (??)
Is that still true after forcibly closing the figures on each loop
iteration as I suggested? Many hours have been spent squashing memory
leaks in matplotlib, and I am not aware of any in at least 0.98 and
later (other than some unavoidable small leaks in certain GUI
backends). Do you have a standalone example that illustrates this on a
recent version of matplotlib?
4) For obvious reasons, JPGs are much bigger when stored as arrays
(though they still seem to take up more memory than they should)
It's pretty easy to estimate the memory requirements for an image. If
the image is true-color (by this, I mean not color-mapped), you'll need
4-bytes-per-pixel for the original image, plus a cached scaled copy (the
size of which depends on the output dpi), again with 4 bytes per pixel.
For color-mapped images, you'll have 4-byte floats for each pixel,
4-byte rgba for the color-mapped image, and again a cached scaled copy
of that. Not knowing the size of your input images, it's impossible to
say if 40MB per image is way too big or not, but it's not unheard of by
any means.
Problems 1-3 seem problematic enough that they will get fixed eventually.
... but (4) is a design issue. Assuming it's possible, it looks like
there could be benefits to making an array-like wrapper around PIL
image objects (perhaps similar in principle to a sparse matrix).
Given PIL.ImageMath, ImagePath, etc., it seems actually fairly
doable. Wouldn't something like this be of major benefit to people
using SciPy for anything image-related?
Are you suggesting decompressing the JPEG on-the-fly with each redraw?
I'm not certain that would be fast enough for interactive use. It may
be worth experimenting with, but it would require a lot of changes to
how matplotlib works. It's also very tricky to get right -- I'm not
aware of any image processing applications that don't ultimately store a
dense matrix of uncompressed image data in memory, except for something
like compressed OpenGL textures on a graphics card. PIL certainly
doesn't retain the compressed JPEG in memory. So, I'm not sure the
cost/benefit tradeoff is right here -- the problems it solves can be
solved much more easily without sacrificing speed in other ways. That
is, if the image data is simply too large, it can be scaled before
feeding it to imshow(). And generating multiple figures in batch is not
a problem if the figure is explicitly closed.
Hope this helps. I would like to get to the bottom of any memory leaks,
so if you can provide a standalone script that leaks, despite calling
figure.close() in each iteration, please let me know.
Cheers,
Mike
Leo
On Fri, Oct 2, 2009 at 7:45 AM, Michael Droettboom <md...@stsci.edu
<mailto:md...@stsci.edu>> wrote:
If you assign each figure to a new number, it will keep all of
those figures around in memory (because pyplot thinks you may want
to use it again.) The best route is to call close('all') or
fig.close() with each loop iteration.
40MB per image doesn't sound way out of reason to me. How big are
your images?
Mike
On 10/01/2009 10:25 PM, Leo Trottier wrote:
I have a friend who's having strange memory issues when opening
and displaying images (using Matplotlib).
Here's what he says:
#######################################
pylab seems really inefficient: Opening a few images and
displaying them eats up tons of memory, and the memory doesn't
get freed.
Starting python, and run
In [5]: from glob import *;
In [6]: from pylab import *
python has 33MB of memory.
Run
In [7]: i = 1
In [8]: for imname in glob("*.JPG"):
...: im = imread(imname)
...: figure(i); i = i+1
...: imshow(im)
...:
This opens 10 figures and displays them. Python takes 480MB of
memory. This is crazy, for 10 images -- 40+MB of memory for each!
In [14]: close("all")
In [15]: i = 1
In [16]: for imname in glob("*.JPG"):
im = imread(imname)
figure(i); i = i+1
imshow(im)
....:
....:
This closes all figures and opens them again. Python takes up
837MB of memory.
and so on... Something is really wrong with memory management.
##### System info: ##############
(using macosx backend)
2.4GHz MacBook Pro Intel Core 2 Duo
4GB 667MHz DDR2 SDRAM
In [5]: sys.version
Out[5]: '2.6.2 (r262:71600, Oct 1 2009, 16:44:23) \n[GCC 4.2.1
(Apple Inc. build 5646)]'
In [6]: numpy.__version__
Out[6]: '1.3.0'
In [7]: matplotlib.__version__
Out[7]: '0.99.1.1'
In [8]: scipy.__version__
Out[8]: '0.7.1'
In [9]:
------------------------------------------------------------------------
------------------------------------------------------------------------------
Come build with us! The BlackBerry® Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9-12, 2009. Register now!
http://p.sf.net/sfu/devconf
------------------------------------------------------------------------
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
<mailto:Matplotlib-users@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/matplotlib-users
------------------------------------------------------------------------------
Come build with us! The BlackBerry® Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9-12, 2009. Register now!
http://p.sf.net/sfu/devconf
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users