For some reason, my earlier reply didn't seem to make it to the mailing list. Here it is in its entirety:

"""
If you assign each figure to a new number, it will keep all of those figures around in memory (because pyplot thinks you may want to use it again.) The best route is to call close('all') or fig.close() with each loop iteration.

40MB per image doesn't sound way out of reason to me. How big are your images?
"""

On 10/05/2009 03:46 AM, Leo Trottier wrote:
Hi,

I think I've figured out what's going on.  It's a combination of things:

1) iPython is ignorant of the problems associated with caching massive data output 2) iPython doesn't seem to have a good way to clear data from memory reliably (https://bugs.launchpad.net/ipython/+bug/412350)
iPython is designed for interactive use, and stores a lot of values so they can be conveniently reused later. For long running "batch" scripts, you can use "regular" Python, or run the code in iPython such that it isn't displayed at the console (by using "import" or "%run"). Bug 2) may help looks like it would still require some manual intervention to be usefull. You're still using a tool designed for fine-grained interactive use (eg. a pen) where one designed for automation may be more appropriate (eg. a laser printer) :)
3) matplotlib/Python seems to be insufficiently aggressive in its garbage collection (??)
Is that still true after forcibly closing the figures on each loop iteration as I suggested? Many hours have been spent squashing memory leaks in matplotlib, and I am not aware of any in at least 0.98 and later (other than some unavoidable small leaks in certain GUI backends). Do you have a standalone example that illustrates this on a recent version of matplotlib?
4) For obvious reasons, JPGs are much bigger when stored as arrays (though they still seem to take up more memory than they should)
It's pretty easy to estimate the memory requirements for an image. If the image is true-color (by this, I mean not color-mapped), you'll need 4-bytes-per-pixel for the original image, plus a cached scaled copy (the size of which depends on the output dpi), again with 4 bytes per pixel. For color-mapped images, you'll have 4-byte floats for each pixel, 4-byte rgba for the color-mapped image, and again a cached scaled copy of that. Not knowing the size of your input images, it's impossible to say if 40MB per image is way too big or not, but it's not unheard of by any means.

Problems 1-3 seem problematic enough that they will get fixed eventually.

... but (4) is a design issue. Assuming it's possible, it looks like there could be benefits to making an array-like wrapper around PIL image objects (perhaps similar in principle to a sparse matrix). Given PIL.ImageMath, ImagePath, etc., it seems actually fairly doable. Wouldn't something like this be of major benefit to people using SciPy for anything image-related?
Are you suggesting decompressing the JPEG on-the-fly with each redraw? I'm not certain that would be fast enough for interactive use. It may be worth experimenting with, but it would require a lot of changes to how matplotlib works. It's also very tricky to get right -- I'm not aware of any image processing applications that don't ultimately store a dense matrix of uncompressed image data in memory, except for something like compressed OpenGL textures on a graphics card. PIL certainly doesn't retain the compressed JPEG in memory. So, I'm not sure the cost/benefit tradeoff is right here -- the problems it solves can be solved much more easily without sacrificing speed in other ways. That is, if the image data is simply too large, it can be scaled before feeding it to imshow(). And generating multiple figures in batch is not a problem if the figure is explicitly closed.

Hope this helps. I would like to get to the bottom of any memory leaks, so if you can provide a standalone script that leaks, despite calling figure.close() in each iteration, please let me know.

Cheers,
Mike

Leo

On Fri, Oct 2, 2009 at 7:45 AM, Michael Droettboom <md...@stsci.edu <mailto:md...@stsci.edu>> wrote:

    If you assign each figure to a new number, it will keep all of
    those figures around in memory (because pyplot thinks you may want
    to use it again.)  The best route is to call close('all') or
    fig.close() with each loop iteration.

    40MB per image doesn't sound way out of reason to me.  How big are
    your images?

    Mike


    On 10/01/2009 10:25 PM, Leo Trottier wrote:
    I have a friend who's having strange memory issues when opening
    and displaying images (using Matplotlib).

    Here's what he says:
    #######################################

    pylab seems really inefficient: Opening a few images and
    displaying them eats up tons of memory, and the memory doesn't
    get freed.

    Starting python, and run

    In [5]: from glob import *;

    In [6]: from pylab import *

    python has 33MB of memory.


    Run

    In [7]: i = 1

    In [8]: for imname in glob("*.JPG"):
      ...:     im = imread(imname)
      ...:     figure(i); i = i+1
      ...:     imshow(im)
      ...:

    This opens 10 figures and displays them. Python takes 480MB of
    memory. This is crazy, for 10 images -- 40+MB of memory for each!

    In [14]: close("all")

    In [15]: i = 1

    In [16]: for imname in glob("*.JPG"):
       im = imread(imname)
       figure(i); i = i+1
       imshow(im)
      ....:
      ....:

    This closes all figures and opens them again. Python takes up
    837MB of memory.

    and so on... Something is really wrong with memory management.

    ##### System info: ##############

    (using macosx backend)

    2.4GHz MacBook Pro Intel Core 2 Duo

    4GB 667MHz DDR2 SDRAM

    In [5]: sys.version
    Out[5]: '2.6.2 (r262:71600, Oct  1 2009, 16:44:23) \n[GCC 4.2.1
    (Apple Inc. build 5646)]'

    In [6]: numpy.__version__
    Out[6]: '1.3.0'

    In [7]: matplotlib.__version__
    Out[7]: '0.99.1.1'

    In [8]: scipy.__version__
    Out[8]: '0.7.1'

    In [9]:


    ------------------------------------------------------------------------

    
------------------------------------------------------------------------------
    Come build with us! The BlackBerry&reg; Developer Conference in SF, CA
    is the only developer event you need to attend this year. Jumpstart your
    developing skills, take BlackBerry mobile applications to market and stay
    ahead of the curve. Join us from November 9&#45;12, 2009. Register now&#33;
    http://p.sf.net/sfu/devconf
    ------------------------------------------------------------------------

    _______________________________________________
    Matplotlib-users mailing list
    Matplotlib-users@lists.sourceforge.net  
<mailto:Matplotlib-users@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/matplotlib-users



------------------------------------------------------------------------------
Come build with us! The BlackBerry&reg; Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9&#45;12, 2009. Register now&#33;
http://p.sf.net/sfu/devconf
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users

Reply via email to