Hello,I have written a small script that, I think, demonstrates a memory leak in savefig. A search of the mailing list shows a thread started by Ralf Gommers <ralf.gomm...@googlemail.com> about 2009-07-01 that seems to cover a very similar issue. I have appended the demonstration script at the end of this e-mail text.
The demonstration script script sits in a relatively tight loop creating figures then saving them while monitoring memory usage. A plot of VmRSS vs. number of loop iterations as generated on my system is attached as "data.png" (you can create your own plots with the sample script). Although I have only tested this on Fedora 12, I expect that most Linux users should be able to run the script for themselves. Users should be able to comment out the "savefig" line and watch memory usage go from unbounded to (relatively) bounded.
Can anybody see a cause for this leak hidden in my code? Has anybody seen this issue and solved it? I would also appreciate it if other people would run this script and report their findings so that there will be some indication of the problem's manifestation frequency.
Sincerely, Keegan Callin ************************************************************************ '''Script to demonstrate memory leakage in savefig call. Requirements: Tested in Fedora 12. It should work on other systems where /proc/{PID}/status files exist and those files contain a 'VmRSS' entry (this is how the script monitors its memory usage). System Details on Original Test System: [kee...@grizzly test]$ uname -aLinux grizzly 2.6.32.9-70.fc12.x86_64 #1 SMP Wed Mar 3 04:40:41 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux
[kee...@grizzly ~]$ gcc --version gcc (GCC) 4.4.3 20100127 (Red Hat 4.4.3-4) Copyright (C) 2010 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. [kee...@grizzly ~]$ cd ~/src/matplotlib-0.99.1.1 [kee...@grizzly matplotlib-0.99.1.1]$ rm -rf build [kee...@grizzly matplotlib-0.99.1.1]$ python setup.py build &> out.log [kee...@grizzly matplotlib-0.99.1.1]$ head -38 out.log ============================================================================ BUILDING MATPLOTLIB matplotlib: 0.99.1.1 python: 2.6.4 (r264:75706, Jan 20 2010, 12:34:05) [GCC 4.4.2 20091222 (Red Hat 4.4.2-20)] platform: linux2 REQUIRED DEPENDENCIES numpy: 1.4.0 freetype2: 9.22.3 OPTIONAL BACKEND DEPENDENCIES libpng: 1.2.43 Tkinter: no * TKAgg requires Tkinter wxPython: no * wxPython not found Gtk+: no* Building for Gtk+ requires pygtk; you must be able
* to "import gtk" in your build/install environment Mac OS X native: no Qt: no Qt4: no Cairo: no OPTIONAL DATE/TIMEZONE DEPENDENCIES datetime: present, version unknown dateutil: matplotlib will provide pytz: 2010b OPTIONAL USETEX DEPENDENCIES dvipng: no ghostscript: 8.71 latex: no pdftops: 0.12.4 [Edit setup.cfg to suppress the above messages] ============================================================================ [kee...@grizzly matplotlib-0.99.1.1]$ bzip2 out.log # out.log.bz2 is attached to the message containing this program. [kee...@grizzly ~]$ python2.6 Python 2.6.4 (r264:75706, Jan 20 2010, 12:34:05) [GCC 4.4.2 20091222 (Red Hat 4.4.2-20)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import matplotlib >>> matplotlib.__version__ '0.99.1.1' ''' # Import standard python modules import sys import os from ConfigParser import SafeConfigParser as ConfigParser from cStringIO import StringIO # import numpy import numpy from numpy import zeros # Import matplotlib from matplotlib.figure import Figure from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas def build_figure(a): '''Returns a new figure containing array a.''' # Create figure and setup graph fig = Figure() FigureCanvas(fig) ax = fig.add_subplot(1, 1, 1) ax.plot(a) return fig _proc_status = '/proc/%d/status' % os.getpid() def load_status():'''Returns a dict of process statistics from from /proc/{PID}/status.'''
status = {} with open(_proc_status) as f: for line in f: key, value = line.split(':', 1) key = key.strip() value = value.strip() status[key] = value return status def main(): data_file = 'data.txt' image_file = 'data.png' num_iterations = 1000 with open(data_file, 'w') as f: # Tried running without matplotlib or numpy such that the # only thing happening in the process is the dumping of process # status information to `data_file` from the loop. Memory # usage reaches a bound _very_ quickly. status = load_status() rss, unit = status['VmRSS'].split() print >>f, rss print 'Executing', num_iterations, 'iterations.' a = zeros(10000) for i in xrange(0, num_iterations): # Shift random data is being shifted into a numpy array. # With numpy and the process status dump enabled, memory # usage reaches a bound very quickly. a[0:-1] = a[1:] a[-1] = numpy.random.rand(1)[0] # When figures of the array are generated in each loop, # memory reaches a bound more slowly(~50 iterations) than # without matplotlib; nevertheless, memory usage still # appears to be bounded. fig = build_figure(a) # Savefig alone causes memory usage to become unbounded. # Memory usage increase seems to be linear with the number # of iterations. sink = StringIO()fig.savefig(sink, format='png', dpi=80, transparent=False, bbox_inches="tight", pad_inches=0.15)
# This line below can be used to demonstrate that StringIO # does not leak without the savefig call. #sink.write(1000*'hello') sink.close() status = load_status() rss, unit = status['VmRSS'].split() print >>f, rss sys.stdout.write('#') sys.stdout.flush() # Load process statistics and save them to a file. print print 'Graphing memory usage data from', data_file, 'to', image_file with open(data_file) as f: rss = [int(r) for r in f] fig = build_figure(rss) with open(image_file, 'wb') as f: fig = build_figure(rss)fig.savefig(f, format='png', dpi=80, transparent=False, bbox_inches="tight", pad_inches=0.15)
return 0 if __name__ == '__main__': sys.exit(main())
<<attachment: data.png>>
------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev
_______________________________________________ Matplotlib-users mailing list Matplotlib-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-users