Chris Withers wrote:
<div class="moz-text-flowed" style="font-family: -moz-fixed">Hi All,

I thought this was fixed back in Python 2.5, but I guess not?

So, I'm playing in an interactive session:

>>> from xlrd import open_workbook
>>> b = open_workbook('some.xls',pickleable=0,formatting_info=1)

At this point, top shows the process usage for python to be about 500Mb.
That's okay, I'd expect that, b is big ;-)

>>> del b

However, it still does now, maybe the garbage collector needs a kick?

>>> import gc
>>> gc.collect()
702614

Nope, still 500Mb. What gives? How can I make Python give the memory its no longer using back to the OS?

Okay, so maybe this is something to do with it being an interactive session? So I wrote this script:

from xlrd import open_workbook
import gc
b = open_workbook('some.xls',pickleable=0,formatting_info=1)
print 'opened'
raw_input()
del b
print 'deleted'
raw_input()
gc.collect()
print 'gc'
raw_input()

The raw inputs are there so I can check the memory usage in top.
Even after the gc, Python still hasn't given the memory back to the OS :-(

What am I doing wrong?

Chris

You're not doing anything wrong. I don't know of any other environment that "gives the memory back" to the OS.

I don't know Unix/Linux memory management, but I do know Windows, and I suspect the others are quite similar. There are a few memory allocators within Windows itself, and some more within the MSC runtime library. They work similarly enough that I can safely just choose one to explain. I'll pick on malloc().

When malloc() is called for the first time (long before your module is loaded), it asks the operating system's low-level mapping allocator for a multiple of 64k. The 64k will always be aligned on a 64k boundary, and is in turn divided into 4k pages. The 64k could come from one of three places - the swapfile, an executable (or DLL), or a data file, but there's not much real difference between those. malloc() itself will always use the swapfile. Anyway, at this point my memory is a little bit fuzzy. I think only 4k of the swapfile is actually mapped in, the rest being reserved. But malloc() will then build some data structures for that 64k block, and as memory is requested, get more and more pieces of that 64k, till the whole thing is mapped in. Then, additional multiples of 64k are allocated in the same way, and of course the data structures are chained together. If an application "frees" a block, the data structure is updated, but the memory is not unmapped. Theoretically, if all the blocks within one 64k were freed, malloc() could release the 64k block to the OS, but to the best of my knowledge, malloc() never does. Incidentally, there's a different scheme for large blocks, but that's changed several times, and I have no idea how it's done now.

Now, C programmers sometimes write a custom allocator, and in C++, it's not hard to have a custom allocator manage all instances of a particular class. This can be convenient for applications that know how their memory usage patterns are likely to work. Photoshop for example can be configured to use "user swap space" (I forget what they call it) from files that Photoshop explicitly allocates. And space from that allocator is not from the swapfile, so it's not constrained by other running applications, and wouldn't be counted by the Windows equivalent of 'top' (eg. the Windows Task Manager).

A custom allocator can also be designed to know when a particular set of allocations are all freed, and release the memory entirely back to the system. For instance, if all temp data for a particular transaction is put into an appropriate custom allocator, then at the end of the transaction, it can safely be released.

I would guess that Python doesn't do any custom allocators, and therefore never releases the memory back to the system. It will however reuse it when you allocate more stuff.


DaveA  (author of the memory tracking subsystem of NuMega's BoundsChecker)

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to