I started writing this post asking for assistance on using Heapy for
studying my Sage application's memory usage. But I think that a more
important question relates to my use of @parallel('multiprocessing').
So here goes.

Section 1.
On an 8-core machine, I kick off a function (which has been decorated
with @parallel('multiprocessing')) with 16 arguments. (BTW, Sage
outrocks all---the sirens of multicore have given Matlab-addicted
colleagues weeks of impotent rage; me, only a couple of hours of
searching and finding "parallel?". The creators of Sage will have a
special place in heaven.)

Anyway, 16 arguments are kicked to this parallel function (which
creates large vectors and reduces them to a scalar), I see 8 or so
sage processes in "top", but one of them stands out: it's memory
footprint steadily increases in small 1-3% increments to about 12% of
memory. It stays at 12% after the sage script finishes. If I kicked
off 100 parallel runs, I get MemoryErrors due to excessive RAM use
(and Sage doesn't recover---it hangs, and Control-C kicks off
KeyboardErrors but never cleanly goes down, I have to kill the
screen).

But if I decorate my function with @parallel('reference'), that is, to
force serial execution, the Sage process at the end of many parallel
runs only takes 1.5% memory. (It just takes a lot more time.)

Is there an issue with @parallel with multiprocessing and garbage
collection? Why would Sage sop up so much memory after parallel runs
of a function that solely returns a scalar? (The function does take as
its input a small compound object, deepcopy's it, modifies the copy,
generates large vectors from the original and modified objects and
reduces them to a scalar, which it returns. I haven't tried making
Python explicitly garbage-collect the intermediate variables it
generates because that sounds risky.)

Section 2.
In a bid to figure out what Python objects are taking so much RAM, I
installed Guppy and am trying to use Heapy (http://guppy-
pe.sourceforge.net/heapy_tutorial.html). I cannot understand its
output, and "from guppy import hpy; hp=hpy(); hp.heap()" tells me that
my 12% memory usage (of 8 GB RAM) is using just 40 MB of memory. If
anyone knows how to interpret it's output, I'd be much obliged:

sage: hp.heap()

Partition of a set of 390118 objects. Total size = 42345660 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of
class)
     0 168567  43 24633284  58  24633284  58 str
     1  93441  24  3740796   9  28374080  67 tuple
     2   1593   0  2051016   5  30425096  72 dict of module
     3  14869   4  2022184   5  32447280  77 dict of
numpy.core.defmatrix.matrix
     4  25218   6  1714824   4  34162104  81 types.CodeType
     5  24449   6  1369144   3  35531248  84 function
     6   2518   1  1281712   3  36812960  87 dict of type
     7   2301   1  1167912   3  37980872  90 dict (no owner)
     8   2520   1  1094184   3  39075056  92 type
     9  14869   4   832664   2  39907720  94
numpy.core.defmatrix.matrix
<839 more rows. Type e.g. '_.more' to view.>

Thanks very much!
--~--~---------~--~----~------------~-------~--~----~
To post to this group, send an email to [email protected]
To unsubscribe from this group, send an email to 
[email protected]
For more options, visit this group at http://groups.google.com/group/sage-devel
URLs: http://www.sagemath.org
-~----------~----~----~----~------~----~------~--~---

Reply via email to