Re: Python's memory hogging

Vlad K. Tue, 15 May 2012 06:54:16 -0700

Okay, followup on this problem.

I've now replaced one large lxml.etree root with chunked writing into a tempfile. The XML view basically does this (debug heapy output included):

hp = guppy.hpy() print "================ BEFORE TEMPFILE" print hp.heap() xml_model = XMLModel() xmlfile = tempfile.TemporaryFile() xmlfile.write(xml_model.begin()) # write XML "doctype" and start root tag for row in session.query(DataModel).filter(....).all(): # turn single row into its XML representation using lxml.element to construct # it, and return as string with lxml.tostring(element), here to be written to tempfile xmlfile.write(xml_model.process_row(row)) xmlfile.write(xml_model.end()) # write closing root tag print "================ BEFORE ITERATOR RETURN" print hp.heap() # Return file as app iterator xmlfile.seek(0) return Resopnse(app_itel=xmlfile, content_type="application/xml")

The problems I've encountered:

1. If I used SpooledTemporaryFile it eats up memory that is apparently Python's and thus not returned to OS, so I'll just use tmpfs to minimize disk writes on tens of thousands of rows
2. The whole application apparently uses 121M of memory as shown by top, while guppy/heapy says 18M of heap is used. What follows is the output of uwsgi.log, I've restarted the app and there is no
request to it except this single request for a large XML "file". The resulting output "file" is 7M.

================ BEFORE TEMPFILE Partition of a set of 192674 objects. Total size = 16970368 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 80163 42 6728540 40 6728540 40 str 1 44155 23 1673300 10 8401840 50 tuple 2 7068 4 1554336 9 9956176 59 dict (no owner) 3 10675 6 725900 4 10682076 63 types.CodeType 4 1539 1 687936 4 11370012 67 type 5 475 0 654424 4 12024436 71 dict of module 6 11259 6 630504 4 12654940 75 function 7 1539 1 614808 4 13269748 78 dict of type 8 3544 2 262048 2 13531796 80 list 9 1817 1 235916 1 13767712 81 unicode <593 more rows. Type e.g. '_.more' to view.> ================ BEFORE ITERATOR RETURN Partition of a set of 194294 objects. Total size = 18709172 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 80558 41 6739812 36 6739812 36 str 1 44243 23 1676316 9 8416128 45 tuple 2 7214 4 1596080 9 10012208 54 dict (no owner) 3 1 0 1573008 8 11585216 62 sqlalchemy.orm.identity.WeakInstanceDict 4 10675 5 725900 4 12311116 66 types.CodeType 5 1539 1 687936 4 12999052 69 type 6 475 0 654424 3 13653476 73 dict of module 7 11264 6 630784 3 14284260 76 function 8 1539 1 614808 3 14899068 80 dict of type 9 3679 2 268960 1 15168028 81 list <628 more rows. Type e.g. '_.more' to view.>

The idle, just started application is reported by top to use:

VIRT: 36216
RES: 23m
SHR: 5276

After single request to the above XML view:

VIRT: 135m
RES: 121m
SHR: 3692

I'm reporting the worker uwsgi process because the master parent process is unchanged (23m).

I am obviously doing something wrong or not understanding processes involved, because the heapy is showing little change before and after processing, total heap of 18M and top says 120M is eaten by that process, and the resulting XML "file" is only 7M large.

.oO V Oo.




-- 

You received this message because you are subscribed to the Google Groups "pylons-discuss" group.

To post to this group, send email to [email protected].

To unsubscribe from this group, send email to [email protected].


For more options, visit this group at http://groups.google.com/group/pylons-discuss?hl=en.

Re: Python's memory hogging

Reply via email to