Okay, followup on this problem.

I've now replaced one large lxml.etree root with chunked writing into a tempfile. The XML view basically does this (debug heapy output included):



hp = guppy.hpy()
print "================ BEFORE TEMPFILE"
print hp.heap()

xml_model = XMLModel()
xmlfile = tempfile.TemporaryFile()
xmlfile.write(xml_model.begin())        # write XML "doctype" and start root tag

for row in session.query(DataModel).filter(....).all():
    # turn single row into its XML representation using lxml.element to construct
    # it, and return as string with lxml.tostring(element), here to be written to tempfile

    xmlfile.write(xml_model.process_row(row))

xmlfile.write(xml_model.end())          # write closing root tag

print "================ BEFORE ITERATOR RETURN"
print hp.heap()

# Return file as app iterator
xmlfile.seek(0)
return Resopnse(app_itel=xmlfile, content_type="application/xml")




The problems I've encountered:

1. If I used SpooledTemporaryFile it eats up memory that is apparently Python's and thus not returned to OS, so I'll just use tmpfs to minimize disk writes on tens of thousands of rows
2. The whole application apparently uses 121M of memory as shown by top, while guppy/heapy says 18M of heap is used. What follows is the output of uwsgi.log, I've restarted the app and there is no
request to it except this single request for a large XML "file". The resulting output "file" is 7M.


================ BEFORE TEMPFILE
Partition of a set of 192674 objects. Total size = 16970368 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0  80163  42  6728540  40   6728540  40 str
     1  44155  23  1673300  10   8401840  50 tuple
     2   7068   4  1554336   9   9956176  59 dict (no owner)
     3  10675   6   725900   4  10682076  63 types.CodeType
     4   1539   1   687936   4  11370012  67 type
     5    475   0   654424   4  12024436  71 dict of module
     6  11259   6   630504   4  12654940  75 function
     7   1539   1   614808   4  13269748  78 dict of type
     8   3544   2   262048   2  13531796  80 list
     9   1817   1   235916   1  13767712  81 unicode
<593 more rows. Type e.g. '_.more' to view.>
================ BEFORE ITERATOR RETURN
Partition of a set of 194294 objects. Total size = 18709172 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0  80558  41  6739812  36   6739812  36 str
     1  44243  23  1676316   9   8416128  45 tuple
     2   7214   4  1596080   9  10012208  54 dict (no owner)
     3      1   0  1573008   8  11585216  62 sqlalchemy.orm.identity.WeakInstanceDict
     4  10675   5   725900   4  12311116  66 types.CodeType
     5   1539   1   687936   4  12999052  69 type
     6    475   0   654424   3  13653476  73 dict of module
     7  11264   6   630784   3  14284260  76 function
     8   1539   1   614808   3  14899068  80 dict of type
     9   3679   2   268960   1  15168028  81 list
<628 more rows. Type e.g. '_.more' to view.>



The idle, just started application is reported by top to use:

VIRT: 36216
RES: 23m
SHR: 5276


After single request to the above XML view:

VIRT: 135m
RES: 121m
SHR: 3692

I'm reporting the worker uwsgi process because the master parent process is unchanged (23m).


I am obviously doing something wrong or not understanding processes involved, because the heapy is showing little change before and after processing, total heap of 18M and top says 120M is eaten by that process, and the resulting XML "file" is only 7M large.


.oO V Oo.


--
You received this message because you are subscribed to the Google Groups "pylons-discuss" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to [email protected].
For more options, visit this group at http://groups.google.com/group/pylons-discuss?hl=en.

Reply via email to