I don't think it's a leak because the consumed memory is constant, ie. the max requested at certain point in the life of the process, and it stays there for same data set regardless of how often it is requested.

I'm not constructing the XML with strings directly but using lxml (and not xml.etree builtin because I use much faster lxml with C extensions for parsing and xsd validation as well), which is perhaps overkill and yeah, I can see several ways to chop that up into smaller pieces and append stuff to a temp file on disc and then return that with a generator or even x-sendfile, but I was hoping for a more "straightforward" solution not requiring rewrite.

I've got about 20 XML file formats to construct, each produced by its own (pluggable) module because each defines its own format etc... Switching to a file-based partial generation would mean a massive rewrite. I guess this is one of the situations where "preemptive optimization is evil" bit me because I _was_ concerned with performance when I started building the system. Admittedly at that point I wasn't aware of the Pythons memory retention "policy".

I was hoping there's a way to safely kill a wsgi process from within it, I could do that only when such largish XML files are requested, or something else not obvious to me. Doesn't have to be uwsgi, though. Another way would be to remove XML generation from the wsgi application altogether into a separate callable and do some subprocess.Popen magick with minimum rewrite required. Is that even wise/advisable under (u)wsgi? Spawning external processes?


Thanks,

.oO V Oo.


On 05/06/2012 02:37 AM, Chris McDonough wrote:
On 05/05/2012 07:42 PM, Vlad K. wrote:

Hi all.


As I understand it Python won't release internally released memory back
to OS. So it happens in my Pyramid app that has to process/construct
rather largish XML files occasionally, the memory consumption jumps to
several hundred MB and stays there for good or until I recycle the
processes (uwsgi reload). This is pretty bad in my case because I'm
forced to have a rather large memory headroom on the server just to
handle those peaks which comprise less than 1% of total requests
throughout the day. Hell, it's way less than 1%. A single request for
such XML file takes anywhere from 1 to up to 10 seconds on this
particular VPS "hardware", and there are maybe a few dozen such requests
throughout the day (out of several thousand daily requests). So what
should really be a transient memory consumption peak lasting up to 10
seconds becomes permanent memory requirement (or until the uwsgi
processes are reloaded).

Now I know I can set max-requests for the uwsgi processes and it will
recycle automatically, but is there another solution? I find this
reluctance of Python to return memory to OS very annoying, right next to
the GIL.

Commenting on how Python does or doesn't return memory to the OS is above my current pay grade. Because I don't understand it, and have no competence to try to "fix" it, I have to work around it. I usually try to do that by writing code doesn't ask for hundreds of megabytes all in one shot from the OS.

For example, maybe you can construct the XML in portions instead of constructing a huge collection of strings that all reside in memory at once.

If there's a memory leak somewhere outside of my control, what I usually do to work around it is to use supervisor + memmon to automatically restart my processes when they use more than some amount of memory:

http://www.plope.com/Members/chrism/memmon_sample

This isn't immediately useful if your processes are spawned by uwsgi, but might be useful if you choose to use a different frontend server like Apache or nginx to proxy to a number of backend supervisor-managed Pyramid processes.

- C


--
You received this message because you are subscribed to the Google Groups 
"pylons-discuss" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en.

Reply via email to