> On 20 Mar 2016, at 1:10 AM, Kent Bower <[email protected]> wrote: > > Thanks Graham, few more items inline... > > On Sat, Mar 19, 2016 at 1:24 AM, Graham Dumpleton <[email protected] > <mailto:[email protected]>> wrote: > >> On 17 Mar 2016, at 11:28 PM, Kent Bower <[email protected] >> <mailto:[email protected]>> wrote: >> >> My answers are below, but before you peek, Graham, note that you and I have >> been through this memory discussion before & I've read the vertical >> partitioning article and use inactivity-timeout, "WSGIRestrictEmbedded On", >> considered maximum-requests, etc. >> >> After years of this, I'm resigned to the fact that python is memory hungry, >> especially built on many of these web-stack and database libraries, etc. >> I'm Ok with that. I'm fine with a high-water RAM mark imposed by running >> under Apache, mostly. But, dang, it sure would be great if the 1 or 2% of >> requests that really (and legitimately) hog a ton of RAM, like, say 500MB >> extra, didn't keep it when done. I may revisit vertical partitioning again, >> but last time I did I think I found that the 1 or 2% in my case generally >> won't be divisible by url. In most cases I wouldn't know whether the >> particular request is going to need lots of RAM until after the database >> queries return (which is far too late for vertical partitioning to be >> useful). >> >> So I was mostly just curious about the status of nginx running wsgi, which >> doesn't solve python's memory piggishness, but would at least relinquish the >> extra RAM once python garbage collected. > > Where have you got the idea that using nginx would result in memory being > released back to the OS once garbage collected? It isn’t able to do that. > > The situations are very narrow as to when a process is able to give back > memory to the operating system. It can only be done when the now free memory > was at top of allocated memory. This generally only happens for large block > allocations and not in normal circumstances for a running Python application. > > > At this point I'm not sure where I got that idea, but I'm surprised at this. > For example, my previous observations of paster running wsgi were that it is > quite faithful at returning free memory to the OS. Was I just getting lucky, > or would paster be different for some reason? > > In any case, if nginx won't solve that, then I can't see any reason to even > consider it over apache/mod_wsgi. Thank you for answering that. > > >> (Have you considered a max-memory parameter to mod_wsgi that would >> gracefully stop taking requests and shutdown after the threshold is reached >> for platforms that would support it? I recall -- maybe incorrectly -- you >> saying on Windows or certain platforms you wouldn't be able to support that. >> What about the platforms that could support it? It seems to me to be the >> very best way mod_wsgi could approach this Apache RAM nuance, so seems like >> it would be tremendously useful for the platforms that could support it.) > > You can do this yourself rather easily with more recent mod_wsgi version. > > If you create a background thread from a WSGI script file, in similar way as > monitor for code changes does in: > > > http://modwsgi.readthedocs.org/en/develop/user-guides/debugging-techniques.html#extracting-python-stack-traces > > <http://modwsgi.readthedocs.org/en/develop/user-guides/debugging-techniques.html#extracting-python-stack-traces> > > but instead of looking for code changes, inside the main loop of the > background thread do: > > import os > import mod_wsgi > > metrics = mod_wsgi.process_metrics() > > if metrics[‘memory_rss’] > MYMEMORYTHRESHOLD: > os.kill(os.getpid(), signal.SIGUSR1) > > So mod_wsgi provides the way of determining the amount of memory without > resorting to importing psutil, which is quite fat in itself, but how you use > it is up to you. > > > Right, that's an idea; (could even be a shell script that takes this > approach, I suppose, but I like your recipe.) > > Unfortunately, I don't want to automate bits that can feasibly clobber > blocked sessions. SIGUSR1, after graceful-timeout & shutdown-timeout, can > result in ungraceful killing. Our application shares a database with an old > legacy application which was poorly written to hold transactions while > waiting on user input (this was apparently common two decades ago). So, > unfortunately, it isn't terribly uncommon that our application is blocked at > the database level waiting for someone using the legacy application who has a > record(s) locked and may not even be at their desk or may have gone to lunch. > Sometimes our client's IT staff has to hunt down these people or decide to > kill their database session. In any case, from a professional point of view, > our application should be the responsible one and wait patiently, allowing > our client's IT staff the choice of how to handle those cases. So, while the > likelihood is pretty low, even with graceful-timeout & shutdown-timeout set > at a very high value like 5 minutes, I still run the risk of killing > legitimate sessions with SIGUSR1. (I've brought this up before and you > didn't agree with my gripe and I do understand why, but in my use case, I > don't feel I can automate that route responsibly.... we do use SIGUSR1 > manually sometimes, when we can monitor and react to cases where a session is > blocked at the database level.)
If we have discussed it previously, then I may not have anything more to add. Did I previously suggest offloading this memory consuming tasks behind a job queue run under Celery or something else? That way they are out of the web server processes at least. > inactivity-timeout doesn't present this concern: it won't ever kill anything, > just silently restarts like a good boy when inactive. I've recently > reconsidered dropping that way down from 30 minutes. (When I first > implemented this, it was just to reclaim RAM at the end of the day, so that's > why it is 30 minutes. I didn't like the idea of churning new processes > during busy periods, but I've been thinking 1 or 2 minutes may be quite > reasonable.) > > If I could signal processes to shutdown at their next opportunity (meaning > the next time they are handling no requests, like inactivity-timeout), that > would solve many issues in this regard for me because I could signal these > processes when their RAM consumption is high and let them restart when > "convenient," being the ultimate in gracefulness. SIGUSR2 could mean "the > next time you get are completely idle," while SIGUSR1 continues to mean > "initiate shutdown now.” That is what SIGUSR1 does it you set graceful-timeout large enough. It is SIGINT or SIGTERM which is effectively initiate shutdown now. So shouldn’t be a need to have a SIGUSR2 as SIGUSR1 should already do what you are hoping for with a reasonable setting of graceful-timeout. > > Do note that if using SIGUSR1 to restart the current process (which should > only be done for deamon mode), you should also set graceful-timeout option to > WSGIDaemonProcess if you have long running requests. It is the maximum time > process will wait to shutdown while still waiting for requests when doing a > SIGUSR2 graceful shutdown of process, before going into forced shutdown mode > where no requests will be accepted and requests can be interrupted. > >> Here >> (http://blog.dscpl.com.au/2009/05/blocking-requests-and-nginx-version-of.html >> >> <http://blog.dscpl.com.au/2009/05/blocking-requests-and-nginx-version-of.html>) >> you discuss nginx's tendency to block requests that may otherwise be >> executing in a different process, depending on timing, etc. Is this issue >> still the same (I thought I read a hint somewhere that there may be a >> workaround for that), so I ask. > > That was related to someones attempt to embedded a Python interpreter inside > of nginx processes themselves. That project died a long time ago. No one > embeds Python interpreters inside of nginx processes. It was a flawed design. > > I don’t what you are reading to get all these strange ideas. :-) > > > Google, I suppose ;) That's why I finally asked you when I couldn't find > anything more about it via Google. > > > >> And so I wanted your opinion on nginx... >> >> ==== >> Here is what you asked for if it can still be useful. >> >> I'm on mod_wsgi-4.4.6 and the particular server that prompted me this time >> is running Apache 2.4 (prefork), though some of our clients use 2.2 >> (prefork). >> >> Our typical wsgi conf setting is something like this, though threads and >> processes varies depending on server size: >> >> LoadModule wsgi_module modules/mod_wsgi.so >> WSGIPythonHome /home/rarch/tg2env >> # see http://code.google.com/p/modwsgi/issues/detail?id=196#c10 >> <http://code.google.com/p/modwsgi/issues/detail?id=196#c10> concerning >> timeouts >> WSGIDaemonProcess rarch processes=20 threads=14 inactivity-timeout=1800 >> display-name=%{GROUP} graceful-timeout=5 >> python-eggs=/home/rarch/tg2env/lib/python-egg-cache > > Is your web server really going to be idle for 30 minutes? I can’t see how > that would have been doing anything. > > Also, in mod_wsgi 4.x when inactivity-timeout kicks in has changed. > > It used to apply when there were active requests and they were blocked, as > well as when no requests were running. > > Now it only applies to case where there are no requests. > > The case for running but blocked requests is now handled by request-timeout. > > You may be better of setting request-timeout now to be a more reasonable > value for your expected longest request, but set inactivity-timeout to > something much shorter. > > So suggest you play with that. > > Also, are you request handles I/O or CPU intensive and how many requests? > > Such a high number of processes and threads always screams to me that half > the performance problems are due to setting these too [HIGH], invoking > pathological OS process swapping issues and Python GIL issues. > > > > Yes, the requests are I/O intensive (that is, database intensive, which adds > a huge overhead to our typical request). Often requests finish in under a > second or two, but they also can take many seconds (not terrible for the > user, but sometimes they do a lot of processing with many trips to the > database). > We have several clients (companies), so the number of requests varies widely, > but can get pretty heavy on busy days (like black friday, since they are in > retail). We've played with those numbers quite a bit and without high > numbers like that, responsiveness suffers because we backlog due to requests > often taking several seconds. > > Thanks for all your input, you've been tremendously helpful! > Kent > > > >> WSGIProcessGroup rarch >> WSGISocketPrefix run/wsgi >> WSGIRestrictStdout Off >> WSGIRestrictStdin On >> # Memory tweak. >> http://blog.dscpl.com.au/2009/11/save-on-memory-with-modwsgi-30.html >> <http://blog.dscpl.com.au/2009/11/save-on-memory-with-modwsgi-30.html> >> WSGIRestrictEmbedded On >> WSGIPassAuthorization On >> >> # we'll make the /tg/ directory resolve as the wsgi script >> WSGIScriptAlias /tg >> /home/rarch/trunk/src/appserver/wsgi-config/wsgi-deployment.py >> process-group=rarch application-group=%{GLOBAL} >> WSGIScriptAlias /debug/tg >> /home/rarch/trunk/src/appserver/wsgi-config/wsgi-deployment.py >> process-group=rarch application-group=%{GLOBAL} >> >> MaxRequestsPerChild 0 >> <IfModule prefork.c> >> MaxClients 308 >> ServerLimit 308 >> </IfModule> >> <IfModule worker.c> >> ThreadsPerChild 25 >> MaxClients 400 >> ServerLimit 16 >> </IfModule> >> >> >> Thanks for all your help and for excellent software! >> Kent >> >> >> On Wed, Mar 16, 2016 at 7:27 PM, Graham Dumpleton >> <[email protected] <mailto:[email protected]>> wrote: >> On the question of whether nginx will solve this problem, I can’t see how. >> >> When one talks about nginx and Python web applications, it is only as a >> proxy for HTTP requests to some backend WSGI server. The Python web >> application doesn’t run in nginx itself. So memory issues and how to deal >> with them are the provence of the WSGI server used, whatever that is and not >> nginx. >> >> Anyway, answer the questions below and can start with that. >> >> You really want to be using recent mod_wsgi version and not Apache 2.2. >> >> Apache 2.2 design has various issues and bad configuration defaults which >> means it can gobble up more memory than you want. Recent mod_wsgi versions >> have workarounds for Apache 2.2 issues and are much better at eliminating >> those Apache 2.2 issues. Recent mod_wsgi versions also have fixes for memory >> usage problems in some corner cases. As far as what I mean by recent, I >> recommend 4.4.12 or later. The most recent version is 4.4.21. If you are >> stuck with 3.4 or 3.5 from your Linux distro that is not good and that may >> increase problems. >> >> So long as got recent mod_wsgi version then can look at using vertical >> partitioning to farm out memory hungry request handlers to their own daemon >> process group and better configure those to handle that and recycle >> processes based on activity or, memory usage. A blog post related to that is: >> >> * http://blog.dscpl.com.au/2014/02/vertically-partitioning-python-web.html >> <http://blog.dscpl.com.au/2014/02/vertically-partitioning-python-web.html> >> >> Graham >> >>> On 17 Mar 2016, at 7:15 AM, Graham Dumpleton <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> What version of mod_wsgi and Apache are you using? >>> >>> Are you stuck with old versions of both? >>> >>> For memory tracking there are API calls mod_wsgi provides in recent >>> versions for getting memory usage which can be used as part of scheme to >>> trigger a process restart. You can’t use sys.exit(), but can use signals to >>> trigger a clean shutdown of a process. Again better to have recent mod_wsgi >>> versions as can then also set up some graceful timeout options for signal >>> induced restart. >>> >>> Also, what is your mod_wsgi configuration so can make sure doing all the >>> typical things one would do to limit memory usage, or quarantine particular >>> handlers which are memory hungry? >>> >>> Graham >>> >>>> On 17 Mar 2016, at 4:29 AM, Kent Bower <[email protected] >>>> <mailto:[email protected]>> wrote: >>>> >>>> Interesting idea.. yes, we are using multiple threads and also other >>>> stack frameworks, so that's not straightforward, but worth thinking >>>> about... not sure how to approach that with the other threads. Thank you >>>> Bill. >>>> >>>> On Wed, Mar 16, 2016 at 1:11 PM, Bill Freeman <[email protected] >>>> <mailto:[email protected]>> wrote: >>>> I don't know about nginx, but one possibility, if the large memory >>>> requests are infrequent, is to detect when you have completed one and >>>> trigger the exit/reload of the daemon process (calling sys.exit() is not >>>> the way, since there could be other threads in the middle of something, >>>> unless you run one thread per process). >>>> >>>> On Wed, Mar 16, 2016 at 7:50 AM, Kent <[email protected] >>>> <mailto:[email protected]>> wrote: >>>> I'm looking for a very brief high-level pros vs. cons of wsgi under apache >>>> vs. under nginx and then to be pointed to more details I can study myself >>>> (or at least the latter). >>>> >>>> Our application occasionally allows requests that consume a large amount >>>> of RAM (no obvious way around that, they are valid requests) and >>>> occasionally this causes problems since we can't reclaim the RAM readily >>>> from apache. (We already have tweaked with and do use >>>> "inactivity-timeout". This helps, but still now and then we hit problems >>>> where we run into swapping to disk.) >>>> >>>> I'm wondering if nginx may solve this problem. I've read much of what you >>>> (Graham) have had to say about the memory strategies with apache and >>>> mod_wsgi, but wonder what your opinion of nginx is and where you've >>>> already discussed this. I've read articles I could find you've written on >>>> nginx, such as "Blocking requests and nginx version of mod_wsgi," but >>>> wonder if the same weaknesses are still applicable today, 7 years later? >>>> >>>> >>>> Thank you very much in advance! >>>> Kent >>>> >>>> -- >>>> You received this message because you are subscribed to the Google Groups >>>> "modwsgi" group. >>>> To unsubscribe from this group and stop receiving emails from it, send an >>>> email to [email protected] >>>> <mailto:[email protected]>. >>>> To post to this group, send email to [email protected] >>>> <mailto:[email protected]>. >>>> Visit this group at https://groups.google.com/group/modwsgi >>>> <https://groups.google.com/group/modwsgi>. >>>> For more options, visit https://groups.google.com/d/optout >>>> <https://groups.google.com/d/optout>. >>>> >>>> >>>> -- >>>> You received this message because you are subscribed to a topic in the >>>> Google Groups "modwsgi" group. >>>> To unsubscribe from this topic, visit >>>> https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe >>>> <https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe>. >>>> To unsubscribe from this group and all its topics, send an email to >>>> [email protected] >>>> <mailto:[email protected]>. >>>> To post to this group, send email to [email protected] >>>> <mailto:[email protected]>. >>>> Visit this group at https://groups.google.com/group/modwsgi >>>> <https://groups.google.com/group/modwsgi>. >>>> For more options, visit https://groups.google.com/d/optout >>>> <https://groups.google.com/d/optout>. >>>> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google Groups >>>> "modwsgi" group. >>>> To unsubscribe from this group and stop receiving emails from it, send an >>>> email to [email protected] >>>> <mailto:[email protected]>. >>>> To post to this group, send email to [email protected] >>>> <mailto:[email protected]>. >>>> Visit this group at https://groups.google.com/group/modwsgi >>>> <https://groups.google.com/group/modwsgi>. >>>> For more options, visit https://groups.google.com/d/optout >>>> <https://groups.google.com/d/optout>. >>> >> >> >> -- >> You received this message because you are subscribed to a topic in the >> Google Groups "modwsgi" group. >> To unsubscribe from this topic, visit >> https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe >> <https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe>. >> To unsubscribe from this group and all its topics, send an email to >> [email protected] >> <mailto:[email protected]>. >> To post to this group, send email to [email protected] >> <mailto:[email protected]>. >> Visit this group at https://groups.google.com/group/modwsgi >> <https://groups.google.com/group/modwsgi>. >> For more options, visit https://groups.google.com/d/optout >> <https://groups.google.com/d/optout>. >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "modwsgi" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] >> <mailto:[email protected]>. >> To post to this group, send email to [email protected] >> <mailto:[email protected]>. >> Visit this group at https://groups.google.com/group/modwsgi >> <https://groups.google.com/group/modwsgi>. >> For more options, visit https://groups.google.com/d/optout >> <https://groups.google.com/d/optout>. > > > -- > You received this message because you are subscribed to a topic in the Google > Groups "modwsgi" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe > <https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe>. > To unsubscribe from this group and all its topics, send an email to > [email protected] > <mailto:[email protected]>. > To post to this group, send email to [email protected] > <mailto:[email protected]>. > Visit this group at https://groups.google.com/group/modwsgi > <https://groups.google.com/group/modwsgi>. > For more options, visit https://groups.google.com/d/optout > <https://groups.google.com/d/optout>. > > > -- > You received this message because you are subscribed to the Google Groups > "modwsgi" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] > <mailto:[email protected]>. > To post to this group, send email to [email protected] > <mailto:[email protected]>. > Visit this group at https://groups.google.com/group/modwsgi > <https://groups.google.com/group/modwsgi>. > For more options, visit https://groups.google.com/d/optout > <https://groups.google.com/d/optout>. -- You received this message because you are subscribed to the Google Groups "modwsgi" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/modwsgi. For more options, visit https://groups.google.com/d/optout.
