> On 22 Mar 2016, at 4:01 AM, Kent Bower <[email protected]> wrote: > > In your recipe for a background monitoring thread watching memory > consumption, after issuing the SIGUSR1, I'd probably just want the thread to > exit instead of sleeping... do I just do "sys.exit()" to safely accomplish > that?
The code isn’t just sleeping. It waits on a queue object which has something placed on it when mod_wsgi is shutting down the process via atexit callback. When the thread gets that it will exit cleanly, with the main thread waiting on it to exit to ensure it isn’t running. If you just call sys.exit() that results in a SystemExit exception being raised which causes the thread to exit but leaves an exception in the error logs. The use of the queue is better as it ensures that threads are shutdown properly when process is shutting down, else you risk that the thread could try and run while interpreter is being destroyed, causing Python to crash the process. > Also, regarding my observations of paster returning garbage-collected memory > to the OS, was I just getting lucky while monitoring (the memory was at the > very top of the allocated memory)? This is a universal python issue? It is a universal issue with any programs running on a UNIX system. You may want to Google up some articles on how memory allocation in UNIX as well as in Python works. > > Again, thanks for all your help! > > On Sat, Mar 19, 2016 at 11:22 PM, Graham Dumpleton > <[email protected] <mailto:[email protected]>> wrote: > >> On 20 Mar 2016, at 1:10 AM, Kent Bower <[email protected] >> <mailto:[email protected]>> wrote: >> >> Thanks Graham, few more items inline... >> >> On Sat, Mar 19, 2016 at 1:24 AM, Graham Dumpleton >> <[email protected] <mailto:[email protected]>> wrote: >> >>> On 17 Mar 2016, at 11:28 PM, Kent Bower <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> My answers are below, but before you peek, Graham, note that you and I have >>> been through this memory discussion before & I've read the vertical >>> partitioning article and use inactivity-timeout, "WSGIRestrictEmbedded On", >>> considered maximum-requests, etc. >>> >>> After years of this, I'm resigned to the fact that python is memory hungry, >>> especially built on many of these web-stack and database libraries, etc. >>> I'm Ok with that. I'm fine with a high-water RAM mark imposed by running >>> under Apache, mostly. But, dang, it sure would be great if the 1 or 2% of >>> requests that really (and legitimately) hog a ton of RAM, like, say 500MB >>> extra, didn't keep it when done. I may revisit vertical partitioning >>> again, but last time I did I think I found that the 1 or 2% in my case >>> generally won't be divisible by url. In most cases I wouldn't know whether >>> the particular request is going to need lots of RAM until after the >>> database queries return (which is far too late for vertical partitioning to >>> be useful). >>> >>> So I was mostly just curious about the status of nginx running wsgi, which >>> doesn't solve python's memory piggishness, but would at least relinquish >>> the extra RAM once python garbage collected. >> >> Where have you got the idea that using nginx would result in memory being >> released back to the OS once garbage collected? It isn’t able to do that. >> >> The situations are very narrow as to when a process is able to give back >> memory to the operating system. It can only be done when the now free memory >> was at top of allocated memory. This generally only happens for large block >> allocations and not in normal circumstances for a running Python application. >> >> >> At this point I'm not sure where I got that idea, but I'm surprised at this. >> For example, my previous observations of paster running wsgi were that it >> is quite faithful at returning free memory to the OS. Was I just getting >> lucky, or would paster be different for some reason? >> >> In any case, if nginx won't solve that, then I can't see any reason to even >> consider it over apache/mod_wsgi. Thank you for answering that. >> >> >>> (Have you considered a max-memory parameter to mod_wsgi that would >>> gracefully stop taking requests and shutdown after the threshold is reached >>> for platforms that would support it? I recall -- maybe incorrectly -- you >>> saying on Windows or certain platforms you wouldn't be able to support >>> that. What about the platforms that could support it? It seems to me to >>> be the very best way mod_wsgi could approach this Apache RAM nuance, so >>> seems like it would be tremendously useful for the platforms that could >>> support it.) >> >> You can do this yourself rather easily with more recent mod_wsgi version. >> >> If you create a background thread from a WSGI script file, in similar way as >> monitor for code changes does in: >> >> >> http://modwsgi.readthedocs.org/en/develop/user-guides/debugging-techniques.html#extracting-python-stack-traces >> >> <http://modwsgi.readthedocs.org/en/develop/user-guides/debugging-techniques.html#extracting-python-stack-traces> >> >> but instead of looking for code changes, inside the main loop of the >> background thread do: >> >> import os >> import mod_wsgi >> >> metrics = mod_wsgi.process_metrics() >> >> if metrics[‘memory_rss’] > MYMEMORYTHRESHOLD: >> os.kill(os.getpid(), signal.SIGUSR1) >> >> So mod_wsgi provides the way of determining the amount of memory without >> resorting to importing psutil, which is quite fat in itself, but how you use >> it is up to you. >> >> >> Right, that's an idea; (could even be a shell script that takes this >> approach, I suppose, but I like your recipe.) >> >> Unfortunately, I don't want to automate bits that can feasibly clobber >> blocked sessions. SIGUSR1, after graceful-timeout & shutdown-timeout, can >> result in ungraceful killing. Our application shares a database with an old >> legacy application which was poorly written to hold transactions while >> waiting on user input (this was apparently common two decades ago). So, >> unfortunately, it isn't terribly uncommon that our application is blocked at >> the database level waiting for someone using the legacy application who has >> a record(s) locked and may not even be at their desk or may have gone to >> lunch. Sometimes our client's IT staff has to hunt down these people or >> decide to kill their database session. In any case, from a professional >> point of view, our application should be the responsible one and wait >> patiently, allowing our client's IT staff the choice of how to handle those >> cases. So, while the likelihood is pretty low, even with graceful-timeout & >> shutdown-timeout set at a very high value like 5 minutes, I still run the >> risk of killing legitimate sessions with SIGUSR1. (I've brought this up >> before and you didn't agree with my gripe and I do understand why, but in my >> use case, I don't feel I can automate that route responsibly.... we do use >> SIGUSR1 manually sometimes, when we can monitor and react to cases where a >> session is blocked at the database level.) > > If we have discussed it previously, then I may not have anything more to add. > > Did I previously suggest offloading this memory consuming tasks behind a job > queue run under Celery or something else? That way they are out of the web > server processes at least. > >> inactivity-timeout doesn't present this concern: it won't ever kill >> anything, just silently restarts like a good boy when inactive. I've >> recently reconsidered dropping that way down from 30 minutes. (When I first >> implemented this, it was just to reclaim RAM at the end of the day, so >> that's why it is 30 minutes. I didn't like the idea of churning new >> processes during busy periods, but I've been thinking 1 or 2 minutes may be >> quite reasonable.) >> >> If I could signal processes to shutdown at their next opportunity (meaning >> the next time they are handling no requests, like inactivity-timeout), that >> would solve many issues in this regard for me because I could signal these >> processes when their RAM consumption is high and let them restart when >> "convenient," being the ultimate in gracefulness. SIGUSR2 could mean "the >> next time you get are completely idle," while SIGUSR1 continues to mean >> "initiate shutdown now.” > > That is what SIGUSR1 does it you set graceful-timeout large enough. It is > SIGINT or SIGTERM which is effectively initiate shutdown now. So shouldn’t be > a need to have a SIGUSR2 as SIGUSR1 should already do what you are hoping for > with a reasonable setting of graceful-timeout. > >> >> Do note that if using SIGUSR1 to restart the current process (which should >> only be done for deamon mode), you should also set graceful-timeout option >> to WSGIDaemonProcess if you have long running requests. It is the maximum >> time process will wait to shutdown while still waiting for requests when >> doing a SIGUSR2 graceful shutdown of process, before going into forced >> shutdown mode where no requests will be accepted and requests can be >> interrupted. >> >>> Here >>> (http://blog.dscpl.com.au/2009/05/blocking-requests-and-nginx-version-of.html >>> >>> <http://blog.dscpl.com.au/2009/05/blocking-requests-and-nginx-version-of.html>) >>> you discuss nginx's tendency to block requests that may otherwise be >>> executing in a different process, depending on timing, etc. Is this issue >>> still the same (I thought I read a hint somewhere that there may be a >>> workaround for that), so I ask. >> >> That was related to someones attempt to embedded a Python interpreter inside >> of nginx processes themselves. That project died a long time ago. No one >> embeds Python interpreters inside of nginx processes. It was a flawed design. >> >> I don’t what you are reading to get all these strange ideas. :-) >> >> >> Google, I suppose ;) That's why I finally asked you when I couldn't find >> anything more about it via Google. >> >> >> >>> And so I wanted your opinion on nginx... >>> >>> ==== >>> Here is what you asked for if it can still be useful. >>> >>> I'm on mod_wsgi-4.4.6 and the particular server that prompted me this time >>> is running Apache 2.4 (prefork), though some of our clients use 2.2 >>> (prefork). >>> >>> Our typical wsgi conf setting is something like this, though threads and >>> processes varies depending on server size: >>> >>> LoadModule wsgi_module modules/mod_wsgi.so >>> WSGIPythonHome /home/rarch/tg2env >>> # see http://code.google.com/p/modwsgi/issues/detail?id=196#c10 >>> <http://code.google.com/p/modwsgi/issues/detail?id=196#c10> concerning >>> timeouts >>> WSGIDaemonProcess rarch processes=20 threads=14 inactivity-timeout=1800 >>> display-name=%{GROUP} graceful-timeout=5 >>> python-eggs=/home/rarch/tg2env/lib/python-egg-cache >> >> Is your web server really going to be idle for 30 minutes? I can’t see how >> that would have been doing anything. >> >> Also, in mod_wsgi 4.x when inactivity-timeout kicks in has changed. >> >> It used to apply when there were active requests and they were blocked, as >> well as when no requests were running. >> >> Now it only applies to case where there are no requests. >> >> The case for running but blocked requests is now handled by request-timeout. >> >> You may be better of setting request-timeout now to be a more reasonable >> value for your expected longest request, but set inactivity-timeout to >> something much shorter. >> >> So suggest you play with that. >> >> Also, are you request handles I/O or CPU intensive and how many requests? >> >> Such a high number of processes and threads always screams to me that half >> the performance problems are due to setting these too [HIGH], invoking >> pathological OS process swapping issues and Python GIL issues. >> >> >> >> Yes, the requests are I/O intensive (that is, database intensive, which adds >> a huge overhead to our typical request). Often requests finish in under a >> second or two, but they also can take many seconds (not terrible for the >> user, but sometimes they do a lot of processing with many trips to the >> database). >> We have several clients (companies), so the number of requests varies >> widely, but can get pretty heavy on busy days (like black friday, since they >> are in retail). We've played with those numbers quite a bit and without >> high numbers like that, responsiveness suffers because we backlog due to >> requests often taking several seconds. >> >> Thanks for all your input, you've been tremendously helpful! >> Kent >> >> >> >>> WSGIProcessGroup rarch >>> WSGISocketPrefix run/wsgi >>> WSGIRestrictStdout Off >>> WSGIRestrictStdin On >>> # Memory tweak. >>> http://blog.dscpl.com.au/2009/11/save-on-memory-with-modwsgi-30.html >>> <http://blog.dscpl.com.au/2009/11/save-on-memory-with-modwsgi-30.html> >>> WSGIRestrictEmbedded On >>> WSGIPassAuthorization On >>> >>> # we'll make the /tg/ directory resolve as the wsgi script >>> WSGIScriptAlias /tg >>> /home/rarch/trunk/src/appserver/wsgi-config/wsgi-deployment.py >>> process-group=rarch application-group=%{GLOBAL} >>> WSGIScriptAlias /debug/tg >>> /home/rarch/trunk/src/appserver/wsgi-config/wsgi-deployment.py >>> process-group=rarch application-group=%{GLOBAL} >>> >>> MaxRequestsPerChild 0 >>> <IfModule prefork.c> >>> MaxClients 308 >>> ServerLimit 308 >>> </IfModule> >>> <IfModule worker.c> >>> ThreadsPerChild 25 >>> MaxClients 400 >>> ServerLimit 16 >>> </IfModule> >>> >>> >>> Thanks for all your help and for excellent software! >>> Kent >>> >>> >>> On Wed, Mar 16, 2016 at 7:27 PM, Graham Dumpleton >>> <[email protected] <mailto:[email protected]>> wrote: >>> On the question of whether nginx will solve this problem, I can’t see how. >>> >>> When one talks about nginx and Python web applications, it is only as a >>> proxy for HTTP requests to some backend WSGI server. The Python web >>> application doesn’t run in nginx itself. So memory issues and how to deal >>> with them are the provence of the WSGI server used, whatever that is and >>> not nginx. >>> >>> Anyway, answer the questions below and can start with that. >>> >>> You really want to be using recent mod_wsgi version and not Apache 2.2. >>> >>> Apache 2.2 design has various issues and bad configuration defaults which >>> means it can gobble up more memory than you want. Recent mod_wsgi versions >>> have workarounds for Apache 2.2 issues and are much better at eliminating >>> those Apache 2.2 issues. Recent mod_wsgi versions also have fixes for >>> memory usage problems in some corner cases. As far as what I mean by >>> recent, I recommend 4.4.12 or later. The most recent version is 4.4.21. If >>> you are stuck with 3.4 or 3.5 from your Linux distro that is not good and >>> that may increase problems. >>> >>> So long as got recent mod_wsgi version then can look at using vertical >>> partitioning to farm out memory hungry request handlers to their own daemon >>> process group and better configure those to handle that and recycle >>> processes based on activity or, memory usage. A blog post related to that >>> is: >>> >>> * http://blog.dscpl.com.au/2014/02/vertically-partitioning-python-web.html >>> <http://blog.dscpl.com.au/2014/02/vertically-partitioning-python-web.html> >>> >>> Graham >>> >>>> On 17 Mar 2016, at 7:15 AM, Graham Dumpleton <[email protected] >>>> <mailto:[email protected]>> wrote: >>>> >>>> What version of mod_wsgi and Apache are you using? >>>> >>>> Are you stuck with old versions of both? >>>> >>>> For memory tracking there are API calls mod_wsgi provides in recent >>>> versions for getting memory usage which can be used as part of scheme to >>>> trigger a process restart. You can’t use sys.exit(), but can use signals >>>> to trigger a clean shutdown of a process. Again better to have recent >>>> mod_wsgi versions as can then also set up some graceful timeout options >>>> for signal induced restart. >>>> >>>> Also, what is your mod_wsgi configuration so can make sure doing all the >>>> typical things one would do to limit memory usage, or quarantine >>>> particular handlers which are memory hungry? >>>> >>>> Graham >>>> >>>>> On 17 Mar 2016, at 4:29 AM, Kent Bower <[email protected] >>>>> <mailto:[email protected]>> wrote: >>>>> >>>>> Interesting idea.. yes, we are using multiple threads and also other >>>>> stack frameworks, so that's not straightforward, but worth thinking >>>>> about... not sure how to approach that with the other threads. Thank you >>>>> Bill. >>>>> >>>>> On Wed, Mar 16, 2016 at 1:11 PM, Bill Freeman <[email protected] >>>>> <mailto:[email protected]>> wrote: >>>>> I don't know about nginx, but one possibility, if the large memory >>>>> requests are infrequent, is to detect when you have completed one and >>>>> trigger the exit/reload of the daemon process (calling sys.exit() is not >>>>> the way, since there could be other threads in the middle of something, >>>>> unless you run one thread per process). >>>>> >>>>> On Wed, Mar 16, 2016 at 7:50 AM, Kent <[email protected] >>>>> <mailto:[email protected]>> wrote: >>>>> I'm looking for a very brief high-level pros vs. cons of wsgi under >>>>> apache vs. under nginx and then to be pointed to more details I can study >>>>> myself (or at least the latter). >>>>> >>>>> Our application occasionally allows requests that consume a large amount >>>>> of RAM (no obvious way around that, they are valid requests) and >>>>> occasionally this causes problems since we can't reclaim the RAM readily >>>>> from apache. (We already have tweaked with and do use >>>>> "inactivity-timeout". This helps, but still now and then we hit >>>>> problems where we run into swapping to disk.) >>>>> >>>>> I'm wondering if nginx may solve this problem. I've read much of what >>>>> you (Graham) have had to say about the memory strategies with apache and >>>>> mod_wsgi, but wonder what your opinion of nginx is and where you've >>>>> already discussed this. I've read articles I could find you've written >>>>> on nginx, such as "Blocking requests and nginx version of mod_wsgi," but >>>>> wonder if the same weaknesses are still applicable today, 7 years later? >>>>> >>>>> >>>>> Thank you very much in advance! >>>>> Kent >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google Groups >>>>> "modwsgi" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send an >>>>> email to [email protected] >>>>> <mailto:[email protected]>. >>>>> To post to this group, send email to [email protected] >>>>> <mailto:[email protected]>. >>>>> Visit this group at https://groups.google.com/group/modwsgi >>>>> <https://groups.google.com/group/modwsgi>. >>>>> For more options, visit https://groups.google.com/d/optout >>>>> <https://groups.google.com/d/optout>. >>>>> >>>>> >>>>> -- >>>>> You received this message because you are subscribed to a topic in the >>>>> Google Groups "modwsgi" group. >>>>> To unsubscribe from this topic, visit >>>>> https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe >>>>> <https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe>. >>>>> To unsubscribe from this group and all its topics, send an email to >>>>> [email protected] >>>>> <mailto:[email protected]>. >>>>> To post to this group, send email to [email protected] >>>>> <mailto:[email protected]>. >>>>> Visit this group at https://groups.google.com/group/modwsgi >>>>> <https://groups.google.com/group/modwsgi>. >>>>> For more options, visit https://groups.google.com/d/optout >>>>> <https://groups.google.com/d/optout>. >>>>> >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google Groups >>>>> "modwsgi" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send an >>>>> email to [email protected] >>>>> <mailto:[email protected]>. >>>>> To post to this group, send email to [email protected] >>>>> <mailto:[email protected]>. >>>>> Visit this group at https://groups.google.com/group/modwsgi >>>>> <https://groups.google.com/group/modwsgi>. >>>>> For more options, visit https://groups.google.com/d/optout >>>>> <https://groups.google.com/d/optout>. >>>> >>> >>> >>> -- >>> You received this message because you are subscribed to a topic in the >>> Google Groups "modwsgi" group. >>> To unsubscribe from this topic, visit >>> https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe >>> <https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe>. >>> To unsubscribe from this group and all its topics, send an email to >>> [email protected] >>> <mailto:[email protected]>. >>> To post to this group, send email to [email protected] >>> <mailto:[email protected]>. >>> Visit this group at https://groups.google.com/group/modwsgi >>> <https://groups.google.com/group/modwsgi>. >>> For more options, visit https://groups.google.com/d/optout >>> <https://groups.google.com/d/optout>. >>> >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "modwsgi" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to [email protected] >>> <mailto:[email protected]>. >>> To post to this group, send email to [email protected] >>> <mailto:[email protected]>. >>> Visit this group at https://groups.google.com/group/modwsgi >>> <https://groups.google.com/group/modwsgi>. >>> For more options, visit https://groups.google.com/d/optout >>> <https://groups.google.com/d/optout>. >> >> >> -- >> You received this message because you are subscribed to a topic in the >> Google Groups "modwsgi" group. >> To unsubscribe from this topic, visit >> https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe >> <https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe>. >> To unsubscribe from this group and all its topics, send an email to >> [email protected] >> <mailto:[email protected]>. >> To post to this group, send email to [email protected] >> <mailto:[email protected]>. >> Visit this group at https://groups.google.com/group/modwsgi >> <https://groups.google.com/group/modwsgi>. >> For more options, visit https://groups.google.com/d/optout >> <https://groups.google.com/d/optout>. >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "modwsgi" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] >> <mailto:[email protected]>. >> To post to this group, send email to [email protected] >> <mailto:[email protected]>. >> Visit this group at https://groups.google.com/group/modwsgi >> <https://groups.google.com/group/modwsgi>. >> For more options, visit https://groups.google.com/d/optout >> <https://groups.google.com/d/optout>. > > > -- > You received this message because you are subscribed to a topic in the Google > Groups "modwsgi" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe > <https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe>. > To unsubscribe from this group and all its topics, send an email to > [email protected] > <mailto:[email protected]>. > To post to this group, send email to [email protected] > <mailto:[email protected]>. > Visit this group at https://groups.google.com/group/modwsgi > <https://groups.google.com/group/modwsgi>. > For more options, visit https://groups.google.com/d/optout > <https://groups.google.com/d/optout>. > > > -- > You received this message because you are subscribed to the Google Groups > "modwsgi" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] > <mailto:[email protected]>. > To post to this group, send email to [email protected] > <mailto:[email protected]>. > Visit this group at https://groups.google.com/group/modwsgi > <https://groups.google.com/group/modwsgi>. > For more options, visit https://groups.google.com/d/optout > <https://groups.google.com/d/optout>. -- You received this message because you are subscribed to the Google Groups "modwsgi" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/modwsgi. For more options, visit https://groups.google.com/d/optout.
