Re: [modwsgi] nginx vs apache

Graham Dumpleton Sat, 19 Mar 2016 20:22:59 -0700

> On 20 Mar 2016, at 1:10 AM, Kent Bower <[email protected]> wrote:
> 
> Thanks Graham, few more items inline...
> 
> On Sat, Mar 19, 2016 at 1:24 AM, Graham Dumpleton <[email protected] 
> <mailto:[email protected]>> wrote:
> 
>> On 17 Mar 2016, at 11:28 PM, Kent Bower <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> My answers are below, but before you peek, Graham, note that you and I have 
>> been through this memory discussion before & I've read the vertical 
>> partitioning article and use inactivity-timeout, "WSGIRestrictEmbedded On", 
>> considered maximum-requests, etc.
>> 
>> After years of this, I'm resigned to the fact that python is memory hungry, 
>> especially built on many of these web-stack and database libraries, etc.  
>> I'm Ok with that.   I'm fine with a high-water RAM mark imposed by running 
>> under Apache, mostly.  But, dang, it sure would be great if the 1 or 2% of 
>> requests that really (and legitimately) hog a ton of RAM, like, say 500MB 
>> extra, didn't keep it when done.  I may revisit vertical partitioning again, 
>> but last time I did I think I found that the 1 or 2% in my case generally 
>> won't be divisible by url.  In most cases I wouldn't know whether the 
>> particular request is going to need lots of RAM until after the database 
>> queries return (which is far too late for vertical partitioning to be 
>> useful).
>> 
>> So I was mostly just curious about the status of nginx running wsgi, which 
>> doesn't solve python's memory piggishness, but would at least relinquish the 
>> extra RAM once python garbage collected.  
> 
> Where have you got the idea that using nginx would result in memory being 
> released back to the OS once garbage collected? It isn’t able to do that.
> 
> The situations are very narrow as to when a process is able to give back 
> memory to the operating system. It can only be done when the now free memory 
> was at top of allocated memory. This generally only happens for large block 
> allocations and not in normal circumstances for a running Python application.
> 
> 
> At this point I'm not sure where I got that idea, but I'm surprised at this.  
> For example, my previous observations of paster running wsgi were that it is 
> quite faithful at returning free memory to the OS.  Was I just getting lucky, 
> or would paster be different for some reason?
> 
> In any case, if nginx won't solve that, then I can't see any reason to even 
> consider it over apache/mod_wsgi.  Thank you for answering that.
>  
> 
>> (Have you considered a max-memory parameter to mod_wsgi that would 
>> gracefully stop taking requests and shutdown after the threshold is reached 
>> for platforms that would support it?  I recall -- maybe incorrectly -- you 
>> saying on Windows or certain platforms you wouldn't be able to support that. 
>>  What about the platforms that could support it?  It seems to me to be the 
>> very best way mod_wsgi could approach this Apache RAM nuance, so seems like 
>> it would be tremendously useful for the platforms that could support it.) 
> 
> You can do this yourself rather easily with more recent mod_wsgi version.
> 
> If you create a background thread from a WSGI script file, in similar way as 
> monitor for code changes does in:
> 
>     
> http://modwsgi.readthedocs.org/en/develop/user-guides/debugging-techniques.html#extracting-python-stack-traces
>  
> <http://modwsgi.readthedocs.org/en/develop/user-guides/debugging-techniques.html#extracting-python-stack-traces>
> 
> but instead of looking for code changes, inside the main loop of the 
> background thread do:
> 
>     import os
>     import mod_wsgi
> 
>     metrics = mod_wsgi.process_metrics()
> 
>     if metrics[‘memory_rss’] > MYMEMORYTHRESHOLD:
>         os.kill(os.getpid(), signal.SIGUSR1)
> 
> So mod_wsgi provides the way of determining the amount of memory without 
> resorting to importing psutil, which is quite fat in itself, but how you use 
> it is up to you.
> 
> 
> Right, that's an idea; (could even be a shell script that takes this 
> approach, I suppose, but I like your recipe.)
> 
> Unfortunately, I don't want to automate bits that can feasibly clobber 
> blocked sessions.  SIGUSR1, after graceful-timeout & shutdown-timeout, can 
> result in ungraceful killing.  Our application shares a database with an old 
> legacy application which was poorly written to hold transactions while 
> waiting on user input (this was apparently common two decades ago).  So, 
> unfortunately, it isn't terribly uncommon that our application is blocked at 
> the database level waiting for someone using the legacy application who has a 
> record(s) locked and may not even be at their desk or may have gone to lunch. 
>  Sometimes our client's IT staff has to hunt down these people or decide to 
> kill their database session.  In any case, from a professional point of view, 
> our application should be the responsible one and wait patiently, allowing 
> our client's IT staff the choice of how to handle those cases.  So, while the 
> likelihood is pretty low, even with graceful-timeout & shutdown-timeout set 
> at a very high value like 5 minutes, I still run the risk of killing 
> legitimate sessions with SIGUSR1.  (I've brought this up before and you 
> didn't agree with my gripe and I do understand why, but in my use case, I 
> don't feel I can automate that route responsibly.... we do use SIGUSR1 
> manually sometimes, when we can monitor and react to cases where a session is 
> blocked at the database level.)


If we have discussed it previously, then I may not have anything more to add.

Did I previously suggest offloading this memory consuming tasks behind a job 
queue run under Celery or something else? That way they are out of the web 
server processes at least.

> inactivity-timeout doesn't present this concern: it won't ever kill anything, 
> just silently restarts like a good boy when inactive.  I've recently 
> reconsidered dropping that way down from 30 minutes.  (When I first 
> implemented this, it was just to reclaim RAM at the end of the day, so that's 
> why it is 30 minutes.  I didn't like the idea of churning new processes 
> during busy periods, but I've been thinking 1 or 2 minutes may be quite 
> reasonable.)
> 
> If I could signal processes to shutdown at their next opportunity (meaning 
> the next time they are handling no requests, like inactivity-timeout), that 
> would solve many issues in this regard for me because I could signal these 
> processes when their RAM consumption is high and let them restart when 
> "convenient," being the ultimate in gracefulness.  SIGUSR2 could mean "the 
> next time you get are completely idle," while SIGUSR1 continues to mean 
> "initiate shutdown now.”  

That is what SIGUSR1 does it you set graceful-timeout large enough. It is 
SIGINT or SIGTERM which is effectively initiate shutdown now. So shouldn’t be a 
need to have a SIGUSR2 as SIGUSR1 should already do what you are hoping for 
with a reasonable setting of graceful-timeout.

> 
> Do note that if using SIGUSR1 to restart the current process (which should 
> only be done for deamon mode), you should also set graceful-timeout option to 
> WSGIDaemonProcess if you have long running requests. It is the maximum time 
> process will wait to shutdown while still waiting for requests when doing a 
> SIGUSR2 graceful shutdown of process, before going into forced shutdown mode 
> where no requests will be accepted and requests can be interrupted.
> 
>> Here 
>> (http://blog.dscpl.com.au/2009/05/blocking-requests-and-nginx-version-of.html
>>  
>> <http://blog.dscpl.com.au/2009/05/blocking-requests-and-nginx-version-of.html>)
>>  you discuss nginx's tendency to block requests that may otherwise be 
>> executing in a different process, depending on timing, etc.  Is this issue 
>> still the same (I thought I read a hint somewhere that there may be a 
>> workaround for that), so I ask.
> 
> That was related to someones attempt to embedded a Python interpreter inside 
> of nginx processes themselves. That project died a long time ago. No one 
> embeds Python interpreters inside of nginx processes. It was a flawed design.
> 
> I don’t what you are reading to get all these strange ideas. :-)
> 
> 
> Google, I suppose ;)   That's why I finally asked you when I couldn't find 
> anything more about it via Google.
> 
>  
> 
>> And so I wanted your opinion on nginx...
>> 
>> ====
>> Here is what you asked for if it can still be useful.
>> 
>> I'm on mod_wsgi-4.4.6 and the particular server that prompted me this time 
>> is running Apache 2.4 (prefork), though some of our clients use 2.2 
>> (prefork).
>> 
>> Our typical wsgi conf setting is something like this, though threads and 
>> processes varies depending on server size:
>> 
>> LoadModule wsgi_module modules/mod_wsgi.so
>> WSGIPythonHome /home/rarch/tg2env
>> # see http://code.google.com/p/modwsgi/issues/detail?id=196#c10 
>> <http://code.google.com/p/modwsgi/issues/detail?id=196#c10> concerning 
>> timeouts
>> WSGIDaemonProcess rarch processes=20 threads=14 inactivity-timeout=1800 
>> display-name=%{GROUP} graceful-timeout=5 
>> python-eggs=/home/rarch/tg2env/lib/python-egg-cache
> 
> Is your web server really going to be idle for 30 minutes? I can’t see how 
> that would have been doing anything.
> 
> Also, in mod_wsgi 4.x when inactivity-timeout kicks in has changed.
> 
> It used to apply when there were active requests and they were blocked, as 
> well as when no requests were running.
> 
> Now it only applies to case where there are no requests.
> 
> The case for running but blocked requests is now handled by request-timeout.
> 
> You may be better of setting request-timeout now to be a more reasonable 
> value for your expected longest request, but set inactivity-timeout to 
> something much shorter.
> 
> So suggest you play with that.
> 
> Also, are you request handles I/O or CPU intensive and how many requests?
> 
> Such a high number of processes and threads always screams to me that half 
> the performance problems are due to setting these too [HIGH], invoking 
> pathological OS process swapping issues and Python GIL issues.
> 
> 
> 
> Yes, the requests are I/O intensive (that is, database intensive, which adds 
> a huge overhead to our typical request).  Often requests finish in under a 
> second or two, but they also can take many seconds (not terrible for the 
> user, but sometimes they do a lot of processing with many trips to the 
> database).  
> We have several clients (companies), so the number of requests varies widely, 
> but can get pretty heavy on busy days (like black friday, since they are in 
> retail).   We've played with those numbers quite a bit and without high 
> numbers like that, responsiveness suffers because we backlog due to requests 
> often taking several seconds.
> 
> Thanks for all your input, you've been tremendously helpful!
> Kent
> 
> 
>  
>> WSGIProcessGroup rarch
>> WSGISocketPrefix run/wsgi
>> WSGIRestrictStdout Off
>> WSGIRestrictStdin On
>> # Memory tweak. 
>> http://blog.dscpl.com.au/2009/11/save-on-memory-with-modwsgi-30.html 
>> <http://blog.dscpl.com.au/2009/11/save-on-memory-with-modwsgi-30.html>
>> WSGIRestrictEmbedded On
>> WSGIPassAuthorization On
>> 
>> # we'll make the /tg/ directory resolve as the wsgi script
>> WSGIScriptAlias /tg 
>> /home/rarch/trunk/src/appserver/wsgi-config/wsgi-deployment.py 
>> process-group=rarch application-group=%{GLOBAL}
>> WSGIScriptAlias /debug/tg 
>> /home/rarch/trunk/src/appserver/wsgi-config/wsgi-deployment.py 
>> process-group=rarch application-group=%{GLOBAL}
>> 
>> MaxRequestsPerChild  0
>> <IfModule prefork.c>
>> MaxClients       308
>> ServerLimit      308
>> </IfModule>
>> <IfModule worker.c>
>> ThreadsPerChild  25
>> MaxClients       400
>> ServerLimit      16
>> </IfModule>
>> 
>> 
>> Thanks for all your help and for excellent software!
>> Kent
>> 
>> 
>> On Wed, Mar 16, 2016 at 7:27 PM, Graham Dumpleton 
>> <[email protected] <mailto:[email protected]>> wrote:
>> On the question of whether nginx will solve this problem, I can’t see how.
>> 
>> When one talks about nginx and Python web applications, it is only as a 
>> proxy for HTTP requests to some backend WSGI server. The Python web 
>> application doesn’t run in nginx itself. So memory issues and how to deal 
>> with them are the provence of the WSGI server used, whatever that is and not 
>> nginx.
>> 
>> Anyway, answer the questions below and can start with that.
>> 
>> You really want to be using recent mod_wsgi version and not Apache 2.2.
>> 
>> Apache 2.2 design has various issues and bad configuration defaults which 
>> means it can gobble up more memory than you want. Recent mod_wsgi versions 
>> have workarounds for Apache 2.2 issues and are much better at eliminating 
>> those Apache 2.2 issues. Recent mod_wsgi versions also have fixes for memory 
>> usage problems in some corner cases. As far as what I mean by recent, I 
>> recommend 4.4.12 or later. The most recent version is 4.4.21. If you are 
>> stuck with 3.4 or 3.5 from your Linux distro that is not good and that may 
>> increase problems.
>> 
>> So long as got recent mod_wsgi version then can look at using vertical 
>> partitioning to farm out memory hungry request handlers to their own daemon 
>> process group and better configure those to handle that and recycle 
>> processes based on activity or, memory usage. A blog post related to that is:
>> 
>> * http://blog.dscpl.com.au/2014/02/vertically-partitioning-python-web.html 
>> <http://blog.dscpl.com.au/2014/02/vertically-partitioning-python-web.html>
>> 
>> Graham
>> 
>>> On 17 Mar 2016, at 7:15 AM, Graham Dumpleton <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 
>>> What version of mod_wsgi and Apache are you using?
>>> 
>>> Are you stuck with old versions of both?
>>> 
>>> For memory tracking there are API calls mod_wsgi provides in recent 
>>> versions for getting memory usage which can be used as part of scheme to 
>>> trigger a process restart. You can’t use sys.exit(), but can use signals to 
>>> trigger a clean shutdown of a process. Again better to have recent mod_wsgi 
>>> versions as can then also set up some graceful timeout options for signal 
>>> induced restart.
>>> 
>>> Also, what is your mod_wsgi configuration so can make sure doing all the 
>>> typical things one would do to limit memory usage, or quarantine particular 
>>> handlers which are memory hungry?
>>> 
>>> Graham
>>> 
>>>> On 17 Mar 2016, at 4:29 AM, Kent Bower <[email protected] 
>>>> <mailto:[email protected]>> wrote:
>>>> 
>>>> Interesting idea..  yes, we are using multiple threads and also other 
>>>> stack frameworks, so that's not straightforward, but worth thinking 
>>>> about... not sure how to approach that with the other threads.  Thank you 
>>>> Bill.
>>>> 
>>>> On Wed, Mar 16, 2016 at 1:11 PM, Bill Freeman <[email protected] 
>>>> <mailto:[email protected]>> wrote:
>>>> I don't know about nginx, but one possibility, if the large memory 
>>>> requests are infrequent, is to detect when you have completed one and 
>>>> trigger the exit/reload of the daemon process (calling sys.exit() is not 
>>>> the way, since there could be other threads in the middle of something, 
>>>> unless you run one thread per process).
>>>> 
>>>> On Wed, Mar 16, 2016 at 7:50 AM, Kent <[email protected] 
>>>> <mailto:[email protected]>> wrote:
>>>> I'm looking for a very brief high-level pros vs. cons of wsgi under apache 
>>>> vs. under nginx and then to be pointed to more details I can study myself 
>>>> (or at least the latter).
>>>> 
>>>> Our application occasionally allows requests that consume a large amount 
>>>> of RAM (no obvious way around that, they are valid requests) and 
>>>> occasionally this causes problems since we can't reclaim the RAM readily 
>>>> from apache.  (We already have tweaked with and do use 
>>>> "inactivity-timeout".   This helps, but still now and then we hit problems 
>>>> where we run into swapping to disk.)
>>>> 
>>>> I'm wondering if nginx may solve this problem.  I've read much of what you 
>>>> (Graham) have had to say about the memory strategies with apache and 
>>>> mod_wsgi, but wonder what your opinion of nginx is and where you've 
>>>> already discussed this.  I've read articles I could find you've written on 
>>>> nginx, such as "Blocking requests and nginx version of mod_wsgi,"  but 
>>>> wonder if the same weaknesses are still applicable today, 7 years later?
>>>> 
>>>> 
>>>> Thank you very much in advance!
>>>> Kent
>>>> 
>>>> -- 
>>>> You received this message because you are subscribed to the Google Groups 
>>>> "modwsgi" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>>> email to [email protected] 
>>>> <mailto:[email protected]>.
>>>> To post to this group, send email to [email protected] 
>>>> <mailto:[email protected]>.
>>>> Visit this group at https://groups.google.com/group/modwsgi 
>>>> <https://groups.google.com/group/modwsgi>.
>>>> For more options, visit https://groups.google.com/d/optout 
>>>> <https://groups.google.com/d/optout>.
>>>> 
>>>> 
>>>> -- 
>>>> You received this message because you are subscribed to a topic in the 
>>>> Google Groups "modwsgi" group.
>>>> To unsubscribe from this topic, visit 
>>>> https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe 
>>>> <https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe>.
>>>> To unsubscribe from this group and all its topics, send an email to 
>>>> [email protected] 
>>>> <mailto:[email protected]>.
>>>> To post to this group, send email to [email protected] 
>>>> <mailto:[email protected]>.
>>>> Visit this group at https://groups.google.com/group/modwsgi 
>>>> <https://groups.google.com/group/modwsgi>.
>>>> For more options, visit https://groups.google.com/d/optout 
>>>> <https://groups.google.com/d/optout>.
>>>> 
>>>> 
>>>> -- 
>>>> You received this message because you are subscribed to the Google Groups 
>>>> "modwsgi" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>>> email to [email protected] 
>>>> <mailto:[email protected]>.
>>>> To post to this group, send email to [email protected] 
>>>> <mailto:[email protected]>.
>>>> Visit this group at https://groups.google.com/group/modwsgi 
>>>> <https://groups.google.com/group/modwsgi>.
>>>> For more options, visit https://groups.google.com/d/optout 
>>>> <https://groups.google.com/d/optout>.
>>> 
>> 
>> 
>> -- 
>> You received this message because you are subscribed to a topic in the 
>> Google Groups "modwsgi" group.
>> To unsubscribe from this topic, visit 
>> https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe 
>> <https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe>.
>> To unsubscribe from this group and all its topics, send an email to 
>> [email protected] 
>> <mailto:[email protected]>.
>> To post to this group, send email to [email protected] 
>> <mailto:[email protected]>.
>> Visit this group at https://groups.google.com/group/modwsgi 
>> <https://groups.google.com/group/modwsgi>.
>> For more options, visit https://groups.google.com/d/optout 
>> <https://groups.google.com/d/optout>.
>> 
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "modwsgi" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] 
>> <mailto:[email protected]>.
>> To post to this group, send email to [email protected] 
>> <mailto:[email protected]>.
>> Visit this group at https://groups.google.com/group/modwsgi 
>> <https://groups.google.com/group/modwsgi>.
>> For more options, visit https://groups.google.com/d/optout 
>> <https://groups.google.com/d/optout>.
> 
> 
> -- 
> You received this message because you are subscribed to a topic in the Google 
> Groups "modwsgi" group.
> To unsubscribe from this topic, visit 
> https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe 
> <https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe>.
> To unsubscribe from this group and all its topics, send an email to 
> [email protected] 
> <mailto:[email protected]>.
> To post to this group, send email to [email protected] 
> <mailto:[email protected]>.
> Visit this group at https://groups.google.com/group/modwsgi 
> <https://groups.google.com/group/modwsgi>.
> For more options, visit https://groups.google.com/d/optout 
> <https://groups.google.com/d/optout>.
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "modwsgi" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] 
> <mailto:[email protected]>.
> To post to this group, send email to [email protected] 
> <mailto:[email protected]>.
> Visit this group at https://groups.google.com/group/modwsgi 
> <https://groups.google.com/group/modwsgi>.
> For more options, visit https://groups.google.com/d/optout 
> <https://groups.google.com/d/optout>.

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/d/optout.

Re: [modwsgi] nginx vs apache

Reply via email to