Re: [modwsgi] nginx vs apache

Kent Bower Mon, 21 Mar 2016 10:02:49 -0700

In your recipe for a background monitoring thread watching memory
consumption, after issuing the SIGUSR1, I'd probably just want the thread
to exit instead of sleeping... do I just do "sys.exit()" to safely
accomplish that?


Also, regarding my observations of paster returning garbage-collected
memory to the OS, was I just getting lucky while monitoring (the memory was
at the very top of the allocated memory)?  This is a universal python issue?

Again, thanks for all your help!

On Sat, Mar 19, 2016 at 11:22 PM, Graham Dumpleton <
[email protected]> wrote:

>
> On 20 Mar 2016, at 1:10 AM, Kent Bower <[email protected]> wrote:
>
> Thanks Graham, few more items inline...
>
> On Sat, Mar 19, 2016 at 1:24 AM, Graham Dumpleton <
> [email protected]> wrote:
>
>>
>> On 17 Mar 2016, at 11:28 PM, Kent Bower <[email protected]> wrote:
>>
>> My answers are below, but before you peek, Graham, note that you and I
>> have been through this memory discussion before & I've read the vertical
>> partitioning article and use inactivity-timeout, "WSGIRestrictEmbedded On",
>> considered maximum-requests, etc.
>>
>> After years of this, I'm resigned to the fact that python is memory
>> hungry, especially built on many of these web-stack and database libraries,
>> etc.  I'm Ok with that.   I'm fine with a high-water RAM mark imposed by
>> running under Apache, mostly.  But, dang, it sure would be great if the 1
>> or 2% of requests that really (and legitimately) hog a ton of RAM, like,
>> say 500MB extra, didn't keep it when done.  I may revisit vertical
>> partitioning again, but last time I did I think I found that the 1 or 2% in
>> my case generally won't be divisible by url.  In most cases I wouldn't know
>> whether the particular request is going to need lots of RAM until
>> *after *the database queries return (which is far too late for vertical
>> partitioning to be useful).
>>
>> So I was mostly just curious about the status of nginx running wsgi,
>> which doesn't solve python's memory piggishness, but would at least
>> relinquish the extra RAM once python garbage collected.
>>
>>
>> Where have you got the idea that using nginx would result in memory being
>> released back to the OS once garbage collected? It isn’t able to do that.
>>
>> The situations are very narrow as to when a process is able to give back
>> memory to the operating system. It can only be done when the now free
>> memory was at top of allocated memory. This generally only happens for
>> large block allocations and not in normal circumstances for a running
>> Python application.
>>
>
>
> At this point I'm not sure where I got that idea, but I'm surprised at
> this.  For example, my previous observations of paster running wsgi were
> that it is quite faithful at returning free memory to the OS.  Was I just
> getting lucky, or would paster be different for some reason?
>
> In any case, if nginx won't solve that, then I can't see any reason to
> even consider it over apache/mod_wsgi.  Thank you for answering that.
>
>
>>
>> (Have you considered a max-memory parameter to mod_wsgi that would
>> gracefully stop taking requests and shutdown after the threshold is reached
>> for platforms that would support it?  I recall -- maybe incorrectly -- you
>> saying on Windows or certain platforms you wouldn't be able to support
>> that.  What about the platforms that *could *support it?  It seems to me
>> to be the very best way mod_wsgi could approach this Apache RAM nuance, so
>> seems like it would be tremendously useful for the platforms that could
>> support it.)
>>
>>
>> You can do this yourself rather easily with more recent mod_wsgi version.
>>
>> If you create a background thread from a WSGI script file, in similar way
>> as monitor for code changes does in:
>>
>>
>> http://modwsgi.readthedocs.org/en/develop/user-guides/debugging-techniques.html#extracting-python-stack-traces
>>
>> but instead of looking for code changes, inside the main loop of the
>> background thread do:
>>
>>     import os
>>     import mod_wsgi
>>
>>     metrics = mod_wsgi.process_metrics()
>>
>>     if metrics[‘memory_rss’] > MYMEMORYTHRESHOLD:
>>         os.kill(os.getpid(), signal.SIGUSR1)
>>
>> So mod_wsgi provides the way of determining the amount of memory without
>> resorting to importing psutil, which is quite fat in itself, but how you
>> use it is up to you.
>>
>
>
> Right, that's an idea; (could even be a shell script that takes this
> approach, I suppose, but I like your recipe.)
>
> Unfortunately, I don't want to *automate *bits that can feasibly clobber
> blocked sessions.  SIGUSR1, after graceful-timeout & shutdown-timeout, can
> result in ungraceful killing.  Our application shares a database with an
> old legacy application which was poorly written to hold transactions while
> waiting on user input (this was apparently common two decades ago).  So,
> unfortunately, it isn't terribly uncommon that our application is blocked
> at the database level waiting for someone using the legacy application who
> has a record(s) locked and may not even be at their desk or may have gone
> to lunch.  Sometimes our client's IT staff has to hunt down these people or
> decide to kill their database session.  In any case, from a professional
> point of view, our application should be the responsible one and wait
> patiently, allowing our client's IT staff the choice of how to handle those
> cases.  So, while the likelihood is pretty low, even with graceful-timeout
> & shutdown-timeout set at a very high value like 5 minutes,* I still run
> the risk of killing legitimate sessions with SIGUSR1*.  (I've brought
> this up before and you didn't agree with my gripe and I do understand why,
> but in my use case, I don't feel I can automate that route responsibly....
> we do use SIGUSR1 manually sometimes, when we can monitor and react to
> cases where a session is blocked at the database level.)
>
>
> If we have discussed it previously, then I may not have anything more to
> add.
>
> Did I previously suggest offloading this memory consuming tasks behind a
> job queue run under Celery or something else? That way they are out of the
> web server processes at least.
>
> inactivity-timeout doesn't present this concern: it won't ever kill
> anything, just silently restarts like a good boy when inactive.  I've
> recently reconsidered dropping that way down from 30 minutes.  (When I
> first implemented this, it was just to reclaim RAM at the end of the day,
> so that's why it is 30 minutes.  I didn't like the idea of churning new
> processes during busy periods, but I've been thinking 1 or 2 minutes may be
> quite reasonable.)
>
> If I could signal processes to shutdown at their next opportunity (meaning
> the next time they are handling no requests, like inactivity-timeout), that
> would solve many issues in this regard for me because I could signal these
> processes when their RAM consumption is high and let them restart when
> "convenient," being the ultimate in gracefulness.  SIGUSR2 could mean "the
> next time you get are completely idle," while SIGUSR1 continues to mean
> "initiate shutdown now.”
>
>
> That is what SIGUSR1 does it you set graceful-timeout large enough. It is
> SIGINT or SIGTERM which is effectively initiate shutdown now. So shouldn’t
> be a need to have a SIGUSR2 as SIGUSR1 should already do what you are
> hoping for with a reasonable setting of graceful-timeout.
>
>
>> Do note that if using SIGUSR1 to restart the current process (which
>> should only be done for deamon mode), you should also set graceful-timeout
>> option to WSGIDaemonProcess if you have long running requests. It is the
>> maximum time process will wait to shutdown while still waiting for requests
>> when doing a SIGUSR2 graceful shutdown of process, before going into forced
>> shutdown mode where no requests will be accepted and requests can be
>> interrupted.
>>
>> Here (
>> http://blog.dscpl.com.au/2009/05/blocking-requests-and-nginx-version-of.html)
>> you discuss nginx's tendency to block requests that may otherwise be
>> executing in a different process, depending on timing, etc.  Is this issue
>> still the same (I thought I read a hint somewhere that there may be a
>> workaround for that), so I ask.
>>
>>
>> That was related to someones attempt to embedded a Python interpreter
>> inside of nginx processes themselves. That project died a long time ago. No
>> one embeds Python interpreters inside of nginx processes. It was a flawed
>> design.
>>
>> I don’t what you are reading to get all these strange ideas. :-)
>>
>
>
> Google, I suppose ;)   That's why I finally asked you when I couldn't find
> anything more about it via Google.
>
>
>
>>
>> And so I wanted your opinion on nginx...
>>
>> ====
>> Here is what you asked for if it can still be useful.
>>
>> I'm on mod_wsgi-4.4.6 and the particular server that prompted me this
>> time is running Apache 2.4 (prefork), though some of our clients use 2.2
>> (prefork).
>>
>> Our typical wsgi conf setting is something like this, though threads and
>> processes varies depending on server size:
>>
>> LoadModule wsgi_module modules/mod_wsgi.so
>> WSGIPythonHome /home/rarch/tg2env
>> # see http://code.google.com/p/modwsgi/issues/detail?id=196#c10 concerning
>> timeouts
>> WSGIDaemonProcess rarch processes=20 threads=14 inactivity-timeout=1800
>> display-name=%{GROUP} graceful-timeout=5
>> python-eggs=/home/rarch/tg2env/lib/python-egg-cache
>>
>>
>> Is your web server really going to be idle for 30 minutes? I can’t see
>> how that would have been doing anything.
>>
>> Also, in mod_wsgi 4.x when inactivity-timeout kicks in has changed.
>>
>> It used to apply when there were active requests and they were blocked,
>> as well as when no requests were running.
>>
>> Now it only applies to case where there are no requests.
>>
>> The case for running but blocked requests is now handled by
>> request-timeout.
>>
>> You may be better of setting request-timeout now to be a more reasonable
>> value for your expected longest request, but set inactivity-timeout to
>> something much shorter.
>>
>> So suggest you play with that.
>>
>> Also, are you request handles I/O or CPU intensive and how many requests?
>>
>> Such a high number of processes and threads always screams to me that
>> half the performance problems are due to setting these too [HIGH], invoking
>> pathological OS process swapping issues and Python GIL issues.
>>
>>
>
> Yes, the requests are I/O intensive (that is, database intensive, which
> adds a huge overhead to our typical request).  Often requests finish in
> under a second or two, but they also can take many seconds (not
> *terrible *for the user, but sometimes they do a lot of processing with
> many trips to the database).
> We have several clients (companies), so the number of requests varies
> widely, but can get pretty heavy on busy days (like black friday, since
> they are in retail).   We've played with those numbers quite a bit and
> without high numbers like that, responsiveness suffers because we backlog
> due to requests often taking several seconds.
>
> Thanks for all your input, you've been tremendously helpful!
> Kent
>
>
>
>
>> WSGIProcessGroup rarch
>> WSGISocketPrefix run/wsgi
>> WSGIRestrictStdout Off
>> WSGIRestrictStdin On
>> # Memory tweak.
>> http://blog.dscpl.com.au/2009/11/save-on-memory-with-modwsgi-30.html
>> WSGIRestrictEmbedded On
>> WSGIPassAuthorization On
>>
>> # we'll make the /tg/ directory resolve as the wsgi script
>> WSGIScriptAlias /tg
>> /home/rarch/trunk/src/appserver/wsgi-config/wsgi-deployment.py
>> process-group=rarch application-group=%{GLOBAL}
>> WSGIScriptAlias /debug/tg
>> /home/rarch/trunk/src/appserver/wsgi-config/wsgi-deployment.py
>> process-group=rarch application-group=%{GLOBAL}
>>
>> MaxRequestsPerChild  0
>> <IfModule prefork.c>
>> MaxClients       308
>> ServerLimit      308
>> </IfModule>
>> <IfModule worker.c>
>> ThreadsPerChild  25
>> MaxClients       400
>> ServerLimit      16
>> </IfModule>
>>
>>
>> Thanks for all your help and for excellent software!
>> Kent
>>
>>
>> On Wed, Mar 16, 2016 at 7:27 PM, Graham Dumpleton <
>> [email protected]> wrote:
>>
>>> On the question of whether nginx will solve this problem, I can’t see
>>> how.
>>>
>>> When one talks about nginx and Python web applications, it is only as a
>>> proxy for HTTP requests to some backend WSGI server. The Python web
>>> application doesn’t run in nginx itself. So memory issues and how to deal
>>> with them are the provence of the WSGI server used, whatever that is and
>>> not nginx.
>>>
>>> Anyway, answer the questions below and can start with that.
>>>
>>> You really want to be using recent mod_wsgi version and not Apache 2.2.
>>>
>>> Apache 2.2 design has various issues and bad configuration defaults
>>> which means it can gobble up more memory than you want. Recent mod_wsgi
>>> versions have workarounds for Apache 2.2 issues and are much better at
>>> eliminating those Apache 2.2 issues. Recent mod_wsgi versions also have
>>> fixes for memory usage problems in some corner cases. As far as what I mean
>>> by recent, I recommend 4.4.12 or later. The most recent version is 4.4.21.
>>> If you are stuck with 3.4 or 3.5 from your Linux distro that is not good
>>> and that may increase problems.
>>>
>>> So long as got recent mod_wsgi version then can look at using vertical
>>> partitioning to farm out memory hungry request handlers to their own daemon
>>> process group and better configure those to handle that and recycle
>>> processes based on activity or, memory usage. A blog post related to that
>>> is:
>>>
>>> *
>>> http://blog.dscpl.com.au/2014/02/vertically-partitioning-python-web.html
>>>
>>> Graham
>>>
>>> On 17 Mar 2016, at 7:15 AM, Graham Dumpleton <[email protected]>
>>> wrote:
>>>
>>> What version of mod_wsgi and Apache are you using?
>>>
>>> Are you stuck with old versions of both?
>>>
>>> For memory tracking there are API calls mod_wsgi provides in recent
>>> versions for getting memory usage which can be used as part of scheme to
>>> trigger a process restart. You can’t use sys.exit(), but can use signals to
>>> trigger a clean shutdown of a process. Again better to have recent mod_wsgi
>>> versions as can then also set up some graceful timeout options for signal
>>> induced restart.
>>>
>>> Also, what is your mod_wsgi configuration so can make sure doing all the
>>> typical things one would do to limit memory usage, or quarantine particular
>>> handlers which are memory hungry?
>>>
>>> Graham
>>>
>>> On 17 Mar 2016, at 4:29 AM, Kent Bower <[email protected]> wrote:
>>>
>>> Interesting idea..  yes, we are using multiple threads and also other
>>> stack frameworks, so that's not straightforward, but worth thinking
>>> about... not sure how to approach that with the other threads.  Thank you
>>> Bill.
>>>
>>> On Wed, Mar 16, 2016 at 1:11 PM, Bill Freeman <[email protected]> wrote:
>>>
>>>> I don't know about nginx, but one possibility, if the large memory
>>>> requests are infrequent, is to detect when you have completed one and
>>>> trigger the exit/reload of the daemon process (calling sys.exit() is not
>>>> the way, since there could be other threads in the middle of something,
>>>> unless you run one thread per process).
>>>>
>>>> On Wed, Mar 16, 2016 at 7:50 AM, Kent <[email protected]> wrote:
>>>>
>>>>> I'm looking for a very brief high-level pros vs. cons of wsgi under
>>>>> *apache *vs. under *nginx *and then to be pointed to more details I
>>>>> can study myself (or at least the latter).
>>>>>
>>>>> Our application occasionally allows requests that consume a large
>>>>> amount of RAM (no obvious way around that, they are valid requests) and
>>>>> occasionally this causes problems since we can't reclaim the RAM readily
>>>>> from apache.  (We already have tweaked with and do use
>>>>> "inactivity-timeout".   This helps, but still now and then we hit problems
>>>>> where we run into swapping to disk.)
>>>>>
>>>>> I'm wondering if nginx may solve this problem.  I've read much of what
>>>>> you (Graham) have had to say about the memory strategies with apache and
>>>>> mod_wsgi, but wonder what your opinion of nginx is and where you've 
>>>>> already
>>>>> discussed this.  I've read articles I could find you've written on nginx,
>>>>> such as "Blocking requests and nginx version of mod_wsgi,"  but wonder if
>>>>> the same weaknesses are still applicable today, 7 years later?
>>>>>
>>>>>
>>>>> Thank you very much in advance!
>>>>> Kent
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "modwsgi" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To post to this group, send email to [email protected].
>>>>> Visit this group at https://groups.google.com/group/modwsgi.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>
>>>> --
>>>> You received this message because you are subscribed to a topic in the
>>>> Google Groups "modwsgi" group.
>>>> To unsubscribe from this topic, visit
>>>> https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe.
>>>> To unsubscribe from this group and all its topics, send an email to
>>>> [email protected].
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at https://groups.google.com/group/modwsgi.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "modwsgi" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at https://groups.google.com/group/modwsgi.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>>
>>>
>>>
>>> --
>>> You received this message because you are subscribed to a topic in the
>>> Google Groups "modwsgi" group.
>>> To unsubscribe from this topic, visit
>>> https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to
>>> [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at https://groups.google.com/group/modwsgi.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "modwsgi" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at https://groups.google.com/group/modwsgi.
>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>>
>> --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "modwsgi" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at https://groups.google.com/group/modwsgi.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "modwsgi" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/modwsgi.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "modwsgi" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/modwsgi/wyo2bJP0Cfc/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/modwsgi.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/d/optout.

Re: [modwsgi] nginx vs apache

Reply via email to