Re: [modwsgi] inactivity-timeout

Graham Dumpleton Sun, 01 Feb 2015 16:09:07 -0800

Your Flask client doesn't need to know about Celery, as your web application 
accepts requests as normal and it is your Python code which would queue the job 
with Celery.


Now looking back, the only configuration I can find, but which I don't know if 
it is your actual production configuration is:

    WSGIDaemonProcess rarch processes=3 threads=2 inactivity-timeout=1800 
display-name=%{GROUP} graceful-timeout=140 eviction-timeout=60 
python-eggs=/home/rarch/tg2env/lib/python-egg-cache

Provided that you don't then start to have overall host memory issues, the 
simplest way around this whole issue is not to use a multithreaded process.

What you would do is vertically partition your URL name space so that just the 
URLs which do the long running report generation would be delegated to single 
threaded processes. Everything else would keep going to the multithread 
processes.

    WSGIDaemonProcess rarch processes=3 threads=2
    WSGIDaemonProvess rarch-long-running processes=6 threads=1 
maximum-requests=20

    WSGIProcessGroup rarch

    <Location /suburl/of/long/running/report/generator>
    WSGIProcessGroup rarch-long-running
    </Location>

You wouldn't even have to worry about the graceful-timeout on 
rarch-long-running as that is only relevant for maxiumum-requests where it is a 
multithreaded processes.

So what would happen is that when the request has finished, if maximum-requests 
is reached, the process would be restarted even before any new request was 
accepted by the process, so there is no chance of a new request being 
interrupted.

You could still set an eviction-timeout of some suitably large value to allow 
you to use SIGUSR1 to be sent to processes in that daemon process group to shut 
them down.

In this case, having eviction-timeout being able to be set independent of 
graceful-timeout (for maximum-requests), is probably useful and so I will 
retain the option.

So is there any reason you couldn't use a daemon process group with many single 
threaded process instead?

Note that since only a sub set of URLs would go to the daemon process group, 
the memory usage profile will change as you aren't potentially loading the 
complete application code into those processes and only those needed for that 
URL and that report. So it could use up less memory than application as a 
whole, allowing you to have multiple single threaded processes with no issue.

Graham

On 31/01/2015, at 12:31 AM, Kent <[email protected]> wrote:

> Thanks for your reply and recommendations.  We're aware of the issues, but I 
> didn't give the full picture for brevity's sake.  The reports are user 
> generated reports.  Ultimately, the users know whether the reports should 
> return quickly (which many, many will), or whether they are long-running.  
> There is no way for the application to know that, so to avoid some sort of 
> polling (which we've done in the past and was a pain in the rear to users), 
> the design is to allow the user to decide whether to run the report in the 
> background or "foreground" via a check box.  Since most reports will return 
> in a matter of a minute or so, we wanted to avoid the pain of making them 
> poll, but I need to look at Celery.  However, I'm not comfortable punishing 
> users for accidentally choosing foreground on a long-running report.  That 
> is, not for an automatic turn-over mechanism like maximum-requests or 
> inactivity-timeout.  In my mind, those are inherently different than 
> something like a SIGUSR1 mechanism because the former are automatic.  
> 
> So, while admitting there are edge cases we are using that don't have a 
> perfect solution (or even admitting we need a better mechanism in that case), 
> it still seems to me mod_wsgi should be somewhat agnostic of design choices.  
> In other words, when it comes to automatic turning over of processes, it 
> seems mod_wsgi shouldn't be involved with length of time considerations, 
> except to allow the user to specify timeouts.  See, the long running reports 
> are only one of my concerns: we also fight with database locks sometimes, 
> held by another application attached to the same database and wholly out of 
> our control.  Sometimes those locks can be held for many minutes on a request 
> that normally should complete within seconds.  There too, it seems mod_wsgi 
> should be very gentle in the automatic turnover cases.
> 
> Thanks for pointing to Celery.  I really wonder whether I can get a message 
> broker to work with Adobe Flash, our current client, but I haven't looked 
> into this much yet.
> 
> Also, my apologies if you believe this to have been a waste of time on your 
> part.  You've been extremely helpful, though and I'm quite thankful for your 
> time!  I understand you not wanting to redesign the shutdown-timeout thing 
> and mess with what otherwise isn't broken.  Would you still like me to post 
> the apache debug logs regarding 'eviction-timeout' or have you changed your 
> mind about releasing that?  (In which case, extra apologies.)
> 
> Kent
> 
> 
> 
> 
> On Friday, January 30, 2015 at 6:34:28 AM UTC-5, Graham Dumpleton wrote:
> If you have web requests generating reports which take 40 minutes to run, you 
> are going the wrong way about it.
> 
> What would be regarded as best practice for long running requests is to use a 
> task queuing system to queue up the task to be run and run it in a distinct 
> set of processes to the web server. Your web request can then return 
> immediately, with some sort of polling system used as necessary to check the 
> progress of the task and allow the result to be downloaded when complete. By 
> using a separate system to run the tasks, it doesn't matter whether the web 
> server is restarted as the tasks will still run and after the web server is 
> restarted, a user can still check on progress of the tasks and get back his 
> response.
> 
> The most common such task execution system for doing this sort of thing is 
> Celery.
> 
> So it is because you aren't using the correct tool for the job here that you 
> are fighting against things like timeouts in the web server. No web server is 
> really a suitable environment to be used as an in process task execution 
> system. The web server should handle requests quickly and offload longer 
> processing tasks a separate task system which is purpose built for handling 
> the management of long running tasks.
> 
> I am not inclined to keep fiddling how the timeouts work now I understand 
> what you are trying to do. I am even questioning now whether I should have 
> introduced the separate eviction timeout I already did given that it is 
> turning out to be a questionable use case.
> 
> I would really recommend you look at re-architecting how you do things. I 
> don't think I would have any trouble finding others on the list who would 
> advise the same thing and who could also give you further advice on using 
> something like Celery instead for task execution.
> 
> Graham
> 
> On 29/01/2015, at 7:30 AM, Kent <[email protected]> wrote:
> 
> Ok, I plan to run those tests with debug and post, but please, in the 
> meantime:
> 
> For our app, not interrupting existing requests is a higher priority than 
> being able to accept new requests, particularly since we typically run many 
> wsgi processes, each with a handful of threads.  So, I'm not really concerned 
> about maintaining always available threads (statistically, I will be fine... 
> that isn't the issue for me).  
> 
> In these circumstances, it would be much better for all these triggering 
> events (SIGUSR1, maximum-requests, or inactivity-timeout, etc.) to 
> immediately stop accepting new requests and "concentrate" on shutting down.  
> (Unless that means requests waiting in apache are terminated because they 
> were queued for this particular process, but I doubt apache has already 
> determined the request's process if none are available, has it?)  With high 
> graceful-timeout/eviction-timeout and low shutdown-timeout, I run a pretty 
> high risk of accepting a new request at the tail end of graceful-timeout or 
> eviction-timeout, only to have it basically doomed to ungraceful death 
> because many of our requests are long running (very often well over 5 or 10 
> sec).
> 
> I guess that's why, through experimentation with SIGUSR1 a few years back, I 
> ended up "graceful-timeout=5 shutdown-timeout=300" ... the opposite of how it 
> would default, because this works well when trying to signal these to recycle 
> themselves: they basically immediately stop accepting new requests so your 
> "guaranteed" graceful timeout is 300.  It seems I have no way to "guarantee" 
> a very large graceful timeout for each and every request, even if affected by 
> maximum-requests or inactivity-timeout, and specify a different (lower) one 
> for SIGUSR1 because the only truly guaranteed lifetime in seconds is 
> "shutdown-timeout," is that accurate?
> 
> The ideal for our app, which may accept certain request that run for several 
> minutes is this:
> if maximum-requests or inactivity-timeout is hit, stop taking new requests 
> immediately and shutdown as soon as possible, but give existing requests 
> basically all the time they need to finish (say, up to 40 minutes (for 
> long-running db reports)).
> if SIGUSR1, stop taking new requests immediately and shutdown as soon as 
> possible, but give existing requests a really good chance to complete, maybe 
> 3-5 minutes, but not the 40 minutes, because this is slightly more urgent 
> (was triggered manually and a user is monitoring/waiting for turnover and 
> wants new code in place)
> I don't think I can accomplish the above if I understand the design correctly 
> because a request may have been accepted at the tail end of 
> graceful-timeout/eviction-timeout and so is only guaranteed a lifetime of 
> shutdown-timeout, regardless of what the trigger was (SIGUSR1 vs. automatic).
> 
> Is my understanding of this accurate?
> 
> 
> 
> On Tuesday, January 27, 2015 at 9:48:01 PM UTC-5, Graham Dumpleton wrote:
> Can you ensure that LogLevel is set to at least info and provide what 
> messages are in the Apache error log file
> 
> If I use:
> 
>     $ mod_wsgi-express start-server hack/sleep.wsg--log-level=debug 
> --verbose-debugging --eviction-timeout 30 --graceful-timeout 60
> 
> which is equivalent to:
> 
>     WSGIDaemonProcess … graceful-timeout=60 eviction-timeout=30
> 
> and fire a request against application that sleeps a long time I see in the 
> Apache error logs at the time of the signal:
> 
> [Wed Jan 28 13:34:34 2015] [info] mod_wsgi (pid=29639): Process eviction 
> requested, waiting for requests to complete 'localhost:8000'.
> 
> At the end of the 30 seconds given by the eviction timeout I see:
> 
> [Wed Jan 28 13:35:05 2015] [info] mod_wsgi (pid=29639): Daemon process 
> graceful timer expired 'localhost:8000'.
> [Wed Jan 28 13:35:05 2015] [info] mod_wsgi (pid=29639): Shutdown requested 
> 'localhost:8000'.
> 
> Up till that point the process would still have been accepting new requests 
> and was waiting for point that there was no active requests to allow it to 
> shutdown.
> 
> As the timeout tripped at 30 seconds, it then instead goes into the more 
> brutal shutdown process. No new requests are accepted from this point.
> 
> For my setup the shutdown-timeout defaults to 5 seconds and because the 
> request still hadn't completed within 5 seconds, then the process is exited 
> anyway and allowed to shutdown.
> 
> [Wed Jan 28 13:35:10 2015] [info] mod_wsgi (pid=29639): Aborting process 
> 'localhost:8000'.
> [Wed Jan 28 13:35:10 2015] [info] mod_wsgi (pid=29639): Exiting process 
> 'localhost:8000'.
> 
> Because the application never returned a response, that results in the Apache 
> child worker who was trying to talk to the daemon process seeing a truncated 
> response.
> 
> [Wed Jan 28 13:35:10 2015] [error] [client 127.0.0.1] Truncated or oversized 
> response headers received from daemon process 'localhost:8000': 
> /tmp/mod_wsgi-localhost:8000:502/htdocs/
> 
> When the Apache parent process notices the daemon process has died, it cleans 
> up and starts a new one.
> 
> [Wed Jan 28 13:35:11 2015] [info] mod_wsgi (pid=29639): Process 
> 'localhost:8000' has died, deregister and restart it.
> [Wed Jan 28 13:35:11 2015] [info] mod_wsgi (pid=29639): Process 
> 'localhost:8000' has been deregistered and will no longer be monitored.
> [Wed Jan 28 13:35:11 2015] [info] mod_wsgi (pid=29764): Starting process 
> 'localhost:8000' with threads=5.
> 
> So the shutdown phase specified by shutdown-timeout is subsequent to 
> eviction-timeout. It is one last chance to shutdown during a time that no new 
> requests are accepted in case it is the constant flow of requests that is 
> preventing it, rather than one long running request.
> 
> The shutdown-timeout should always be kept quite short because no new 
> requests will be accepted during that time. So changing it from the default 
> isn't something one would normally do.
> 
> Graham
> 
> On 28/01/2015, at 3:02 AM, Kent <[email protected]> wrote:
> 
> Let me be more specific.  I'm having a hard time getting this to test as I 
> expected.  Here is my WSGIDaemonProcess directive:
> 
> WSGIDaemonProcess rarch processes=3 threads=2 inactivity-timeout=1800 
> display-name=%{GROUP} graceful-timeout=140 eviction-timeout=60 
> python-eggs=/home/rarch/tg2env/lib/python-egg-cache
> 
> I put a 120 sec sleep in one of the processes' requests and then SIGUSR1 
> (Linux) all three processes.  The two inactive ones immediately restart, as I 
> expect.  However, the 3rd (sleeping) one is allowed to run past the 60 second 
> eviction_timeout and runs straight to the graceful_timeout before it is 
> terminated.  Shouldn't it have been killed at 60 sec?
> 
> (And then, as my previous question, how does shutdown-timeout factor into all 
> this?)
> 
> Thanks again!
> Kent
> 
> 
> 
> On Tuesday, January 27, 2015 at 9:34:12 AM UTC-5, Kent wrote:
> I think I might understand the difference between 'graceful-timeout' and 
> 'shutdown-timeout', but can you please just clarify the difference?  Are they 
> additive?
> 
> Also, will 'eviction-timeout' interact with either of those, or simply 
> override them?
> 
> Thanks,
> Kent
> 
> On Monday, January 26, 2015 at 12:44:13 AM UTC-5, Graham Dumpleton wrote:
> Want to give:
> 
>     https://github.com/GrahamDumpleton/mod_wsgi/archive/develop.tar.gz
> 
> a go?
> 
> The WSGIDaemonProcess directive is 'eviction-timeout'. For mod_wsgi-express 
> the command line option is '--eviction-timeout'.
> 
> So the terminology am using around this is that sending a signal is like 
> forcibly evicting the WSGI application, allow the process to be restarted. At 
> least this way can have an option name that is distinct enough from generic 
> 'restart' so as not to be confusing.
> 
> Graham
> 
> On 21/01/2015, at 11:15 PM, Kent <[email protected]> wrote:
> 
> 
> On Tuesday, January 20, 2015 at 5:53:26 PM UTC-5, Graham Dumpleton wrote:
> 
> On 20/01/2015, at 11:50 PM, Kent <[email protected]> wrote:
> 
> On Sunday, January 18, 2015 at 12:43:08 AM UTC-5, Graham Dumpleton wrote:
> There are a few possibilities here of how this could be enhanced/changed.
> 
> The problem with maximum-requests is that it can be dangerous. People can set 
> it too low and when their site gets a big spike of traffic then the processes 
> can be restarted too quickly only adding to the load of the site and causing 
> things to slow down and hamper their ability to handle the spike. This is 
> where setting a longer amount of time for graceful-timeout helps because you 
> can set it to be quite large. The use of maximum-requests can still be like 
> using a hammer though, and one which can be applied unpredictably.
> 
> Yes, I can see that. (It may be overkill, but you could default a separate 
> minimum-lifetime parameter so only users who specifically mess with that as 
> well as maximum-requests shoot themselves in the foot, but it is starting to 
> get confusing with all the different timeouts, I'll agree there...)
>  
> 
> The minimum-lifetime option is an interesting idea. It may have to do nothing 
> by default to avoid conflicts with existing expected behaviour.
> 
> 
> The maximum-requests option also doesn't help in the case where you are 
> running background threads which do stuff and it is them and not the number 
> of requests coming in that dictate things like memory growth that you want to 
> counter.
> 
> 
> True, but solving with maximum lifetime... well, actually, solving memory 
> problems with any of these mechanisms isn't measuring the heart of the 
> problem, which is RAM.  I imagine there isn't a good way to measure RAM or 
> you would have added that option by now.  Seems what we are truly after for 
> the majority of these isn't how many requests or how log its been up, etc, 
> but how much RAM it is taking (or perhaps, optionally, average RAM per 
> thread, instead).  If my process exceeds consuming 1.5GB, then trigger a 
> graceful restart at the next appropriate convenience, being gentle to 
> existing requests.  That may be arguably the most useful parameter.
> 
> 
> The problem with calculating memory is that there isn't one cross platform 
> portable way of doing it. On Linux you have to dive into the /proc file 
> system. On MacOS X you can use C API calls. On Solaris I think you again need 
> to dive into a /proc file system but it obviously has a different file 
> structure for getting details out compared to Linux. Adding such cross 
> platform stuff in gets a bit messy.
> 
> What I was moving towards as an extension of the monitoring stuff I am doing 
> for mod_wsgi was to have a special daemon process you can setup which has 
> access to some sort of management API. You could then create your own Python 
> script that runs in that and which using the management API can get daemon 
> process pids and then use Python psutil to get memory usage on periodic basis 
> and then you decide if process should be restarted and send it a signal to 
> stop, or management API provided which allows you to notify in some way, 
> maybe by signal, or maybe using shared memory flag, that daemon process 
> should shut down.
> 
> 
> I figured there was something making that a pain...
>  
> So the other option I have contemplated adding a number of times is is one to 
> periodically restart the process. The way this would work is that a process 
> restart would be done periodically based on what time was specified. You 
> could therefore say the restart interval was 3600 and it would restart the 
> process once an hour.
> 
> The start of the time period for this would either be, when the process was 
> created, if any Python code or a WSGI script was preloaded at process start 
> time. Or, it would be from when the first request arrived if the WSGi 
> application was lazily loaded. This restart-interval could be tied to the 
> graceful-timeout option so that you can set and extended period if you want 
> to try and ensure that requests are not interrupted.
> 
> We just wouldn't want it to die having never even served a single request, so 
> my vote would be against the birth of the process as the beginning point 
> (and, rather, at first request).
> 
> 
> It would effectively be from first request if lazily loaded. If preloaded 
> though, as background threads could be created which do stuff and consume 
> memory over time, would then be from when process started, ie., when Python 
> code was preloaded.
> 
> 
> But then for preloaded, processes life-cycle themselves for no reason 
> throughout inactive periods like maybe overnight.  That's not the end of the 
> world, but I wonder if we're catering to the wrong design. (These are, after 
> all, webserver processes, so it seems a fair assumption that they exist 
> primarily to handle requests, else why even run under apache?)  My vote, for 
> what it's worth, would still be timed from first request, but I probably 
> won't use that particular option.  Either way would be useful for some I'm 
> sure.
>  
> 
> Now we have the ability to sent the process graceful restart signal (usually 
> SIGUSR1), to force an individual process to restart.
> 
> Right now this is tied to the graceful-timeout duration as well, which as you 
> point out, would perhaps be better off having its own time duration for the 
> notional grace period.
> 
> Using the name restart-timeout for this could be confusing if I have a 
> restart interval option.
> 
> 
> In my opinion, SIGUSR1 is different from the automatic parameters because it 
> was (most likely) triggered by user intervention, so that one should ideally 
> have its own parameter.  If that is the case and this parameter becomes 
> dedicated to SIGUSR1, then the least ambiguous name I can think of is 
> sigusr1-timeout.
>  
> 
> Except that it isn't guaranteed to be called SIGUSR1. Technically it could be 
> a different signal dependent on platform that Apache runs as. But then, as 
> far as I know all UNIX systems do use SIGUSR1.
> 
> 
> In any case, they are "signals": you like signal-timeout? (Also could be 
> taken ambiguously, but maybe less so than restart-timeout?)
>  
> I also have another type of process restart I am trying to work out how to 
> accommodate and the naming of options again complicates the problem. In this 
> case we want to introduce an artificial restart delay.
> 
> This would be an option to combat the problem which is being caused by Django 
> 1.7 in that WSGI script file loading for Django isn't stateless. If a 
> transient problem occurs, such as the database not being ready, the loading 
> of the WSGI script file can fail. On the next request an attempt is made to 
> load it again but now Django kicks a stink because it was half way setting 
> things up last time when it failed and the setup code cannot be run a second 
> time. The result is that the process then keeps failing.
> 
> The idea of the restart delay option therefore is to allow you to set it to 
> number of seconds, normally just 1. If set like that, if a WSGI script file 
> import fails, it will effectively block for the delay specified and when over 
> it will kill the process so the whole process is thrown away and the WSGI 
> script file can be reloaded in a fresh process. This gets rid of the problem 
> of Django initialisation not being able to be retried.
> 
> 
> (We are using turbogears... I don't think I've seen something like that 
> happen, but rarely have seen start up anomalies.)
>  
> A delay is needed to avoid an effective fork bomb, where a WSGI script file 
> not loading with high request throughput would cause a constant cycle of 
> processes dying and being replaced. It is possible it wouldn't be as bad as I 
> think as Apache only checks for dead processes to replace once a second, but 
> still prefer my own failsafe in case that changes.
> 
> I am therefore totally fine with a separate graceful time period for when 
> SIGUSR1 is used, I just need to juggle these different features and come up 
> with an option naming scheme that make sense.
> 
> How about then that I add the following new options:
> 
>     maximum-lifetime - Similar to maximum-requests in that it will cause the 
> processes to be shutdown and restarted, but in this case it will occur based 
> on the time period given as argument, measured from the first request or when 
> the WSGI script file or any other Python code was preloaded, that is, in the 
> latter case when the process was started.
> 
>     restart-timeout - Specifies a separate grace period for when the process 
> is being forcibly restarted using the graceful restart signal. If 
> restart-timeout is not specified and graceful-timeout is specified, then the 
> value of graceful-timeout is used. If neither are specified, then the restart 
> signal will be have similar to the process being sent a SIGINT.
> 
>     linger-timeout - When a WSGI script file, of other Python code is being 
> imported by mod_wsgi directly, if that fails the default is that the error is 
> ignored. For a WSGI script file reloading will be attempted on the next 
> request. But if preloading code then it will fail and merely be logged. If 
> linger-timeout is specified to a non zero value, with the value being 
> seconds, then the daemon process will instead be shutdown and restarted to 
> try and allow a successful reloading of the code to occur if it was a 
> transient issue. To avoid a fork bomb if a persistent issue, a delay will be 
> introduced based on the value of the linger-timeout option.
>  
> How does that all sound, if it makes sense that is. :-)
> 
> 
> 
> That sounds absolutely great!  How would I get on the notification cc: of the 
> ticket or whatever so I'd be informed of progress on that?
> 
> These days my turn around time is pretty quick so long as I am happy and know 
> what to change and how. So I just need to think a bit more about it and gets 
> some day job stuff out of the way before I can do something.
> 
> So don't be surprised if you simply get a reply to this email within a week 
> pointing at a development version to try.
> 
> 
> Well tons of thanks again.
>  
> Graham
> 
> Graham
> 
>  
> On 17/01/2015, at 12:27 AM, Kent <[email protected]> wrote:
> 
> Thanks again.  Yes, I did take our current version from the repo because you 
> hadn't released the SIGUSR1 bit yet...  I should upgrade now.
> 
> As for the very long graceful-timeout, I was skirting around that solution 
> because I like where the setting is currently for SIGUSR1.  I would like to 
> ask, "Is there a way to indicate a different graceful-timeout for handling 
> SIGUSR1 vs. maximum-requests?" but I already have the answer from the release 
> notes: "No."
> 
> I don't know if you can see the value in distinguishing the two, but 
> maximum-requests is sort of a "standard operating mode," so it might make 
> sense for a modwsgi user to want a higher, very gentle mode of operation 
> there, whereas SIGUSR1, while beautifully more graceful than SIGKILL, still 
> "means business," so the same user may want a shorter responsive timeout 
> there (while still allowing a decent chunk of time for being graceful to 
> running requests).   That is the case for me at least.  Any chance you'd 
> entertain that as a feature request?
> 
> Even if not, you've been extremely helpful, thank you!  And thanks for 
> pointing me to the correct version of documentation: I thought I was reading 
> current version.
> Kent
> 
> P.S. I also have ideas for possible vertical URL partitioning, but 
> unfortunately, our app has much cross-over by URL, so that's why I'm down 
> this maximum-requests path...
> 
> 
> On Friday, January 16, 2015 at 4:54:50 AM UTC-5, Graham Dumpleton wrote:
> 
> On 16/01/2015, at 7:28 AM, Kent <[email protected]> wrote:
> 
> I'm running 4 (a very early version of it, possibly before you officially 
> released it).   We upgraded to take advantage of the amazingly-helpful 
> SIGUSR1 signaling for graceful process restarting, which we use somewhat 
> regularly to gracefully deploy software changes (minor ones which won't 
> matter if 2 processes have different versions loaded) without disrupting 
> users.  Thanks a ton for that!
> 
> SIGUSR1 support was released in version 4.1.0.
> 
>     http://modwsgi.readthedocs.org/en/master/release-notes/version-4.1.0.html
> 
> That same version has all the other stuff which was changed so long as using 
> the actual 4.1.0 is being used and you aren't still using the repo from 
> before the official release.
> 
> If not sure, best just upgrading to latest version if you can.
> 
> We are also multi-threading our processes (plural processes, plural threads).
> 
> Some requests could be (validly) running for very long periods of time 
> (database reporting, maybe even half hour, though that would be very extreme).
> 
> Some processes (especially those generating .pdfs, for example), hog tons of 
> RAM, as you know, so I'd like these to eventually check their RAM back in, so 
> to speak, by utilizing either inactivity-timeout or maximum-requests, but 
> always in a very gentle way, since, as I mentioned, some requests might be 
> properly running, even though for many minutes.  maximum-requests seems too 
> brutal for my use-case since the threshold request sends the process down the 
> graceful-timeout/shutdown-timeout, even if there are valid processes running 
> and then SIGKILLs.  My ideal vision of "maximum-requests," since it is 
> primarily for memory management, is to be very gentle, sort of a "ok, now 
> that I've hit my threshold, at my next earliest convenience, I should die, 
> but only once all my current requests have ended of their own accord."
> 
> That is where if you vertically partition those URLs out to a separate daemon 
> process group, you can simply set a very hight graceful-timeout value.
> 
> So relying on the feature:
> 
> """
> 2. Add a graceful-timeout option to WSGIDaemonProcess. This option is applied 
> in a number of circumstances.
> 
> When maximum-requests and this option are used together, when maximum 
> requests is reached, rather than immediately shutdown, potentially 
> interupting active requests if they don’t finished with shutdown timeout, can 
> specify a separate graceful shutdown period. If the all requests are 
> completed within this time frame then will shutdown immediately, otherwise 
> normal forced shutdown kicks in. In some respects this is just allowing a 
> separate shutdown timeout on cases where requests could be interrupted and 
> could avoid it if possible.
> """
> 
> You could set:
> 
>     maximum-requests=20 graceful-timeout=600
> 
> So as soon as it hits 20 requests, it switches to a mode where it will when 
> no requests, restart. You can set that timeout as high as you want, even 
> hours, and it will not care.
> 
> "inactivity-timeout" seems to function exactly as I want in that it seems 
> like it won't ever kill a process with a thread with an active request (at 
> least, I can't get it too even by adding a long import 
> time;time.sleep(longtime)... it doesn't seem to die until the request is 
> finished.  But that's why the documentation made me nervous because it 
> implies that it could, in fact, kill a proc with an active request: "For the 
> purposes of this option, being idle means no new requests being received, or 
> no attempts by current requests to read request content or generate response 
> content for the defined period."   
> 
> The release notes for 4.1.0 say:
> 
> """
> 4. The inactivity-timeout option to WSGIDaemonProcess now only results in the 
> daemon process being restarted after the idle timeout period where there are 
> no active requests. Previously it would also interrupt a long running 
> request. See the new request-timeout option for a way of interrupting long 
> running, potentially blocked requests and restarting the process.
> """
> 
> I'd rather have a more gentle "maximum-requests" than "inactivity-timeout" 
> because then, even on very heavy days (when RAM is most likely to choke), I 
> could gracefully turn over these processes a couple times a day, which I 
> couldn't do with "inactivity-timeout" on an extremely heavy day.
> 
> Hope this makes sense.  I'm really asking :
> whether inactivity-timeout triggering will ever SIGKILL a process with an 
> active request, as the docs intimate
> No from 4.1.0 onwards.
> whether there is any way to get maximum-requests to behave more gently under 
> all circumstances
> By setting a very very long graceful-timeout.
> for your ideas/best advice
> Have a good read through the release notes for 4.1.0.
> 
> Not necessarily useful in your case, but also look at request-timeout. It can 
> act as a final fail safe for when things are stuck, but since it is more of a 
> fail safe, it doesn't make use of graceful-timeout.
> 
> Graham
> 
> 
> Thanks for your help!
> 
> 
> 
> On Wednesday, January 14, 2015 at 9:48:02 PM UTC-5, Graham Dumpleton wrote:
> 
> On 15/01/2015, at 8:32 AM, Kent <[email protected]> wrote: 
> 
> > Graham, the docs state: "For the purposes of this option, being idle means 
> > no new requests being received, or no attempts by current requests to read 
> > request content or generate response content for the defined period."   
> > 
> > This implies to me that a running request that is taking a long time could 
> > actually be killed as if it were idle (suppose it were fetching a very slow 
> > database query).  Is this the case? 
> 
> This is the case for mod_wsgi prior to version 4.0. 
> 
> Things have changed in mod_wsgi 4.X. 
> 
> How long are your long running requests though? The inactivity-timeout was 
> more about restarting infrequently used applications so that memory can be 
> taken back. 
>  
> 
> > Also, I'm looking for an ultra-conservative and graceful method of 
> > recycling memory. I've read your article on url partitioning, which was 
> > useful, but sooner or later, one must rely on either inactivity-timeout or 
> > maximum-requests, is that accurate?  But both these will eventually, after 
> > graceful timeout/shutdown timeout, potentially kill active requests.  It is 
> > valid for our app to handle long-running reports, so I was hoping for an 
> > ultra-safe mechanism. 
> > Do you have any advice here? 
> 
> The options available in mod_wsgi 4.X are much better in this area than 3.X. 
> The changes in 4.X aren't covered in main documentation though and are only 
> described in the release notes where change was made. 
> 
> In 4.X the concept of an inactivity-timeout is now separate to the idea of a 
> request-timeout. There is also a graceful-timeout that can be applied to 
> maximum-requests and some other situations as well to allow requests to 
> finish out properly before being more brutal. One can also signal the daemon 
> processes to do a more graceful restart as well. 
> 
> You cannot totally avoid having to be brutal though and kill things else you 
> don't have a fail safe for a stuck process where all request threads were 
> blocked on back end services and were never going to recover. Use of 
> multithreading in a process also complicates the implementation of 
> request-timeout. 
> 
> Anyway, the big question is what version are you using? 
> 
> Graham 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "modwsgi" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/modwsgi.
> For more options, visit https://groups.google.com/d/optout.
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "modwsgi" group.
> To unsubscribe from this group and stop re
> ...
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "modwsgi" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/modwsgi.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/d/optout.

Re: [modwsgi] inactivity-timeout

Reply via email to