My assumption is that all the C modules shipping with Python would have to support the "with timeout" feature.
At least on Windows Vista and beyond, this wouldn't be so hard -- a call to CancelSynchronousIo plus a per-thread timeout object and change *WaitFor* calls to wait on that timeout object (change WaitForSingleObject to WaitForMultipleObjects as needed). For OS's before Vista, this is speculation, but I would guess that CancelSynchronousIo could be implemented on top of ntdll.dll APIs. I'm not enough of a Unix expert to say what would be involved on various Unix systems. Anyway, obviously this is not going to happen in the forseeable future. > One has to gauge what are the things that one would > need to protect against blocking. Right now I'm having a lot of difficulties coming from the fact that query X is blocked waiting for query Y to release a mutex -- and that mutex, in turn, won't be released until query Y finishes doing a bunch of slow network queries to server Z (which could be out on the public Internet in some cases). The general solution is to never hold a mutex while waiting for a network query that could take a potentially unbounded amount of time -- release the mutex first and reacquire after the network query finishes -- but this requires a fairly substantial rearchitecture of a lot of code. Some of which is written in Python, some in C/C++, and a good chunk of which is a file system driver. I certainly have little to no control over how fast (third-party, often proprietary and closed-source, often not on the LAN) server Z chooses to respond to my queries. I've seen cases where server Z is bogged down responding to some other random unrelated query, holding a write lock on its DB, and even the simplest read-only query to server Z hangs for 10 minutes waiting for that huge unrelated DB query to complete. Like I said, big complex distributed software system. On Wed, Feb 11, 2009 at 7:57 PM, Graham Dumpleton < [email protected]> wrote: > > 2009/2/12 Matt Craighead <[email protected]>: > > Agree with your terminology corrections. > > > > As for Apache on Windows, I have no love for it -- but I have to offer my > > users *some* sort of Windows-based server solution. I have my own > miniature > > WSGI web server for users who want it to "just work" without having to > > install anything separate, but some people want more advanced features > like > > SSL. > > > > As for CGI, it's not a full solution, but you can always kill a process > > manually, whereas there's no way to kill a specific hung worker thread > > buried inside a process. You have to kill the whole process. Putting on > a > > sysadmin hat, I might like be able to kill a single bad request without > > killing the whole server. Obviously even better would be to never have > to > > kill a request at all, but I'm not sure I see that as 100% realistic in a > > distributed software system with a lot of diverse components, some of > which > > I didn't write and/or which don't have network protocol specs and which I > > therefore have to access through closed-source third-party API libraries. > > > > Writing to the output in a loop is a valid solution for some classes of > > applications: a gigantic HTML table comes to mind. I don't see it > working > > very well for my particular application. I could probably make it work > for > > a subset of my queries. > > > > Timeouts? Assuming I can catch every single potentially-blocking API > call > > and make it time out (fairly challenging when we're talking about a WSGI > > module that accesses a file system driver that in turn may make network > > requests to service certain file system calls -- I'm planning to > rearchitect > > this slightly but I don't think this part will go away entirely)... I'd > > better make those timeouts fairly long to avoid false positives. I think > > this kind of approach is probably good enough to prevent myself from > running > > out of threads and/or memory... *if* I can put a timeout in every single > > potentially-blocking code path. > > > > Last time I looked into this, I think I saw some web debates over whether > it > > would be possible to have a language construct along the lines of: > > > > with timeout(seconds): > > do_stuff() > > > > ...where if the contents of the "with" code block took more than > "seconds" > > seconds to execute, some sort of TimeoutException would be asynchronously > > thrown that would allow for a full and orderly cleanup. This would be > great > > for my purposes. I doubt something like this will find its way into the > > language any time soon, seeing as I've already found bugs in Python where > > certain simple blocking socket API calls can't be Ctrl-C'd under Windows. > > Except that like trying to inject a Python exception into a running > thread, it will not work if the code is actually inside of C code and > that that is where it is hanging. > > > Overall, I'm getting the impression that the answer to my dilemma is: > "Yes, > > this is hard. Deal with it." Would that be a correct assessment? > > More or less. One has to gauge what are the things that one would > need to protect against blocking. If it is your own internal network > or systems, probably safe. If going outside of your network, then > protect yourself. > > As failsafe use inactivity-timeout on daemon process group. This is in > part what it was added for. So even if process hangs to period of > timeout, that at least it will automatically recover rather than > having to wait for human to kill it. > > Graham > > > On Wed, Feb 11, 2009 at 5:47 PM, Graham Dumpleton > > <[email protected]> wrote: > >> > >> 2009/2/12 Matt Craighead <[email protected]>: > >> > Suppose the client making an HTTP request to my WSGI app closes its > >> > socket, > >> > say, because the user hit Escape in their web browser. What happens > to > >> > my > >> > Python interpreter executing the WSGI code in question? It keeps > >> > running, > >> > right? > >> > >> First off, you don't mean 'Python interpreter', you mean 'request > thread'. > >> > >> Python interpreter instances once created within a process survive for > >> the life of the process. Separate interpreter instances are NOT > >> created for each request. > >> > >> You may well understand this, but it does seem to be a misconception > >> that some do have, so am clarifying this point. > >> > >> As to whether the 'request thread' keeps running, it depends on what > >> it is doing and how you are yield data from a WSGI application. > >> > >> In the simplest case where a WSGI application forms the complete > >> response as a single string, or list of strings and returns it, the > >> WSGI application will complete the request regardless. This is because > >> the data is only being written at the end of the request and so > >> potentially no earlier point at which it could be detected that the > >> client had closed the connection. > >> > >> If that request thread was performing an operation that took some > >> time, whether it be computational, or whether it needs to block on the > >> result from some external process, then whatever it is doing is not > >> interrupted. > >> > >> In the more complicated case where the WSGI application has returned a > >> generator which yields data in blocks, then there is an attempt to > >> write data back to the client as the request progresses. Provided that > >> blocks of data are only generated when asked for, if writing a prior > >> block resulted in it being detected that client connection had closed, > >> the mod_wsgi will skip asking for more blocks of data and move > >> straight on to closing the generator and finalising the request. > >> > >> Thus, use of generators and only generating data as it can be sent, > >> does provide an option to interupt a long running process. This still > >> doesn't help in situations where a specific request for a block of > >> data resulted in the application making some blocking call. > >> > >> Another case is where the write() function from start_response() is > >> used. In this case writing back data to client is driven by the WSGI > >> application rather than a loop within mod_wsgi requesting data from a > >> generator. Thus, when closure of connection is detected then a Python > >> exception will be seen by the application and it is up to it as to > >> what to do. Most applications seem not to handle it and a 500 error > >> related to unhandled exception results. > >> > >> The only other option for detecting that a client has closed the > >> connection is when application is reading wsgi.input. This again > >> generates a Python exception which the application would deal with as > >> appropriate. > >> > >> > This seems pretty unfortunate. Suppose that the implementation of my > >> > HTTP > >> > request needs to go out on the network to talk to some other server. > >> > (Which, in my case, some of them do.) connect(), send(), recv() can > all > >> > take potentially unlimited amounts of time to complete. They may not > >> > consume any CPU time while they're blocking, but a thread is just > >> > sitting > >> > there doing nothing; and what, if anything, will cause that thread to > >> > die > >> > short of killing the Apache process or the WSGI daemon process (if > any)? > >> > Leak enough threads and you could run out of memory, deadlock, or > >> > whatnot. > >> > >> If your backend process you are communicating with never returns then > >> that is a separate issue to the client closing the connection. For > >> detecting a backend process as never returning you should be > >> implementing non blocking operations in conjunction with a timeout to > >> ensure that any processing is completed in the time you expect it to > >> be. > >> > >> Whether loss of the client connection should cause connection to > >> backend process to be closed really depends on what the application > >> does. It may be the case that you still need backend task to be > >> completed regardless, closing the connection to the backend process, > >> depending on how the service is implemented may be bad as it may cause > >> backend task to be interrupted and not complete. But then, if you > >> really need that, then you should be using a persistent message/task > >> queuing system to ensure requests aren't lost. Overall though, what > >> should be done can depend on the individual connections to backend > >> systems and thus should be handled at the application level. > >> > >> When using daemon mode of mod_wsgi the only option you really have is > >> to set inactivity-timeout as a fail safe for all threads in the > >> process getting into a locked up state because of code which blocks > >> and never returns. What would happen is that even though all threads > >> in a process may be handling a request, if none of them actually read > >> any request input or generate any request output in the specificed > >> time, then the daemon process would be forcibly shutdown. > >> > >> > The behavior I'd think I'd want would be that a closed client socket > >> > would > >> > result in a Python KeyboardInterrupt being raised asynchronously > inside > >> > my > >> > WSGI Python interpreter, exactly like Ctrl-C in a normal Python app. > >> > Then > >> > my code would nicely release any DB locks/rollback any pending DB > >> > transactions as the stack unrolled, and blocking IOs (socket or > >> > otherwise) > >> > could be interrupted via a signal (Unix)/IO cancellation > (Windows)/some > >> > other mechanism (???). > >> > >> My understanding is that this wouldn't necessarily be a safe thing to > >> do as it would involve injecting an exception into a distinct thread. > >> I remember seeing some warnings about this at one point, but things > >> could have changed. Either way, I have looked at it before and wasn't > >> convinced it was a good idea. > >> > >> > mod_wsgi daemon mode seems like a partial solution at best: > >> > >> In what respect are you saying that? > >> > >> > - daemon mode is not supported on Windows, right? > >> > >> And never will be. First because fork() is not supported on Windows, > >> and second because I don't really regard Windows as a good deployment > >> platform for Apache. > >> > >> > - killing the daemon process (potentially?) kills other requests, not > >> > just > >> > the hung request > >> > >> Yes, although in the case of inactivity-timeout, all threads would > >> effectively need to have stalled before it kicked in and killed the > >> process. > >> > >> > And any solution that involves one process per request, well, then we > >> > might > >> > as well be back to using CGI rather than WSGI... > >> > >> But CGI will not help you with this either. Well, not completely true, > >> CGI will allow more and more processes to be created, but keep doing > >> that and it will consume all resources on your machine. You still need > >> something that is going to kill of stuck processes. > >> > >> No other web hosting mechanism for WSGI applications I have seen > >> really provide a solution either. Some others provide timeouts on > >> individual requests and will kill processes, but none that I know of > >> will interject some sort of signal indicating that client connection > >> has closed. As partly explained above, in Apache at least you can only > >> know a client connection has closed when you attempt to read data from > >> it or write data to it. Apache is not event driven and so there is no > >> select/poll on a client connection such that you could be notified > >> immediately anyway. > >> > >> All you can really do for any system is at your application level try > >> and implement timeouts on potentially blocking operations to backend > >> processes and otherwise simply ensure you have allowed enough > >> processes/threads to handle expected load with some additional > >> capacity to cope with requests stalling for a while until timeouts > >> kick in. > >> > >> Graham > >> > >> 512-772-1834 > >> > >> >> > > > > > > -- Matt Craighead Founder/CEO, Conifer Systems LLC http://www.conifersystems.com 512-772-1834 --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "modwsgi" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en -~----------~----~----~----~------~----~------~--~---
