Passing request_data wasn't a good idea for certain reasons. I probably just rediscovered why I didn't do it that way in the first place. :-)
Anyway, provided you use the latest from 'develop' branch of GitHub repo, play around with the following code. import os import sys import time import threading import traceback import mod_wsgi def event_handler(name, **kwargs): if name == 'request_started': request_data = mod_wsgi.request_data() request_data.update(kwargs) request_data['python_thread_id'] = threading.get_ident() elif name == 'process_stopping': if kwargs['shutdown_reason'] == 'request_timeout': print('SHUTDOWN') current_time = time.time() stacks = dict(sys._current_frames().items()) active = dict(mod_wsgi.active_requests.items()) for request_id, request_data in active.items(): python_thread_id = request_data['python_thread_id'] application_start = request_data['application_start'] request_environ = request_data['request_environ'] request_uri = request_environ['REQUEST_URI'] running_time = current_time - application_start print() print('THREAD_ID', python_thread_id) print('REQUEST_ID', request_id) print('REQUEST_URI', request_uri) print('RUNNING_TIME', running_time) if python_thread_id in stacks: print('STACK-TRACE') traceback.print_stack(stacks[python_thread_id]) mod_wsgi.subscribe_events(event_handler) def application(environ, start_response): sleep_duration = environ.get('HTTP_X_SLEEP_DURATION', 0) sleep_duration = float(sleep_duration or 0) status = '200 OK' output = b'Hello World!' response_headers = [('Content-type', 'text/plain'), ('Content-Length', str(len(output)))] start_response(status, response_headers) if sleep_duration: time.sleep(sleep_duration) yield output I am running that with: mod_wsgi-express start-server tests/request-timeout.wsgi --log-to-terminal --request-timeout 1 --threads 1 --log-level info and running curl as: curl -H X-Sleep-Duration:60 http://localhost:8000/ <http://localhost:8000/> That gets me the following. One stack trace I generate myself just for that sub interpreter out of event mechanism. The other is process wide generated from C code level. [Thu Feb 08 22:52:41.635819 2018] [wsgi:info] [pid 21486] mod_wsgi (pid=21486): Daemon process request time limit exceeded, stopping process 'localhost:8000'. [Thu Feb 08 22:52:41.636053 2018] [wsgi:info] [pid 21486] mod_wsgi (pid=21486): Shutdown requested 'localhost:8000'. [Thu Feb 08 22:52:41.636126 2018] [wsgi:error] [pid 21486] SHUTDOWN [Thu Feb 08 22:52:41.636149 2018] [wsgi:error] [pid 21486] [Thu Feb 08 22:52:41.636156 2018] [wsgi:error] [pid 21486] THREAD_ID 123145557110784 [Thu Feb 08 22:52:41.636162 2018] [wsgi:error] [pid 21486] REQUEST_ID YSp5DBLFVJs [Thu Feb 08 22:52:41.636166 2018] [wsgi:error] [pid 21486] REQUEST_URI / [Thu Feb 08 22:52:41.636179 2018] [wsgi:error] [pid 21486] RUNNING_TIME 1.85565185546875 [Thu Feb 08 22:52:41.636187 2018] [wsgi:error] [pid 21486] STACK-TRACE [Thu Feb 08 22:52:41.819058 2018] [wsgi:error] [pid 21486] File "/Volumes/graham/Projects/mod_wsgi/tests/request-timeout.wsgi", line 56, in application [Thu Feb 08 22:52:41.819071 2018] [wsgi:error] [pid 21486] time.sleep(sleep_duration) [Thu Feb 08 22:52:41.819080 2018] [wsgi:info] [pid 21486] mod_wsgi (pid=21486): Dumping stack trace for active Python threads. [Thu Feb 08 22:52:41.819083 2018] [wsgi:info] [pid 21486] mod_wsgi (pid=21486): Thread 123145557110784 executing file "/Volumes/graham/Projects/mod_wsgi/tests/request-timeout.wsgi", line 56, in application [Thu Feb 08 22:52:46.636172 2018] [wsgi:info] [pid 21486] mod_wsgi (pid=21486): Aborting process 'localhost:8000'. [Thu Feb 08 22:52:46.636375 2018] [wsgi:info] [pid 21486] mod_wsgi (pid=21486): Exiting process 'localhost:8000'. Graham > On 8 Feb 2018, at 2:44 pm, Graham Dumpleton <graham.dumple...@gmail.com> > wrote: > > FWIW, the other current events are. > > * response_started > * request_exception > * process_stopping > > Passing the per request data as "request_data" argument to event handler for > request/response events was easy enough to do. > > Getting access to the thread object in order to work out thread.name when > dumping stack traces is definitely not easy. This is because you only have > access to the integer thread ID and Python frame object. You would have to > use the thread ID to look up threading._active and hope it is the same value > used for that. The problem is though that you don't know what Python > interpreter the thread was active in and which you need to switch to to do > that lookup. When you have multiple sub interpreters, the thread could also > have been used in more than one sub interpreter and associated with multiple > high level Python thread objects. This is the sort of mess that just gives > one headaches. I therefore don't think that is practical. > > So I don't think it can be done from where that current code gets triggered. > The point where the "process_stopping" event is generated could instead be > used. This is different because that is generated for each sub interpreter > just before destroying it, so you are already executing code in the right > context. > > Right now at least, using the per request events, you could track current > requests yourself in a global dictionary, with link to thread as part of the > request data. The "process_stopping" could access that cache to dump things > out. What you will not know is why the process was being stopped, but that is > possibly something that can pass as an argument to "process_stopping" so > could decide what to do based on reason for process shutdown. > > What mod_wsgi could do to make it easier is provide a way of getting back a > dictionary of references, by request ID, to all request_data objects for > active requests. What you stash in them from the event handlers would be up > to you. I need to think about the best way to expose that giving access to > those request data objects. > > Graham > >> On 8 Feb 2018, at 1:41 pm, Graham Dumpleton <graham.dumple...@gmail.com >> <mailto:graham.dumple...@gmail.com>> wrote: >> >> >> >>> On 8 Feb 2018, at 1:08 pm, Jesus Cea <j...@jcea.es <mailto:j...@jcea.es>> >>> wrote: >>> >>> On 08/02/18 02:39, Graham Dumpleton wrote: >>>> What are you setting the thread name too? >>> >>> Beside the mod_wsgi threads running "application()", my code creates >>> tons of long term threads like "cache_cleanup", >>> "periodic_cache_flush_to_disk", "map generation workers", "audio >>> transcoding", etc. >>> >>>> https://github.com/GrahamDumpleton/mod_wsgi/issues/160 >>>> <https://github.com/GrahamDumpleton/mod_wsgi/issues/160> >>> >>> Uhm, first thing in "application()" could be a "thread.name=URI", and >>> the "finally" statement could be "thread.name='idle'", or in the >>> "close()" code of the iterator returned. >>> >>> This looks like a pattern for a near trivial middleware. >> >> Using a WSGI middleware for that is a bad idea because of the complexity of >> implementing a WSGI middleware that properly bounds the full execution of >> the code involved in all parts of handling the request. The better way is to >> use the event system in mod_wsgi to be notified of the start and end of the >> request. This is more efficient and you don't need to wrap the WSGI >> application entry point. >> >> import mod_wsgi >> import threading >> >> def event_handler(name, **kwargs): >> cache = mod_wsgi.request_data() >> thread = threading.current_thread() >> >> if name == 'request_started': >> cache['original_thread_name'] = thread.name >> environ = kwargs['request_environ'] >> thread.name = environ['REQUEST_URI'] >> >> elif name == 'request_finished': >> thread.name = cache['original_thread_name'] >> >> mod_wsgi.subscribe_events(event_handler) >> >> I don't think overriding the thread name with a request URI is a good idea >> here though. I think it would be better to have mod_wsgi set it based on the >> existing request ID that Apache generates as that then matches what you can >> log for the request in the access log. >> >> Overall what is a probably a better approach is for me to extend the event >> mechanism in a couple of ways. >> >> The first is to add a new event type of 'request_active'. In mod_wsgi I >> could have a default reporting interval to pick up on long running requests >> and generate a 'request_active' event every 15 seconds (configurable), so >> long as the request is running. >> >> A second event could be of type 'request_timeout'. This could be triggered >> for each request, specifically for the case where there are active requests >> when the process is being shutdown due to request-timeout expiring for the >> process. >> >> From the event handler for either of these you could log any information you >> want to. The only hard bit for me is that currently the >> mod_wsgi.request_data() call which provides access to a per request >> dictionary where you can stash data, is based on using thread locals, so >> calling that in these two events wouldn't work as it isn't being called from >> the request thread, but a separate thread. I had refrained from passing it >> as an explicit argument to the event handler for reasons I couldn't >> remember. Otherwise need to know when calling mod_wsgi.request_data() that >> doing it for these special cases and calculate the cache of request data >> another way based on knowing what request am going working with when >> triggering the event. >> >>>> If you are logging thread ID in access log, then setting thread ID to >>>> request ID and attaching it to the traceback sounds reasonable. >>> >>> I am looking for being able to easily identify my threads in a >>> "request-timeout" traceback dump when I have like 130 threads running. >>> They are nicely labeled in my code, but the mod_wsgi traceback dump >>> doesn't show the "name" field, but opaque and uninformative "thread.ident". >>> >>> I have a "futures._WorkItem" overloaded to accept an extra "thread_name" >>> parameter in the "futures.executor.submit()" so I can annotate threads >>> doing background work. >>> >>> Now that I am using "request-timeout" traceback dumps, I would love to >>> have all that information available. Just dump "thread.name" if >>> available, instead of "thread.ident" :-). >> >> >> Graham > -- You received this message because you are subscribed to the Google Groups "modwsgi" group. To unsubscribe from this group and stop receiving emails from it, send an email to modwsgi+unsubscr...@googlegroups.com. To post to this group, send email to email@example.com. Visit this group at https://groups.google.com/group/modwsgi. For more options, visit https://groups.google.com/d/optout.