> On 8 Feb 2018, at 1:08 pm, Jesus Cea <[email protected]> wrote:
>
> On 08/02/18 02:39, Graham Dumpleton wrote:
>> What are you setting the thread name too?
>
> Beside the mod_wsgi threads running "application()", my code creates
> tons of long term threads like "cache_cleanup",
> "periodic_cache_flush_to_disk", "map generation workers", "audio
> transcoding", etc.
>
>> https://github.com/GrahamDumpleton/mod_wsgi/issues/160
>
> Uhm, first thing in "application()" could be a "thread.name=URI", and
> the "finally" statement could be "thread.name='idle'", or in the
> "close()" code of the iterator returned.
>
> This looks like a pattern for a near trivial middleware.
Using a WSGI middleware for that is a bad idea because of the complexity of
implementing a WSGI middleware that properly bounds the full execution of the
code involved in all parts of handling the request. The better way is to use
the event system in mod_wsgi to be notified of the start and end of the
request. This is more efficient and you don't need to wrap the WSGI application
entry point.
import mod_wsgi
import threading
def event_handler(name, **kwargs):
cache = mod_wsgi.request_data()
thread = threading.current_thread()
if name == 'request_started':
cache['original_thread_name'] = thread.name
environ = kwargs['request_environ']
thread.name = environ['REQUEST_URI']
elif name == 'request_finished':
thread.name = cache['original_thread_name']
mod_wsgi.subscribe_events(event_handler)
I don't think overriding the thread name with a request URI is a good idea here
though. I think it would be better to have mod_wsgi set it based on the
existing request ID that Apache generates as that then matches what you can log
for the request in the access log.
Overall what is a probably a better approach is for me to extend the event
mechanism in a couple of ways.
The first is to add a new event type of 'request_active'. In mod_wsgi I could
have a default reporting interval to pick up on long running requests and
generate a 'request_active' event every 15 seconds (configurable), so long as
the request is running.
A second event could be of type 'request_timeout'. This could be triggered for
each request, specifically for the case where there are active requests when
the process is being shutdown due to request-timeout expiring for the process.
>From the event handler for either of these you could log any information you
>want to. The only hard bit for me is that currently the
>mod_wsgi.request_data() call which provides access to a per request dictionary
>where you can stash data, is based on using thread locals, so calling that in
>these two events wouldn't work as it isn't being called from the request
>thread, but a separate thread. I had refrained from passing it as an explicit
>argument to the event handler for reasons I couldn't remember. Otherwise need
>to know when calling mod_wsgi.request_data() that doing it for these special
>cases and calculate the cache of request data another way based on knowing
>what request am going working with when triggering the event.
>> If you are logging thread ID in access log, then setting thread ID to
>> request ID and attaching it to the traceback sounds reasonable.
>
> I am looking for being able to easily identify my threads in a
> "request-timeout" traceback dump when I have like 130 threads running.
> They are nicely labeled in my code, but the mod_wsgi traceback dump
> doesn't show the "name" field, but opaque and uninformative "thread.ident".
>
> I have a "futures._WorkItem" overloaded to accept an extra "thread_name"
> parameter in the "futures.executor.submit()" so I can annotate threads
> doing background work.
>
> Now that I am using "request-timeout" traceback dumps, I would love to
> have all that information available. Just dump "thread.name" if
> available, instead of "thread.ident" :-).
Graham
--
You received this message because you are subscribed to the Google Groups
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/d/optout.