> On 8 Feb 2018, at 1:08 pm, Jesus Cea <[email protected]> wrote:
> 
> On 08/02/18 02:39, Graham Dumpleton wrote:
>> What are you setting the thread name too?
> 
> Beside the mod_wsgi threads running "application()", my code creates
> tons of long term threads like "cache_cleanup",
> "periodic_cache_flush_to_disk", "map generation workers", "audio
> transcoding", etc.
> 
>> https://github.com/GrahamDumpleton/mod_wsgi/issues/160
> 
> Uhm, first thing in "application()" could be a "thread.name=URI", and
> the "finally" statement could be "thread.name='idle'", or in the
> "close()" code of the iterator returned.
> 
> This looks like a pattern for a near trivial middleware.

Using a WSGI middleware for that is a bad idea because of the complexity of 
implementing a WSGI middleware that properly bounds the full execution of the 
code involved in all parts of handling the request. The better way is to use 
the event system in mod_wsgi to be notified of the start and end of the 
request. This is more efficient and you don't need to wrap the WSGI application 
entry point.

import mod_wsgi
import threading

def event_handler(name, **kwargs):
    cache = mod_wsgi.request_data()
    thread = threading.current_thread()

    if name == 'request_started':
        cache['original_thread_name'] = thread.name
        environ = kwargs['request_environ']
        thread.name = environ['REQUEST_URI']

    elif name == 'request_finished':
        thread.name = cache['original_thread_name']

mod_wsgi.subscribe_events(event_handler)

I don't think overriding the thread name with a request URI is a good idea here 
though. I think it would be better to have mod_wsgi set it based on the 
existing request ID that Apache generates as that then matches what you can log 
for the request in the access log.

Overall what is a probably a better approach is for me to extend the event 
mechanism in a couple of ways.

The first is to add a new event type of 'request_active'. In mod_wsgi I could 
have a default reporting interval to pick up on long running requests and 
generate a 'request_active' event every 15 seconds (configurable), so long as 
the request is running.

A second event could be of type 'request_timeout'. This could be triggered for 
each request, specifically for the case where there are active requests when 
the process is being shutdown due to request-timeout expiring for the process.

>From the event handler for either of these you could log any information you 
>want to. The only hard bit for me is that currently the 
>mod_wsgi.request_data() call which provides access to a per request dictionary 
>where you can stash data, is based on using thread locals, so calling that in 
>these two events wouldn't work as it isn't being called from the request 
>thread, but a separate thread. I had refrained from passing it as an explicit 
>argument to the event handler for reasons I couldn't remember. Otherwise need 
>to know when calling mod_wsgi.request_data() that doing it for these special 
>cases and calculate the cache of request data another way based on knowing 
>what request am going working with when triggering the event.

>> If you are logging thread ID in access log, then setting thread ID to
>> request ID and attaching it to the traceback sounds reasonable.
> 
> I am looking for being able to easily identify my threads in a
> "request-timeout" traceback dump when I have like 130 threads running.
> They are nicely labeled in my code, but the mod_wsgi traceback dump
> doesn't show the "name" field, but opaque and uninformative "thread.ident".
> 
> I have a "futures._WorkItem" overloaded to accept an extra "thread_name"
> parameter in the "futures.executor.submit()" so I can annotate threads
> doing background work.
> 
> Now that I am using "request-timeout" traceback dumps, I would love to
> have all that information available. Just dump "thread.name" if
> available, instead of "thread.ident" :-).


Graham

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/d/optout.

Reply via email to