On 2013-06-12 10:46, Alex Bligh wrote:
I think I've finally figured out what's going wrong in my module but am unsure what to do about it.The module runs on apache 2.2.22 with mpm prefork. Occasionally I am seeing corruption of the output bucket brigade, primarily the ring pointers (link->next and link->prev) ending up with strange values. Having spent some time reading the source, I believe that apache provides no protection to these pointers, and it's inherently unsafe for a bucket brigade to be used by more than one thread (even if you are careful with allocators), unless all callers provide their own mutex protection. As apache itself uses the output bucket brigade without mutex protection, the output bucket brigade can never be written to by other threads, and therefore ap_fwrite (to this brigade) can never be safely by any thread other than the main thread. First question: is this correct? My module is currently structured as follows. The main thread creates another thread for each request (the requests are long running websocket connections). The main thread does the following: while (!done) { /* Blocking read */ apr_brigade_create; ap_get_brigade; apr_brigade_flatten; /* Do stuff with the data */ blocking_socket_write; } The spawned thread does the following while (!done) { blocking_socket_read; /* do stuff with the data */ ap_fwrite(output_bucket_brigade); } Now, what I believe is happening is as follows. The blocking read in the main thread at some point calls select(), and does not only do a read, but also also a write of the data in the output bucket brigade. This removes a bucket from the ring. If this is happens at the same time as the ap_fwrite in the spawned thread adds something to the output ring, two threads will be accessing the ring pointers at once. What I can't figure out is how to fix this. I can't put in a mutex to protect the ring pointers, because the access to the ring pointers by apache is outside of my module. I can't hold a mutex across the blocking read in the main thread, because otherwise my module won't be able to write data to the output bucket brigade whilst there is no input from the apache client; as the apache client may be waiting for data to be sent to it, this could cause deadlock. And I can't obviously see how to do the read in a non-blocking way. Any ideas?
If I understand correctly, the main thread belongs to your module, i.e. it is not a concise pseudo-code of the request processing in apache's code.
I don't see where the output brigade appears in the main thread. I think this is critical, as the output_bucket_brigade is the data item shared between the two threads. ap_get_brigade triggers the execution of the chain of input filters. One of these input filters writes to the output brigade?
Sorin
