Hi, yesterday I spent a few ours looking at the peruser source code trying to get some basic understanding of it. The idea behind peruser is great and I'm still wondering why it (or something similar) is not the default MPM for apache. Obviously the task isn't easy otherwise per_child wouldn't have died and peruser would be backed up by much more developers. Given that, many thanks to Sean Gabriel for taking module as far as it is. Since I've read the source code now I've got at least vague idea how much work this might have been. Good work.
I also looked at the SSL problem which seems to be the most pressing problem after a workaround for the IP logging is in place (I cannot reproduce this BTW). With the Apache log level set to debug and full peruser debugging turned on the problem is easy to spot: The request is processed twice by mod_ssl. The first time it is decoded in the multiplexer. Then the request is passed on to a worker/processor. As soon as the worker accepts the request mod_ssl jumps in again and tries to decode the already decoded request. Obviously this doesn't work. SSL ouputs an error message that HTTP was used over a non-HTTP connection and exits. This error message is never displayed by the web browser be it is still in SSL mode as set up by the multiplexer. I recognized three approaches to fix the problem: 1) We could do all SSL decoding in the multiplexer and then pass on the decoded request to the worker. This obviously is a potential performance bottleneck but that could be solved later on using multiple multiplexers. SSL processing has to be disabled in the processors for this approach to work. Luckily mod_ssl offers an optional function (ssl_engine_disable) which can be aquired and used at runtime by peruser. I tried this approach but was not able to get it working. Although the processing of the document seems to work it fails to encode the data which is returned to the client. I had to learn that the multiplexer only accepts the data but is not involved in actually returning the data to the client. Still, this approach looks somewaht promising and if we'd be able to only enable the SSL output filter on the processor we might be able to get this working. But yes, this would be a bit messy and require changes to mod_ssl. 2) We could disable SSL processing on the multiplexer. I didn't expect this approach to work at all but with the code from the first approach in place this basically was a one-character change. My expectation was that the request wouldn't be passed on at all since pass_request() works on a already parsed request which obviously isn't possible when the data is still encoded. To my big surprise the data was passed to a processor and mod_ssl started to process it. The problem now is that the SSL/TLS protocol is an interactive one. Unlike HTTP where only a single request is sent and then a single answer is sent back the TLS protocol involves multiple steps. But they multiplexer simply reads the data from the client, passes them to the processor and closes the connection. So there is no way for the ssl module in the processor to start the two-way protocol with the client. Again, pretty close but still not there. We obviously could try to hack the multiplexer to relay all communication between the client and the processor but this again will be a bottleneck and might open the way for a simple DOS attack. 3) The most promising approach I see would be to extend peruser to pass the input socket to the processor without touching it. Since SSL doesn't allow name based virtual hosting the decision which vhost should handle the request is easy to make, the IP address and the port number are the only values involved and both can be taken from the socket itself. There is no requirement to read any data. We only need to read the headers for the non-ssl case and then only if name based virtual hosting is activated for the given ip/port. Besides allowing us to support ssl this will also reduce the workload of the multiplexer for normal http servers and thus make peruser more scalable. I did a very crude hack and indeed it works ;) Currently it's a hack, not more. There is a hard coded check for port 443 and it always selects the first available processor, regardless which server environment has been selected for a given vhost. It might also have some memory leaks, I don't know the APR framework in detail, so this might need some work, too. I'll prepare a clean patch and send it around later on. Stefan _______________________________________________ Peruser mailing list [email protected] http://www.telana.com/mailman/listinfo/peruser
