Grisha wrote .. > > As an an example of an Apache module that uses output filters to do > > stuff, there is mod_cache. Luckily in that case, a HEAD request is one > > of various cases where mod_cache decides it will not use the output. > > This does not mean though that some other output filter that someone > is > > using might expect content to be there for HEAD. > > I am not sure I agree with this explanation (while I do agree that the > behaviour should be reverted, but for different reason, see below). The > RFC is pretty clear on HEAD: "The HEAD method is identical to GET except > that the server MUST NOT return a message-body in the response.", so for > a filter (or anything) to expect the content with HEAD is wrong.
That is ignoring what filters can do, including the fact that they can add additional output headers. If a filter can add additional headers, that it doesn't receive content can be a problem. This is because the additional headers it adds may be in some way be dependent on the content. In effect this is partly what your quote from the http-dev list is about. * All handlers should always send content down even if r->header_only is set. If not, it means that the HEAD requests don't generate the same headers as a GET which is wrong. I'll grant that I may be partly talking theoretical cases here, but the inbuilt Apache CONTENT_LENGTH output filter can be used to add the content length to a response, provided the output filter is able to consume the whole content the first time it is called. Here is a case whereby if mod_python.publisher didn't return the content for a HEAD request, that the content length thereby calculated by the CONTENT_LENGTH output filter would be different for the HEAD request compared to the same request as a GET. Now if I can just work out why when I add CONTENT_LENGTH as an output filter in conjunction with mod_python.publisher that I don't get a content length at all, I might have a leg to stand on. :-) I note that if I have a Python based output filter, that a single req.write() generates two calls into the output filter. The first ends because of read returning an empty string, with the second reading None straight away. [Thu Jan 05 08:49:00 2006] [error] handler_1 [Thu Jan 05 08:49:00 2006] [error] uppercase [Thu Jan 05 08:49:00 2006] [error] is_input 0 [Thu Jan 05 08:49:00 2006] [error] read 'CONTENT' [Thu Jan 05 08:49:00 2006] [error] write 'CONTENT' [Thu Jan 05 08:49:00 2006] [error] read '' [Thu Jan 05 08:49:00 2006] [error] uppercase [Thu Jan 05 08:49:00 2006] [error] is_input 0 [Thu Jan 05 08:49:00 2006] [error] read None [Thu Jan 05 08:49:00 2006] [error] close For CONTENT_LENGTH to work, I would image it should be getting the equivalent of the None read at the end of the first call into the output filter. I wander if this is somehow specifically to do with how mod_python manages the bucket brigade. It would be unfortunate if mod_python is doing something a bit strange which would prevent the CONTENT_LENGTH output filter being used. Wait, I worked it out. Two calls are being seen into the output filter because req.write() is not being told not to flush the output. Thus, if req.write() is called as: req.write(result,0) it all works and content length is added by the CONTENT_LENGTH output filter. Thus with that change, one can see the problem with not outputing content when HEAD is used. Namely, no content length header generated. ~ [505]$ telnet localhost 8080 Trying ::1... Connected to localhost. Escape character is '^]'. GET /~grahamd/content_length/example.py HTTP/1.0 HTTP/1.1 200 OK Date: Wed, 04 Jan 2006 21:59:39 GMT Server: Apache/2.0.55 (Unix) mod_python/3.2.6-dev-20051229 Python/2.3 Content-Length: 7 Connection: close Content-Type: text/plain CONTENTConnection closed by foreign host. ~ [506]$ telnet localhost 8080 Trying ::1... Connected to localhost. Escape character is '^]'. HEAD /~grahamd/content_length/example.py HTTP/1.0 HTTP/1.1 200 OK Date: Wed, 04 Jan 2006 21:59:53 GMT Server: Apache/2.0.55 (Unix) mod_python/3.2.6-dev-20051229 Python/2.3 Connection: close Content-Type: text/plain Connection closed by foreign host. Either way, we agree that mod_python.publisher should still output content for HEAD. I would also propose as a change that the req.write() call not cause output to be flushed to allow an output filter like CONTENT_LENGTH to be used. I'll add a new JIRA issue for that. Graham