Grisha wrote ..
> > As an an example of an Apache module that uses output filters to do 
> > stuff, there is mod_cache. Luckily in that case, a HEAD request is one
> > of various cases where mod_cache decides it will not use the output.
> > This does not mean though that some other output filter that someone
> is 
> > using might expect content to be there for HEAD.
> 
> I am not sure I agree with this explanation (while I do agree that the
> behaviour should be reverted, but for different reason, see below). The
> RFC is pretty clear on HEAD: "The HEAD method is identical to GET except
> that the server MUST NOT return a message-body in the response.", so for
> a  filter (or anything) to expect the content with HEAD is wrong.

That is ignoring what filters can do, including the fact that they can
add additional output headers. If a filter can add additional headers,
that it doesn't receive content can be a problem. This is because the
additional headers it adds may be in some way be dependent on the
content.

In effect this is partly what your quote from the http-dev list is about.

* All handlers should always send content down even if r->header_only
   is set.  If not, it means that the HEAD requests don't generate the
   same headers as a GET which is wrong.

I'll grant that I may be partly talking theoretical cases here, but the inbuilt
Apache CONTENT_LENGTH output filter can be used to add the content
length to a response, provided the output filter is able to consume the
whole content the first time it is called.

Here is a case whereby if mod_python.publisher didn't return the
content for a HEAD request, that the content length thereby calculated
by the CONTENT_LENGTH output filter would be different for the
HEAD request compared to the same request as a GET.

Now if I can just work out why when I add CONTENT_LENGTH as an
output filter in conjunction with mod_python.publisher that I don't get
a content length at all, I might have a leg to stand on. :-)

I note that if I have a Python based output filter, that a single req.write()
generates two calls into the output filter. The first ends because of read
returning an empty string, with the second reading None straight away.

[Thu Jan 05 08:49:00 2006] [error] handler_1

[Thu Jan 05 08:49:00 2006] [error] uppercase
[Thu Jan 05 08:49:00 2006] [error] is_input 0
[Thu Jan 05 08:49:00 2006] [error] read 'CONTENT'
[Thu Jan 05 08:49:00 2006] [error] write 'CONTENT'
[Thu Jan 05 08:49:00 2006] [error] read ''

[Thu Jan 05 08:49:00 2006] [error] uppercase
[Thu Jan 05 08:49:00 2006] [error] is_input 0
[Thu Jan 05 08:49:00 2006] [error] read None
[Thu Jan 05 08:49:00 2006] [error] close

For CONTENT_LENGTH to work, I would image it should be getting the
equivalent of the None read at the end of the first call into the output
filter.

I wander if this is somehow specifically to do with how mod_python
manages the bucket brigade. It would be unfortunate if mod_python
is doing something a bit strange which would prevent the
CONTENT_LENGTH output filter being used.

Wait, I worked it out. Two calls are being seen into the output filter
because req.write() is not being told not to flush the output. Thus, if
req.write() is called as:

  req.write(result,0)

it all works and content length is added by the CONTENT_LENGTH
output filter.

Thus with that change, one can see the problem with not outputing
content when HEAD is used. Namely, no content length header
generated.

~ [505]$ telnet localhost 8080
Trying ::1...
Connected to localhost.
Escape character is '^]'.
GET /~grahamd/content_length/example.py HTTP/1.0

HTTP/1.1 200 OK
Date: Wed, 04 Jan 2006 21:59:39 GMT
Server: Apache/2.0.55 (Unix) mod_python/3.2.6-dev-20051229 Python/2.3
Content-Length: 7
Connection: close
Content-Type: text/plain

CONTENTConnection closed by foreign host.


~ [506]$ telnet localhost 8080
Trying ::1...
Connected to localhost.
Escape character is '^]'.
HEAD /~grahamd/content_length/example.py HTTP/1.0

HTTP/1.1 200 OK
Date: Wed, 04 Jan 2006 21:59:53 GMT
Server: Apache/2.0.55 (Unix) mod_python/3.2.6-dev-20051229 Python/2.3
Connection: close
Content-Type: text/plain

Connection closed by foreign host.

Either way, we agree that mod_python.publisher should still output
content for HEAD.

I would also propose as a change that the req.write() call not cause
output to be flushed to allow an output filter like CONTENT_LENGTH
to be used. I'll add a new JIRA issue for that.

Graham

Reply via email to