On Sep 15, 2009, at 10:44 PM, Paul Davis wrote:

Regardless of browser support, the first question should always be
weather we can avoid hacks specific to a user agent. Unless you can
show that there's a case where its absolutely impossible for a
significant user agent to configure itself to work properly I would be
at least a -0 on this.



I've seriously researched the issue since it is a critical issue in my use of CouchDB as apparently it was for the initial filer of the bug. IE can be configured to immediately treat content as expired, however requiring users to change their advanced connection options in IE (which they may be policy be prevented from doing) or using a different browser. Very good way of having a client decide that CouchDB is not an acceptable platform. I hate having to deploy a branched CouchDB, but that is what I'm doing now.

You can add additional bogus random query parameters to interfere with caching, however that come at the cost of losing all cacheing, plus at least at the time originally posted some of the end-points did not accept unrecognized parameters.


Also, the spec fairly explicitly states:

The format is an absolute date and time as defined by HTTP-date in
section 3.3.1; it MUST be in RFC 1123 date format

And

To mark a response as "already expired," an origin server sends an
Expires date that is equal to the Date header value.

That pretty much says that neither a random historical date or value
of 0 is ever good. The place where it does mention 0:

HTTP/1.1 clients and caches MUST treat other invalid date formats,
especially including the value "0", as in the past (i.e., "already
expired").

Here it seems that the spec is specifically saying that this A Bad
Thing ™ so much that it went out of its way to specify the error
condition. That's not the same as saying "its ok to send 0, any other
invalid date, or a date in the past".


The spec appears to be walking a fine line with respecting behavior of some HTTP 1.0 caches that treated expires in the past as equivalent to no-cache. See section 14.9.3, where HTTP 1.1 caches are instructed to assume no-cache if they see an Expires date in the past without a Cache-Control header. CouchDB does send a Cache-Control, so we would not be affected by that.

Within RFC 2616, the description of the treatment of "Expires 0", the note in section 14.9.3 and section 14.18.1, all seem to acknowledge the use of Expires in the long past. RFC 2109 had this quote that indicated that at least in 1997, having a fixed Expires date in the past was a common pattern.

HTTP/1.1 servers must send Expires: old-date (where old-date is a date long in the past) on responses containing Set-Cookie response headers unless they know for certain (by out of band means) that there are no downsteam HTTP/1.0 proxies. HTTP/1.1 servers may send other Cache-Control directives that permit caching by HTTP/1.1 proxies in addition to the Expires: old-date directive; the Cache- Control directive will override the Expires: old-date for HTTP/1.1 proxies.


The HTTP spec is pretty clear that any bad date value should be interpreted as in the past (which is what we want). It seems farfetched that any reasonable client would do the wrong thing with an old date.

In my testing (as mentioned in the bug report), max-age=0 was insufficient to prevent stale requests if subsequent reads occurred within the same second. I have not tested Expires == Date, but I think there is a possibility that some clients might still have a window where you can get stale depending on the client. Setting max- age=0 is clearly within the spec and would likely reduce the user experiences of conflicts due to stale data, but would not be sufficient to get the unit tests to work unless you added explicit waits.



Also I looked around the spec for a bit trying to find a logical
progression for when Expires applies vs ETag but couldn't find
anything. Though also importantly I didn't see "Clients are free to
use a heuristic in the absence of this header" clause. I'm well aware
that RFC's can be difficult to respect given their ambiguity in
places, but this appears to be another example of just making stuff up
though I could be convinced otherwise if there's a thread on a W3C ML
or something about why this heuristic exists.

I've monitored the traffic and the logs before and after the patch and see exactly what I would expect. Second requests to unchanged documents get a 304 returned from the CouchDB and use the previously retrieved value on
every browser I've tried.  Fire up Fiddler or your favorite network
monitoring tool and see for yourself.

Second requests from Safari get a 304 returned without the patch. Feel
free to fire up Wireshark. :)


All clients that I've tested other than IE 6 return a 304 without the patch. With the patch, IE 6 gets a 304 and all the other clients do too. Christopher Lenz assertion was that if you applied the patch, that you would no longer get a 304 on the second request. You can test his assertion by applying my patch to CouchDB and trying the test again. If you no longer get a 304, then Christopher is correct and I've missed something, but I'm thninking that you will still see a 304.


But in all seriousness, the real question is whether we're improving
the situation by fudging this aspect of the HTTP spec or not. The fact
is IE6 (as much as we all hate it) still has a noticeable market
share. Just kicking it to the curb would be expedient but isn't the
right answer either. The answer is that we need to make sure that it
can be made compatible, and if not then and only then should we
consider breaking HTTP as a special case.

As Christopher Lenz mentions, if the concern is a working Futon on IE6
then adding smarts for detecting the browser environment and
configuring itself is a patch away. If its trying to force CouchDB to
make amends for a specific broken HTTP stack, that's another. Unless
it can be shown that its impossible for IE6 to fix itself there's no
reason to complicate every other client.

My concern is not Futon on IE 6. My concern is that I have to tell people that anytime they get a conflict message in my app, then have to close their browser, reload the site and redo their work. None of my code is based on Futon or JQuery.



The unconditional Expires header is the simplest fix. As far as I can tell, it has no undesirable effects and accomplishes the goal. If that doesn't go in, then I'd prefer to see the header values being configurable instead of
baking in other logic.

Configurable headers are a good idea. "X-Noah: Awesome" is something
that CouchDB should be able to do. Though I don't know what other
logic we'd be baking in unless you mean browser sniffing. Christopher
Lenz might've only been -0 on that, but I'd be -shitton.

I took a shot at it before I submitted my patch, but I must have missed something since I couldn't get the info out of the configuration file. I thought the more minimal patch was likely to be accepted.


I don't want to get in a reopen war, but please reopen the bug.

As far as I'm concerned you haven't done anything to refute
Christopher's logic on why this isn't a bug. Feel free to open a
configurable headers ticket though because I think that'd be generally
useful.

Paul Davis

Christopher made an assertion that my patch broke all caching. The bug was closed on that assertion. I've done a lot of testing that is inconsistent with his assertion. The code and the tools are there for anyone else to check our assertions.

I didn't file the initial bug report, but I've encountered the same issue in my development. It has been dismissed as a broken stack and fix it on the client side, however there is nothing like an easy client side fix. I asked in May, if it wasn't going to get fixed server-side, then what was the client-side fix.

Reply via email to