Hi all,
Had a bit of trouble with etags and my browser cache tonight. In short, the
etag was always the same in a circumstance where I assumed it wouldn't. Here
are the details.
I'm writing some functional tests that create a couchdb database, a view
document, insert some documents, and then execute various tests against my own
client javascript code. One of my tests would pass once, and then regularly
fail without any change to the client code. The issue was the browser cache.
Here are the details:
Functional test setup:
1. create '/test_db'
2. insert view:
function(doc) {
var recordType = doc.type;
for(var key in doc)
if(/.+_id$/.test(key)){
emit({type:recordType, fKey:doc[key]}, null);
}
}
}
3. insert document A (via put)
{
_id:'parent_id', //I wanted a known ID for convenience sake
type:'parent'
}
3. insert document B (via post, because I don't care what the ID would be)
{
type:'child',
parent_id:'parent_id' //using my known ID
}
OK - setup is complete. I've verified several times that the setup works as
expected. The test that caused issues with the browser cache was testing my
code that dynamically builds a query against the view. The anticipated query in
this case looks like this:
http://localhost:5984/test_db/_design/fetch/_view/fetchKeys?key={%22type%22%3A%22child%22%2C%22fKey%22%3A%22parent_id%22}&include_docs=true
What's happening is, the etag of the response to this query is always the same.
My problem that in this case, since I'm including 'include_docs=true', the
results of the query are never the same. To illustrate:
First run through the test, the results of the query is:
Header:
etag: "7BD040ILHVQHCY0L8ER5DW2RG"
date: Mon, 04 Oct 2010 02:02:52 GMT
server: CouchDB/1.0.1 (Erlang OTP/R14B)
content-length: 0
Body:
{"total_rows":1,"offset":0,"rows":[
{"id":"121026820a1c1454e82930085604b15a","key":{"type":"child","fKey":"parent_id"},"value":null,"doc":{"_id":"121026820a1c1454e82930085604b15a","_rev":"1-c2ad8daac622eea701f0c62dfa023ea9","parent_id":"parent_id","type":"child"}}
]}
The teardown deletes the database. I run the same test again, the parent has
the same id, but the child gets a new one. The query is the same, and the index
of the response looks the same, although the document itself is different.
Here's the results of the second run:
Header:
etag: "7BD040ILHVQHCY0L8ER5DW2RG"
date: Mon, 04 Oct 2010 02:07:27 GMT
server: CouchDB/1.0.1 (Erlang OTP/R14B)
content-length: 0
Body:
{"total_rows":1,"offset":0,"rows":[
{"id":"121026820a1c1454e82930085604be87","key":{"type":"child","fKey":"parent_id"},"value":null,"doc":{"_id":"121026820a1c1454e82930085604be87","_rev":"1-c2ad8daac622eea701f0c62dfa023ea9","parent_id":"parent_id","type":"child"}}
]}
My test keeps failing because it's anticipating the ID of the new 'child' row.
The issue is that since the etag is the same, my browser cache takes over and
returns the 'old' row to the javascript code. I've verified this by simply
clearing my browser cache and rerunning my same, unaltered scripts. If I clear
the browser cache between each run, it passes. Otherwise, it passes the first
run after the cache clear, and fails every subsequent test. Each subsequent
failure clearly shows that my code 'received' the ID child row during my first
passing run, even though the insert captured the new ID, so
first run:
passed - expected childID "foo", actual child id "foo"
second run:
failed expected childID "bar", but was "foo"
This feels to me like a case where the current algorithm for the etag isn't
quite precise enough. Would it be reasonable to assume that if the query itself
included the 'include_docs' param, then the etag algorithm should take that
data into consideration? Actually in my case, simply taking the id field into
consideration would solve my problem.
For the time being, I've modified my test to fetch a UUID for the parent
document. This makes the view row, and subsequent etag, unique for each query.
Tim