On Fri, Oct 30, 2009 at 08:37:36PM -0400, Adam Kocoloski wrote:
> I like where your head's at on this, Brian. I should mention that it
> *is* possible to retrieve all conflict revisions of a document with one
> request:
>
> GET /db/bob?open_revs=all
Unfortunately, open_revs=all opens more than just the current conflicting
revisions, and the live one is not the first. For example, after branching
to three conflicting revisions and then merging them back into one, I get
[{"ok": {"_deleted"=>true, ...},
{"ok": {"_deleted"=>true, ...},
{"ok": {...current doc...}
This is true after compaction as well. The attached program demonstrates
this.
Do we need a version of get_all_leafs which excludes _deleted members?
> The response format is a slightly awkward Array -- I believe the first
> revision is the winning one.
>
> [{"ok":{"_id":"bob","_rev":"1-3453545",...}},{"ok":
> {"_id":"bob","_rev":"1-23042"]
The "ok" tag isn't hard to strip, so this is just a minor annoyance. At
first I couldn't see why it was there at all, but it turns out that
open_revs also lets you list revisions explicitly:
$ curl 'http://127.0.0.1:5984/conflict_test/test?open_revs=%5b%221-2345%22%5d'
[{"missing":"1-2345"}]
A simpler API might be to return a _missing member for the requested doc
itself, similar to _deleted, e.g.
[{"_id":"test","_rev":"1-2345","_missing":true}]
> I think I'd be in favor of making the default GET include all conflicts,
> but probably in the _conflicts field so as to minimize the changes to the
> current API. I'm not sure a multi-rev version of PUT is as urgent a
> need.
The only downside of the current _bulk_docs way of resolving a conflict is
that it is asymmetrical. If you have (say) three conflicting revisions of a
document, then you have to update one and delete the other two, rather than
supercede all three. So you have to pick one as somehow "more important"
than the other two, or follow couchdb's arbitrary choice.
This might be an artefact of couchdb's revision history mechanism: I would
guess that it does not allow multiple ancestors of a single revision
(unlike, say, git, which lets you have a merge commit with multiple parents)
A quick test suggests this is true. If I set up the following conflict
sequence:
,--> r2a --> r3
r1 --> r2b --> (deleted)
`--> r2c --> (deleted)
then I query r3 with revs_info=yes, then I see only the linear sequence
r1-r2a-r3.
Regards,
Brian.
require 'rubygems'
require 'restclient'
require 'json'
require 'pp'
DB="http://127.0.0.1:5984/conflict_test"
# Write multiple documents as all_or_nothing, can introduce conflicts
def writem(docs)
JSON.parse(RestClient.post("#{DB}/_bulk_docs", {
"all_or_nothing" => true,
"docs" => docs,
}.to_json))
end
# Write one document, return the rev
def write1(doc, id=nil, rev=nil)
doc['_id'] = id if id
doc['_rev'] = rev if rev
writem([doc]).first['rev']
end
# Read a document, return *all* revs
def read1(id)
retries = 0
loop do
# FIXME: escape id
res = [JSON.parse(RestClient.get("#{DB}/#{id}?conflicts=true"))]
if revs = res.first.delete('_conflicts')
begin
revs.each do |rev|
res << JSON.parse(RestClient.get("#{DB}/#{id}?rev=#{rev}"))
end
rescue
retries += 1
raise if retries >= 5
next
end
end
return res
end
end
# Create DB
RestClient.delete DB rescue nil
RestClient.put DB, {}.to_json
# Write a document
rev1 = write1({"hello"=>"xxx"},"test")
p read1("test")
# Make three conflicting versions
write1({"hello"=>"foo"},"test",rev1)
write1({"hello"=>"bar"},"test",rev1)
write1({"hello"=>"baz"},"test",rev1)
res = read1("test")
p res
puts "open_revs when there are conflicts:"
pp JSON.parse(RestClient.get("#{DB}/test?open_revs=all"))
# Now let's replace these three with one
res.first['hello'] = "foo+bar+baz"
res.each_with_index do |r,i|
unless i == 0
r.replace({'_id'=>r['_id'], '_rev'=>r['_rev'], '_deleted'=>true})
end
end
writem(res)
p read1("test")
puts "open_revs when conflict resolved:"
pp JSON.parse(RestClient.get("#{DB}/test?open_revs=all"))
RestClient.post("#{DB}/_compact", {}.to_json)
puts "open_revs after compact:"
pp JSON.parse(RestClient.get("#{DB}/test?open_revs=all"))