[ 
https://issues.apache.org/jira/browse/COUCHDB-968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12971165#action_12971165
 ] 

Adam Kocoloski commented on COUCHDB-968:
----------------------------------------

Ugh, the deeper I look the more issues I find.  That commit is not the whole 
fix, because siblings can show up in the Place = 0 function clauses too.  I've 
added two commits to my original branch for this ticket:

https://github.com/kocolosk/couchdb/tree/968-duplicate-seq-entries-rebased

In these commits I'm relying on the condition that (length(Ours) =:= 1 or 
length(Insert) =:= 1), which I think is justified because we start with a 
single root in both Ours and Insert, and we only "drill down" into one of the 
trees.

You might recall that Damien's original code for the merge arranged the 
arguments to merge_at so that the the 3rd argument was always the tree that did 
not need to be drilled into.  That reduced the number of function clauses in 
merge_at, but it had the fatal flaw that, if the disk tree ended up in this 
position, the committed document body for a particular revision would be 
ignored in favor of saving a new copy of the same document body.  This was the 
original root cause of the dupes.

Clearly this is some really subtle stuff.  I might see if I can teach myself 
how to use QuickCheck Mini in time to have it hammer on this algorithm and look 
for other bugs.

> Duplicated IDs in _all_docs
> ---------------------------
>
>                 Key: COUCHDB-968
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-968
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Database Core
>    Affects Versions: 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2
>         Environment: any
>            Reporter: Sebastian Cohnen
>            Assignee: Adam Kocoloski
>            Priority: Blocker
>             Fix For: 0.11.3, 1.0.2, 1.1
>
>
> We have a database, which is causing serious trouble with compaction and 
> replication (huge memory and cpu usage, often causing couchdb to crash b/c 
> all system memory is exhausted). Yesterday we discovered that db/_all_docs is 
> reporting duplicated IDs (see [1]). Until a few minutes ago we thought that 
> there are only few duplicates but today I took a closer look and I found 10 
> IDs which sum up to a total of 922 duplicates. Some of them have only 1 
> duplicate, others have hundreds.
> Some facts about the database in question:
> * ~13k documents, with 3-5k revs each
> * all duplicated documents are in conflict (with 1 up to 14 conflicts)
> * compaction is run on a daily bases
> * several thousands updates per hour
> * multi-master setup with pull replication from each other
> * delayed_commits=false on all nodes
> * used couchdb versions 1.0.0 and 1.0.x (*)
> Unfortunately the database's contents are confidential and I'm not allowed to 
> publish it.
> [1]: Part of http://localhost:5984/DBNAME/_all_docs
> ...
> {"id":"9997","key":"9997","value":{"rev":"6096-603c68c1fa90ac3f56cf53771337ac9f"}},
> {"id":"9999","key":"9999","value":{"rev":"6097-3c873ccf6875ff3c4e2c6fa264c6a180"}},
> {"id":"9999","key":"9999","value":{"rev":"6097-3c873ccf6875ff3c4e2c6fa264c6a180"}},
> ...
> [*]
> There were two (old) servers (1.0.0) in production (already having the 
> replication and compaction issues). Then two servers (1.0.x) were added and 
> replication was set up to bring them in sync with the old production servers 
> since the two new servers were meant to replace the old ones (to update 
> node.js application code among other things).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to