[ 
https://issues.apache.org/jira/browse/COUCHDB-968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965199#action_12965199
 ] 

Bob Dionne commented on COUCHDB-968:
------------------------------------

 I see that there are possibly a few aspects to this but I think the core issue 
is that couchdb simply does not handle exceeding the revs_limit gracefully. 
Some more data points:

1. In my test I've set revs_limit=3, have added a single doc and updated it 10 
times. The changes[1] run before compaction shows that the revs seem to be out 
of sync with the seq no. Perhaps that's ok but with one doc which is the only 
one updated they should be in sync? Also 3 of the changes are missing.

2. timing is a factor. I put in a 3 sec sleep between updates and all is hunky 
dory. So the problem seems to occur when the replication from db2->db1 finds 
that a new update has occurred on db1.

Interestingly one can fix it completely by reversing the direction in the stem 
function, but it breaks almost everything else :) I don't quite grok what the 
stemming is intended to do. In particular under what conditions will the Tree 
exceed the Limit (which in all paths is always revs_limit) and need to be 
stemmed? Since it's called immediately after merge_rev_trees, which has a 
comment about checking that a previous revision is a leaf node, the real 
culprit might be merge_rev_trees as you surmised.

Nice to see you made it safely back to civilization :)


[1] https://gist.github.com/721483


> Duplicated IDs in _all_docs
> ---------------------------
>
>                 Key: COUCHDB-968
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-968
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Database Core
>    Affects Versions: 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2
>         Environment: Ubuntu 10.04.
>            Reporter: Sebastian Cohnen
>            Priority: Blocker
>
> We have a database, which is causing serious trouble with compaction and 
> replication (huge memory and cpu usage, often causing couchdb to crash b/c 
> all system memory is exhausted). Yesterday we discovered that db/_all_docs is 
> reporting duplicated IDs (see [1]). Until a few minutes ago we thought that 
> there are only few duplicates but today I took a closer look and I found 10 
> IDs which sum up to a total of 922 duplicates. Some of them have only 1 
> duplicate, others have hundreds.
> Some facts about the database in question:
> * ~13k documents, with 3-5k revs each
> * all duplicated documents are in conflict (with 1 up to 14 conflicts)
> * compaction is run on a daily bases
> * several thousands updates per hour
> * multi-master setup with pull replication from each other
> * delayed_commits=false on all nodes
> * used couchdb versions 1.0.0 and 1.0.x (*)
> Unfortunately the database's contents are confidential and I'm not allowed to 
> publish it.
> [1]: Part of http://localhost:5984/DBNAME/_all_docs
> ...
> {"id":"9997","key":"9997","value":{"rev":"6096-603c68c1fa90ac3f56cf53771337ac9f"}},
> {"id":"9999","key":"9999","value":{"rev":"6097-3c873ccf6875ff3c4e2c6fa264c6a180"}},
> {"id":"9999","key":"9999","value":{"rev":"6097-3c873ccf6875ff3c4e2c6fa264c6a180"}},
> ...
> [*]
> There were two (old) servers (1.0.0) in production (already having the 
> replication and compaction issues). Then two servers (1.0.x) were added and 
> replication was set up to bring them in sync with the old production servers 
> since the two new servers were meant to replace the old ones (to update 
> node.js application code among other things).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to