Stephen, I am probably wrong here (someone hop in and correct me), but I thought Compaction would remove the old revisions (and conflicts) of docs.
Alternatively a question for the Couch devs, if Stephen set _revs_limit to something artifically low, say 1, and restarted couch and did a compaction, would that force the DB to smash down the datastore to 1 rev per doc and remove the long-tail off these docs? REF: http://wiki.apache.org/couchdb/Compaction On Thu, Mar 14, 2013 at 2:02 AM, Stephen Bartell <[email protected]>wrote: > Hi all, > > tldr; I've got a database with just a couple docs. Conflict management > went unchecked and these docs have thousands of conflicts each. > Replication fails. Couch consumes all the server's cpu. > > First the story, then the questions. Please bear with me! > > I wanted to replicate this database to another, new database. So I > started the replication. beam.smp took 100% of my cpu and the replicator > status held steady at a constant percent for quite a while. It eventually > finished. > > I thought maybe I should handle the conflicts and then replicate. > Hopefuly it'll go faster next time. So I cleared all the conflicts. I > replicated again but this time I could not get anything to replicate. > Again, cpu held steady, topped out. I eventually restarted couch. > > I dug throughout the logs and saw that the POSTS were failing. I figure > that the replicator was timing out when trying to post to couch. > > I have a replicator that I've been working on thats written in node.js. > So I started that one up to do the same thing. I drew inspiration from > Pouchdb's replicator and from Jens Alkes amazing replication algorithm > documentation, so my replicator follows more or less the same story. 1) > consume _changes with style=all_docs. 2) revs_diff on the target database. > 3) get each revision from source with revs=true. 4) bulk post with > new_edits=false. > > Same thing. Except now I can kind of make sense of whats going on. > Sucking the data out of the source is no problem. Diffing the revs > against the target is no problem. Posting the docs is THE problem. Since > the database is clean, thousands of docs are being thrown at couch at once > to build up the revision trees. Couch is just taking forever in finishing > the job. It doesn't matter if I bulk post the docs or post them > individually, couch sucks 100% of my cpu every time and takes forever to > finish. (I actually never let it finish). > > So that is is the story. Here are my questions. > > 1) Has anyone else stepped on this mine? If so, could I get pointed > towards some workarounds? I don't think it is right to make the assumption > that users of couchdb will never have databases with huge conflict sausages > like this. So simply saying manage your conflicts won't help. > > 2) Lets say I did manage my conflicts. I still have the > _deleted_conflicts sausage. I know that _deleted and _deleted_docs must be > replicated to maintain consistency across the cluster. If the replicator > throws up when these huge sausages come through, how is the data ever going > to replicate? Is there a trade secret I don't know about? > > 3) Is there any limit on the resources that CouchDB is allowed to consume? > I can get that we run into these cases where theres tons of data to move > and its just going to take a hell of a long time. But I don't get why its > permissible for CouchDB to eat all my cpu. The whole server should never > grind to a halt because its moving lots of data. I feel like it should be > like the little train who could. Just chug along slow and steady until it > crests the hill. > > I would really like to reply on the erlang replicator, but I can't. At > least with the replicator I wrote I have a chance with throttling the posts > so CouchDB doesn't render my server useless. > > Sorry for wrapping more questions into those questions. I'm pretty tired, > stumped, and have machines in production crumbling. > > Best, > Stephen
