Re: replicating docs with tons of conflicts

Stephen Bartell Fri, 15 Mar 2013 10:41:17 -0700

On Mar 14, 2013, at 3:36 PM, Robert Newson <[email protected]> wrote:


> Runaway processes are the very devil but the problem is not specific
> to CouchDB, there is no CouchDB mechanism for this just as there's no
> bash/python/ruby/perl method to limit a while(true){} loop.

Totally makes sense.

> 
> Highly conflicted documents are painful to update and read. I can't do
> anything about that today.

Thanks for your feedback!

> 
> B.
> 
> On 14 March 2013 17:23, Stephen Bartell <[email protected]> wrote:
>> Robert, this only works if I don't need to keep those docs around anymore.  
>> In my case, I want to keep the docs.  I don't want to keep the conflicts of 
>> the docs. Most importantly thought, even if I delete all the conflicts on 
>> all my docs, I still have the problem of _deleted_docs.  What I've seen is 
>> that only a few docs with a few thousand _deleted_docs each will plug up 
>> Couch and render unusable. You can't get rid of it through natural means.
>> 
>> This is what Riyad was bringing up and what Ive implemented.  I have a 
>> program which replicates from the troubled database _changes with the query 
>> param style=main_only.  This allows me to still have the revision tree of 
>> the troubled database, but without the _deleted_conflicts.  I can then wipe 
>> out the troubled db, recreate it, and replicate the shiny clean data back 
>> into it.
>> 
>> This is unnatural and requires custom code to make happen.  I can live with 
>> it until a better solution comes around.
>> 
>> What I'm really concerned about is how couchdb eats all my cpu.
>> 
>> Is there any way to ration the resources that couchdb uses? Like tell it not 
>> to use more than 50% or something.  I think that couch eating all the 
>> resources on a machine just because its reading loads of data is a bug.  Is 
>> this a reasonable conclusion?
>> 
>> On Mar 14, 2013, at 2:18 PM, Robert Newson <[email protected]> wrote:
>> 
>>> One trick, you can delete the doc and replicate with a filter like
>>> 'return !doc['_deleted'];' that blocks all deletes. the target db will
>>> then not receive any trace of these highly conflicted docs.
>>> 
>>> On 14 March 2013 14:10, Stephen Bartell <[email protected]> wrote:
>>>> 
>>>> On Mar 14, 2013, at 11:44 AM, Robert Newson <[email protected]> wrote:
>>>> 
>>>>> Conflicts are *not* removed during compaction, CouchDB has no way of
>>>>> knowing which ones it would be ok to delete.
>>>> 
>>>> Yep, they need to be deleted in the context of the person/process 
>>>> manipulating the docs.
>>>> 
>>>>> 
>>>>> CouchDB does struggle to process documents with lots of conflicts,
>>>>> we've encountered this at Cloudant a fair bunch. We resolve the
>>>>> conflicts via http if possible or, if that consistently fails, with a
>>>>> direct erlang manipulation. It's certainly something we need to
>>>>> improve.
>>>>> 
>>>> 
>>>> But even deleting them yields the same problem.  When replicating, the 
>>>> _deleted_conflicts is carried over.
>>>> Users could be diligent in deleting conflicts, but still end up unable to 
>>>> replicate their docs because of the volume of _deleted_conflicts.
>>>> 
>>>> Robert, thanks for chiming in.  I feel better knowing I'm in good company 
>>>> with this problem. When this mine eventually goes off, couchdb is rendered 
>>>> useless because beam.smp takes all the cpu.  Is there any way to ration 
>>>> the resources couchdb consumes?
>>>> 
>>>>> B.
>>>>> 
>>>>> On 14 March 2013 13:09, Riyad Kalla <[email protected]> wrote:
>>>>>> Stephen,
>>>>>> I am probably wrong here (someone hop in and correct me), but I thought
>>>>>> Compaction would remove the old revisions (and conflicts) of docs.
>>>>>> 
>>>>>> Alternatively a question for the Couch devs, if Stephen set _revs_limit 
>>>>>> to
>>>>>> something artifically low, say 1, and restarted couch and did a 
>>>>>> compaction,
>>>>>> would that force the DB to smash down the datastore to 1 rev per doc and
>>>>>> remove the long-tail off these docs?
>>>>>> 
>>>>>> REF: http://wiki.apache.org/couchdb/Compaction
>>>>>> 
>>>>>> On Thu, Mar 14, 2013 at 2:02 AM, Stephen Bartell 
>>>>>> <[email protected]>wrote:
>>>>>> 
>>>>>>> Hi all,
>>>>>>> 
>>>>>>> tldr; I've got a database with just a couple docs.  Conflict management
>>>>>>> went unchecked and these docs have thousands of conflicts each.
>>>>>>> Replication fails.  Couch consumes all the server's cpu.
>>>>>>> 
>>>>>>> First the story, then the questions.  Please bear with me!
>>>>>>> 
>>>>>>> I wanted to replicate this database to another, new database.  So I
>>>>>>> started the replication.  beam.smp took 100% of my cpu and the 
>>>>>>> replicator
>>>>>>> status held steady at a constant percent for quite a while.  It 
>>>>>>> eventually
>>>>>>> finished.
>>>>>>> 
>>>>>>> I thought maybe I should handle the conflicts and then replicate.
>>>>>>> Hopefuly it'll go faster next time.  So I cleared all the conflicts.  I
>>>>>>> replicated again but this time I could not get anything to replicate.
>>>>>>> Again, cpu held steady, topped out. I eventually restarted couch.
>>>>>>> 
>>>>>>> I dug throughout the logs and saw that the POSTS were failing.  I figure
>>>>>>> that the replicator was timing out when trying to post to couch.
>>>>>>> 
>>>>>>> I have a replicator that I've been working on thats written in node.js.
>>>>>>> So I started that one up to do the same thing.  I drew inspiration from
>>>>>>> Pouchdb's replicator and from Jens Alkes amazing replication algorithm
>>>>>>> documentation, so my replicator follows more or less the same story.  1)
>>>>>>> consume _changes with style=all_docs.  2) revs_diff on the target 
>>>>>>> database.
>>>>>>> 3) get each revision from source with revs=true.  4) bulk post with
>>>>>>> new_edits=false.
>>>>>>> 
>>>>>>> Same thing.  Except now I can kind of make sense of whats going on.
>>>>>>> Sucking the data out of the source is no problem.  Diffing the revs
>>>>>>> against the target is no problem.  Posting the docs is THE problem.  
>>>>>>> Since
>>>>>>> the database is clean, thousands of docs are being thrown at couch at 
>>>>>>> once
>>>>>>> to build up the revision trees.  Couch is just taking forever in 
>>>>>>> finishing
>>>>>>> the job.  It doesn't matter if I bulk post the docs or post them
>>>>>>> individually, couch sucks 100% of my cpu every time and takes forever to
>>>>>>> finish. (I actually never let it finish).
>>>>>>> 
>>>>>>> So that is is the story. Here are my questions.
>>>>>>> 
>>>>>>> 1) Has anyone else stepped on this mine?  If so, could I get pointed
>>>>>>> towards some workarounds?  I don't think it is right to make the 
>>>>>>> assumption
>>>>>>> that users of couchdb will never have databases with huge conflict 
>>>>>>> sausages
>>>>>>> like this. So simply saying manage your conflicts won't help.
>>>>>>> 
>>>>>>> 2) Lets say I did manage my conflicts.  I still have the
>>>>>>> _deleted_conflicts sausage.  I know that _deleted and _deleted_docs 
>>>>>>> must be
>>>>>>> replicated to maintain consistency across the cluster.  If the 
>>>>>>> replicator
>>>>>>> throws up when these huge sausages come through, how is the data ever 
>>>>>>> going
>>>>>>> to replicate?  Is there a trade secret I don't know about?
>>>>>>> 
>>>>>>> 3) Is there any limit on the resources that CouchDB is allowed to 
>>>>>>> consume?
>>>>>>> I can get that we run into these cases where theres tons of data to move
>>>>>>> and its just going to take a hell of a long time.  But I don't get why 
>>>>>>> its
>>>>>>> permissible for CouchDB to eat all my cpu.  The whole server should 
>>>>>>> never
>>>>>>> grind to a halt because its moving lots of data.  I feel like it should 
>>>>>>> be
>>>>>>> like the little train who could.  Just chug along slow and steady until 
>>>>>>> it
>>>>>>> crests the hill.
>>>>>>> 
>>>>>>> I would really like to reply on the erlang replicator, but I can't.  At
>>>>>>> least with the replicator I wrote I have a chance with throttling the 
>>>>>>> posts
>>>>>>> so CouchDB doesn't render my server useless.
>>>>>>> 
>>>>>>> Sorry for wrapping more questions into those questions.  I'm pretty 
>>>>>>> tired,
>>>>>>> stumped, and have machines in production crumbling.
>>>>>>> 
>>>>>>> Best,
>>>>>>> Stephen
>>>> 
>>

Re: replicating docs with tons of conflicts

Reply via email to