On Mar 14, 2013, at 3:36 PM, Robert Newson <[email protected]> wrote:
> Runaway processes are the very devil but the problem is not specific
> to CouchDB, there is no CouchDB mechanism for this just as there's no
> bash/python/ruby/perl method to limit a while(true){} loop.
Totally makes sense.
>
> Highly conflicted documents are painful to update and read. I can't do
> anything about that today.
Thanks for your feedback!
>
> B.
>
> On 14 March 2013 17:23, Stephen Bartell <[email protected]> wrote:
>> Robert, this only works if I don't need to keep those docs around anymore.
>> In my case, I want to keep the docs. I don't want to keep the conflicts of
>> the docs. Most importantly thought, even if I delete all the conflicts on
>> all my docs, I still have the problem of _deleted_docs. What I've seen is
>> that only a few docs with a few thousand _deleted_docs each will plug up
>> Couch and render unusable. You can't get rid of it through natural means.
>>
>> This is what Riyad was bringing up and what Ive implemented. I have a
>> program which replicates from the troubled database _changes with the query
>> param style=main_only. This allows me to still have the revision tree of
>> the troubled database, but without the _deleted_conflicts. I can then wipe
>> out the troubled db, recreate it, and replicate the shiny clean data back
>> into it.
>>
>> This is unnatural and requires custom code to make happen. I can live with
>> it until a better solution comes around.
>>
>> What I'm really concerned about is how couchdb eats all my cpu.
>>
>> Is there any way to ration the resources that couchdb uses? Like tell it not
>> to use more than 50% or something. I think that couch eating all the
>> resources on a machine just because its reading loads of data is a bug. Is
>> this a reasonable conclusion?
>>
>> On Mar 14, 2013, at 2:18 PM, Robert Newson <[email protected]> wrote:
>>
>>> One trick, you can delete the doc and replicate with a filter like
>>> 'return !doc['_deleted'];' that blocks all deletes. the target db will
>>> then not receive any trace of these highly conflicted docs.
>>>
>>> On 14 March 2013 14:10, Stephen Bartell <[email protected]> wrote:
>>>>
>>>> On Mar 14, 2013, at 11:44 AM, Robert Newson <[email protected]> wrote:
>>>>
>>>>> Conflicts are *not* removed during compaction, CouchDB has no way of
>>>>> knowing which ones it would be ok to delete.
>>>>
>>>> Yep, they need to be deleted in the context of the person/process
>>>> manipulating the docs.
>>>>
>>>>>
>>>>> CouchDB does struggle to process documents with lots of conflicts,
>>>>> we've encountered this at Cloudant a fair bunch. We resolve the
>>>>> conflicts via http if possible or, if that consistently fails, with a
>>>>> direct erlang manipulation. It's certainly something we need to
>>>>> improve.
>>>>>
>>>>
>>>> But even deleting them yields the same problem. When replicating, the
>>>> _deleted_conflicts is carried over.
>>>> Users could be diligent in deleting conflicts, but still end up unable to
>>>> replicate their docs because of the volume of _deleted_conflicts.
>>>>
>>>> Robert, thanks for chiming in. I feel better knowing I'm in good company
>>>> with this problem. When this mine eventually goes off, couchdb is rendered
>>>> useless because beam.smp takes all the cpu. Is there any way to ration
>>>> the resources couchdb consumes?
>>>>
>>>>> B.
>>>>>
>>>>> On 14 March 2013 13:09, Riyad Kalla <[email protected]> wrote:
>>>>>> Stephen,
>>>>>> I am probably wrong here (someone hop in and correct me), but I thought
>>>>>> Compaction would remove the old revisions (and conflicts) of docs.
>>>>>>
>>>>>> Alternatively a question for the Couch devs, if Stephen set _revs_limit
>>>>>> to
>>>>>> something artifically low, say 1, and restarted couch and did a
>>>>>> compaction,
>>>>>> would that force the DB to smash down the datastore to 1 rev per doc and
>>>>>> remove the long-tail off these docs?
>>>>>>
>>>>>> REF: http://wiki.apache.org/couchdb/Compaction
>>>>>>
>>>>>> On Thu, Mar 14, 2013 at 2:02 AM, Stephen Bartell
>>>>>> <[email protected]>wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> tldr; I've got a database with just a couple docs. Conflict management
>>>>>>> went unchecked and these docs have thousands of conflicts each.
>>>>>>> Replication fails. Couch consumes all the server's cpu.
>>>>>>>
>>>>>>> First the story, then the questions. Please bear with me!
>>>>>>>
>>>>>>> I wanted to replicate this database to another, new database. So I
>>>>>>> started the replication. beam.smp took 100% of my cpu and the
>>>>>>> replicator
>>>>>>> status held steady at a constant percent for quite a while. It
>>>>>>> eventually
>>>>>>> finished.
>>>>>>>
>>>>>>> I thought maybe I should handle the conflicts and then replicate.
>>>>>>> Hopefuly it'll go faster next time. So I cleared all the conflicts. I
>>>>>>> replicated again but this time I could not get anything to replicate.
>>>>>>> Again, cpu held steady, topped out. I eventually restarted couch.
>>>>>>>
>>>>>>> I dug throughout the logs and saw that the POSTS were failing. I figure
>>>>>>> that the replicator was timing out when trying to post to couch.
>>>>>>>
>>>>>>> I have a replicator that I've been working on thats written in node.js.
>>>>>>> So I started that one up to do the same thing. I drew inspiration from
>>>>>>> Pouchdb's replicator and from Jens Alkes amazing replication algorithm
>>>>>>> documentation, so my replicator follows more or less the same story. 1)
>>>>>>> consume _changes with style=all_docs. 2) revs_diff on the target
>>>>>>> database.
>>>>>>> 3) get each revision from source with revs=true. 4) bulk post with
>>>>>>> new_edits=false.
>>>>>>>
>>>>>>> Same thing. Except now I can kind of make sense of whats going on.
>>>>>>> Sucking the data out of the source is no problem. Diffing the revs
>>>>>>> against the target is no problem. Posting the docs is THE problem.
>>>>>>> Since
>>>>>>> the database is clean, thousands of docs are being thrown at couch at
>>>>>>> once
>>>>>>> to build up the revision trees. Couch is just taking forever in
>>>>>>> finishing
>>>>>>> the job. It doesn't matter if I bulk post the docs or post them
>>>>>>> individually, couch sucks 100% of my cpu every time and takes forever to
>>>>>>> finish. (I actually never let it finish).
>>>>>>>
>>>>>>> So that is is the story. Here are my questions.
>>>>>>>
>>>>>>> 1) Has anyone else stepped on this mine? If so, could I get pointed
>>>>>>> towards some workarounds? I don't think it is right to make the
>>>>>>> assumption
>>>>>>> that users of couchdb will never have databases with huge conflict
>>>>>>> sausages
>>>>>>> like this. So simply saying manage your conflicts won't help.
>>>>>>>
>>>>>>> 2) Lets say I did manage my conflicts. I still have the
>>>>>>> _deleted_conflicts sausage. I know that _deleted and _deleted_docs
>>>>>>> must be
>>>>>>> replicated to maintain consistency across the cluster. If the
>>>>>>> replicator
>>>>>>> throws up when these huge sausages come through, how is the data ever
>>>>>>> going
>>>>>>> to replicate? Is there a trade secret I don't know about?
>>>>>>>
>>>>>>> 3) Is there any limit on the resources that CouchDB is allowed to
>>>>>>> consume?
>>>>>>> I can get that we run into these cases where theres tons of data to move
>>>>>>> and its just going to take a hell of a long time. But I don't get why
>>>>>>> its
>>>>>>> permissible for CouchDB to eat all my cpu. The whole server should
>>>>>>> never
>>>>>>> grind to a halt because its moving lots of data. I feel like it should
>>>>>>> be
>>>>>>> like the little train who could. Just chug along slow and steady until
>>>>>>> it
>>>>>>> crests the hill.
>>>>>>>
>>>>>>> I would really like to reply on the erlang replicator, but I can't. At
>>>>>>> least with the replicator I wrote I have a chance with throttling the
>>>>>>> posts
>>>>>>> so CouchDB doesn't render my server useless.
>>>>>>>
>>>>>>> Sorry for wrapping more questions into those questions. I'm pretty
>>>>>>> tired,
>>>>>>> stumped, and have machines in production crumbling.
>>>>>>>
>>>>>>> Best,
>>>>>>> Stephen
>>>>
>>