Re: CouchDB Crash report db_not_found when attempting to replicate databases

Mikeal Rogers Wed, 14 Sep 2011 13:09:17 -0700

Yeah, what i need is a GET that will return the document, with attachments, in 
that format.


-Mikeal

On Sep 14, 2011, at September 14, 201112:19 PM, Adam Kocoloski wrote:

> There's a multipart API which allows for a single PUT request containing the 
> document body as JSON and all its attachments in their raw form. 
> Documentation is pretty thin at the moment, and unfortunately I think it 
> doesn't quite allow for a pipe(). Would be really nice if it did, though.
> 
> On Wednesday, September 14, 2011 at 1:16 PM, Mikeal Rogers wrote:
> 
>> npm is mostly attachments and I haven't seen any issues so far.
>> 
>> I wish there was a better way to replicate attachments atomically for a 
>> single revision but if there is, I don't know about it.
>> 
>> It's probably a huge JSON operation and it sucks, but I don't have to parse 
>> it in node.js, I just pipe() the body right along.
>> 
>> -Mikeal
>> 
>> On Sep 14, 2011, at September 14, 20118:42 AM, Adam Kocoloski wrote:
>> 
>>> Hi Mikeal, I just took a quick peek at your code. It looks like you handle 
>>> attachments by inlining all of them into the JSON representation of the 
>>> document. Does that ever cause problems when dealing with the ~100 MB 
>>> attachments in the npm repo?
>>> 
>>> I've certainly seen my fair share of problems with attachment replication 
>>> in CouchDB 1.0.x. I have a sneaking suspicion that there are latent bugs 
>>> related to incorrect determinations of Content-Length under various 
>>> compression scenarios.
>>> 
>>> Adam
>>> 
>>> On Tuesday, September 13, 2011 at 5:08 PM, Mikeal Rogers wrote:
>>> 
>>>> My replicator is fairly young so I think calling it "reliable" might be a 
>>>> little misleading.
>>>> 
>>>> It does less, I don't ever attempt to cache the high watermark (last seq 
>>>> written) and start over from there. If the process crashes just start over 
>>>> from scratch. This can lead to a delay after restart but I find that it's 
>>>> much simpler and more reliable on failure.
>>>> 
>>>> It's also simpler because it doesn't have to content with being an http 
>>>> client and a client of the internal couchdb erlang API. It just proxies 
>>>> requests from one couch to another.
>>>> 
>>>> While I'm sure there are bugs that I haven't found yet in it, I can say 
>>>> that it replicates the npm repository quite well and I'm using it in 
>>>> production.
>>>> 
>>>> -Mikeal
>>>> 
>>>> On Sep 13, 2011, at September 13, 201111:44 AM, Max Ogden wrote:
>>>> 
>>>>> Hi Chris,
>>>>> 
>>>>> From what I understand the current state of the replicator (as of 1.1) is
>>>>> that for certain types of collections of documents it can be somewhat
>>>>> fragile. In the case of the node.js package repository, http://npmjs.org,
>>>>> there are many relatively large (~100MB) documents that would sometimes
>>>>> throw errors or timeout during replication and crash the replicator, at
>>>>> which point the replicator would restart and attempt to pick up where it
>>>>> left off. I am not an expert in the internals of the replicator but
>>>>> apparently the cumulative time required for the replicator to repeatedly
>>>>> crash and then subsequently relocate itself in _changes feed in the case 
>>>>> of
>>>>> replicating the node package manager was making the built in couch
>>>>> replicator unusable for the task.
>>>>> 
>>>>> Two solutions exist that I know of. There is a new replicator in trunk 
>>>>> (not
>>>>> to be confused with the _replicator db from 1.1 -- it is still using the 
>>>>> old
>>>>> replicator algorithms) and there is also a more reliable replicator 
>>>>> written
>>>>> in node.js https://github.com/mikeal/replicate that was was written
>>>>> specifically to replicate the node package repository between hosting
>>>>> providers.
>>>>> 
>>>>> Additionally it may be useful if you could describe the 'fingerprint' of
>>>>> your documents a bit. How many documents are in the failing databases? are
>>>>> the documents large or small? do they have many attachments? how large is
>>>>> your _changes feed?
>>>>> 
>>>>> Cheers,
>>>>> 
>>>>> Max
>>>>> 
>>>>> On Tue, Sep 13, 2011 at 11:22 AM, Chris Stockton
>>>>> <[email protected] (mailto:[email protected])>wrote:
>>>>> 
>>>>>> Hello,
>>>>>> 
>>>>>> We now have about 150 dbs that are refusing to replicate with random
>>>>>> crashes, which provide really zero debug information. The error is db
>>>>>> not found, but I know its available. Does anyone know how can I
>>>>>> trouble shoot this? Do we just have to many databases replicating for
>>>>>> couchdb to handle? 4000 is a small number for the massive hardware
>>>>>> these are running on.
>>>>>> 
>>>>>> -Chris
> 
>

Re: CouchDB Crash report db_not_found when attempting to replicate databases

Reply via email to