Hi Carlos,

Here's the reply (to the question I originally asked) I've got on parallel Dev 
mailing list. Probably this helps answering your question:

The answer is slightly different depending if you're using 1.x/2.0 releases
or current master.

If you're using 1.x or 2.0 release: Set "max_replication_retry_count =
infinity" so it will always retry failed replications.  That setting
controls how the whole replication job restarts if there is any error. Then
"retries_per_request" can be used to handle errors for individual
replicator HTTP requests. Basically the case where a quick immediate retry
succeeds. The default value for "retries_per_request" is 10. After the
first failure, there is a 0.25 second wait. Then on next failure it doubles
to 0.5 and so on. Max wait interval is 5 minutes. But If you expect to be
offline routinely, maybe it's not worth retrying individual requests for
too long so reduce the "retries_per_request" to  6 or 7. So individual
requests would retry a few times for about 10 - 20 seconds then the whole
replication job will crash and retry.

If you're using current master, which has the new scheduling replicator: No
need to set "max_replication_retry_count", that setting is gone and all
replication jobs will always retry for as long as replication document
exists. But "retries_per_request" works the same as above. Replication
scheduler also does exponential backoffs when replication jobs fail
consecutively. First backoff is 30 seconds. Then it doubles to 1 minute, 2
minutes, and so on. Max backoff wait is about 8 hours. But if you don't
want to wait 4 hours on average for the replication to restart when network
connectivity is restored, and want to it be about 5 minutes or so, set
"max_history = 8" in the "replicator" config section. max_history controls
how much history of past events are retained for each replication job. If
there is less history of consecutive crashes, that backoff wait interval
will also be shorter.

So to summarize, for 1.x/2.0 releases:

[replicator]
max_replication_retry_count = infinity
retries_per_request = 6

For current master:

[replicator]
max_history = 8
retries_per_request = 6

Cheers,
-Nick


> On Jun 21, 2017, at 1:40 AM, Carlos Alonso <[email protected]> wrote:
> 
> Hi guys,
> 
> What happens after the *retries_per_request* retries limit is reached? Is
> the replication stopped? How is it notified? We're experiencing 'frozen'
> replications from time to time and I was wondering if that may be the
> case... But I think we're not seeing anything in the logs...
> 
> Regards
> 
> On Tue, Jun 20, 2017 at 6:22 PM Vladimir Kuznetsov <[email protected]>
> wrote:
> 
>> Thank you Eric!
>> 
>> Forgot to mention, I'm using 2.0, just got a reply in parallel thread so
>> it looks like this behavior is the same for 1.6 and 2.0.
>> 
>> --Vovan
>> 
>>> On Jun 20, 2017, at 5:30 AM, Eiri <[email protected]> wrote:
>>> 
>>> Hi Vovan,
>>> 
>>> If we are talking Couch 1.6, the attribute *retries_per_request*
>> controls a number of attempts a current replication is going to do to read
>> _changes feed before giving up. The attribute *max_replication_retry_count*
>> controls a number of attempts the whole replication job is going to be
>> retried by a replication manager. Setting this attribute to “infinity”
>> should make the replicaton manager to never give up.
>>> 
>>> I don’t think the interval between those attempts is configurable. As
>> far as I understand it’s going to start from 2.5 sec between the retries
>> and then double until reached 10 minutes, which is going to be hard upper
>> limit.
>>> 
>>> 
>>> Regards,
>>> Eric
>>> 
>>>> On Jun 20, 2017, at 00:31, Vladimir Kuznetsov <[email protected]>
>> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> I'm currently exploring CouchDB replication and trying to figure out
>> the difference between *max_replication_retry_count* and
>> *retries_per_request* configuration options in *[replicator]* section of
>> configuration file.
>>>> 
>>>> Basically I want to configure continuous replication of local couchdb
>> to the remote instance that would never stop replication attempts,
>> considering potentially continuous periods of being offline(days or even
>> weeks). So, I'd like to have infinite replication attempts with maximum
>> retry interval of 5 minutes or so. Can I do this? Do I need to change
>> default configuration to achieve this?
>>>> 
>>>> thanks,
>>>> --Vovan
>>> 
>> 
>> --
> [image: Cabify - Your private Driver] <http://www.cabify.com/>
> 
> *Carlos Alonso*
> Data Engineer
> Madrid, Spain
> 
> [email protected]
> 
> Prueba gratis con este código
> #CARLOSA6319 <https://cabify.com/i/carlosa6319>
> [image: Facebook] <http://cbify.com/fb_ES>[image: Twitter]
> <http://cbify.com/tw_ES>[image: Instagram] <http://cbify.com/in_ES>[image:
> Linkedin] <https://www.linkedin.com/in/mrcalonso>
> 
> -- 
> Este mensaje y cualquier archivo adjunto va dirigido exclusivamente a su 
> destinatario, pudiendo contener información confidencial sometida a secreto 
> profesional. No está permitida su reproducción o distribución sin la 
> autorización expresa de Cabify. Si usted no es el destinatario final por 
> favor elimínelo e infórmenos por esta vía. 
> 
> This message and any attached file are intended exclusively for the 
> addressee, and it may be confidential. You are not allowed to copy or 
> disclose it without Cabify's prior written authorization. If you are not 
> the intended recipient please delete it from your system and notify us by 
> e-mail.

Reply via email to