Re: How many replications per server?

Mark Richter Tue, 30 Oct 2018 14:33:20 -0700

I would, but it would be much better written by someone who knows what they're 
talking about.  This is all really new to me.


Adam? :-)

________________________________________
From: Joan Touzet <woh...@apache.org>
Sent: Tuesday, October 30, 2018 2:10 PM
To: user@couchdb.apache.org
Subject: Re: How many replications per server?

Hi Mark,

Thanks for the suggestion! Are you volunteering? :) Please remember that 
CouchDB is a volunteer-run project. We're always on the lookout for new 
contributors.

If you know your way around a text editor and GitHub, we'd love a pull request 
here:

https://github.com/apache/couchdb-documentation/

All the best,
Joan

----- Original Message -----
From: "Mark Richter" <mrich...@solarflare.com>
To: user@couchdb.apache.org
Sent: Tuesday, October 30, 2018 3:46:05 PM
Subject: Re: How many replications per server?

It would be better to put examples like that in the official documentation so 
we don't have to search the web to find what needs to be documented in the 
first place.

Just my $0.02.

Mark

________________________________________
From: Adam Kocoloski <kocol...@apache.org>
Sent: Tuesday, October 30, 2018 10:53 AM
To: Andrea Brancatelli
Cc: user@couchdb.apache.org
Subject: Re: How many replications per server?

Precisely :)

I agree the settings can be difficult to grok. I’ll bet a few examples in a 
blog post would go a long way towards illustrating the interplay between them. 
Cheers,

Adam

> On Oct 30, 2018, at 1:03 PM, Andrea Brancatelli <abrancate...@schema31.it> 
> wrote:
>
> Thanks Adam, that's what I thought as well, but believe me I'm having a 
> really hard time understanding the explanation of max_jobs and max_churns 
> from the docs.
>
> I don't exactly get the difference between those two values. My first guess 
> was that max_jobs was a systemwide max value while max_churn would define how 
> many jobs would run at the same time.
>
> I tried it and it wasn't working as expected.
>
> Now I just reread it and I'm guessing that
>
> while true {
>
>   if (jobs > max_jobs) {
>
>     for (x = 1 to max_churn) {
>
>       kill_or_start(something)
>
>     }
>
>   }
>
>   sleep(interval)
>
> }
>
>
>
> Is this correct?
>
>
>
> ---
> Andrea Brancatelli
>
> On 2018-10-30 17:17, Adam Kocoloski wrote:
>
>> Hi Andrea, your numbers don't sound crazy for an out-of-the-box setup.
>>
>> Worth noting that in CouchDB 2.1 and above there is a replication scheduler 
>> which can cycle through an ~unlimited number of continuous replications 
>> within a defined resource envelope. The scheduler is documented here:
>>
>> http://docs.couchdb.org/en/stable/replication/replicator.html#replication-scheduler
>>  
>> <http://docs.couchdb.org/en/stable/replication/replicator.html#replication-scheduler>
>>  
>> <http://docs.couchdb.org/en/stable/replication/replicator.html#replication-scheduler
>>  
>> <http://docs.couchdb.org/en/stable/replication/replicator.html#replication-scheduler>>
>>
>> There are a number of configuration properties that govern the behavior of 
>> the scheduler and also the default resources allocated to any particular 
>> replication. These are clustered in the [replicator] configuration block:
>>
>> http://docs.couchdb.org/en/stable/config/replicator.html#replicator 
>> <http://docs.couchdb.org/en/stable/config/replicator.html#replicator> 
>> <http://docs.couchdb.org/en/stable/config/replicator.html#replicator 
>> <http://docs.couchdb.org/en/stable/config/replicator.html#replicator>>
>>
>> The `worker_processes` and `http_connections` in particular can have a 
>> significant impact on the resource consumption of each replication job. If 
>> your goal is to host a large number of lightweight replications you could 
>> reduce those settings, and then configure the scheduler to keep a large 
>> `max_jobs` running. It's also possible to override resource settings on a 
>> per-replication basis.
>>
>> Cheers, Adam
>>
>>
>>> On Oct 30, 2018, at 11:52 AM, Stefan Klein <st.fankl...@gmail.com 
>>> <mailto:st.fankl...@gmail.com>> wrote:
>>>
>>> Hi,
>>>
>>> can't comment on the behavior of recent, 2.x, versions of couchdb.
>>>
>>> Long time ago, with couchdb 1.4 or so I ran a similar test.
>>> Our solution was to:
>>> * keep a list of "active" users (by our application specific definition)
>>> * listen to _db_changes
>>> * run one-shot replications for the changed documents to the per-user dbs
>>> of the users who got access to the documents and are "active"
>>> When a users becomes "active" - again determined by application logic - a
>>> one-shot replication is run to bring the per-user db up to date.
>>>
>>> Sadly this logic is deeply integrated in our application code and can't be
>>> easily extracted to a module (we're using nodejs).
>>> It's also basically unchanged since then and we have to adapt to couchdb
>>> 2.x.
>>>
>>> regards,
>>> Stefan
>>>
>>>
>>> Am Di., 30. Okt. 2018 um 16:22 Uhr schrieb Andrea Brancatelli <
>>> abrancate...@schema31.it <mailto:abrancate...@schema31.it>>:
>>>
>>>> Sorry the attachment got stripped - here it is:
>>>> https://pasteboard.co/HKRwOFy.png <https://pasteboard.co/HKRwOFy.png>
>>>>
>>>> ---
>>>>
>>>> Andrea Brancatelli
>>>>
>>>> On 2018-10-30 15:51, Andrea Brancatelli wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I have a bare curiosity - I know it's a pretty vague question, but how
>>>> many continuous replication jobs one can expect to run on a single "common"
>>>> machine?
>>>>>
>>>>>
>>>>> With common I'd say a quad/octa core with ~16GB RAM...
>>>>>
>>>>> I don't need an exact number, just the order of it... 1? 10? 100? 1000?
>>>>>
>>>>> I've read a lot about the per-user approach, the filtered replication
>>>> and all that stuff, but on a test server with 64 replication jobs (1
>>>> central user and 32 test users) the machine is totally bent on its knees:
>>>>>
>>>>>
>>>>> root@bigdata-free-rm-01:~/asd # uptime
>>>>> 3:50PM up 5 days, 4:55, 3 users, load averages: 9.28, 9.84, 9.39
>>>>>
>>>>> I'm attaching a screenshot of current htop output (filtered for CouchDB
>>>> user, but it's the only thing running on the machine)...
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Andrea Brancatelli
>>
>>

The information contained in this message is confidential and is intended for 
the addressee(s) only. If you have received this message in error, please 
notify the sender immediately and delete the message. Unless you are an 
addressee (or authorized to receive for an addressee), you may not use, copy or 
disclose to anyone this message or any information contained in this message. 
The unauthorized use, disclosure, copying or alteration of this message is 
strictly prohibited.
The information contained in this message is confidential and is intended for 
the addressee(s) only. If you have received this message in error, please 
notify the sender immediately and delete the message. Unless you are an 
addressee (or authorized to receive for an addressee), you may not use, copy or 
disclose to anyone this message or any information contained in this message. 
The unauthorized use, disclosure, copying or alteration of this message is 
strictly prohibited.

Re: How many replications per server?

Reply via email to