Hi Ulrich, sharding is indeed per-database. This allows for an important degree 
of flexibility but it does introduce maintenance overhead when you have a lot 
of databases. The system databases you mentioned do have their own sharding 
documents which can be modified if you want to redistribute them across the 
cluster. Note that this is not required as you scale the cluster; nodes can 
still access the information in those databases regardless of the presence of a 
“local” shard. Of course if you’re planning on removing a node hosting shards 
of those databases you should move the shards first to preserve the replica 
level.

The sharding document is a normal document and absolutely does have revisions. 
We found the changelog to be a used asset when resolving any merge conflicts 
introduced in a concurrent rebalancing exercise. Cheers,

Adam

> On Jan 7, 2018, at 6:08 AM, Ulrich Mayring <[email protected]> wrote:
> 
> Hello,
> 
> I haven't quite understood the 2.1.1 documentation for sharding in one 
> aspect: it is described how to get the sharding document for one database, 
> how to edit it by e. g. adding a node to it and how to upload it again. I've 
> tried that and it works fine.
> 
> However, if I have the couch_per_user feature turned on, then there are 
> potentially thousands of databases. Suppose I add a new node to the cluster, 
> do I then need to follow this procedure for all databases in order to balance 
> data? Or is it enough to do it for one database? I suppose an equivalent 
> question would be: are the shards per database or per cluster?
> 
> And, somewhat related: what about the _users, _global_changes and _replicator 
> databases? Do I need to edit their sharding document as well, whenever I add 
> or remove a cluster node?
> 
> I also find it interesting that the sharding document has no revisions and 
> instead relies on changelog entries.
> 
> many thanks in advance for any enlightenment,
> 
> Ulrich
> 

Reply via email to