Hi,

There are no votes, no elections and there are no leader nodes.

CouchDB chooses availability over consistency and will accept reads/writes even 
if only one node (that hosts the shard ranges being read/written) is up. 

In a 3-node, 3-replica cluster, where every node hosts a copy of every shard, 
any single node can be up to allow all reads and writes to succeed.

Every node in the cluster can coordinate a read or write. The coordinator 
creates N concurrent and independent read/write requests and sends them to the 
appropriate nodes (that the shard map indicates for that document id). The 
coordinator waits for a quorum of replies before merging those replies into the 
http response to the client, up to the request timeout parameter. If at least 
one write occurred CouchDB will return a 202 status code, if quorum was reached 
a 201 is returned, 200 is returned for reads (whether quorum reached or not, 
the difference is you'd get a faster reply if quorum is reached, otherwise 
you're waiting for the timeout). 

When couchdb believes nodes to be down, the quorum is implicitly lowered to 
avoid the latency penalty.

In your scenario the two offline nodes would not get the writes at the time, 
for obvious reasons, but once up again they will receive those writes from the 
surviving nodes, restoring the expected N level of redundancy.

B.

> On 14 Jun 2023, at 07:11, Luca Morandini <luca.morandi...@gmail.com> wrote:
> 
> Folks,
> 
> A student (I teach CouchDB as part of a Cloud Computing course), pointed
> out that, on a 4-node, 3-replica cluster, the database should stop
> accepting requests when 2 nodes are down.
> 
> His rationale is: the quorum (assuming its default value of 2) in principle
> can be reached, but since some of the shards are not present on both nodes,
> the quorum of replicas cannot be reached even when there are still 2 nodes
> standing.
> 
> This did not chime with my experience, hence I did a little experiment:
> - set a cluster with 4 nodes and cluster parameters set to
> "q":8,"n":3,"w":2,"r":2;
> - created a database;
> - added a few documents;
> - stopped 2 nodes out of 4;
> - added another 10,000 documents without a hiccup.
> 
> I checked the two surviving nodes, and there were 6 shard files
> representing the 8 shards in each node: 4 shards were replicated, and 4
> were not.
> Therefore, about 5,000 of the write operations must have hit the
> un-replicated shards.
> 
> In other words, who has the vote in a quorum election: all the nodes, or
> only the nodes that host the shard with the sought document?
> 
> Cheers,
> 
> Luca Morandini

Reply via email to