Folks, A student (I teach CouchDB as part of a Cloud Computing course), pointed out that, on a 4-node, 3-replica cluster, the database should stop accepting requests when 2 nodes are down.
His rationale is: the quorum (assuming its default value of 2) in principle can be reached, but since some of the shards are not present on both nodes, the quorum of replicas cannot be reached even when there are still 2 nodes standing. This did not chime with my experience, hence I did a little experiment: - set a cluster with 4 nodes and cluster parameters set to "q":8,"n":3,"w":2,"r":2; - created a database; - added a few documents; - stopped 2 nodes out of 4; - added another 10,000 documents without a hiccup. I checked the two surviving nodes, and there were 6 shard files representing the 8 shards in each node: 4 shards were replicated, and 4 were not. Therefore, about 5,000 of the write operations must have hit the un-replicated shards. In other words, who has the vote in a quorum election: all the nodes, or only the nodes that host the shard with the sought document? Cheers, Luca Morandini