Hi all,

I’ve doing a bit of poking around the container orchestration space lately and 
looking at how we might best deploy a CouchDB 2.0 cluster in a container 
environment. In general I’ve been pretty impressed with the design point of the 
Kubernetes project, and I wanted to see how hard it would be to put together a 
proof of concept.

As a preamble, I needed to put together a container image for 2.0 that just 
runs a single Erlang VM instead of the container-local “dev cluster”. You can 
find that work here:

https://github.com/klaemo/docker-couchdb/pull/52 
<https://github.com/klaemo/docker-couchdb/pull/52>

So far, so good - now for Kubernetes itself. My goal was to figure out how to 
deploy a collection of “Pods” that could discover one another and self-assemble 
into a cluster. Kubernetes differs from the traditional Docker network model in 
that every Pod gets an IP address that is routable from all other Pods in the 
cluster. As a result there’s no need for some of the port gymnastics that one 
might encounter with other Docker environments - each CouchDB pod can listen on 
5984, 4369 and whatever distribution port you like on its own IP.

What you don’t get with Pods is a hostname that’s discoverable from other Pods 
in the cluster. A “Service” (a replicated, load-balanced collection of Pods) 
can optionally have a DNS name, but the Pods themselves do not. This throws a 
wrench in the most common distributed Erlang setup, where each node gets a name 
like “couchdb@FQDN” and the FQDNs are resolvable to IP addresses via DNS.

It is certainly possible to specify an Erlang node name like 
“couchdb@12.34.56.78 <mailto:couchdb@12.34.56.78>”, but we need to be a bit 
careful here. CouchDB is currently forcing the Erlang node name to do 
“double-duty”; it’s both the way that the nodes in a cluster figure out how to 
route traffic to one another and it’s the identifier for nodes to claim 
ownership over individual replicas of database shards in the shard map. 
Speaking from experience it’s often quite useful operationally to remap a given 
Erlang node name to a new server and have the new server be automatically 
populated with the replicas it’s supposed to own. If we use the Pod IP in 
Kubernetes for the node name we won’t have that luxury.

I think the best path forward here would be to extend the “Node" concept in a 
CouchDB cluster so that it has an identifier which is allowed to be distinct 
fro the Erlang node name. The “CouchDB Node” is the one that owns database 
shard replicas, and it can be remapped to different distributed Erlang nodes 
over time via modification of an attribute in the _nodes DB.

Hope you all found this useful — I’m quite interested in finding way to make it 
easier for users to acquire a highly-available cluster configured in the “right 
way”, and I think projects like Kubernetes have a lot of promise in this 
regard. Cheers,

Adam

Reply via email to