Hi all, Recently we've run into a couple of scaling problems and hope some of you can shed some light.
First, some background info. of what we're doing with couchdb. We run a mobile app (hubapp.com) and we basically use TouchDB/Couchbase-lite to replicate to a database on a remote CouchDB database server. Our mobile app let users create different Hub, which equals to a separate database in couchdb. (We'd have create just one database for all Hub, but filter replication was just too slow for mobile client as number of "Hub" grows). So, as time passed by, we are now at 16000+ database on one server. The server is hosted on AWS with 8GB of memory and 100GB of diskspace. We are averaging about 6000 connections to couch throughout the date. 5000+ of those connections are from a nodejs process called "follow" that listens to the changes feed on each "active" database on the server. Now, the problem that we've been having is that sometime couchdb would take up all the memory in the box to a point that the OS kicks in to kill the couchdb process. At that point however, a zombie "beam" process is started again but does not take in any connection until a manually restart is done. Does anyone know if that's because of the number of connection (5000+) to the server ? or is it related to the number of databases (16000+) within one server? I also have the erlang dump file when couchdb stop allowing connection, would happily share if that's any help. Regards, Herman
