Getting an error "all_dbs_active" running a CouchDB 2.3 cluster

Jake Kroon Thu, 14 Mar 2019 03:06:13 -0700

Hi,

I'm in the process of trying to migrate a CouchDB cluster from 2.0 to 2.3, by 
creating a new cluster and replicating the databases over to it, and eventually 
plan to switch over to the new one. Generally this process is going fine, but 
I'm getting errors similar to the following when running my applications 
against the new cluster:


[error] 2019-02-21T07:04:51.213276Z couchdb@ip<mailto:couchdb@10.0.30.239> 
<0.17397.4590> f346ddb688 rexi_server: from: couchdb@ip 
(<0.32026.4592>)<mailto:couchdb@10.0.30.239(%3c0.32026.4592%3e)> mfa: 
fabric_rpc:map_view/5 error:{badmatch,{error,all_dbs_active}} 
[{fabric_rpc,map_view,5,[{file,"src/fabric_rpc.erl"},{line,148}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}]

I'm using the default max_dbs_open value of 500 (which is preset in the 
default.ini file). As far as I understand it, this should be plenty, and it's 
what I'm successfully using on my current 2.0 cluster with no errors. I may be 
misunderstanding how this setting works though.

I have about 90 databases in the cluster, and all I'm currently running is a 
couple of scripts:


  1.  A "build views" script that runs every hour, that goes through each 
database and queries each of the views (in series).
  2.  A "conflict resolver" script that runs every 15 minutes, that queries all 
databases for conflicts and then performs custom logic to deal with conflicts 
(though there won't be any conflicts on our new server at this time, so it's 
just querying the conflicts view on each database)

I also previously had continuous bidirectional replication set up between the 
new cluster and the old one, and the "all_dbs_active" error was happening quite 
often (a couple of times per hour). I've cancelled all the replication jobs and 
the error has reduced to about 1 or 2 instances per day.

I haven't yet tried increasing the max_dbs_open value (which seems to be a 
common suggestion for dealing with the "all_dbs_active" error), because the 
live 2.0 cluster is working fine with the default value of 500, and has higher 
load on it than the new 2.3 cluster.

I was wondering if anyone has any suggestions on what I should look at to try 
to solve this issue?

I'm running the cluster on Ubuntu 18.04 LTS.

Thanks!
Jake Kroon
Software Engineer

D: +61 8 9318 6949
E: jkr...@immersivetechnologies.com

[cid:image001.gif@01D4DA90.804BE3F0]<http://www.immersivetechnologies.com/>

YouTube<http://www.youtube.com/channel/UCFB0-tgoMdZBUdI9n0-ztSQ?sub_confirmation=1>
  |  
LinkedIn<http://www.linkedin.com/company/immersive-technologies?trk=fc_badge>  
|  Google+<http://plus.google.com/106523448781407207880>

Getting an error "all_dbs_active" running a CouchDB 2.3 cluster

Reply via email to