Hello list,

We recently had an outage on a CouchDB cluster where a problem on 1 node in the cluster prevented all writes from occurring and took out the application.

We are running a 6 node cluster, 3 nodes in dc1 and 3 in dc2, with the following settings:

[cluster]
placement = dc1:2,dc2:2
q=8
r=2
w=2
n=3

CouchDB version:

 curl -X GET http://127.0.0.1:5984
{"couchdb":"Welcome","version":"2.3.0","git_sha":"07ea0c7","uuid":"cb381f835e3922fbc183ce3eb05c1da8","features":["pluggable-storage-engines","scheduler"],"vendor":{"name":"The Apache Software Foundation"}}

Log files from 5 of the nodes show this error:

[error] 2020-01-23T17:58:30.640197Z [email protected] <0.10369.1330> -------- fabric_worker_timeout open_revs,'[email protected]',<<"shards/40000000-5fffffff/account/2b/94/20ca9a5e0d16ddccf18f4fa20e8b.1548248531">> [error] 2020-01-23T17:58:30.641167Z [email protected] <0.19108.1333> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/40000000-5fffffff/account/2b/94/20ca9a5e0d16ddccf18f4fa20e8b.1548248531">> [error] 2020-01-23T17:58:33.035259Z [email protected] <0.1810.1334> -------- fabric_worker_timeout open_revs,'[email protected]',<<"shards/40000000-5fffffff/account/39/47/df16b2d43514a7d41dcf7f4c8d73.1548248538">> [error] 2020-01-23T17:58:33.037218Z [email protected] <0.14681.1330> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/40000000-5fffffff/account/39/47/df16b2d43514a7d41dcf7f4c8d73.1548248538">> [error] 2020-01-23T17:58:50.634245Z [email protected] <0.17849.1330> 3a094d698b fabric_worker_timeout open_doc,'[email protected]',<<"shards/20000000-3fffffff/services.1548248909">> [error] 2020-01-23T17:58:52.388263Z [email protected] <0.19332.1329> 96611add99 fabric_worker_timeout open_doc,'[email protected]',<<"shards/e0000000-ffffffff/accounts.1548248787">>

Which points to a problem on d2b3 and then log on this node is:


/var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:29.275556Z [email protected] <0.26325.1238> -------- rexi_server: from: [email protected](<14354.16318.1129>) mfa: fabric_rpc:open_revs/4 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,with_db,3,[{file,"src/fabric_rpc.erl"},{line,331}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:29.275362Z [email protected] <0.18708.1236> -------- rexi_server: from: [email protected](<0.1032.1234>) mfa: fabric_rpc:open_revs/4 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,with_db,3,[{file,"src/fabric_rpc.erl"},{line,331}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:29.275584Z [email protected] <0.31988.1233> -------- rexi_server: from: [email protected](<14353.32154.1338>) mfa: fabric_rpc:open_revs/4 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,with_db,3,[{file,"src/fabric_rpc.erl"},{line,331}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:29.275596Z [email protected] <0.15622.1235> -------- rexi_server: from: [email protected](<14352.30846.1369>) mfa: fabric_rpc:open_revs/4 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,with_db,3,[{file,"src/fabric_rpc.erl"},{line,331}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:29.275741Z [email protected] <0.8781.1235> -------- rexi_server: from: [email protected](<14355.25402.1152>) mfa: fabric_rpc:open_revs/4 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,with_db,3,[{file,"src/fabric_rpc.erl"},{line,331}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:29.276072Z [email protected] <0.797.1235> -------- rexi_server: from: [email protected](<14354.28647.1129>) mfa: fabric_rpc:update_docs/3 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,read_repair_filter,4,[{file,"src/fabric_rpc.erl"},{line,349}]},{fabric_rpc,update_docs,3,[{file,"src/fabric_rpc.erl"},{line,274}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:29.276243Z [email protected] <0.12311.1234> -------- rexi_server: from: [email protected](<14353.14446.1337>) mfa: fabric_rpc:update_docs/3 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,read_repair_filter,4,[{file,"src/fabric_rpc.erl"},{line,349}]},{fabric_rpc,update_docs,3,[{file,"src/fabric_rpc.erl"},{line,274}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:29.277333Z [email protected] <0.23752.1233> -------- rexi_server: from: [email protected](<14352.13570.1369>) mfa: fabric_rpc:update_docs/3 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,read_repair_filter,4,[{file,"src/fabric_rpc.erl"},{line,349}]},{fabric_rpc,update_docs,3,[{file,"src/fabric_rpc.erl"},{line,274}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:29.278143Z [email protected] <0.6221.1234> -------- rexi_server: from: [email protected](<0.5950.1235>) mfa: fabric_rpc:update_docs/3 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,read_repair_filter,4,[{file,"src/fabric_rpc.erl"},{line,349}]},{fabric_rpc,update_docs,3,[{file,"src/fabric_rpc.erl"},{line,274}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:29.278168Z [email protected] <0.10161.1234> -------- rexi_server: from: [email protected](<14355.31878.1152>) mfa: fabric_rpc:update_docs/3 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,read_repair_filter,4,[{file,"src/fabric_rpc.erl"},{line,349}]},{fabric_rpc,update_docs,3,[{file,"src/fabric_rpc.erl"},{line,274}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:29.278193Z [email protected] <0.11752.1236> 1f52cdfea8 rexi_server: from: [email protected](<14351.5309.1331>) mfa: fabric_rpc:map_view/5 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,map_view,5,[{file,"src/fabric_rpc.erl"},{line,148}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:29.278016Z [email protected] <0.12329.1234> 1f52cdfea8 rexi_server: from: [email protected](<14351.5309.1331>) mfa: fabric_rpc:map_view/5 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,map_view,5,[{file,"src/fabric_rpc.erl"},{line,148}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:29.278083Z [email protected] <0.31002.1236> 1f52cdfea8 rexi_server: from: [email protected](<14351.5309.1331>) mfa: fabric_rpc:map_view/5 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,map_view,5,[{file,"src/fabric_rpc.erl"},{line,148}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:29.278238Z [email protected] <0.13296.1235> 1f52cdfea8 rexi_server: from: [email protected](<14351.5309.1331>) mfa: fabric_rpc:map_view/5 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,map_view,5,[{file,"src/fabric_rpc.erl"},{line,148}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:35.180064Z [email protected] <0.963.1234> -------- fabric_worker_timeout open_doc,'[email protected]',<<"shards/40000000-5fffffff/account/2b/94/20ca9a5e0d16ddccf18f4fa20e8b.1548248531">> /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:35.191265Z [email protected] <0.28963.1233> -------- fabric_worker_timeout open_revs,'[email protected]',<<"shards/40000000-5fffffff/account/2b/94/20ca9a5e0d16ddccf18f4fa20e8b.1548248531">> /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:37.646983Z [email protected] <0.18624.1233> -------- fabric_worker_timeout open_revs,'[email protected]',<<"shards/40000000-5fffffff/account/39/47/df16b2d43514a7d41dcf7f4c8d73.1548248538">> /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:40.738966Z [email protected] <0.2331.1233> -------- rexi_server: from: [email protected](<14354.29531.1130>) mfa: fabric_rpc:open_revs/4 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,with_db,3,[{file,"src/fabric_rpc.erl"},{line,331}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:40.739545Z [email protected] <0.12476.1232> -------- rexi_server: from: [email protected](<14351.21208.1333>) mfa: fabric_rpc:open_revs/4 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,with_db,3,[{file,"src/fabric_rpc.erl"},{line,331}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:40.739759Z [email protected] <0.22395.1235> -------- rexi_server: from: [email protected](<14352.13905.1370>) mfa: fabric_rpc:open_revs/4 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,with_db,3,[{file,"src/fabric_rpc.erl"},{line,331}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:40.740098Z [email protected] <0.32082.1236> -------- rexi_server: from: [email protected](<14352.28036.1369>) mfa: fabric_rpc:open_doc/3 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,with_db,3,[{file,"src/fabric_rpc.erl"},{line,331}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:40.740428Z [email protected] <0.24543.1234> -------- rexi_server: from: [email protected](<14355.20090.1150>) mfa: fabric_rpc:open_revs/4 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,with_db,3,[{file,"src/fabric_rpc.erl"},{line,331}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:40.741481Z [email protected] <0.19440.1234> -------- rexi_server: from: [email protected](<14353.25323.1335>) mfa: fabric_rpc:open_revs/4 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,with_db,3,[{file,"src/fabric_rpc.erl"},{line,331}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:40.742429Z [email protected] <0.2034.1236> -------- rexi_server: from: [email protected](<14351.16958.1327>) mfa: fabric_rpc:update_docs/3 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,read_repair_filter,4,[{file,"src/fabric_rpc.erl"},{line,349}]},{fabric_rpc,update_docs,3,[{file,"src/fabric_rpc.erl"},{line,274}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:40.742587Z [email protected] <0.25660.1235> -------- rexi_server: from: [email protected](<14352.32337.1370>) mfa: fabric_rpc:update_docs/3 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,read_repair_filter,4,[{file,"src/fabric_rpc.erl"},{line,349}]},{fabric_rpc,update_docs,3,[{file,"src/fabric_rpc.erl"},{line,274}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:40.742765Z [email protected] <0.20392.1236> -------- rexi_server: from: [email protected](<14355.2799.1152>) mfa: fabric_rpc:update_docs/3 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,read_repair_filter,4,[{file,"src/fabric_rpc.erl"},{line,349}]},{fabric_rpc,update_docs,3,[{file,"src/fabric_rpc.erl"},{line,274}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:40.742889Z [email protected] <0.3251.1233> -------- rexi_server: from: [email protected](<14354.12701.1130>) mfa: fabric_rpc:update_docs/3 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,read_repair_filter,4,[{file,"src/fabric_rpc.erl"},{line,349}]},{fabric_rpc,update_docs,3,[{file,"src/fabric_rpc.erl"},{line,274}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:40.743525Z [email protected] <0.15372.1236> -------- rexi_server: from: [email protected](<14353.16323.1335>) mfa: fabric_rpc:update_docs/3 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,read_repair_filter,4,[{file,"src/fabric_rpc.erl"},{line,349}]},{fabric_rpc,update_docs,3,[{file,"src/fabric_rpc.erl"},{line,274}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:40.778236Z [email protected] <0.18751.1234> -------- rexi_server: from: [email protected](<0.15370.1235>) mfa: fabric_rpc:open_revs/4 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,with_db,3,[{file,"src/fabric_rpc.erl"},{line,331}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}] /var/log/couchdb/couchdb.log-20200124.gz:[error] 2020-01-23T17:58:40.780347Z [email protected] <0.10143.1237> -------- rexi_server: from: [email protected](<0.19312.1235>) mfa: fabric_rpc:update_docs/3 error:function_clause [{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,184}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,85}]},{fabric_rpc,read_repair_filter,4,[{file,"src/fabric_rpc.erl"},{line,349}]},{fabric_rpc,update_docs,3,[{file,"src/fabric_rpc.erl"},{line,274}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,140}]}]

I restarted CouchDB on d2b3 only and service was restored.

Can anyone suggest the cause of this problem or what I can do to debug the issue further?

Many Thanks

Alan


Reply via email to