sergey-safarov opened a new issue, #4868:
URL: https://github.com/apache/couchdb/issues/4868

   
   ## Description
   
   We catch the case when the CouchDB daemon consumes most of the CPU on the 
server.
   Here is data from the monitoring
   
![image](https://github.com/apache/couchdb/assets/2562241/7d612c54-e480-41e7-8760-fe005b53f05a)
   
   After we have observed CouchDB logs I can error messages
   > [error] 2023-11-22T11:31:14.412591Z [email protected] 
<0.172.3089> 9419196231 rexi_server: from: 
[email protected](<14638.24595.7214>) mfa: fabric_rpc:reduce_view/4 
error:function_clause 
[{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,190}]},{couch_server,open_int,2,[{file,"src/couch_server.erl"},{line,106}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,96}]},{mem3_util,get_or_create_db_int,2,[{file,"src/mem3_util.erl"},{line,566}]},{fabric_rpc,reduce_view,5,[{file,"src/fabric_rpc.erl"},{line,160}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,141}]}]
   [error] 2023-11-22T11:31:19.431161Z [email protected] 
<0.13590.3088> 60d3f30af5 rexi_server: from: 
[email protected](<14158.26486.7168>) mfa: fabric_rpc:reduce_view/4 
error:function_clause 
[{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,190}]},{couch_server,open_int,2,[{file,"src/couch_server.erl"},{line,106}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,96}]},{mem3_util,get_or_create_db_int,2,[{file,"src/mem3_util.erl"},{line,566}]},{fabric_rpc,reduce_view,5,[{file,"src/fabric_rpc.erl"},{line,160}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,141}]}]
   [error] 2023-11-22T11:31:19.431262Z [email protected] 
<0.3944.3088> 60d3f30af5 rexi_server: from: 
[email protected](<14158.26486.7168>) mfa: fabric_rpc:reduce_view/4 
error:function_clause 
[{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,190}]},{couch_server,open_int,2,[{file,"src/couch_server.erl"},{line,106}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,96}]},{mem3_util,get_or_create_db_int,2,[{file,"src/mem3_util.erl"},{line,566}]},{fabric_rpc,reduce_view,5,[{file,"src/fabric_rpc.erl"},{line,160}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,141}]}]
   [error] 2023-11-22T11:31:19.431353Z [email protected] 
<0.17148.3089> 60d3f30af5 rexi_server: from: 
[email protected](<14158.26486.7168>) mfa: fabric_rpc:reduce_view/4 
error:function_clause 
[{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,190}]},{couch_server,open_int,2,[{file,"src/couch_server.erl"},{line,106}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,96}]},{mem3_util,get_or_create_db_int,2,[{file,"src/mem3_util.erl"},{line,566}]},{fabric_rpc,reduce_view,5,[{file,"src/fabric_rpc.erl"},{line,160}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,141}]}]
   [error] 2023-11-22T11:31:19.431439Z [email protected] 
<0.5374.3088> 60d3f30af5 rexi_server: from: 
[email protected](<14158.26486.7168>) mfa: fabric_rpc:reduce_view/4 
error:function_clause 
[{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,190}]},{couch_server,open_int,2,[{file,"src/couch_server.erl"},{line,106}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,96}]},{mem3_util,get_or_create_db_int,2,[{file,"src/mem3_util.erl"},{line,566}]},{fabric_rpc,reduce_view,5,[{file,"src/fabric_rpc.erl"},{line,160}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,141}]}]
   [error] 2023-11-22T11:31:19.431523Z [email protected] 
<0.20478.3087> 60d3f30af5 rexi_server: from: 
[email protected](<14158.26486.7168>) mfa: fabric_rpc:reduce_view/4 
error:function_clause 
[{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,190}]},{couch_server,open_int,2,[{file,"src/couch_server.erl"},{line,106}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,96}]},{mem3_util,get_or_create_db_int,2,[{file,"src/mem3_util.erl"},{line,566}]},{fabric_rpc,reduce_view,5,[{file,"src/fabric_rpc.erl"},{line,160}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,141}]}]
   [error] 2023-11-22T11:31:19.432234Z [email protected] 
<0.2365.3090> 60d3f30af5 rexi_server: from: 
[email protected](<14158.26486.7168>) mfa: fabric_rpc:reduce_view/4 
error:function_clause 
[{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,190}]},{couch_server,open_int,2,[{file,"src/couch_server.erl"},{line,106}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,96}]},{mem3_util,get_or_create_db_int,2,[{file,"src/mem3_util.erl"},{line,566}]},{fabric_rpc,reduce_view,5,[{file,"src/fabric_rpc.erl"},{line,160}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,141}]}]
   [error] 2023-11-22T11:31:24.557262Z [email protected] 
<0.13361.3092> 573a4e9e0b rexi_server: from: 
[email protected](<0.7835.3087>) mfa: fabric_rpc:open_shard/2 
error:function_clause 
[{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,190}]},{couch_server,open_int,2,[{file,"src/couch_server.erl"},{line,106}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,96}]},{couch_db,open,2,[{file,"src/couch_db.erl"},{line,163}]},{mem3_util,get_or_create_db,2,[{file,"src/mem3_util.erl"},{line,549}]},{fabric_rpc,open_shard,2,[{file,"src/fabric_rpc.erl"},{line,307}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,141}]}]
   [error] 2023-11-22T11:31:24.557464Z [email protected] 
<0.19582.3089> 573a4e9e0b rexi_server: from: 
[email protected](<0.7835.3087>) mfa: fabric_rpc:open_shard/2 
error:function_clause 
[{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,190}]},{couch_server,open_int,2,[{file,"src/couch_server.erl"},{line,106}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,96}]},{couch_db,open,2,[{file,"src/couch_db.erl"},{line,163}]},{mem3_util,get_or_create_db,2,[{file,"src/mem3_util.erl"},{line,549}]},{fabric_rpc,open_shard,2,[{file,"src/fabric_rpc.erl"},{line,307}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,141}]}]
   [error] 2023-11-22T11:31:24.557691Z [email protected] 
<0.20989.3090> 573a4e9e0b rexi_server: from: 
[email protected](<0.7835.3087>) mfa: fabric_rpc:open_shard/2 
error:function_clause 
[{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,190}]},{couch_server,open_int,2,[{file,"src/couch_server.erl"},{line,106}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,96}]},{couch_db,open,2,[{file,"src/couch_db.erl"},{line,163}]},{mem3_util,get_or_create_db,2,[{file,"src/mem3_util.erl"},{line,549}]},{fabric_rpc,open_shard,2,[{file,"src/fabric_rpc.erl"},{line,307}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,141}]}]
   [error] 2023-11-22T11:31:24.557865Z [email protected] 
<0.10185.3092> 573a4e9e0b rexi_server: from: 
[email protected](<0.7835.3087>) mfa: fabric_rpc:open_shard/2 
error:function_clause 
[{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,190}]},{couch_server,open_int,2,[{file,"src/couch_server.erl"},{line,106}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,96}]},{couch_db,open,2,[{file,"src/couch_db.erl"},{line,163}]},{mem3_util,get_or_create_db,2,[{file,"src/mem3_util.erl"},{line,549}]},{fabric_rpc,open_shard,2,[{file,"src/fabric_rpc.erl"},{line,307}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,141}]}]
   [error] 2023-11-22T11:31:24.558054Z [email protected] 
<0.28077.3090> 573a4e9e0b rexi_server: from: 
[email protected](<0.7835.3087>) mfa: fabric_rpc:open_shard/2 
error:function_clause 
[{couch_db,incref,[undefined],[{file,"src/couch_db.erl"},{line,190}]},{couch_server,open_int,2,[{file,"src/couch_server.erl"},{line,106}]},{couch_server,open,2,[{file,"src/couch_server.erl"},{line,96}]},{couch_db,open,2,[{file,"src/couch_db.erl"},{line,163}]},{mem3_util,get_or_create_db,2,[{file,"src/mem3_util.erl"},{line,549}]},{fabric_rpc,open_shard,2,[{file,"src/fabric_rpc.erl"},{line,307}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,141}]}]
   
   
   Errors that contains `reduce_view` we already know do not lead to high CPU 
usage.
   From my perspectvive here present errors "open_shard" and propable this lead 
to high CPU usage.
   
   ## Steps to Reproduce
   
   N/A
   
   ## Expected Behaviour
   
   One faild shard should be handled by cluster and data should be provided by 
other nodes and do not trigger high CPU usage.
   
   ## Your Environment
   used 3 node cluster via IPv6 network only and docker image 
`apache/couchdb:3.2.1` aarch64.
   
   ```
   [~]# curl -s http://[::1]:5984| jq
   {
     "couchdb": "Welcome",
     "version": "3.2.1",
     "git_sha": "244d428af",
     "uuid": "12137f5eef7d7e2ab0480a4bf0d890f7",
     "features": [
       "access-ready",
       "partitioned",
       "pluggable-storage-engines",
       "reshard",
       "scheduler"
     ],
     "vendor": {
       "name": "The Apache Software Foundation"
     }
   }
   ```
   
   * CouchDB version used: 3.2.1
   * Browser name and version: not used
   * Operating system and version: CentOS 8 Strem with docker image 
`apache/couchdb:3.2.1` aarch64
   
   ## Additional Context
   We AWS T4 instances with `aarch64` architecture
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to