cg1972 opened a new issue, #90:
URL: https://github.com/apache/couchdb-helm/issues/90

   
   
   **Describe the bug**
   We have used the helm charts to install a 3 node couchdb cluster. We have 
noticed that one of the nodes in the cluster (coordinator node) is restarting 
on a regular basis, usually once a day.
   
   The couchdb pod error is
   `Container couchdb failed liveness probe, will be restarted`
   
   The couchdb logs indicate the following errors:
   `[notice] 2022-06-23T06:23:43.733589Z 
couc...@couchdb-couchdb-0.couchdb-couchdb.couchdb.svc.cluster.local 
<0.30690.22> b938e4d3fa 192.168.230.108:5984 10.1.2.179 undefined GET /_up 200 
ok 29613
   [notice] 2022-06-23T06:23:45.153723Z 
couc...@couchdb-couchdb-0.couchdb-couchdb.couchdb.svc.cluster.local 
<0.30707.22> 7bc4d55e3e 192.168.230.108:5984 10.1.2.179 undefined GET /_up 200 
ok 14872
   [notice] 2022-06-23T06:23:45.154154Z 
couc...@couchdb-couchdb-0.couchdb-couchdb.couchdb.svc.cluster.local 
<0.30697.22> d48973481d 192.168.230.108:5984 10.1.2.179 undefined GET /_up 200 
ok 13
   [error] 2022-06-23T06:23:45.273373Z 
couc...@couchdb-couchdb-0.couchdb-couchdb.couchdb.svc.cluster.local 
<0.30706.22> 45a001ddb1 req_err(2751202856) timeout : The request could not be 
processed in a reasonable amount of time.
       [<<"gen_server:call/2 L238">>,<<"chttpd_misc:handle_up_req/1 
L274">>,<<"chttpd:handle_req_after_auth/2 L327">>,<<"chttpd:process_request/1 
L310">>,<<"chttpd:handle_request_int/1 L249">>,<<"mochiweb_http:headers/6 
L150">>,<<"proc_lib:init_p_do_apply/3 L226">>]
   [error] 2022-06-23T06:23:45.274458Z 
couc...@couchdb-couchdb-0.couchdb-couchdb.couchdb.svc.cluster.local 
<0.30691.22> c975e456eb req_err(2751202856) timeout : The request could not be 
processed in a reasonable amount of time.
       [<<"gen_server:call/2 L238">>,<<"chttpd_misc:handle_up_req/1 
L274">>,<<"chttpd:handle_req_after_auth/2 L327">>,<<"chttpd:process_request/1 
L310">>,<<"chttpd:handle_request_int/1 L249">>,<<"mochiweb_http:headers/6 
L150">>,<<"proc_lib:init_p_do_apply/3 L226">>]
   [notice] 2022-06-23T06:23:45.274005Z 
couc...@couchdb-couchdb-0.couchdb-couchdb.couchdb.svc.cluster.local 
<0.30706.22> 45a001ddb1 192.168.230.108:5984 10.1.2.179 undefined GET /_up 500 
ok 15095
   [notice] 2022-06-23T06:23:45.274958Z 
couc...@couchdb-couchdb-0.couchdb-couchdb.couchdb.svc.cluster.local 
<0.30691.22> c975e456eb 192.168.230.108:5984 10.1.2.179 undefined GET /_up 500 
ok 34630
   [error] 2022-06-23T06:23:45.915833Z 
couc...@couchdb-couchdb-0.couchdb-couchdb.couchdb.svc.cluster.local <0.23685.4> 
-------- gen_server couch_prometheus_server terminated with reason: 
{timeout,{gen_server,call,[couch_stats_aggregator,fetch]}} at 
gen_server:call/2(line:238) <= couch_stats_aggregator:fetch/0(line:44) <= 
couch_prometheus_server:get_couchdb_stats/0(line:94) <= 
couch_prometheus_server:refresh_metrics/0(line:87) <= 
couch_prometheus_server:handle_info/2(line:74) <= 
gen_server:try_dispatch/4(line:689) <= gen_server:handle_msg/6(line:765) <= 
proc_lib:init_p_do_apply/3(line:226)
     last msg: redacted
        state: {st,<<"# TYPE couchdb_couch_log_requests_total 
counter\ncouchdb_couch_log_requests_total{level=\"alert\"} 
0\ncouchdb_couch_log_requests_total{level=\"critical\"} 
0\ncouchdb_couch_log_requests_total{level=\"debug\"} 
0\ncouchdb_couch_log_requests_total{level=\"emergency\"} 
0\ncouchdb_couch_log_requests_total{level=\"error\"} 
0\ncouchdb_couch_log_requests_total{level=\"info\"} 
7\ncouchdb_couch_log_requests_total{level=\"notice\"} 
18573\ncouchdb_couch_log_requests_total{level=\"warning\"} 0\n# TYPE 
couchdb_couch_replicator_changes_manager_deaths_total 
counter\ncouchdb_couch_replicator_changes_manager_deaths_total 0\n# TYPE 
couchdb_couch_replicator_changes_queue_deaths_total 
counter\ncouchdb_couch_replicator_changes_queue_deaths_total 0\n# TYPE 
couchdb_couch_replicator_changes_read_failures_total 
counter\ncouchdb_couch_replicator_changes_read_failures_total 0\n# TYPE 
couchdb_couch_replicator_changes_reader_deaths_total 
counter\ncouchdb_couch_replicator_changes_reader_deaths
 _total 0\n# TYPE couchdb_couch_replicator_checkpoints_failure_total 
counter\ncouchdb_couch_replicator_checkpoints_failure_total 0\n# TYPE 
couchdb_couch_replicator_checkpoints_total 
counter\ncouchdb_couch_replicator_checkpoints_total 0\n# TYPE 
couchdb_couch_replicator_cluster_is_stable 
gauge\ncouchdb_couch_replicator_cluster_is_stable 1\n# TYPE 
couchdb_couch_replicator_connection_acquires_total 
counter\ncouchdb_couch_replicator_connection_acquires_total 0\n# TYPE 
couchdb_couch_replicator_connection_closes_total 
counter\ncouchdb_couch_replicator_connection_closes_total 0\n# TYPE 
couchdb_couch_replicator_connection_creates_total 
counter\ncouchdb_couch_replicator_connection_creates_total 0\n# TYPE 
couchdb_couch_replicator_connection_owner_crashes_total 
counter\ncouchdb_couch_replicator_connection_owner_crashes_total 0\n# TYPE 
couchdb_couch_replicator_connection_releases_total 
counter\ncouchdb_couch_replicator_connection_releases_total 0\n# TYPE 
couchdb_couch_replicator_connection_worker
 _crashes_total 
counter\ncouchdb_couch_replicator_connection_worker_crashes_total 0\n# TYPE 
couchdb_couch_replicator_db_scans_total 
counter\ncouchdb_couch_replicator_db_scans_total 1\n# TYPE 
couchdb_couch_replicator_docs_completed_state_updates_total 
counter\ncouchdb_couch_replicator_docs_completed_state_updates_total 0\n# TYPE 
couchdb_couch_replicator_docs_db_changes_total 
counter\ncouchdb_couch_replicator_docs_db_changes_total 0\n# TYPE 
couchdb_couch_replicator_docs_dbs_created_total 
counter\ncouchdb_couch_replicator_docs_dbs_created_total 0\n# TYPE 
couchdb_couch_replicator_docs_dbs_deleted_total 
counter\ncouchdb_couch_replicator_docs_dbs_deleted_total 0\n# TYPE 
couchdb_couch_replicator_docs_dbs_found_total 
counter\ncouchdb_couch_replicator_docs_dbs_found_total 2\n# TYPE 
couchdb_couch_replicator_docs_failed_state_updates_total 
counter\ncouchdb_couch_replicator_docs_failed_state_updates_total 0\n# TYPE 
couchdb_couch_replicator_failed_starts_total 
counter\ncouchdb_couch_replicator_fa
 iled_starts_total 0\n# TYPE couchdb_couch_replicator_jobs_adds_total 
counter\ncouchdb_couch_replicator_jobs_adds_total 0\n# TYPE 
couchdb_couch_replicator_jobs_crashed 
gauge\ncouchdb_couch_replicator_jobs_crashed 0\n# TYPE 
couchdb_couch_replicator_jobs_crashes_total 
counter\ncouchdb_couch_replicator_jobs_crashes_total 0\n# TYPE 
couchdb_couch_replicator_jobs_duplicate_adds_total 
counter\ncouchdb_couch_replicator_jobs_duplicate_adds_total 0\n# TYPE 
couchdb_couch_replicator_jobs_pending 
gauge\ncouchdb_couch_replicator_jobs_pending 0\n# TYPE 
couchdb_couch_replicator_jobs_removes_total 
counter\ncouchdb_couch_replicator_jobs_removes_total 0\n# TYPE 
couchdb_couch_replicator_jobs_running 
gauge\ncouchdb_couch_replicator_jobs_running 0\n# TYPE 
couchdb_couch_replicator_jobs_starts_total 
counter\ncouchdb_couch_replicator_jobs_starts_total 0\n# TYPE 
couchdb_couch_replicator_jobs_stops_total 
counter\ncouchdb_couch_replicator_jobs_stops_total 0\n# TYPE 
couchdb_couch_replicator_jobs_total gauge\ncou
 chdb_couch_replicator_jobs_total 0\n# TYPE 
couchdb_couch_replicator_requests_total 
counter\ncouchdb_couch_replicator_requests_total 0\n# TYPE 
couchdb_couch_replicator_responses_failure_total 
counter\ncouchdb_couch_replicator_responses_failure_total 0\n# TYPE 
couchdb_couch_replicator_responses_total 
counter\ncouchdb_couch_replicator_responses_total 0\n# TYPE 
couchdb_couch_replicator_stream_responses_failure_total 
counter\ncouchdb_couch_replicator_stream_responses_failure_total 0\n# TYPE 
couchdb_couch_replicator_stream_responses_total 
counter\ncouchdb_couch_replicator_stream_responses_total 0\n# TYPE 
couchdb_couch_replicator_worker_deaths_total 
counter\ncouchdb_couch_replicator_worker_deaths_total 0\n# TYPE 
couchdb_couch_replicator_workers_started_total 
counter\ncouchdb_couch_replicator_workers_started_total 0\n# TYPE 
couchdb_auth_cache_requests_total counter\ncouchdb_auth_cache_requests_total 
0\n# TYPE couchdb_auth_cache_misses_total 
counter\ncouchdb_auth_cache_misses_total 0\n# TYPE
  couchdb_collect_results_time_seconds 
summary\ncouchdb_collect_results_time_seconds{quantile=\"0.5\"} 
0.0\ncouchdb_collect_results_time_seconds{quantile=\"0.75\"} 
0.0\ncouchdb_collect_results_time_seconds{quantile=\"0.9\"} 
0.0\ncouchdb_collect_results_time_seconds{quantile=\"0.95\"} 
0.0\ncouchdb_collect_results_time_seconds{quantile=\"0.99\"} 
0.0\ncouchdb_collect_results_time_seconds{quantile=\"0.999\"} 
0.0\ncouchdb_collect_results_time_seconds_sum 
0.0\ncouchdb_collect_results_time_seconds_count 0\n# TYPE 
couchdb_couch_server_lru_skip_total 
counter\ncouchdb_couch_server_lru_skip_total 0\n# TYPE 
couchdb_database_purges_total counter\ncouchdb_database_purges_total 0\n# TYPE 
couchdb_database_reads_total counter\ncouchdb_database_reads_total 24\n# TYPE 
couchdb_database_writes_total counter\ncouchdb_database_writes_total 0\n# TYPE 
couchdb_db_open_time_seconds 
summary\ncouchdb_db_open_time_seconds{quantile=\"0.5\"} 
0.0\ncouchdb_db_open_time_seconds{quantile=\"0.75\"} 0.0\ncouchdb_db_open_
 time_seconds{quantile=\"0.9\"} 
0.0\ncouchdb_db_open_time_seconds{quantile=\"0.95\"} 
0.0\ncouchdb_db_open_time_seconds{quantile=\"0.99\"} 
0.0\ncouchdb_db_open_time_seconds{quantile=\"0.999\"} 
0.0\ncouchdb_db_open_time_seconds_sum 0.0\ncouchdb_db_open_time_seconds_count 
0\n# TYPE couchdb_dbinfo_seconds 
summary\ncouchdb_dbinfo_seconds{quantile=\"0.5\"} 
0.0\ncouchdb_dbinfo_seconds{quantile=\"0.75\"} 
0.0\ncouchdb_dbinfo_seconds{quantile=\"0.9\"} 
0.0\ncouchdb_dbinfo_seconds{quantile=\"0.95\"} 
0.0\ncouchdb_dbinfo_seconds{quantile=\"0.99\"} 
0.0\ncouchdb_dbinfo_seconds{quantile=\"0.999\"} 0.0\ncouchdb_dbinfo_seconds_sum 
0.0\ncouchdb_dbinfo_seconds_count 0\n# TYPE couchdb_document_inserts_total 
counter\ncouchdb_document_inserts_total 7\n# TYPE 
couchdb_document_purges_failure_total 
counter\ncouchdb_document_purges_failure_total 0\n# TYPE 
couchdb_document_purges_success_total 
counter\ncouchdb_document_purges_success_total 0\n# TYPE 
couchdb_document_purges_total_total counter\ncouchdb_document_p
 urges_total_total 0\n# TYPE couchdb_document_writes_total 
counter\ncouchdb_document_writes_total 14\n# TYPE 
couchdb_httpd_aborted_requests_total 
counter\ncouchdb_httpd_aborted_requests_total 0\n# TYPE 
couchdb_httpd_all_docs_timeouts_total 
counter\ncouchdb_httpd_all_docs_timeouts_total 0\n# TYPE 
couchdb_httpd_bulk_docs_seconds 
summary\ncouchdb_httpd_bulk_docs_seconds{quantile=\"0.5\"} 
0.0\ncouchdb_httpd_bulk_docs_seconds{quantile=\"0.75\"} 
0.0\ncouchdb_httpd_bulk_docs_seconds{quantile=\"0.9\"} 
0.0\ncouchdb_httpd_bulk_docs_seconds{quantile=\"0.95\"} 
0.0\ncouchdb_httpd_bulk_docs_seconds{quantile=\"0.99\"} 
0.0\ncouchdb_httpd_bulk_docs_seconds{quantile=\"0.999\"} 
0.0\ncouchdb_httpd_bulk_docs_seconds_sum 
0.0\ncouchdb_httpd_bulk_docs_seconds_count 0\n# TYPE 
couchdb_httpd_bulk_requests_total counter\ncouchdb_httpd_bulk_requests_total 
0\n# TYPE couchdb_httpd_clients_requesting_changes_total 
counter\ncouchdb_httpd_clients_requesting_changes_total 0\n...">>,...}
       extra: []
   [error] 2022-06-23T06:23:45.938589Z 
couc...@couchdb-couchdb-0.couchdb-couchdb.couchdb.svc.cluster.local <0.23685.4> 
-------- gen_server couch_prometheus_server terminated with reason: 
{timeout,{gen_server,call,[couch_stats_aggregator,fetch]}} at 
gen_server:call/2(line:238) <= couch_stats_aggregator:fetch/0(line:44) <= 
couch_prometheus_server:get_couchdb_stats/0(line:94) <= 
couch_prometheus_server:refresh_metrics/0(line:87) <= 
couch_prometheus_server:handle_info/2(line:74) <= 
gen_server:try_dispatch/4(line:689) <= gen_server:handle_msg/6(line:765) <= 
proc_lib:init_p_do_apply/3(line:226)
     last msg: redacted
        state: {st,<<"# TYPE couchdb_couch_log_requests_total 
counter\ncouchdb_couch_log_requests_total{level=\"alert\"} 
0\ncouchdb_couch_log_requests_total{level=\"critical\"} 
0\ncouchdb_couch_log_requests_total{level=\"debug\"} 
0\ncouchdb_couch_log_requests_total{level=\"emergency\"} 
0\ncouchdb_couch_log_requests_total{level=\"error\"} 
0\ncouchdb_couch_log_requests_total{level=\"info\"} 
7\ncouchdb_couch_log_requests_total{level=\"notice\"} 
18573\ncouchdb_couch_log_requests_total{level=\"warning\"} 0\n# TYPE 
couchdb_couch_replicator_changes_manager_deaths_total 
counter\ncouchdb_couch_replicator_changes_manager_deaths_total 0\n# TYPE 
couchdb_couch_replicator_changes_queue_deaths_total 
counter\ncouchdb_couch_replicator_changes_queue_deaths_total 0\n# TYPE 
couchdb_couch_replicator_changes_read_failures_total 
counter\ncouchdb_couch_replicator_changes_read_failures_total 0\n# TYPE 
couchdb_couch_replicator_changes_reader_deaths_total 
counter\ncouchdb_couch_replicator_changes_reader_deaths
 _total 0\n# TYPE couchdb_couch_replicator_checkpoints_failure_total 
counter\ncouchdb_couch_replicator_checkpoints_failure_total 0\n# TYPE 
couchdb_couch_replicator_checkpoints_total 
counter\ncouchdb_couch_replicator_checkpoints_total 0\n# TYPE 
couchdb_couch_replicator_cluster_is_stable 
gauge\ncouchdb_couch_replicator_cluster_is_stable 1\n# TYPE 
couchdb_couch_replicator_connection_acquires_total 
counter\ncouchdb_couch_replicator_connection_acquires_total 0\n# TYPE 
couchdb_couch_replicator_connection_closes_total 
counter\ncouchdb_couch_replicator_connection_closes_total 0\n# TYPE 
couchdb_couch_replicator_connection_creates_total 
counter\ncouchdb_couch_replicator_connection_creates_total 0\n# TYPE 
couchdb_couch_replicator_connection_owner_crashes_total 
counter\ncouchdb_couch_replicator_connection_owner_crashes_total 0\n# TYPE 
couchdb_couch_replicator_connection_releases_total 
counter\ncouchdb_couch_replicator_connection_releases_total 0\n# TYPE 
couchdb_couch_replicator_connection_worker
 _crashes_total 
counter\ncouchdb_couch_replicator_connection_worker_crashes_total 0\n# TYPE 
couchdb_couch_replicator_db_scans_total 
counter\ncouchdb_couch_replicator_db_scans_total 1\n# TYPE 
couchdb_couch_replicator_docs_completed_state_updates_total 
counter\ncouchdb_couch_replicator_docs_completed_state_updates_total 0\n# TYPE 
couchdb_couch_replicator_docs_db_changes_total 
counter\ncouchdb_couch_replicator_docs_db_changes_total 0\n# TYPE 
couchdb_couch_replicator_docs_dbs_created_total 
counter\ncouchdb_couch_replicator_docs_dbs_created_total 0\n# TYPE 
couchdb_couch_replicator_docs_dbs_deleted_total 
counter\ncouchdb_couch_replicator_docs_dbs_deleted_total 0\n# TYPE 
couchdb_couch_replicator_docs_dbs_found_total 
counter\ncouchdb_couch_replicator_docs_dbs_found_total 2\n# TYPE 
couchdb_couch_replicator_docs_failed_state_updates_total 
counter\ncouchdb_couch_replicator_docs_failed_state_updates_total 0\n# TYPE 
couchdb_couch_replicator_failed_starts_total 
counter\ncouchdb_couch_replicator_fa
 iled_starts_total 0\n# TYPE couchdb_couch_replicator_jobs_adds_total 
counter\ncouchdb_couch_replicator_jobs_adds_total 0\n# TYPE 
couchdb_couch_replicator_jobs_crashed 
gauge\ncouchdb_couch_replicator_jobs_crashed 0\n# TYPE 
couchdb_couch_replicator_jobs_crashes_total 
counter\ncouchdb_couch_replicator_jobs_crashes_total 0\n# TYPE 
couchdb_couch_replicator_jobs_duplicate_adds_total 
counter\ncouchdb_couch_replicator_jobs_duplicate_adds_total 0\n# TYPE 
couchdb_couch_replicator_jobs_pending 
gauge\ncouchdb_couch_replicator_jobs_pending 0\n# TYPE 
couchdb_couch_replicator_jobs_removes_total 
counter\ncouchdb_couch_replicator_jobs_removes_total 0\n# TYPE 
couchdb_couch_replicator_jobs_running 
gauge\ncouchdb_couch_replicator_jobs_running 0\n# TYPE 
couchdb_couch_replicator_jobs_starts_total 
counter\ncouchdb_couch_replicator_jobs_starts_total 0\n# TYPE 
couchdb_couch_replicator_jobs_stops_total 
counter\ncouchdb_couch_replicator_jobs_stops_total 0\n# TYPE 
couchdb_couch_replicator_jobs_total gauge\ncou
 chdb_couch_replicator_jobs_total 0\n# TYPE 
couchdb_couch_replicator_requests_total 
counter\ncouchdb_couch_replicator_requests_total 0\n# TYPE 
couchdb_couch_replicator_responses_failure_total 
counter\ncouchdb_couch_replicator_responses_failure_total 0\n# TYPE 
couchdb_couch_replicator_responses_total 
counter\ncouchdb_couch_replicator_responses_total 0\n# TYPE 
couchdb_couch_replicator_stream_responses_failure_total 
counter\ncouchdb_couch_replicator_stream_responses_failure_total 0\n# TYPE 
couchdb_couch_replicator_stream_responses_total 
counter\ncouchdb_couch_replicator_stream_responses_total 0\n# TYPE 
couchdb_couch_replicator_worker_deaths_total 
counter\ncouchdb_couch_replicator_worker_deaths_total 0\n# TYPE 
couchdb_couch_replicator_workers_started_total 
counter\ncouchdb_couch_replicator_workers_started_total 0\n# TYPE 
couchdb_auth_cache_requests_total counter\ncouchdb_auth_cache_requests_total 
0\n# TYPE couchdb_auth_cache_misses_total 
counter\ncouchdb_auth_cache_misses_total 0\n# TYPE
  couchdb_collect_results_time_seconds 
summary\ncouchdb_collect_results_time_seconds{quantile=\"0.5\"} 
0.0\ncouchdb_collect_results_time_seconds{quantile=\"0.75\"} 
0.0\ncouchdb_collect_results_time_seconds{quantile=\"0.9\"} 
0.0\ncouchdb_collect_results_time_seconds{quantile=\"0.95\"} 
0.0\ncouchdb_collect_results_time_seconds{quantile=\"0.99\"} 
0.0\ncouchdb_collect_results_time_seconds{quantile=\"0.999\"} 
0.0\ncouchdb_collect_results_time_seconds_sum 
0.0\ncouchdb_collect_results_time_seconds_count 0\n# TYPE 
couchdb_couch_server_lru_skip_total 
counter\ncouchdb_couch_server_lru_skip_total 0\n# TYPE 
couchdb_database_purges_total counter\ncouchdb_database_purges_total 0\n# TYPE 
couchdb_database_reads_total counter\ncouchdb_database_reads_total 24\n# TYPE 
couchdb_database_writes_total counter\ncouchdb_database_writes_total 0\n# TYPE 
couchdb_db_open_time_seconds 
summary\ncouchdb_db_open_time_seconds{quantile=\"0.5\"} 
0.0\ncouchdb_db_open_time_seconds{quantile=\"0.75\"} 0.0\ncouchdb_db_open_
 time_seconds{quantile=\"0.9\"} 
0.0\ncouchdb_db_open_time_seconds{quantile=\"0.95\"} 
0.0\ncouchdb_db_open_time_seconds{quantile=\"0.99\"} 
0.0\ncouchdb_db_open_time_seconds{quantile=\"0.999\"} 
0.0\ncouchdb_db_open_time_seconds_sum 0.0\ncouchdb_db_open_time_seconds_count 
0\n# TYPE couchdb_dbinfo_seconds 
summary\ncouchdb_dbinfo_seconds{quantile=\"0.5\"} 
0.0\ncouchdb_dbinfo_seconds{quantile=\"0.75\"} 
0.0\ncouchdb_dbinfo_seconds{quantile=\"0.9\"} 
0.0\ncouchdb_dbinfo_seconds{quantile=\"0.95\"} 
0.0\ncouchdb_dbinfo_seconds{quantile=\"0.99\"} 
0.0\ncouchdb_dbinfo_seconds{quantile=\"0.999\"} 0.0\ncouchdb_dbinfo_seconds_sum 
0.0\ncouchdb_dbinfo_seconds_count 0\n# TYPE couchdb_document_inserts_total 
counter\ncouchdb_document_inserts_total 7\n# TYPE 
couchdb_document_purges_failure_total 
counter\ncouchdb_document_purges_failure_total 0\n# TYPE 
couchdb_document_purges_success_total 
counter\ncouchdb_document_purges_success_total 0\n# TYPE 
couchdb_document_purges_total_total counter\ncouchdb_document_p
 urges_total_total 0\n# TYPE couchdb_document_writes_total 
counter\ncouchdb_document_writes_total 14\n# TYPE 
couchdb_httpd_aborted_requests_total 
counter\ncouchdb_httpd_aborted_requests_total 0\n# TYPE 
couchdb_httpd_all_docs_timeouts_total 
counter\ncouchdb_httpd_all_docs_timeouts_total 0\n# TYPE 
couchdb_httpd_bulk_docs_seconds 
summary\ncouchdb_httpd_bulk_docs_seconds{quantile=\"0.5\"} 
0.0\ncouchdb_httpd_bulk_docs_seconds{quantile=\"0.75\"} 
0.0\ncouchdb_httpd_bulk_docs_seconds{quantile=\"0.9\"} 
0.0\ncouchdb_httpd_bulk_docs_seconds{quantile=\"0.95\"} 
0.0\ncouchdb_httpd_bulk_docs_seconds{quantile=\"0.99\"} 
0.0\ncouchdb_httpd_bulk_docs_seconds{quantile=\"0.999\"} 
0.0\ncouchdb_httpd_bulk_docs_seconds_sum 
0.0\ncouchdb_httpd_bulk_docs_seconds_count 0\n# TYPE 
couchdb_httpd_bulk_requests_total counter\ncouchdb_httpd_bulk_requests_total 
0\n# TYPE couchdb_httpd_clients_requesting_changes_total 
counter\ncouchdb_httpd_clients_requesting_changes_total 0\n...">>,...}
       extra: []
   [error] 2022-06-23T06:23:45.953809Z 
couc...@couchdb-couchdb-0.couchdb-couchdb.couchdb.svc.cluster.local <0.23685.4> 
-------- CRASH REPORT Process couch_prometheus_server (<0.23685.4>) with 0 
neighbors exited with reason: 
{timeout,{gen_server,call,[couch_stats_aggregator,fetch]}} at 
gen_server:call/2(line:238) <= couch_stats_aggregator:fetch/0(line:44) <= 
couch_prometheus_server:get_couchdb_stats/0(line:94) <= 
couch_prometheus_server:refresh_metrics/0(line:87) <= 
couch_prometheus_server:handle_info/2(line:74) <= 
gen_server:try_dispatch/4(line:689) <= gen_server:handle_msg/6(line:765) <= 
proc_lib:init_p_do_apply/3(line:226); initial_call: 
{couch_prometheus_server,init,['Argument__1']}, ancestors: 
[couch_prometheus_sup,<0.251.0>], message_queue_len: 1, links: [<0.252.0>], 
dictionary: [], trap_exit: false, status: running, heap_size: 46422, 
stack_size: 28, reductions: 5547311068
   [error] 2022-06-23T06:23:45.954202Z 
couc...@couchdb-couchdb-0.couchdb-couchdb.couchdb.svc.cluster.local <0.23685.4> 
-------- CRASH REPORT Process couch_prometheus_server (<0.23685.4>) with 0 
neighbors exited with reason: 
{timeout,{gen_server,call,[couch_stats_aggregator,fetch]}} at 
gen_server:call/2(line:238) <= couch_stats_aggregator:fetch/0(line:44) <= 
couch_prometheus_server:get_couchdb_stats/0(line:94) <= 
couch_prometheus_server:refresh_metrics/0(line:87) <= 
couch_prometheus_server:handle_info/2(line:74) <= 
gen_server:try_dispatch/4(line:689) <= gen_server:handle_msg/6(line:765) <= 
proc_lib:init_p_do_apply/3(line:226); initial_call: 
{couch_prometheus_server,init,['Argument__1']}, ancestors: 
[couch_prometheus_sup,<0.251.0>], message_queue_len: 1, links: [<0.252.0>], 
dictionary: [], trap_exit: false, status: running, heap_size: 46422, 
stack_size: 28, reductions: 5547311068
   [error] 2022-06-23T06:23:46.044832Z 
couc...@couchdb-couchdb-0.couchdb-couchdb.couchdb.svc.cluster.local <0.252.0> 
-------- Supervisor couch_prometheus_sup had child couch_prometheus_server 
started with couch_prometheus_server:start_link() at <0.23685.4> exit with 
reason {timeout,{gen_server,call,[couch_stats_aggregator,fetch]}} in context 
child_terminated
   [error] 2022-06-23T06:23:46.044957Z 
couc...@couchdb-couchdb-0.couchdb-couchdb.couchdb.svc.cluster.local <0.252.0> 
-------- Supervisor couch_prometheus_sup had child couch_prometheus_server 
started with couch_prometheus_server:start_link() at <0.23685.4> exit with 
reason {timeout,{gen_server,call,[couch_stats_aggregator,fetch]}} in context 
child_terminated
   [notice] 2022-06-23T06:23:46.364711Z 
couc...@couchdb-couchdb-0.couchdb-couchdb.couchdb.svc.cluster.local 
<0.30722.22> 4688407aa4 192.168.230.108:5984 10.1.2.179 undefined GET /_up 200 
ok 49
   [notice] 2022-06-23T06:24:06.671559Z 
couc...@couchdb-couchdb-0.couchdb-couchdb.couchdb.svc.cluster.local 
<0.30883.22> 54b4dc40cb 192.168.230.108:5984 10.1.2.179 undefined GET /_up 200 
ok 150
   [notice] 2022-06-23T06:24:09.260972Z 
couc...@couchdb-couchdb-0.couchdb-couchdb.couchdb.svc.cluster.local 
<0.30884.22> e34d093bcd 192.168.230.108:5984 10.1.2.179 undefined GET /_up 200 
ok 1660
   [info] 2022-06-23T06:24:09.602627Z 
couc...@couchdb-couchdb-0.couchdb-couchdb.couchdb.svc.cluster.local <0.40.0> 
-------- SIGTERM received - shutting down
   
   [info] 2022-06-23T06:24:09.602724Z 
couc...@couchdb-couchdb-0.couchdb-couchdb.couchdb.svc.cluster.local <0.40.0> 
-------- SIGTERM received - shutting down
   
   [notice] 2022-06-23T06:24:14.600448Z 
couc...@couchdb-couchdb-0.couchdb-couchdb.couchdb.svc.cluster.local 
<0.30916.22> c6896d1f00 192.168.230.108:5984 10.1.2.179 undefined GET /_up 200 
ok 56
   [error] 2022-06-23T06:24:18.961753Z 
couc...@couchdb-couchdb-0.couchdb-couchdb.couchdb.svc.cluster.local <0.811.0> 
-------- gen_server <0.811.0> terminated with reason: killed
     last msg: redacted
        state: 
{state,#Ref<0.3717146181.405405699.170850>,couch_replicator_doc_processor,nil,<<"_replicator">>,#Ref<0.3717146181.405274627.170851>,nil,[],true}
       extra: []
   [error] 2022-06-23T06:24:18.962005Z 
couc...@couchdb-couchdb-0.couchdb-couchdb.couchdb.svc.cluster.local <0.811.0> 
-------- gen_server <0.811.0> terminated with reason: killed
     last msg: redacted
        state: 
{state,#Ref<0.3717146181.405405699.170850>,couch_replicator_doc_processor,nil,<<"_replicator">>,#Ref<0.3717146181.405274627.170851>,nil,[],true}
       extra: []
   [error] 2022-06-23T06:24:18.962362Z 
couc...@couchdb-couchdb-0.couchdb-couchdb.couchdb.svc.cluster.local <0.811.0> 
-------- CRASH REPORT Process  (<0.811.0>) with 0 neighbors exited with reason: 
killed at gen_server:decode_msg/9(line:475) <= 
proc_lib:init_p_do_apply/3(line:226); initial_call: 
{couch_multidb_changes,init,['Argument__1']}, ancestors: 
[<0.692.0>,couch_replicator_sup,<0.668.0>], message_queue_len: 0, links: [], 
dictionary: [], trap_exit: true, status: running, heap_size: 1598, stack_size: 
28, reductions: 181192
   [error] 2022-06-23T06:24:19.005925Z 
couc...@couchdb-couchdb-0.couchdb-couchdb.couchdb.svc.cluster.local <0.811.0> 
-------- CRASH REPORT Process  (<0.811.0>) with 0 neighbors exited with reason: 
killed at gen_server:decode_msg/9(line:475) <= 
proc_lib:init_p_do_apply/3(line:226); initial_call: 
{couch_multidb_changes,init,['Argument__1']}, ancestors: 
[<0.692.0>,couch_replicator_sup,<0.668.0>], message_queue_len: 0, links: [], 
dictionary: [], trap_exit: true, status: running, heap_size: 1598, stack_size: 
28, reductions: 181192`
   
   **Version of Helm and Kubernetes**:
   Helm Version: 3.4.0
   Kubernetes Version: 1.18.3
   
   **What happened**:
   The coordinator pod would routinely restart due to the error shown above in 
the logs
   
   **What you expected to happen**:
   All pods should remain running without restarting
   
   **How to reproduce it** (as minimally and precisely as possible):
   The issue occurs randomly and restarts with the error shown in the logs.  
   
   **Anything else we need to know**:
   We have the helm chart deployed in both a testing a production kubernetes 
environment and both environments demonstrate the same behaviour. The db only 
has a small amount of data in it and the pods do not have any cpu or memory 
restrictions. The pods are configured with 16Gb local-path PV's. Average memory 
usage is 56Mb and cpu usage below 0.03
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to