sergey-safarov commented on issue #1119: Could not open shard URL: https://github.com/apache/couchdb/issues/1119#issuecomment-362504677 Joan (@wohali) First i measured disk io operation when `tmpfs` mounted as volume inside container. ``` [root@node0 ~]# docker run --rm --read-only -it --mount type=tmpfs,destination=/opt/couchdb/data fedora:27 bash [root@c0ed791b0e7d /]# rm -f /workspace/* && time dd if=/dev/zero of=/opt/couchdb/data/test_container1.img bs=512 count=1000 oflag=dsync 1000+0 records in 1000+0 records out 512000 bytes (512 kB, 500 KiB) copied, 0.00277375 s, 185 MB/s real 0m0.004s user 0m0.003s sys 0m0.001s [root@c0ed791b0e7d /]# rm -f /workspace/* && time dd if=/dev/zero of=/opt/couchdb/data/test_container1.img bs=512 count=1000 oflag=dsync 1000+0 records in 1000+0 records out 512000 bytes (512 kB, 500 KiB) copied, 0.00277794 s, 184 MB/s real 0m0.004s user 0m0.001s sys 0m0.003s [root@c0ed791b0e7d /]# rm -f /workspace/* && time dd if=/dev/zero of=/opt/couchdb/data/test_container1.img bs=512 count=1000 oflag=dsync 1000+0 records in 1000+0 records out 512000 bytes (512 kB, 500 KiB) copied, 0.00283199 s, 181 MB/s real 0m0.004s user 0m0.001s sys 0m0.003s [root@c0ed791b0e7d /]# rm -f /workspace/* && time dd if=/dev/zero of=/opt/couchdb/data/test_container1.img bs=512 count=1000 oflag=dsync 1000+0 records in 1000+0 records out 512000 bytes (512 kB, 500 KiB) copied, 0.00279983 s, 183 MB/s real 0m0.004s user 0m0.001s sys 0m0.003s ``` Then i started couchdb container with mounter `tmpfs` as volume. Added arg `--mount type=tmpfs,destination=/opt/couchdb/data` ``` docker run -t --rm=true --log-driver=none --network kazoo --name couchdb1 \ --hostname couchdb1.kazoo \ --ip 10.0.9.8 \ --ulimit nofile=999999 \ --mount type=tmpfs,destination=/opt/couchdb/data \ -v /etc/kazoo/couchdb/vm.args.node1:/opt/couchdb/etc/vm.args \ apache/couchdb:2.1.1 ``` `vm.args` is same to default with added option `name` ``` [root@node0 ~]# docker exec -it couchdb1 cat /opt/couchdb/etc/vm.args # Licensed under the Apache License, Version 2.0 (the "License"); you may not # use this file except in compliance with the License. You may obtain a copy of # the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, WITHOUT # WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the # License for the specific language governing permissions and limitations under # the License. # Ensure that the Erlang VM listens on a known port -kernel inet_dist_listen_min 9100 -kernel inet_dist_listen_max 9100 # Tell kernel and SASL not to log anything -kernel error_logger silent -sasl sasl_error_logger false # Use kernel poll functionality if supported by emulator +K true # Start a pool of asynchronous IO threads +A 16 # Comment this line out to enable the interactive Erlang shell on startup +Bd -noinput -setcookie monster -name couchdb@couchdb1.kazoo [root@node0 ~]# ``` Now disk layout inside container ``` [root@node0 ~]# docker exec -it couchdb1 df Filesystem 1K-blocks Used Available Use% Mounted on overlay 268304384 76159032 192145352 29% / tmpfs 65536 0 65536 0% /dev tmpfs 65968072 0 65968072 0% /sys/fs/cgroup /dev/mapper/vgRAID10-root 2086912 157692 1929220 8% /etc/couchdb /dev/mapper/vgRAID10-docker 268304384 76159032 192145352 29% /etc/hosts shm 65536 0 65536 0% /dev/shm tmpfs 65968072 48 65968024 1% /opt/couchdb/data tmpfs 65968072 0 65968072 0% /proc/scsi tmpfs 65968072 0 65968072 0% /sys/firmware ``` Host total memory. Host have 128Gb of memory and used about 160 Mb of swap. ``` top - 01:41:43 up 17 days, 15:46, 4 users, load average: 1.78, 1.54, 0.87 Tasks: 900 total, 1 running, 283 sleeping, 0 stopped, 486 zombie %Cpu(s): 0.8 us, 0.8 sy, 0.0 ni, 98.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 13193614+total, 12358560 free, 2934528 used, 11664305+buff/cache KiB Swap: 31457280+total, 31441177+free, 161036 used. 12743585+avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1332 root 20 0 1469936 452596 4084 S 6.3 0.3 587:28.21 beam.smp 14915 root 20 0 1297128 295484 3928 S 5.6 0.2 510:51.97 beam.smp 20715 root 20 0 54392 5120 3812 R 0.7 0.0 0:00.09 top ``` Then i started replication and after 5 hours i have folowing results. Disk layout inside couchdb container. CoucDB data size about 10Gb ``` [root@node0 ~]# docker exec -it couchdb1 df Filesystem 1K-blocks Used Available Use% Mounted on overlay 268304384 76134184 192170200 29% / tmpfs 65536 0 65536 0% /dev tmpfs 65968072 0 65968072 0% /sys/fs/cgroup /dev/mapper/vgRAID10-root 2086912 157692 1929220 8% /etc/couchdb /dev/mapper/vgRAID10-docker 268304384 76134184 192170200 29% /etc/hosts shm 65536 0 65536 0% /dev/shm tmpfs 65968072 10083456 55884616 16% /opt/couchdb/data tmpfs 65968072 0 65968072 0% /proc/scsi tmpfs 65968072 0 65968072 0% /sys/firmware ``` Host memory usage. Swap usage not changed. All CouchDB data is located inside memory. ``` top - 06:35:51 up 17 days, 20:40, 6 users, load average: 1.03, 0.95, 0.82 Tasks: 938 total, 1 running, 322 sleeping, 0 stopped, 486 zombie %Cpu(s): 1.4 us, 1.1 sy, 0.0 ni, 97.2 id, 0.1 wa, 0.1 hi, 0.1 si, 0.0 st KiB Mem : 13193614+total, 87901344 free, 3644528 used, 40390272 buff/cache KiB Swap: 31457280+total, 31441177+free, 161036 used. 11659586+avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 5269 root 20 0 56704 5360 3876 R 11.8 0.0 0:00.03 top 14915 root 20 0 1315680 307456 5852 S 11.8 0.2 550:18.26 beam.smp 1332 root 20 0 1463408 452628 5444 S 5.9 0.3 615:53.69 beam.smp 23036 root 20 0 1859228 63316 31484 S 5.9 0.0 30:43.66 dockerd ``` After 5 hours of replication to database located on memory disk (`tmpfs`) couchdb have following errors in log **Could not open shards** - total 2 error ``` 2ea5274cd52c5bf0-201802/_local/75d3dc6ddfdfcae76e87a7acdfc5ed89 201 ok 42 Feb 02 03:50:57 node0.docker.rcsnet.ru sh[28220]: [notice] 2018-02-02T03:50:57.866252Z couchdb@couchdb1.kazoo <0.357.0> -------- couch_replicator_scheduler: Job {"75d3dc6ddfdfcae76e87a7acdfc5ed89","+create_target"} completed normally Feb 02 03:50:57 node0.docker.rcsnet.ru sh[28220]: [notice] 2018-02-02T03:50:57.866613Z couchdb@couchdb1.kazoo <0.22131.14> 8a931c0f9d 10.0.9.8:5984 10.0.9.8 undefined POST /_replicate 200 ok 2805 Feb 02 03:50:57 node0.docker.rcsnet.ru sh[28220]: [notice] 2018-02-02T03:50:57.916222Z couchdb@couchdb1.kazoo <0.357.0> -------- couch_replicator_scheduler: Job {"80f52a6352af0019a08ff5bf307e6192","+create_target"} started as <0.10587.13> Feb 02 03:50:59 node0.docker.rcsnet.ru sh[28220]: [error] 2018-02-02T03:50:59.599271Z couchdb@couchdb1.kazoo <0.14885.14> 4c046843ba Request to create N=3 DB but only 1 node(s) Feb 02 03:50:59 node0.docker.rcsnet.ru sh[28220]: [notice] 2018-02-02T03:50:59.629560Z couchdb@couchdb1.kazoo <0.14885.14> 4c046843ba 10.0.9.8:5984 10.0.9.8 undefined PUT /account%2f94%2f93%2f45d6587d8f195f1620b65b1eb063-201802/ 201 ok 32 Feb 02 03:50:59 node0.docker.rcsnet.ru sh[28220]: [error] 2018-02-02T03:50:59.655204Z couchdb@couchdb1.kazoo <0.2117.13> -------- Could not open file ./data/shards/60000000-7fffffff/account/94/93/45d6587d8f195f1620b65b1eb063-201802.1517543459.couch: no such file or directory Feb 02 03:50:59 node0.docker.rcsnet.ru sh[28220]: [error] 2018-02-02T03:50:59.655216Z couchdb@couchdb1.kazoo <0.2121.13> -------- Could not open file ./data/shards/40000000-5fffffff/account/94/93/45d6587d8f195f1620b65b1eb063-201802.1517543459.couch: no such file or directory Feb 02 03:50:59 node0.docker.rcsnet.ru sh[28220]: [info] 2018-02-02T03:50:59.666163Z couchdb@couchdb1.kazoo <0.206.0> -------- open_result error {not_found,no_db_file} for shards/60000000-7fffffff/account/94/93/45d6587d8f195f1620b65b1eb063-201802.1517543459 Feb 02 03:50:59 node0.docker.rcsnet.ru sh[28220]: [info] 2018-02-02T03:50:59.666210Z couchdb@couchdb1.kazoo <0.206.0> -------- open_result error {not_found,no_db_file} for shards/40000000-5fffffff/account/94/93/45d6587d8f195f1620b65b1eb063-201802.1517543459 Feb 02 03:50:59 node0.docker.rcsnet.ru sh[28220]: [warning] 2018-02-02T03:50:59.666204Z couchdb@couchdb1.kazoo <0.2139.13> d9f2f7b03a creating missing database: shards/60000000-7fffffff/account/94/93/45d6587d8f195f1620b65b1eb063-201802.1517543459 Feb 02 03:50:59 node0.docker.rcsnet.ru sh[28220]: [warning] 2018-02-02T03:50:59.666235Z couchdb@couchdb1.kazoo <0.2149.13> d9f2f7b03a creating missing database: shards/40000000-5fffffff/account/94/93/45d6587d8f195f1620b65b1eb063-201802.1517543459 Feb 02 03:50:59 node0.docker.rcsnet.ru sh[28220]: [error] 2018-02-02T03:50:59.666644Z couchdb@couchdb1.kazoo <0.303.0> -------- mem3_shards tried to create shards/40000000-5fffffff/account/94/93/45d6587d8f195f1620b65b1eb063-201802.1517543459, got file_exists Feb 02 03:50:59 node0.docker.rcsnet.ru sh[28220]: [error] 2018-02-02T03:50:59.932281Z couchdb@couchdb1.kazoo <0.16410.7> -------- rexi_server: from: couchdb@couchdb1.kazoo(<0.16407.7>) mfa: fabric_rpc:all_docs/3 exit:timeout [{rexi,init_stream,1,[{file,"src/rexi.erl"},{line,256}]},{rexi,stream2,3,[{file,"src/rexi.erl"},{line,204}]},{fabric_rpc,view_cb,2,[{file,"src/fabric_rpc.erl"},{line,308}]},{couch_mrview,finish_fold,2,[{file,"src/couch_mrview.erl"},{line,642}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,139}]}] Feb 02 03:50:59 node0.docker.rcsnet.ru sh[28220]: [error] 2018-02-02T03:50:59.932290Z couchdb@couchdb1.kazoo <0.16409.7> -------- rexi_server: from: couchdb@couchdb1.kazoo(<0.16407.7>) mfa: fabric_rpc:all_docs/3 exit:timeout [{rexi,init_stream,1,[{file,"src/rexi.erl"},{line,256}]},{rexi,stream2,3,[{file,"src/rexi.erl"},{line,204}]},{fabric_rpc,view_cb,2,[{file,"src/fabric_rpc.erl"},{line,308}]},{couch_mrview,finish_fold,2,[{file,"src/couch_mrview.erl"},{line,642}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,139}]}] ``` **Replicator error. Cannot put document** - about 20 errors ``` Feb 02 04:04:51 node0.docker.rcsnet.ru sh[28220]: [error] 2018-02-02T04:04:51.673222Z couchdb@couchdb1.kazoo <0.8179.40> -------- rexi_server: from: couchdb@couchdb1.kazoo(<0.10464.36>) mfa: fabric_rpc:all_docs/3 exit:timeout [{rexi,init_stream,1,[{file,"src/rexi.erl"},{line,256}]},{rexi,stream2,3,[{file,"src/rexi.erl"},{line,204}]},{fabric_rpc,view_cb,2,[{file,"src/fabric_rpc.erl"},{line,308}]},{couch_mrview,finish_fold,2,[{file,"src/couch_mrview.erl"},{line,642}]},{rexi_server,init_p,3,[{file,"src/rexi_server.erl"},{line,139}]}] Feb 02 04:04:55 node0.docker.rcsnet.ru sh[28220]: [error] 2018-02-02T04:04:55.012851Z couchdb@couchdb1.kazoo emulator -------- Error in process <0.10651.59> on node 'couchdb@couchdb1.kazoo' with exit value: {{nocatch,{mp_parser_died,noproc}},[{couch_att,'-foldl/4-fun-0-',3,[{file,"src/couch_att.erl"},{line,613}]},{couch_att,fold_streamed_data,4,[{file,"src/couch_att.erl"},{line,664}]},{couch_att,foldl,4,[{file,"src/couch_att.erl"},... Feb 02 04:04:55 node0.docker.rcsnet.ru sh[28220]: [info] 2018-02-02T04:04:55.013064Z couchdb@couchdb1.kazoo <0.353.0> -------- Replication connection to: "10.0.9.8":5984 died with reason {{nocatch,{mp_parser_died,noproc}},[{couch_att,'-foldl/4-fun-0-',3,[{file,"src/couch_att.erl"},{line,613}]},{couch_att,fold_streamed_data,4,[{file,"src/couch_att.erl"},{line,664}]},{couch_att,foldl,4,[{file,"src/couch_att.erl"},{line,617}]},{couch_httpd_multipart,atts_to_mp,4,[{file,"src/couch_httpd_multipart.erl"},{line,208}]}]} Feb 02 04:04:55 node0.docker.rcsnet.ru sh[28220]: [error] 2018-02-02T04:04:55.014366Z couchdb@couchdb1.kazoo <0.7570.55> 8967ff8316 req_err(4199105376) badmatch : ok Feb 02 04:04:55 node0.docker.rcsnet.ru sh[28220]: [<<"chttpd_db:db_doc_req/3 L782">>,<<"chttpd:process_request/1 L295">>,<<"chttpd:handle_request_int/1 L231">>,<<"mochiweb_http:headers/6 L91">>,<<"proc_lib:init_p_do_apply/3 L237">>] Feb 02 04:04:55 node0.docker.rcsnet.ru sh[28220]: [notice] 2018-02-02T04:04:55.014881Z couchdb@couchdb1.kazoo <0.7570.55> 8967ff8316 10.0.9.8:5984 10.0.9.8 undefined PUT /account%2f0e%2fe2%2fb3fccd72d86299cf8f66f6caa6bd-201801/201801-b089783f4734b57244ea8b5613f8375a?new_edits=false 500 ok 2 Feb 02 04:04:55 node0.docker.rcsnet.ru sh[28220]: [error] 2018-02-02T04:04:55.015761Z couchdb@couchdb1.kazoo <0.5748.60> -------- Replicator, request PUT to "http://10.0.9.8:5984/account%2f0e%2fe2%2fb3fccd72d86299cf8f66f6caa6bd-201801/201801-b089783f4734b57244ea8b5613f8375a?new_edits=false" failed due to error {error, Feb 02 04:04:55 node0.docker.rcsnet.ru sh[28220]: {'EXIT', Feb 02 04:04:55 node0.docker.rcsnet.ru sh[28220]: {{{nocatch,{mp_parser_died,noproc}}, Feb 02 04:04:55 node0.docker.rcsnet.ru sh[28220]: [{couch_att,'-foldl/4-fun-0-',3, Feb 02 04:04:55 node0.docker.rcsnet.ru sh[28220]: [{file,"src/couch_att.erl"},{line,613}]}, Feb 02 04:04:55 node0.docker.rcsnet.ru sh[28220]: {couch_att,fold_streamed_data,4, Feb 02 04:04:55 node0.docker.rcsnet.ru sh[28220]: [{file,"src/couch_att.erl"},{line,664}]}, Feb 02 04:04:55 node0.docker.rcsnet.ru sh[28220]: {couch_att,foldl,4,[{file,"src/couch_att.erl"},{line,617}]}, Feb 02 04:04:55 node0.docker.rcsnet.ru sh[28220]: {couch_httpd_multipart,atts_to_mp,4, Feb 02 04:04:55 node0.docker.rcsnet.ru sh[28220]: [{file,"src/couch_httpd_multipart.erl"},{line,208}]}]}, Feb 02 04:04:55 node0.docker.rcsnet.ru sh[28220]: {gen_server,call, Feb 02 04:04:55 node0.docker.rcsnet.ru sh[28220]: [<0.13476.55>, Feb 02 04:04:55 node0.docker.rcsnet.ru sh[28220]: {send_req, Feb 02 04:04:55 node0.docker.rcsnet.ru sh[28220]: {{url, ``` About replication error is important server IP `10.0.9.8`. This IP is assigned to destination CouchDB server. Joan (@wohali), I think issue not related to disk IO speed inside container. Test is made on RAM disk (`tmpfs`) with speed about 180 MB/s. That value is more greater then on real hard disk.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services