Replication renders CouchDB unresponsive.
-----------------------------------------
Key: COUCHDB-197
URL: https://issues.apache.org/jira/browse/COUCHDB-197
Project: CouchDB
Issue Type: Bug
Components: Database Core
Reporter: Maximillian Dornseif
I am quite sure this is not the same issue as in COUCHDB-193.
Im trying to replicte a somewhat big database
{"doc_count":541394,"doc_del_count":265692,"update_seq":2118390,"purge_seq":0,"compact_running":false,"disk_size":16552608803}
to an other machine.
I started replication with this:
send: 'POST /_replicate HTTP/1.1\r\nHost:
couchdb1.local.xxx:5984\r\nAccept-Encoding: identity\r\ncontent-length:
90\r\ncontent-type: application/json\r\naccept: application/json\r\nuser-agent:
couchdb-python 0.5dev-r127\r\n\r\n'
send: '{"source": "hulog_events", "target":
"http://couchdb2.local.xxx:5984/hulog_events"}'
reply: ''
connect: (couchdb1.local.hudora.biz, 5984)
send: 'POST /_replicate HTTP/1.1\r\nHost:
couchdb1.local.xxxx:5984\r\nAccept-Encoding: identity\r\ncontent-length:
90\r\ncontent-type: application/json\r\naccept: application/json\r\nuser-agent:
couchdb-python 0.5dev-r127\r\n\r\n'
send: '{"source": "hulog_events", "target":
"http://couchdb2.local.xxxx:5984/hulog_events"}'
(no reply so far)
On the source server (couchdb1) I see following logentries:
Mon, 05 Jan 2009 19:34:21 GMT] [info] [<0.12745.45>] 192.168.0.30 - - 'POST'
/_replicate 200
[Mon, 05 Jan 2009 19:35:36 GMT] [info] [<0.107.0>] Compaction for db
"hulog_events_test" completed.
[Mon, 05 Jan 2009 19:35:45 GMT] [info] [<0.12746.45>] 127.0.0.1 - - 'GET'
/hulog_events/ 200
[Mon, 05 Jan 2009 19:35:46 GMT] [info] [<0.95.0>] Compaction for db "eap"
completed.
[Mon, 05 Jan 2009 19:42:17 GMT] [error] [<0.12765.45>] ** Generic server
<0.12765.45> terminating
** Last message in was {'EXIT',<0.12762.45>,
{timeout,
{gen_server,call,
[<0.12768.45>,
{write,
<<0,0,1,36,131,104,2,104,1,108,0,0,0,8,104,2,
109,0,0,0,7,112,114,111,100,117,99,116,109,
0,0,0,8,54,53,49,52,48,47,69,75,104,2,109,0,
0,0,11,116,114,97,110,115,97,99,116,105,111,
110,109,0,0,0,8,114,101,116,114,105,101,118,
101,104,2,109,0,0,0,4,116,121,112,101,109,0,
0,0,4,117,110,105,116,104,2,109,0,0,0,11,97,
114,99,104,105,118,101,100,95,97,116,109,0,
0,0,22,50,48,48,56,48,50,50,50,84,49,50,49,
52,48,53,46,53,50,54,51,56,52,104,2,109,0,0,
0,10,99,114,101,97,116,101,100,95,97,116,
109,0,0,0,22,50,48,48,55,49,49,50,56,84,49,
53,52,50,48,54,46,51,52,52,54,49,56,104,2,
109,0,0,0,4,112,114,111,112,104,1,108,0,0,0,
2,104,2,109,0,0,0,8,108,111,99,97,116,105,
111,110,109,0,0,0,6,65,85,83,76,65,71,104,2,
109,0,0,0,6,104,101,105,103,104,116,98,0,0,
7,158,106,104,2,109,0,0,0,3,109,117,105,109,
0,0,0,18,51,52,48,48,53,57,57,56,49,48,48,
48,48,51,49,50,53,50,104,2,109,0,0,0,8,113,
117,97,110,116,105,116,121,97,11,106,106>>}]}}}
** When Server state == {file_descriptor,prim_file,{#Port<0.904761>,24}}
** Reason for termination ==
** {timeout,{gen_server,call,
[<0.12768.45>,
{write,<<0,0,1,36,131,104,2,104,1,108,0,0,0,8,104,
2,109,0,0,0,7,112,114,111,100,117,99,116,
109,0,0,0,8,54,53,49,52,48,47,69,75,104,
2,109,0,0,0,11,116,114,97,110,115,97,99,
116,105,111,110,109,0,0,0,8,114,101,116,
114,105,101,118,101,104,2,109,0,0,0,4,
116,121,112,101,109,0,0,0,4,117,110,105,
116,104,2,109,0,0,0,11,97,114,99,104,105,
118,101,100,95,97,116,109,0,0,0,22,50,48,
48,56,48,50,50,50,84,49,50,49,52,48,53,
46,53,50,54,51,56,52,104,2,109,0,0,0,10,
99,114,101,97,116,101,100,95,97,116,109,
0,0,0,22,50,48,48,55,49,49,50,56,84,49,
53,52,50,48,54,46,51,52,52,54,49,56,104,
2,109,0,0,0,4,112,114,111,112,104,1,108,
0,0,0,2,104,2,109,0,0,0,8,108,111,99,97,
116,105,111,110,109,0,0,0,6,65,85,83,76,
65,71,104,2,109,0,0,0,6,104,101,105,103,
104,116,98,0,0,7,158,106,104,2,109,0,0,0,
3,109,117,105,109,0,0,0,18,51,52,48,48,
53,57,57,56,49,48,48,48,48,51,49,50,53,
50,104,2,109,0,0,0,8,113,117,97,110,116,
105,116,121,97,11,106,106>>}]}}
[Mon, 05 Jan 2009 19:42:57 GMT] [error] [<0.12765.45>] {error_report,<0.22.0>,
{<0.12765.45>,crash_report,
[[{pid,<0.12765.45>},
{registered_name,[]},
{error_info,
{exit,
{timeout,
{gen_server,call,
[<0.12768.45>,
{write,
<<0,0,1,36,131,104,2,104,1,108,0,0,0,8,104,2,
109,0,0,0,7,112,114,111,100,117,99,116,109,0,
0,0,8,54,53,49,52,48,47,69,75,104,2,109,0,0,
0,11,116,114,97,110,115,97,99,116,105,111,
110,109,0,0,0,8,114,101,116,114,105,101,118,
101,104,2,109,0,0,0,4,116,121,112,101,109,0,
0,0,4,117,110,105,116,104,2,109,0,0,0,11,97,
114,99,104,105,118,101,100,95,97,116,109,0,0,
0,22,50,48,48,56,48,50,50,50,84,49,50,49,52,
48,53,46,53,50,54,51,56,52,104,2,109,0,0,0,
10,99,114,101,97,116,101,100,95,97,116,109,0,
0,0,22,50,48,48,55,49,49,50,56,84,49,53,52,
50,48,54,46,51,52,52,54,49,56,104,2,109,0,0,
0,4,112,114,111,112,104,1,108,0,0,0,2,104,2,
109,0,0,0,8,108,111,99,97,116,105,111,110,
109,0,0,0,6,65,85,83,76,65,71,104,2,109,0,0,
0,6,104,101,105,103,104,116,98,0,0,7,158,106,
104,2,109,0,0,0,3,109,117,105,109,0,0,0,18,
51,52,48,48,53,57,57,56,49,48,48,48,48,51,49,
50,53,50,104,2,109,0,0,0,8,113,117,97,110,
116,105,116,121,97,11,106,106>>}]}},
[{gen_server,terminate,6},{proc_lib,init_p_do_apply,3}]}},
{initial_call,{couch_file,init,['Argument__1']}},
{ancestors,[<0.12762.45>]},
{messages,[]},
{links,[#Port<0.904761>]},
{dictionary,[]},
{trap_exit,true},
{status,running},
{heap_size,987},
{stack_size,23},
{reductions,836156}],
[]]}}
[Mon, 05 Jan 2009 19:43:02 GMT] [error] [<0.22399.43>] ** Generic server
<0.22399.43> terminating
** Last message in was {'EXIT',<0.10848.41>,
{timeout,
{gen_server,call,
[<0.12768.45>,
{write,
<<0,0,1,36,131,104,2,104,1,108,0,0,0,8,104,2,
109,0,0,0,7,112,114,111,100,117,99,116,109,
0,0,0,8,54,53,49,52,48,47,69,75,104,2,109,0,
0,0,11,116,114,97,110,115,97,99,116,105,111,
110,109,0,0,0,8,114,101,116,114,105,101,118,
101,104,2,109,0,0,0,4,116,121,112,101,109,0,
0,0,4,117,110,105,116,104,2,109,0,0,0,11,97,
114,99,104,105,118,101,100,95,97,116,109,0,
0,0,22,50,48,48,56,48,50,50,50,84,49,50,49,
52,48,53,46,53,50,54,51,56,52,104,2,109,0,0,
0,10,99,114,101,97,116,101,100,95,97,116,
109,0,0,0,22,50,48,48,55,49,49,50,56,84,49,
53,52,50,48,54,46,51,52,52,54,49,56,104,2,
109,0,0,0,4,112,114,111,112,104,1,108,0,0,0,
2,104,2,109,0,0,0,8,108,111,99,97,116,105,
111,110,109,0,0,0,6,65,85,83,76,65,71,104,2,
109,0,0,0,6,104,101,105,103,104,116,98,0,0,
7,158,106,104,2,109,0,0,0,3,109,117,105,109,
0,0,0,18,51,52,48,48,53,57,57,56,49,48,48,
48,48,51,49,50,53,50,104,2,109,0,0,0,8,113,
117,97,110,116,105,116,121,97,11,106,106>>}]}}}
** When Server state == {file_descriptor,prim_file,{#Port<0.904494>,16}}
** Reason for termination ==
** {timeout,{gen_server,call,
[<0.12768.45>,
{write,<<0,0,1,36,131,104,2,104,1,108,0,0,0,8,104,
2,109,0,0,0,7,112,114,111,100,117,99,116,
109,0,0,0,8,54,53,49,52,48,47,69,75,104,
2,109,0,0,0,11,116,114,97,110,115,97,99,
116,105,111,110,109,0,0,0,8,114,101,116,
114,105,101,118,101,104,2,109,0,0,0,4,
116,121,112,101,109,0,0,0,4,117,110,105,
116,104,2,109,0,0,0,11,97,114,99,104,105,
118,101,100,95,97,116,109,0,0,0,22,50,48,
48,56,48,50,50,50,84,49,50,49,52,48,53,
46,53,50,54,51,56,52,104,2,109,0,0,0,10,
99,114,101,97,116,101,100,95,97,116,109,
0,0,0,22,50,48,48,55,49,49,50,56,84,49,
53,52,50,48,54,46,51,52,52,54,49,56,104,
2,109,0,0,0,4,112,114,111,112,104,1,108,
0,0,0,2,104,2,109,0,0,0,8,108,111,99,97,
116,105,111,110,109,0,0,0,6,65,85,83,76,
65,71,104,2,109,0,0,0,6,104,101,105,103,
104,116,98,0,0,7,158,106,104,2,109,0,0,0,
3,109,117,105,109,0,0,0,18,51,52,48,48,
53,57,57,56,49,48,48,48,48,51,49,50,53,
50,104,2,109,0,0,0,8,113,117,97,110,116,
105,116,121,97,11,106,106>>}]}}
[Mon, 05 Jan 2009 19:43:28 GMT] [error] [<0.22399.43>] {error_report,<0.22.0>,
{<0.22399.43>,crash_report,
[[{pid,<0.22399.43>},
{registered_name,[]},
{error_info,
{exit,
{timeout,
{gen_server,call,
[<0.12768.45>,
{write,
<<0,0,1,36,131,104,2,104,1,108,0,0,0,8,104,2,
109,0,0,0,7,112,114,111,100,117,99,116,109,0,
0,0,8,54,53,49,52,48,47,69,75,104,2,109,0,0,
0,11,116,114,97,110,115,97,99,116,105,111,
110,109,0,0,0,8,114,101,116,114,105,101,118,
101,104,2,109,0,0,0,4,116,121,112,101,109,0,
0,0,4,117,110,105,116,104,2,109,0,0,0,11,97,
114,99,104,105,118,101,100,95,97,116,109,0,0,
0,22,50,48,48,56,48,50,50,50,84,49,50,49,52,
48,53,46,53,50,54,51,56,52,104,2,109,0,0,0,
10,99,114,101,97,116,101,100,95,97,116,109,0,
0,0,22,50,48,48,55,49,49,50,56,84,49,53,52,
50,48,54,46,51,52,52,54,49,56,104,2,109,0,0,
0,4,112,114,111,112,104,1,108,0,0,0,2,104,2,
109,0,0,0,8,108,111,99,97,116,105,111,110,
109,0,0,0,6,65,85,83,76,65,71,104,2,109,0,0,
0,6,104,101,105,103,104,116,98,0,0,7,158,106,
104,2,109,0,0,0,3,109,117,105,109,0,0,0,18,
51,52,48,48,53,57,57,56,49,48,48,48,48,51,49,
50,53,50,104,2,109,0,0,0,8,113,117,97,110,
116,105,116,121,97,11,106,106>>}]}},
[{gen_server,terminate,6},{proc_lib,init_p_do_apply,3}]}},
{initial_call,{couch_file,init,['Argument__1']}},
{ancestors,
[<0.10848.41>,<0.10847.41>,couch_server,couch_primary_services,
couch_server_sup,<0.1.0>]},
{messages,
[{'DOWN',#Ref<0.0.81.132266>,process,<0.10847.41>,
{timeout,
{gen_server,call,
[<0.12768.45>,
{write,
<<0,0,1,36,131,104,2,104,1,108,0,0,0,8,104,
2,109,0,0,0,7,112,114,111,100,117,99,116,
109,0,0,0,8,54,53,49,52,48,47,69,75,104,
2,109,0,0,0,11,116,114,97,110,115,97,99,
116,105,111,110,109,0,0,0,8,114,101,116,
114,105,101,118,101,104,2,109,0,0,0,4,
116,121,112,101,109,0,0,0,4,117,110,105,
116,104,2,109,0,0,0,11,97,114,99,104,105,
118,101,100,95,97,116,109,0,0,0,22,50,48,
48,56,48,50,50,50,84,49,50,49,52,48,53,
46,53,50,54,51,56,52,104,2,109,0,0,0,10,
99,114,101,97,116,101,100,95,97,116,109,
0,0,0,22,50,48,48,55,49,49,50,56,84,49,
53,52,50,48,54,46,51,52,52,54,49,56,104,
2,109,0,0,0,4,112,114,111,112,104,1,108,
0,0,0,2,104,2,109,0,0,0,8,108,111,99,97,
116,105,111,110,109,0,0,0,6,65,85,83,76,
65,71,104,2,109,0,0,0,6,104,101,105,103,
104,116,98,0,0,7,158,106,104,2,109,0,0,0,
3,109,117,105,109,0,0,0,18,51,52,48,48,
53,57,57,56,49,48,48,48,48,51,49,50,53,
50,104,2,109,0,0,0,8,113,117,97,110,116,
105,116,121,97,11,106,106>>}]}}}]},
{links,[#Port<0.904494>]},
{dictionary,[{<0.10847.41>,{#Ref<0.0.81.132266>,1}}]},
{trap_exit,true},
{status,running},
{heap_size,987},
{stack_size,23},
{reductions,5627554}],
[]]}}
(and nothing further)
I still can access couchdb1 (the source) but every trivial request takes
exactly 5015ms:
balancer:/filespace/couchdb/log# time curl -i http://127.0.0.1:5984/
HTTP/1.1 200 OK
Server: CouchDB/0.9.0a731357-incubating (Erlang OTP/R12B)
Date: Mon, 05 Jan 2009 20:45:46 GMT
Content-Type: text/plain;charset=utf-8
Content-Length: 102
Cache-Control: must-revalidate
{"couchdb":"Welcome","version":"0.9.0a731357-incubating","start_time":"Sun, 04
Jan 2009 21:43:13 GMT"}
real 0m5.015s
user 0m0.008s
sys 0m0.000s
For these accesses no log entries on couchdb1 are created.
Meanwhile on the destination server (couchdb2) I can see lot of activity:
[Mon, 05 Jan 2009 20:47:58 GMT] [info] [<0.19601.5>] 172.28.4.107 - - 'POST'
/hulog_events/_missing_revs 200
... 40 lines...
[Mon, 05 Jan 2009 20:47:58 GMT] [info] [<0.19644.5>] 172.28.4.107 - - 'POST'
/hulog_events/_missing_revs 200
[Mon, 05 Jan 2009 20:48:13 GMT] [info] [<0.19652.5>] 172.28.4.107 - - 'POST'
/hulog_events/_missing_revs 200
... ca 200 lines
[Mon, 05 Jan 2009 20:48:13 GMT] [info] [<0.19744.5>] 172.28.4.107 - - 'POST'
/hulog_events/_missing_revs 200
[Mon, 05 Jan 2009 20:48:28 GMT] [info] [<0.19747.5>] 172.28.4.107 - - 'POST'
/hulog_events/_missing_revs 200
... ca 200 lines
[Mon, 05 Jan 2009 20:48:28 GMT] [info] [<0.19844.5>] 172.28.4.107 - - 'POST'
/hulog_events/_missing_revs 200
[Mon, 05 Jan 2009 20:48:43 GMT] [info] [<0.19944.5>] 172.28.4.107 - - 'POST'
/hulog_events/_missing_revs 200
... ca 200 lines
[Mon, 05 Jan 2009 20:48:58 GMT] [info] [<0.19948.5>] 172.28.4.107 - - 'POST'
/hulog_events/_missing_revs 200
... ca 200 lines
[Mon, 05 Jan 2009 20:48:58 GMT] [info] [<0.20044.5>] 172.28.4.107 - - 'POST'
/hulog_events/_missing_revs 200
[Mon, 05 Jan 2009 20:49:13 GMT] [info] [<0.20045.5>] 172.28.4.107 - - 'POST'
/hulog_events/_missing_revs 200
But the number of documents on the destination servers hasnt been incerasing in
the meantime:
{"db_name":"hulog_events","doc_count":25926,"doc_del_count":10074,"update_seq":36000,"purge_seq":0,"compact_running":false,"disk_size":21927524}
couchdb1 is CouchDB 0.9.0a731357-incubating
couchdb2 is CouchDB 0.9.0a730405-incubating
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.