[
https://issues.apache.org/jira/browse/COUCHDB-2484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gunther Gruber updated COUCHDB-2484:
------------------------------------
Component/s: Database Core
> replication crashes
> -------------------
>
> Key: COUCHDB-2484
> URL: https://issues.apache.org/jira/browse/COUCHDB-2484
> Project: CouchDB
> Issue Type: Bug
> Security Level: public(Regular issues)
> Components: Database Core
> Affects Versions: 1.6.0
> Reporter: Gunther Gruber
> Priority: Minor
>
> We are Using Couchdb Version 1.6 with 8.3T of data, biggest Database ist
> 2.1T. At this moment we switch to new hardware with more storage space. We
> copied the files with rsync and started the replication.
> One system is already in sync, the other is doing the replication.
> I appreciate that besides the errors in the log, the first system is now in
> sync.
> The log looks like the following
> Retrying POST request to http://replication:XXXX/database/_revs_diff in 0.5
> seconds due to error req_timedout
> and then
> Mon, 01 Dec 2014 13:00:28 GMT] [error] [<0.27044.1>] ** Generic server
> <0.27044.1> terminating
> ** Last message in was {'EXIT',<0.26965.1>,killed}
> ** When Server state == {state,<0.26965.1>,<0.27045.1>,40,
> {httpdb,
> "http://replication:[email protected]/sm_chemie/",
> nil,
> [{"Accept","application/json"},
> {"User-Agent","CouchDB/1.2.0"}],
> 30000,
> [{socket_options,
> [{recbuf,262144},
> {sndbuf,262144},
> {nodelay,true},
> {keepalive,true}]}],
> 10,250,<0.26966.1>,40},
> {httpdb,
> "http://replication:XXX@XXX:5984/sm_chemie/",
> nil,
> [{"Accept","application/json"},
> {"User-Agent","CouchDB/1.2.0"}],
> 30000,
> [{socket_options,
> [{recbuf,262144},
> {sndbuf,262144},
> {nodelay,true},
> {keepalive,true}]}],
> 10,250,<0.26968.1>,40},
> [],nil,nil,nil,
> {rep_stats,0,0,0,0,0},
> nil,nil,
> {batch,[],0}}
> ** Reason for termination ==
> ** killed
> [Mon, 01 Dec 2014 13:00:28 GMT] [error] [<0.27042.1>] {error_report,<0.31.0>,
> {<0.27042.1>,crash_report,
> [[{initial_call,
> {couch_replicator_worker,init,['Argument__1']}},
> {pid,<0.27042.1>},
> {registered_name,[]},
> {error_info,
> {exit,killed,
> [{gen_server,terminate,6,
> [{file,"gen_server.erl"},{line,747}]},
> {proc_lib,init_p_do_apply,3,
> [{file,"proc_lib.erl"},{line,227}]}]}},
> {ancestors,
> [<0.26965.1>,couch_rep_sup,couch_primary_services,
> couch_server_sup,<0.32.0>]},
> {messages,[]},
> {links,[<0.27043.1>]},
> {dictionary,
> [{last_stats_report,{1417,438797,704976}}]},
> {trap_exit,true},
> {status,running},
> {heap_size,377},
> {stack_size,24},
> {reductions,372}],
> []]}}
> It seems to me like a timeout and the replication task then exits. I allready
> played arround with the configuration setting with no succes. I can provide
> more information if needed.
> /etc/couchdb/local.d/001-user_config.ini
> [couchdb]
> file_compression = snappy
> max_dbs_open = 400
> [httpd]
> bind_address = ::
> server_options = [{backlog, 128}, {acceptor_pool_size, 16}]
> socket_options = [{recbuf, 262144}, {sndbuf, 262144}, {nodelay, true},
> {keepalive, true}]
> [couch_httpd_auth]
> secret =
> [log_level_by_module]
> couch_httpd = warning
> couch_replicator = debug
> couch_query_servers = warning
> [daemons]
> httpsd = {couch_httpd, start_link, [https]}
> [ssl]
> cert_file = /etc/couchdb/ssl/certs/couchdb-couch1.prime.adns.de.pem
> key_file = /etc/couchdb/ssl/private/couchdb-couch1.prime.adns.de.pem
> verify_ssl_certificates = false
> [replicator]
> worker_batch_size = 2000
> worker_processes = 40
> http_connections = 40
> socket_options = [{recbuf, 262144}, {sndbuf, 262144}, {nodelay, true},
> {keepalive, true}]
> /etc/default/couchdb
> # Sourced by init script for configuration.
> COUCHDB_USER=couchdb
> COUCHDB_STDOUT_FILE=/dev/null
> COUCHDB_STDERR_FILE=/dev/null
> COUCHDB_RESPAWN_TIMEOUT=5
> COUCHDB_OPTIONS=
> # 32 Threads to handle I/O
> export ERL_FLAGS="+A 32"
> # 8192 open files
> export ERL_MAX_PORTS=8192
> ulimit -n 8192
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)