[
https://issues.apache.org/jira/browse/COUCHDB-416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Enda Farrell updated COUCHDB-416:
---------------------------------
Priority: Critical (was: Major)
Description:
I have a set of CouchDB instances, each one acting as a shard for a large set
of data.
Ocassionally, we replicate each instances' database into a different CouchDB
instance. We always "pull" replicate (see image attached)
When we do this, we often see errors like this on the target instance:
* [Thu, 16 Jul 2009 13:52:32 GMT] [error] [emulator] Error in process
<0.29787.102> with exit value:
{function_clause,[{lists,map,[#Fun<couch_rep.6.75683565>,undefined]},{couch_rep,enum_docs_since,4}]}
*
*
*
* [Thu, 16 Jul 2009 13:52:32 GMT] [error] [<0.7456.6>] replication enumerator
exited with {function_clause,
* [{lists,map,
* [#Fun<couch_rep.6.75683565>,undefined]},
* {couch_rep,enum_docs_since,4}]} ..
respawning
Once this starts, it is fatal to the CouchDB instance. It logs these messages
at over 1000 per second (log level = severe) and chews up HDD.
No errors (other than a HTTP timeout) are seen.
After a database had gone "respawning", the target node was shutdown, logs
cleared, target node restarted. Log was tailed - all was quiet. Once a single
replication was called again against this database it again immediatly went
into respawning hell. There were no stacked replications in this case.
>From this it seems that - if a database ever goes into "respawning" it cannot
>recover (when your enviroment/setup requires replication to occur always).
was:
I have a set of CouchDB instances, each one acting as a shard for a large set
of data.
Ocassionally, we replicate each instances' database into a different CouchDB
instance. We always "pull" replicate (see image attached)
When we do this, we often see errors like this on the target instance:
* [Thu, 16 Jul 2009 13:52:32 GMT] [error] [emulator] Error in process
<0.29787.102> with exit value:
{function_clause,[{lists,map,[#Fun<couch_rep.6.75683565>,undefined]},{couch_rep,enum_docs_since,4}]}
*
*
*
* [Thu, 16 Jul 2009 13:52:32 GMT] [error] [<0.7456.6>] replication enumerator
exited with {function_clause,
* [{lists,map,
* [#Fun<couch_rep.6.75683565>,undefined]},
* {couch_rep,enum_docs_since,4}]} ..
respawning
Once this starts, it is fatal to the CouchDB instance. It logs these messages
at over 1000 per second (log level = severe) and chews up HDD.
No errors (other than a HTTP timeout) are seen.
> Replicating shards into a single aggregation node may cause endless respawning
> ------------------------------------------------------------------------------
>
> Key: COUCHDB-416
> URL: https://issues.apache.org/jira/browse/COUCHDB-416
> Project: CouchDB
> Issue Type: Bug
> Components: Database Core
> Affects Versions: 0.9
> Environment: couchdb 0.9.0.r766883 CentOS x86_64
> Reporter: Enda Farrell
> Priority: Critical
> Attachments: Picture 2.png
>
>
> I have a set of CouchDB instances, each one acting as a shard for a large set
> of data.
> Ocassionally, we replicate each instances' database into a different CouchDB
> instance. We always "pull" replicate (see image attached)
> When we do this, we often see errors like this on the target instance:
> * [Thu, 16 Jul 2009 13:52:32 GMT] [error] [emulator] Error in process
> <0.29787.102> with exit value:
> {function_clause,[{lists,map,[#Fun<couch_rep.6.75683565>,undefined]},{couch_rep,enum_docs_since,4}]}
> *
> *
> *
> * [Thu, 16 Jul 2009 13:52:32 GMT] [error] [<0.7456.6>] replication enumerator
> exited with {function_clause,
> * [{lists,map,
> *
> [#Fun<couch_rep.6.75683565>,undefined]},
> * {couch_rep,enum_docs_since,4}]} ..
> respawning
> Once this starts, it is fatal to the CouchDB instance. It logs these messages
> at over 1000 per second (log level = severe) and chews up HDD.
> No errors (other than a HTTP timeout) are seen.
> After a database had gone "respawning", the target node was shutdown, logs
> cleared, target node restarted. Log was tailed - all was quiet. Once a single
> replication was called again against this database it again immediatly went
> into respawning hell. There were no stacked replications in this case.
> From this it seems that - if a database ever goes into "respawning" it cannot
> recover (when your enviroment/setup requires replication to occur always).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.