Nick Vatamaniuc created COUCHDB-3167:
----------------------------------------

             Summary: CouchDB replicator will retry forever if it cannot write 
to source db
                 Key: COUCHDB-3167
                 URL: https://issues.apache.org/jira/browse/COUCHDB-3167
             Project: CouchDB
          Issue Type: Bug
            Reporter: Nick Vatamaniuc


If a replication is using checkpoints (and by default they do), and replication 
document doesn't not have authorization to write to source db, replication will 
crash repeatedly.

Crashing is expected and not a problem, however, each time it crashes it writes 
an error state to the replication doc and then the replication job exits. 
Writing the error state, generates a new doc update change for the _replicator 
db. Replicator reads the document change. Starts a new replication job. Writes 
a "triggered" state to the document. Replication starts successfully then 
crashes and writes "error" to the document.

So alternating states of "triggered" and "error" keep being written to the 
document forever. Looking at some examples of this there was a shard >900GB in 
size. Some as high as 500GB.

The critical bit above is that the replication starts successfully. There is a 
mechanism to fail and cancel replications which fail repeated starts. However 
after replication jobs start, if it crashes, it will be restarted an unlimited 
number of times.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to