[
https://issues.apache.org/jira/browse/COUCHDB-1946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13843018#comment-13843018
]
Dave Cottlehuber commented on COUCHDB-1946:
-------------------------------------------
[~stelcheck] agreed
[~thor.lange]
There's something with replicating this specific doc that seems to trigger
issues. Here's what I used to identify it (call source db and use since=
<checkpoint -1)
http://isaacs.iriscouch.com/registry/_changes\?limit\=2\&since\=701251
here's some things you can try:
# option 1
- delete all existing replications
- compact your DB if there's a big difference between data size and on-disk
size. jq is awesome for this.
curl -s http://localhost:5984/registry | jq ' (.disk_size| tonumber) -
(.data_size |tonumber)'
http://stedolan.github.io/jq/
This is a good spot to copy the registry.couch file if you have space, in case
you need to revert back to it.
- replicate the single failing document by POSTing this to _replicator. This
could take a *while*.
{{code}}
{
"source": "http://isaacs.iriscouch.com/registry",
"target": "registry",
"doc_ids": [
"as-stream"
],
"owner": "admin",
}
}
{{code}}
- this is simply replicating the single stuck document. If you do this, I would
love an ngrep or tcpdump of the traffic to see what happens on the wire during
these stuck transfers
- once this is completed, you can then run the normal replication again.
# option 2
Install an older release of CouchDB and see if it doesn't get stuck here:
https://archive.apache.org/dist/couchdb/binary/win/1.2.2/
If you *can* please try the R15B03-1 release first, report back, and then the
R14B04 one. It's not yet clear to me if the issue we are seeing is also related
to garbage collection differences in Erlang/OTP between releases, or solely
within CouchDB.
# option 3
Sometime later (hopefully today), I should have a bitttorrent accessible
version of npm. I need to update & compact first, this is pretty much IO
limited :-).
> Trying to replicate NPM grinds to a halt after 40GB
> ---------------------------------------------------
>
> Key: COUCHDB-1946
> URL: https://issues.apache.org/jira/browse/COUCHDB-1946
> Project: CouchDB
> Issue Type: Bug
> Components: Database Core
> Reporter: Marc Trudel
> Attachments: couch.log
>
>
> I have been able to replicate the Node.js NPM database until 40G or so, then
> I get this:
> https://gist.github.com/stelcheck/7723362
> I one case I have gotten a flat-out OOM error, but I didn't take a dump of
> the log output at the time.
> CentOS6.4 with CouchDB 1.5 (also tried 1.3.1, but to no avail). Also tried to
> restart replication from scratch - twice - bot cases stalling at 40GB.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)