Repository: couchdb-couch-replicator Updated Branches: refs/heads/master 227cbc647 -> 90d70883e
Fix changes worker timeout cleanup Previously if we timed out waiting for the next message the changes reader would end up just exiting with an error. Unfortunately the ibrowse worker doesn't bother noticing that its streaming target has died and will wait in perpetuity. If the main replication process happens to be waiting on this HTTP worker it'll block indefinitely and never make progress in the replication. This change just ensures that the ibrowse worker is killed which will cause the main replication pid to restart. This particular bug has been observed on the Oculus clusters at a fairly low rate so the cost of restarting a replication shouldn't be an issue. BugzId: 47971 Project: http://git-wip-us.apache.org/repos/asf/couchdb-couch-replicator/repo Commit: http://git-wip-us.apache.org/repos/asf/couchdb-couch-replicator/commit/90d70883 Tree: http://git-wip-us.apache.org/repos/asf/couchdb-couch-replicator/tree/90d70883 Diff: http://git-wip-us.apache.org/repos/asf/couchdb-couch-replicator/diff/90d70883 Branch: refs/heads/master Commit: 90d70883eb7e53beed09752ae7e41a4d386a2ede Parents: 227cbc6 Author: Paul J. Davis <[email protected]> Authored: Tue Jun 9 11:34:57 2015 -0500 Committer: Robert Newson <[email protected]> Committed: Mon Aug 24 16:39:57 2015 +0100 ---------------------------------------------------------------------- src/couch_replicator_httpc.erl | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/couchdb-couch-replicator/blob/90d70883/src/couch_replicator_httpc.erl ---------------------------------------------------------------------- diff --git a/src/couch_replicator_httpc.erl b/src/couch_replicator_httpc.erl index 052eb98..9a10bdb 100644 --- a/src/couch_replicator_httpc.erl +++ b/src/couch_replicator_httpc.erl @@ -144,7 +144,7 @@ process_stream_response(ReqId, Worker, HttpDb, Params, Callback) -> StreamDataFun = fun() -> stream_data_self(HttpDb, Params, Worker, ReqId, Callback) end, - put(?STREAM_STATUS, streaming), + put(?STREAM_STATUS, {streaming, Worker}), ibrowse:stream_next(ReqId), try Ret = Callback(Ok, Headers, StreamDataFun), @@ -199,7 +199,7 @@ clean_mailbox(_ReqId, 0) -> ok; clean_mailbox({ibrowse_req_id, ReqId}, Count) when Count > 0 -> case get(?STREAM_STATUS) of - streaming -> + {streaming, Worker} -> ibrowse:stream_next(ReqId), receive {ibrowse_async_response, ReqId, _} -> @@ -208,6 +208,7 @@ clean_mailbox({ibrowse_req_id, ReqId}, Count) when Count > 0 -> put(?STREAM_STATUS, ended), ok after 30000 -> + exit(Worker, {timeout, ibrowse_stream_cleanup}), exit({timeout, ibrowse_stream_cleanup}) end; Status when Status == init; Status == ended ->
