[
https://issues.apache.org/jira/browse/COUCHDB-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089755#comment-13089755
]
Damien Katz commented on COUCHDB-1256:
--------------------------------------
I agree with the fix adam proposes. The code in question is an optimization to
prevent the sending/checking of documents we've already examined, but with
checkpointing it breaks. Removal of the code is the right fix for now.
In the future, we can add the optimization back if the check-pointing can keep
note of completed replications vs. checkpointed. Checkpointed records would
keep a "high water mark" of the last completed replication, and the seq num and
that high mark for completed replication would both be sent to the _changes
handler. The _changes would not send docs with a seq below the checkpoint
value. When the replication checkpoints, it saves the current seq and the last
high water mark complete. When replication completes. it sets the last seq and
high water mark to the same seq, and that is gets sent for the next replication.
Also, continuous replication would need a way to signal when a replication is
"complete" as well, so that the high water mark can be set there as well.
> Incremental requests to _changes can skip revisions
> ---------------------------------------------------
>
> Key: COUCHDB-1256
> URL: https://issues.apache.org/jira/browse/COUCHDB-1256
> Project: CouchDB
> Issue Type: Bug
> Components: Replication
> Affects Versions: 0.10, 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2,
> 1.1, 1.0.3
> Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be
> present in 1.0.3 and trunk
> Reporter: Adam Kocoloski
> Assignee: Adam Kocoloski
> Priority: Blocker
> Fix For: 1.0.4, 1.1.1, 1.2
>
> Attachments: jira-1256-test.diff
>
>
> Requests to _changes with style=all_docs&since=N (requests made by the
> replicator) are liable to suppress revisions of a document. The following
> sequence of curl commands demonstrates the bug:
> curl -X PUT localhost:5985/revseq
> {"ok":true}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d
> '{"a":123}'
> {"ok":true,"id":"foo","rev":"1-0dc33db52a43872b6f3371cef7de0277"}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d
> '{"a":456}'
> {"ok":true,"id":"bar","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % stick a conflict revision in foo
> curl -X PUT -Hcontent-type:application/json
> localhost:5985/revseq/foo?new_edits=false -d
> '{"_rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a", "a":123}'
> {"ok":true,"id":"foo","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % request without since= gives the expected result
> curl -Hcontent-type:application/json
> localhost:5985/revseq/_changes?style=all_docs
> {"results":[
> {"seq":2,"id":"bar","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]},
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"},{"rev":"1-0dc33db52a43872b6f3371cef7de0277"}]}
> ],
> "last_seq":3}
> % request starting from since=2 suppresses revision
> 1-0dc33db52a43872b6f3371cef7de0277 of foo
> macbook:~ (master) $ curl
> localhost:5985/revseq/_changes?style=all_docs\&since=2
> {"results":[
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]}
> ],
> "last_seq":3}
> I believe the fix is something like this (though we could refactor further
> because Style is unused):
> diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
> index e8705be..65aeca3 100644
> --- a/src/couchdb/couch_db.erl
> +++ b/src/couchdb/couch_db.erl
> @@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) ->
> changes_since(Db, Style, StartSeq, Fun, [], Acc).
>
> changes_since(Db, Style, StartSeq, Fun, Options, Acc) ->
> - Wrapper = fun(DocInfo, _Offset, Acc2) ->
> - #doc_info{revs=Revs} = DocInfo,
> - DocInfo2 =
> - case Style of
> - main_only ->
> - DocInfo;
> - all_docs ->
> - % remove revs before the seq
> - DocInfo#doc_info{revs=[RevInfo ||
> - #rev_info{seq=RevSeq}=RevInfo <- Revs, StartSeq <
> RevSeq]}
> - end,
> - Fun(DocInfo2, Acc2)
> - end,
> + Wrapper = fun(DocInfo, _Offset, Acc2) -> Fun(DocInfo, Acc2) end,
> {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db),
> Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options),
> {ok, AccOut}.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira