+1, sounds like a plan. On Jan 30, 2014, at 9:01 AM, Jan Lehnardt <[email protected]> wrote:
> I commented with a summary on the ticket again: > https://github.com/couchbase/sync_gateway/issues/248#issuecomment-33689652 > > I talked this over with @rnewson. The scenario appears to be this: > > ``` > [14:39:41] <+rnewson> but for the replicator to know to fetch > "_design/foo" the source must have said it in the changes feed. > [14:40:06] <+rnewson> if the source says "_design/foo" was updated > *and* 404's when you fetch it, I don't see how it's the replicators problem. > ``` > > In CouchDB, the above scenario is still possible, but rather rare. In > addition, CouchDB will retry this if it fails and unless you look at the > docs, you are none the wiser that this has happened. > > The solution here is two-fold: > 1. CouchDB should add a clause to handle non-200 responses and print a > sensible error message about violating replication protocol expectations (we > really need a proper spec soon). > 2. sync_gateway should not send a _changes feed line for documents that can > result in a 404, and/or handle the fact that CouchDB (pre-fix) bails on that > or (post-fix) returns a sensible error. > > > > On 30 Jan 2014, at 14:47 , Adam Kocoloski <[email protected]> wrote: > >> Correct, the CouchDB replicator fails on a 404 there. There are relatively >> few occasions where the replicator will skip data and run to completion -- >> explicit validate_doc_update rejections and MD5 mismatches on attachments >> are two that come to mind. >> >> We could / should have the discussion about whether the replicator should >> make progress in the event of other failure modes like this one. I could >> make a case for it, but I worry that no one will pay attention to any >> warning message or metric that gets reported. >> >> Adam >> >> On Jan 30, 2014, at 8:19 AM, Jan Lehnardt <[email protected]> wrote: >> >>> I commented on the issue: >>> >>> It looks like we are not handling a 404 in the function below, especially >>> thefun(200, Headers, StreamDataFun) -> bit (that’s like 171 in >>> couch_replicator_api_wrap). >>> >>> I’m not too familiar with that code, maybe one of Adam, Bob, Filipe, Benoit >>> could have a look? >>> >>> cc dev@ >>> >>> open_doc_revs(#httpdb{} = HttpDb, Id, Revs, Options, Fun, Acc) -> >>> Path = encode_doc_id(Id), >>> QArgs = options_to_query_args( >>> HttpDb, Path, [revs, {open_revs, Revs} | Options]), >>> Self = self(), >>> Streamer = spawn_link(fun() -> >>> send_req( >>> HttpDb, >>> [{path, Path}, {qs, QArgs}, >>> {ibrowse_options, [{stream_to, {self(), once}}]}, >>> {headers, [{"Accept", "multipart/mixed"}]}], >>> fun(200, Headers, StreamDataFun) -> >>> remote_open_doc_revs_streamer_start(Self), >>> {<<"--">>, _, _} = couch_httpd:parse_multipart_request( >>> get_value("Content-Type", Headers), >>> StreamDataFun, >>> fun mp_parse_mixed/1) >>> end), >>> unlink(Self) >>> end), >>> receive >>> {started_open_doc_revs, Ref} -> >>> receive_docs_loop(Streamer, Fun, Id, Revs, Ref, Acc) >>> end; >>> >>> On 29 Jan 2014, at 16:35 , Jens Alfke <[email protected]> wrote: >>> >>>> A developer has reported a CouchDB 1.3 exception/crash replicating with >>>> the Couchbase Sync Gateway. They've attached the Erlang crash report, but >>>> those are about as readable to me as ancient Aramaic, or the logos* of >>>> black-metal bands :( >>>> >>>> >>>> https://github.com/couchbase/sync_gateway/issues/248#issuecomment-33523814 >>>> >>>> Could someone who knows CouchDB take a look and give me a clue about what >>>> it might be taking exception [sic] to? In my experience, there are some >>>> areas where it gets very picky about parsing incoming data, for example >>>> the number of newlines at the end of a multipart body. If I had some idea >>>> what type of data it was reading when it barfed, that would help me figure >>>> this out... >>>> >>>> Thanks! >>>> >>>> —Jens >>>> >>>> * viz.: >>>> http://www.rottentomatoes.com/quiz/the-most-unreadable-metal-band-logos/ >>> >> >
