Heya Ingo,

I think new streaming interfaces are the way to go. We’ve been talking about 
them in other contexts like replication before. We won’t likely get to this 
before 2.0, though.

In the meantime, the best you can do is, as you write, smaller batches.

An alternative approach would be to disable doc validation on the target DB, do 
the _bulk_docs inserts and then replicate the database into a local copy that 
does have doc validation.

Or write the doc validation in Erlang.

Best
Jan
--
Professional Support for Apache CouchDB:
http://www.neighbourhood.ie/couchdb-support/



> On 16 Mar 2015, at 11:50, Ingo Radatz <[email protected]> wrote:
> 
> Hi!
> 
> My use case is to upload around 100k JSON documents (500Mb) to the CouchDB 
> via a PUT to the _bulk_doc handler. Everything works well and because some 
> schema validation is involved the upload time of an hour was not surprising.
> 
> Unfortunately it turns out that some proxy servers (e.g. squid) on the sender 
> site cancel the connection (a default timeout config param was reached) while 
> they should wait for the response of the _bulk_doc handler. Because to change 
> the config of foreign proxy servers is theoretically but not practically 
> possible i'm looking for a solution that lets such proxies know that the 
> connection is healthy and should not be canceled.
> 
> - Maybe a streamed approach for the up- and download phases?
> - Maybe a heartbeat?
> - Any other idea?
> 
> A quick and dirty solution is done - uploads are made now in smaller batch 
> sizes. That cannot be the final solution because it depends on so many 
> variables that it stays just a hope "to finish until some timeout will hit".
> 
> Here a little repetition of the failed upload process:
> 
> 1. send 10k docs as one JSON payload to the _bulk_doc
> 2. wait a time longer then the default timeout in the proxy (e.g. 4 minutes) 
> for the response
> 3. get disconnected by the proxy without a response from CouchDB
> 
> Best, ingo
> 

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to