Hi!

My use case is to upload around 100k JSON documents (500Mb) to the CouchDB via 
a PUT to the _bulk_doc handler. Everything works well and because some schema 
validation is involved the upload time of an hour was not surprising. 

Unfortunately it turns out that some proxy servers (e.g. squid) on the sender 
site cancel the connection (a default timeout config param was reached) while 
they should wait for the response of the _bulk_doc handler. Because to change 
the config of foreign proxy servers is theoretically but not practically 
possible i'm looking for a solution that lets such proxies know that the 
connection is healthy and should not be canceled. 

- Maybe a streamed approach for the up- and download phases? 
- Maybe a heartbeat?
- Any other idea?

A quick and dirty solution is done - uploads are made now in smaller batch 
sizes. That cannot be the final solution because it depends on so many 
variables that it stays just a hope "to finish until some timeout will hit".

Here a little repetition of the failed upload process:

1. send 10k docs as one JSON payload to the _bulk_doc
2. wait a time longer then the default timeout in the proxy (e.g. 4 minutes) 
for the response
3. get disconnected by the proxy without a response from CouchDB

Best, ingo

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to