[jira] Commented: (COUCHDB-639) Make replication profit of attachment compression and improve push replication for large attachments

Filipe Manana (JIRA) Fri, 19 Feb 2010 09:04:50 -0800

    [ 
https://issues.apache.org/jira/browse/COUCHDB-639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12835820#action_12835820
 ]


Filipe Manana commented on COUCHDB-639:
---------------------------------------

Hi Chris,

That is in fact the part I don't like, exposing the _bulk_doc_rep. I did it 
because when using the doc multipart streamer, we can't use the same http body 
to include other docs (at least not as far as I know). So _bulk_docs would be 
no longer _bulk_docs but _bulk_doc (singular).

The alternative I see, is to add a case clause in _bulk_docs, like:

case HttpHeaderContentType of
"multipart/related" ->
   % do the stuff of _bulk_doc_rep (new_edits is false, call update_docs with 
"replicated_changes")
_Else ->
   % ....
end

This probably, looks better?

It should copy new and old attachments (doesn't matter if they're compressed or 
not). Hummm, what is there suspicious about that?

cheers

> Make replication profit of attachment compression and improve push 
> replication for large attachments
> ----------------------------------------------------------------------------------------------------
>
>                 Key: COUCHDB-639
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-639
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Replication
>    Affects Versions: 0.11
>         Environment: trunk
>            Reporter: Filipe Manana
>         Attachments: rep-att-comp-and-multipart-trunk.patch
>
>
> At the moment, for compressed attachments, the replication uncompresses and 
> then compresses again the attachments. Therefore, a waste of CPU time.
> The push replication is also not reliable for very large attachments (500mb + 
> for example). Currently it sends the attachments in-lined in the respective 
> JSON doc. Not only this requires too much ram memory, it also wastes too much 
> CPU time doing the base64 encoding of the attachment (and also a 
> decompression if the attachment is compressed).
> The following patch (rep-att-comp-and-multipart-trunk*.patch) addresses both 
> issues. Docs containing attachments are now streamed to the target remote DB 
> using the multipart doc streaming feature provided by couch_doc.erl, and 
> compressed attachments are not uncompressed and re-compressed during the 
> replication
> JavaScript tests included.
> Previously doing a replication of a DB containing 2 docs with attachments of 
> 100mb and 500mb caused the Erlang VM to consume near 1.2GB of ram memory in 
> my system. With that patch applied, it uses about 130Mb of ram memory.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (COUCHDB-639) Make replication profit of attachment compression and improve push replication for large attachments

Reply via email to