[
https://issues.apache.org/jira/browse/HBASE-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13474786#comment-13474786
]
Lars Hofhansl commented on HBASE-5355:
--------------------------------------
Before we commit this or the trunk patch I'd love to see some numbers comparing
this full compression stream approach with just avoiding duplicate data while
serializing from/to the RegionServer. On both sides we'd have to reassemble the
full KVs (unless we finally make a KV interface), but we can that efficiently
if we keep track size of the omitted parts of the KV and preallocate the space
and copy the data in that. That way we'd have the same amount memory copying
(ignoring DMA from the network card for the moment) and can safe bytes on the
wire.
I raised this on the mailing this a while ago, and Andy commented on that
somewhere as well.
KV are sorted when traveling over the wire (as a set of Puts/Deletes or in a
Result) we can simple avoid copying the prefix multiple times.
> Compressed RPC's for HBase
> --------------------------
>
> Key: HBASE-5355
> URL: https://issues.apache.org/jira/browse/HBASE-5355
> Project: HBase
> Issue Type: Improvement
> Components: IPC/RPC
> Affects Versions: 0.89.20100924
> Reporter: Karthik Ranganathan
> Assignee: Karthik Ranganathan
> Attachments: HBASE-5355-0.94.patch
>
>
> Some application need ability to do large batched writes and reads from a
> remote MR cluster. These eventually get bottlenecked on the network. These
> results are also pretty compressible sometimes.
> The aim here is to add the ability to do compressed calls to the server on
> both the send and receive paths.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira