[
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253995#comment-13253995
]
Lars Hofhansl commented on HBASE-5778:
--------------------------------------
This fundamentally break replication.
The problem above is actually that the HLogKey and WALEdit after being read
from a compressed HLog have the compression context set, and hence this will be
used to compress them when sent over the wire to the sink. Of course the sink
does not know how to uncompress.
So I just set the compression context to null in ReplicationSource.
With that hurdle out of the way, I find that seeking to a specific position in
the HLog (the position stored in ZK) does not work, because the dictionary is
not build up (compressed HLogs always need to read from the beginning).
Not sure how to fix the 2nd part.
> Turn on WAL compression by default
> ----------------------------------
>
> Key: HBASE-5778
> URL: https://issues.apache.org/jira/browse/HBASE-5778
> Project: HBase
> Issue Type: Improvement
> Reporter: Jean-Daniel Cryans
> Assignee: Lars Hofhansl
> Priority: Blocker
> Fix For: 0.96.0, 0.94.1
>
> Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch
>
>
> I ran some tests to verify if WAL compression should be turned on by default.
> For a use case where it's not very useful (values two order of magnitude
> bigger than the keys), the insert time wasn't different and the CPU usage 15%
> higher (150% CPU usage VS 130% when not compressing the WAL).
> When values are smaller than the keys, I saw a 38% improvement for the insert
> run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure
> WAL compression accounts for all the additional CPU usage, it might just be
> that we're able to insert faster and we spend more time in the MemStore per
> second (because our MemStores are bad when they contain tens of thousands of
> values).
> Those are two extremes, but it shows that for the price of some CPU we can
> save a lot. My machines have 2 quads with HT, so I still had a lot of idle
> CPUs.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira