[
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13497443#comment-13497443
]
stack commented on HBASE-5778:
------------------------------
This is better. Here's some comments:
Does CompressionContext class have to be public? Can it stay pkg private?
You'll have to move your new class into wal package but that seems fine to me.
Does the base Reader interface have to know about a compression context? Can
this not be internal to the implementation?
You call it ReplicationHLogReader but is it a replication only class? If so,
it does not belong in regionserver package but over in replication package.
My sense though is that this is a generally useful WAL reader? One that can do
compressed or non-compressed WAL? One that can be used by replication but also
by fellas who want to index hbase, etc.
Missing a license
Can it be in the wal package? Then don't have to open up so much of HLog?
Its unfortunate that you can't tell its a compressed wal from reading say some
magic or metadata at the head of the file. It seems a bit broke consulting
configuration.
Yeah, why can't an implementation of HLog.Reader manage the compression context
internally? Why it have to be out here in this ReplicationHLogReader class?
Afterall, isn't the dictionary reconstructed on read? You don't save it around?
So, a HLog.ReaderFactory that looks at configuration and returns a HLog.Reader
that either does compressed or not by looking at configs?
Is this right:
+ if (entry != null) {
+ entry.setCompressionContext(null);
> Turn on WAL compression by default
> ----------------------------------
>
> Key: HBASE-5778
> URL: https://issues.apache.org/jira/browse/HBASE-5778
> Project: HBase
> Issue Type: Improvement
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 5778.addendum, 5778-addendum.txt, HBASE-5778-0.94.patch,
> HBASE-5778-0.94-v2.patch, HBASE-5778-0.94-v3.patch, HBASE-5778.patch
>
>
> I ran some tests to verify if WAL compression should be turned on by default.
> For a use case where it's not very useful (values two order of magnitude
> bigger than the keys), the insert time wasn't different and the CPU usage 15%
> higher (150% CPU usage VS 130% when not compressing the WAL).
> When values are smaller than the keys, I saw a 38% improvement for the insert
> run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure
> WAL compression accounts for all the additional CPU usage, it might just be
> that we're able to insert faster and we spend more time in the MemStore per
> second (because our MemStores are bad when they contain tens of thousands of
> values).
> Those are two extremes, but it shows that for the price of some CPU we can
> save a lot. My machines have 2 quads with HT, so I still had a lot of idle
> CPUs.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira