[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13531371#comment-13531371
 ] 

stack commented on HBASE-5778:
------------------------------

On commit fix this comment:

"+   * Get a reader for the WAL. If you are reading from a file that's being 
written to
+   * and need to reopen it multiple times, use {@link HLog.Reader#reset()} 
instead of this method
+   * then just seek back to the last known good position."

It has too much about the implementation... 

This comment on reset is good... maybe use some of it:

+    // Resetting the reader lets us see newly added data if the file is being 
written to
+    // We also keep the same compressionContext which was previously populated 
for this file


Or the stuff in openReader is good.... too... makes sense

+1 on commit
                
> Turn on WAL compression by default
> ----------------------------------
>
>                 Key: HBASE-5778
>                 URL: https://issues.apache.org/jira/browse/HBASE-5778
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.96.0
>
>         Attachments: 5778.addendum, 5778-addendum.txt, HBASE-5778-0.94.patch, 
> HBASE-5778-0.94-v2.patch, HBASE-5778-0.94-v3.patch, HBASE-5778-0.94-v4.patch, 
> HBASE-5778-0.94-v5.patch, HBASE-5778.patch
>
>
> I ran some tests to verify if WAL compression should be turned on by default.
> For a use case where it's not very useful (values two order of magnitude 
> bigger than the keys), the insert time wasn't different and the CPU usage 15% 
> higher (150% CPU usage VS 130% when not compressing the WAL).
> When values are smaller than the keys, I saw a 38% improvement for the insert 
> run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
> WAL compression accounts for all the additional CPU usage, it might just be 
> that we're able to insert faster and we spend more time in the MemStore per 
> second (because our MemStores are bad when they contain tens of thousands of 
> values).
> Those are two extremes, but it shows that for the price of some CPU we can 
> save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
> CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to