[ 
https://issues.apache.org/jira/browse/HBASE-5469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13237530#comment-13237530
 ] 

Hudson commented on HBASE-5469:
-------------------------------

Integrated in HBase-TRUNK #2694 (See 
[https://builds.apache.org/job/HBase-TRUNK/2694/])
    [jira] [HBASE-5469] Add baseline compression efficiency to 
DataBlockEncodingTool

Summary:
DataBlockEncodingTool currently does not provide baseline compression
efficiency, e.g. Hadoop compression codec applied to unencoded data. E.g. if
we are using LZO to compress blocks, we would like to have the following
columns in the report (possibly as percentages of raw data size).

Baseline K+V in blockcache | Baseline K + V on disk (LZO compressed) | K + V
DataBlockEncoded in block cache | K + V DataBlockEncoded + LZOCompressed (on
disk)

Background: we never store compressed blocks in cache, but we always store
encoded data blocks in cache if data block encoding is enabled for the column
family.

This patch also has multiple bugfixes and improvements to DataBlockEncodingTool,
including presentation format, memory requirements (reduced 3x) and fixing the
handling of compression.

Test Plan:
* Run unit tests.
* Run DataBlockEncodingTool on a variety of real-world HFiles.

Reviewers: JIRA, dhruba, tedyu, stack, heyongqiang

Reviewed By: tedyu

Differential Revision: https://reviews.facebook.net/D2409 (Revision 1304626)

     Result = SUCCESS
mbautin : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java

                
> Add baseline compression efficiency to DataBlockEncodingTool
> ------------------------------------------------------------
>
>                 Key: HBASE-5469
>                 URL: https://issues.apache.org/jira/browse/HBASE-5469
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Mikhail Bautin
>            Assignee: Mikhail Bautin
>            Priority: Minor
>         Attachments: D2409.1.patch, D2409.2.patch, 
> jira-HBASE-5469-Add-baseline-compression-efficiency--2012-03-23_15_04_41.patch
>
>
> DataBlockEncodingTool currently does not provide baseline compression 
> efficiency, e.g. Hadoop compression codec applied to unencoded data. E.g. if 
> we are using LZO to compress blocks, we would like to have the following 
> columns in the report (possibly as percentages of raw data size).
> Baseline K+V in blockcache  |   Baseline K + V on disk  (LZO compressed)  | K 
> + V  DataBlockEncoded in block cache |   K + V DataBlockEncoded + 
> LZOCompressed (on disk)
> Background: we never store compressed blocks in cache, but we always store 
> encoded data blocks in cache if data block encoding is enabled for the column 
> family.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to