[jira] [Updated] (HBASE-5469) Add baseline compression efficiency to DataBlockEncodingTool

2012-03-23 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-5469:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk.

 Add baseline compression efficiency to DataBlockEncodingTool
 

 Key: HBASE-5469
 URL: https://issues.apache.org/jira/browse/HBASE-5469
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D2409.1.patch, D2409.2.patch, 
 jira-HBASE-5469-Add-baseline-compression-efficiency--2012-03-23_15_04_41.patch


 DataBlockEncodingTool currently does not provide baseline compression 
 efficiency, e.g. Hadoop compression codec applied to unencoded data. E.g. if 
 we are using LZO to compress blocks, we would like to have the following 
 columns in the report (possibly as percentages of raw data size).
 Baseline K+V in blockcache  |   Baseline K + V on disk  (LZO compressed)  | K 
 + V  DataBlockEncoded in block cache |   K + V DataBlockEncoded + 
 LZOCompressed (on disk)
 Background: we never store compressed blocks in cache, but we always store 
 encoded data blocks in cache if data block encoding is enabled for the column 
 family.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5469) Add baseline compression efficiency to DataBlockEncodingTool

2012-03-23 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-5469:
--

Attachment: 
jira-HBASE-5469-Add-baseline-compression-efficiency--2012-03-23_15_04_41.patch

The exact patch that was committed.

 Add baseline compression efficiency to DataBlockEncodingTool
 

 Key: HBASE-5469
 URL: https://issues.apache.org/jira/browse/HBASE-5469
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D2409.1.patch, D2409.2.patch, 
 jira-HBASE-5469-Add-baseline-compression-efficiency--2012-03-23_15_04_41.patch


 DataBlockEncodingTool currently does not provide baseline compression 
 efficiency, e.g. Hadoop compression codec applied to unencoded data. E.g. if 
 we are using LZO to compress blocks, we would like to have the following 
 columns in the report (possibly as percentages of raw data size).
 Baseline K+V in blockcache  |   Baseline K + V on disk  (LZO compressed)  | K 
 + V  DataBlockEncoded in block cache |   K + V DataBlockEncoded + 
 LZOCompressed (on disk)
 Background: we never store compressed blocks in cache, but we always store 
 encoded data blocks in cache if data block encoding is enabled for the column 
 family.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5469) Add baseline compression efficiency to DataBlockEncodingTool

2012-03-22 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5469:
---

Attachment: D2409.2.patch

mbautin updated the revision [jira] [HBASE-5469] Add baseline compression 
efficiency to DataBlockEncodingTool.
Reviewers: JIRA, dhruba, tedyu, stack, heyongqiang

  Addressing Ted's comments.

REVISION DETAIL
  https://reviews.facebook.net/D2409

AFFECTED FILES
  
src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
  src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
  src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
  
src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java


 Add baseline compression efficiency to DataBlockEncodingTool
 

 Key: HBASE-5469
 URL: https://issues.apache.org/jira/browse/HBASE-5469
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D2409.1.patch, D2409.2.patch


 DataBlockEncodingTool currently does not provide baseline compression 
 efficiency, e.g. Hadoop compression codec applied to unencoded data. E.g. if 
 we are using LZO to compress blocks, we would like to have the following 
 columns in the report (possibly as percentages of raw data size).
 Baseline K+V in blockcache  |   Baseline K + V on disk  (LZO compressed)  | K 
 + V  DataBlockEncoded in block cache |   K + V DataBlockEncoded + 
 LZOCompressed (on disk)
 Background: we never store compressed blocks in cache, but we always store 
 encoded data blocks in cache if data block encoding is enabled for the column 
 family.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5469) Add baseline compression efficiency to DataBlockEncodingTool

2012-03-21 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5469:
---

Attachment: D2409.1.patch

mbautin requested code review of [jira] [HBASE-5469] Add baseline compression 
efficiency to DataBlockEncodingTool.
Reviewers: JIRA, dhruba, tedyu, stack

  DataBlockEncodingTool currently does not provide baseline compression
  efficiency, e.g. Hadoop compression codec applied to unencoded data. E.g. if
  we are using LZO to compress blocks, we would like to have the following
  columns in the report (possibly as percentages of raw data size).

  Baseline K+V in blockcache | Baseline K + V on disk (LZO compressed) | K + V
  DataBlockEncoded in block cache | K + V DataBlockEncoded + LZOCompressed (on
  disk)

  Background: we never store compressed blocks in cache, but we always store
  encoded data blocks in cache if data block encoding is enabled for the column
  family.

TEST PLAN
  * Run unit tests.
  * Run DataBlockEncodingTool on a variety of real-world HFiles.

REVISION DETAIL
  https://reviews.facebook.net/D2409

AFFECTED FILES
  
src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java
  src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
  src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java
  
src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/5403/

Tip: use the X-Herald-Rules header to filter Herald messages in your client.


 Add baseline compression efficiency to DataBlockEncodingTool
 

 Key: HBASE-5469
 URL: https://issues.apache.org/jira/browse/HBASE-5469
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
 Attachments: D2409.1.patch


 DataBlockEncodingTool currently does not provide baseline compression 
 efficiency, e.g. Hadoop compression codec applied to unencoded data. E.g. if 
 we are using LZO to compress blocks, we would like to have the following 
 columns in the report (possibly as percentages of raw data size).
 Baseline K+V in blockcache  |   Baseline K + V on disk  (LZO compressed)  | K 
 + V  DataBlockEncoded in block cache |   K + V DataBlockEncoded + 
 LZOCompressed (on disk)
 Background: we never store compressed blocks in cache, but we always store 
 encoded data blocks in cache if data block encoding is enabled for the column 
 family.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira