[jira] [Updated] (HBASE-5469) Add baseline compression efficiency to DataBlockEncodingTool
[ https://issues.apache.org/jira/browse/HBASE-5469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Bautin updated HBASE-5469: -- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Add baseline compression efficiency to DataBlockEncodingTool Key: HBASE-5469 URL: https://issues.apache.org/jira/browse/HBASE-5469 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor Attachments: D2409.1.patch, D2409.2.patch, jira-HBASE-5469-Add-baseline-compression-efficiency--2012-03-23_15_04_41.patch DataBlockEncodingTool currently does not provide baseline compression efficiency, e.g. Hadoop compression codec applied to unencoded data. E.g. if we are using LZO to compress blocks, we would like to have the following columns in the report (possibly as percentages of raw data size). Baseline K+V in blockcache | Baseline K + V on disk (LZO compressed) | K + V DataBlockEncoded in block cache | K + V DataBlockEncoded + LZOCompressed (on disk) Background: we never store compressed blocks in cache, but we always store encoded data blocks in cache if data block encoding is enabled for the column family. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5469) Add baseline compression efficiency to DataBlockEncodingTool
[ https://issues.apache.org/jira/browse/HBASE-5469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Bautin updated HBASE-5469: -- Attachment: jira-HBASE-5469-Add-baseline-compression-efficiency--2012-03-23_15_04_41.patch The exact patch that was committed. Add baseline compression efficiency to DataBlockEncodingTool Key: HBASE-5469 URL: https://issues.apache.org/jira/browse/HBASE-5469 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor Attachments: D2409.1.patch, D2409.2.patch, jira-HBASE-5469-Add-baseline-compression-efficiency--2012-03-23_15_04_41.patch DataBlockEncodingTool currently does not provide baseline compression efficiency, e.g. Hadoop compression codec applied to unencoded data. E.g. if we are using LZO to compress blocks, we would like to have the following columns in the report (possibly as percentages of raw data size). Baseline K+V in blockcache | Baseline K + V on disk (LZO compressed) | K + V DataBlockEncoded in block cache | K + V DataBlockEncoded + LZOCompressed (on disk) Background: we never store compressed blocks in cache, but we always store encoded data blocks in cache if data block encoding is enabled for the column family. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5469) Add baseline compression efficiency to DataBlockEncodingTool
[ https://issues.apache.org/jira/browse/HBASE-5469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-5469: --- Attachment: D2409.2.patch mbautin updated the revision [jira] [HBASE-5469] Add baseline compression efficiency to DataBlockEncodingTool. Reviewers: JIRA, dhruba, tedyu, stack, heyongqiang Addressing Ted's comments. REVISION DETAIL https://reviews.facebook.net/D2409 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java Add baseline compression efficiency to DataBlockEncodingTool Key: HBASE-5469 URL: https://issues.apache.org/jira/browse/HBASE-5469 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor Attachments: D2409.1.patch, D2409.2.patch DataBlockEncodingTool currently does not provide baseline compression efficiency, e.g. Hadoop compression codec applied to unencoded data. E.g. if we are using LZO to compress blocks, we would like to have the following columns in the report (possibly as percentages of raw data size). Baseline K+V in blockcache | Baseline K + V on disk (LZO compressed) | K + V DataBlockEncoded in block cache | K + V DataBlockEncoded + LZOCompressed (on disk) Background: we never store compressed blocks in cache, but we always store encoded data blocks in cache if data block encoding is enabled for the column family. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5469) Add baseline compression efficiency to DataBlockEncodingTool
[ https://issues.apache.org/jira/browse/HBASE-5469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HBASE-5469: --- Attachment: D2409.1.patch mbautin requested code review of [jira] [HBASE-5469] Add baseline compression efficiency to DataBlockEncodingTool. Reviewers: JIRA, dhruba, tedyu, stack DataBlockEncodingTool currently does not provide baseline compression efficiency, e.g. Hadoop compression codec applied to unencoded data. E.g. if we are using LZO to compress blocks, we would like to have the following columns in the report (possibly as percentages of raw data size). Baseline K+V in blockcache | Baseline K + V on disk (LZO compressed) | K + V DataBlockEncoded in block cache | K + V DataBlockEncoded + LZOCompressed (on disk) Background: we never store compressed blocks in cache, but we always store encoded data blocks in cache if data block encoding is enabled for the column family. TEST PLAN * Run unit tests. * Run DataBlockEncodingTool on a variety of real-world HFiles. REVISION DETAIL https://reviews.facebook.net/D2409 AFFECTED FILES src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/DiffKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/EncodedDataBlock.java src/main/java/org/apache/hadoop/hbase/io/encoding/FastDiffDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/encoding/PrefixKeyDeltaEncoder.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java src/main/java/org/apache/hadoop/hbase/io/hfile/HFileDataBlockEncoderImpl.java src/test/java/org/apache/hadoop/hbase/io/encoding/TestDataBlockEncoders.java src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java src/test/java/org/apache/hadoop/hbase/regionserver/DataBlockEncodingTool.java src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java MANAGE HERALD DIFFERENTIAL RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/5403/ Tip: use the X-Herald-Rules header to filter Herald messages in your client. Add baseline compression efficiency to DataBlockEncodingTool Key: HBASE-5469 URL: https://issues.apache.org/jira/browse/HBASE-5469 Project: HBase Issue Type: Improvement Reporter: Mikhail Bautin Assignee: Mikhail Bautin Priority: Minor Attachments: D2409.1.patch DataBlockEncodingTool currently does not provide baseline compression efficiency, e.g. Hadoop compression codec applied to unencoded data. E.g. if we are using LZO to compress blocks, we would like to have the following columns in the report (possibly as percentages of raw data size). Baseline K+V in blockcache | Baseline K + V on disk (LZO compressed) | K + V DataBlockEncoded in block cache | K + V DataBlockEncoded + LZOCompressed (on disk) Background: we never store compressed blocks in cache, but we always store encoded data blocks in cache if data block encoding is enabled for the column family. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira