[jira] [Updated] (HBASE-6371) [89-fb] Tier based compaction
[ https://issues.apache.org/jira/browse/HBASE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liyin Tang updated HBASE-6371: -- Attachment: (was: HBase_Tier_Base_Compaction.pdf) [89-fb] Tier based compaction - Key: HBASE-6371 URL: https://issues.apache.org/jira/browse/HBASE-6371 Project: HBase Issue Type: Improvement Reporter: Akashnil Assignee: Liyin Tang Labels: noob Attachments: HBASE-6371-089fb-commit.patch, HBase_Tier_Base_Compaction.pdf Currently, the compaction selection is not very flexible and is not sensitive to the hotness of the data. Very old data is likely to be accessed less, and very recent data is likely to be in the block cache. Both of these considerations make it inefficient to compact these files as aggressively as other files. In some use-cases, the access-pattern is particularly obvious even though there is no way to control the compaction algorithm in those cases. In the new compaction selection algorithm, we plan to divide the candidate files into different levels according to oldness of the data that is present in those files. For each level, parameters like compaction ratio, minimum number of store-files in each compaction may be different. Number of levels, time-ranges, and parameters for each level will be configurable online on a per-column family basis. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6371) [89-fb] Tier based compaction
[ https://issues.apache.org/jira/browse/HBASE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liyin Tang updated HBASE-6371: -- Attachment: HBase_Tier_Base_Compaction.pdf [89-fb] Tier based compaction - Key: HBASE-6371 URL: https://issues.apache.org/jira/browse/HBASE-6371 Project: HBase Issue Type: Improvement Reporter: Akashnil Assignee: Liyin Tang Labels: noob Attachments: HBASE-6371-089fb-commit.patch, HBase_Tier_Base_Compaction.pdf Currently, the compaction selection is not very flexible and is not sensitive to the hotness of the data. Very old data is likely to be accessed less, and very recent data is likely to be in the block cache. Both of these considerations make it inefficient to compact these files as aggressively as other files. In some use-cases, the access-pattern is particularly obvious even though there is no way to control the compaction algorithm in those cases. In the new compaction selection algorithm, we plan to divide the candidate files into different levels according to oldness of the data that is present in those files. For each level, parameters like compaction ratio, minimum number of store-files in each compaction may be different. Number of levels, time-ranges, and parameters for each level will be configurable online on a per-column family basis. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6371) [89-fb] Tier based compaction
[ https://issues.apache.org/jira/browse/HBASE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liyin Tang updated HBASE-6371: -- Attachment: HBase_Tier_Base_Compaction.pdf The design doc for HBase Tier-based Compaction from Akashnil. [89-fb] Tier based compaction - Key: HBASE-6371 URL: https://issues.apache.org/jira/browse/HBASE-6371 Project: HBase Issue Type: Improvement Reporter: Akashnil Assignee: Liyin Tang Labels: noob Attachments: HBASE-6371-089fb-commit.patch, HBase_Tier_Base_Compaction.pdf Currently, the compaction selection is not very flexible and is not sensitive to the hotness of the data. Very old data is likely to be accessed less, and very recent data is likely to be in the block cache. Both of these considerations make it inefficient to compact these files as aggressively as other files. In some use-cases, the access-pattern is particularly obvious even though there is no way to control the compaction algorithm in those cases. In the new compaction selection algorithm, we plan to divide the candidate files into different levels according to oldness of the data that is present in those files. For each level, parameters like compaction ratio, minimum number of store-files in each compaction may be different. Number of levels, time-ranges, and parameters for each level will be configurable online on a per-column family basis. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6371) [89-fb] Tier based compaction
[ https://issues.apache.org/jira/browse/HBASE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-6371: Attachment: HBASE-6371-089fb-commit.patch I am attaching the 0.89fb commit for reference. The commit hash is 1b3e7bb4df1ed05d7d268cb90ffc23f5955c4398 [89-fb] Tier based compaction - Key: HBASE-6371 URL: https://issues.apache.org/jira/browse/HBASE-6371 Project: HBase Issue Type: Improvement Reporter: Akashnil Assignee: Liyin Tang Labels: noob Attachments: HBASE-6371-089fb-commit.patch Currently, the compaction selection is not very flexible and is not sensitive to the hotness of the data. Very old data is likely to be accessed less, and very recent data is likely to be in the block cache. Both of these considerations make it inefficient to compact these files as aggressively as other files. In some use-cases, the access-pattern is particularly obvious even though there is no way to control the compaction algorithm in those cases. In the new compaction selection algorithm, we plan to divide the candidate files into different levels according to oldness of the data that is present in those files. For each level, parameters like compaction ratio, minimum number of store-files in each compaction may be different. Number of levels, time-ranges, and parameters for each level will be configurable online on a per-column family basis. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6371) [89-fb] Tier based compaction
[ https://issues.apache.org/jira/browse/HBASE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liyin Tang updated HBASE-6371: -- Summary: [89-fb] Tier based compaction (was: [89-fb] Level based compaction) [89-fb] Tier based compaction - Key: HBASE-6371 URL: https://issues.apache.org/jira/browse/HBASE-6371 Project: HBase Issue Type: Improvement Reporter: Akashnil Assignee: Liyin Tang Labels: noob Currently, the compaction selection is not very flexible and is not sensitive to the hotness of the data. Very old data is likely to be accessed less, and very recent data is likely to be in the block cache. Both of these considerations make it inefficient to compact these files as aggressively as other files. In some use-cases, the access-pattern is particularly obvious even though there is no way to control the compaction algorithm in those cases. In the new compaction selection algorithm, we plan to divide the candidate files into different levels according to oldness of the data that is present in those files. For each level, parameters like compaction ratio, minimum number of store-files in each compaction may be different. Number of levels, time-ranges, and parameters for each level will be configurable online on a per-column family basis. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira