[jira] [Updated] (HBASE-6371) [89-fb] Tier based compaction

2012-11-06 Thread Liyin Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Tang updated HBASE-6371:
--

Attachment: (was: HBase_Tier_Base_Compaction.pdf)

 [89-fb] Tier based compaction
 -

 Key: HBASE-6371
 URL: https://issues.apache.org/jira/browse/HBASE-6371
 Project: HBase
  Issue Type: Improvement
Reporter: Akashnil
Assignee: Liyin Tang
  Labels: noob
 Attachments: HBASE-6371-089fb-commit.patch, 
 HBase_Tier_Base_Compaction.pdf


 Currently, the compaction selection is not very flexible and is not sensitive 
 to the hotness of the data. Very old data is likely to be accessed less, and 
 very recent data is likely to be in the block cache. Both of these 
 considerations make it inefficient to compact these files as aggressively as 
 other files. In some use-cases, the access-pattern is particularly obvious 
 even though there is no way to control the compaction algorithm in those 
 cases.
 In the new compaction selection algorithm, we plan to divide the candidate 
 files into different levels according to oldness of the data that is present 
 in those files. For each level, parameters like compaction ratio, minimum 
 number of store-files in each compaction may be different. Number of levels, 
 time-ranges, and parameters for each level will be configurable online on a 
 per-column family basis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6371) [89-fb] Tier based compaction

2012-11-06 Thread Liyin Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Tang updated HBASE-6371:
--

Attachment: HBase_Tier_Base_Compaction.pdf

 [89-fb] Tier based compaction
 -

 Key: HBASE-6371
 URL: https://issues.apache.org/jira/browse/HBASE-6371
 Project: HBase
  Issue Type: Improvement
Reporter: Akashnil
Assignee: Liyin Tang
  Labels: noob
 Attachments: HBASE-6371-089fb-commit.patch, 
 HBase_Tier_Base_Compaction.pdf


 Currently, the compaction selection is not very flexible and is not sensitive 
 to the hotness of the data. Very old data is likely to be accessed less, and 
 very recent data is likely to be in the block cache. Both of these 
 considerations make it inefficient to compact these files as aggressively as 
 other files. In some use-cases, the access-pattern is particularly obvious 
 even though there is no way to control the compaction algorithm in those 
 cases.
 In the new compaction selection algorithm, we plan to divide the candidate 
 files into different levels according to oldness of the data that is present 
 in those files. For each level, parameters like compaction ratio, minimum 
 number of store-files in each compaction may be different. Number of levels, 
 time-ranges, and parameters for each level will be configurable online on a 
 per-column family basis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6371) [89-fb] Tier based compaction

2012-11-05 Thread Liyin Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Tang updated HBASE-6371:
--

Attachment: HBase_Tier_Base_Compaction.pdf

The design doc for HBase Tier-based Compaction from Akashnil.

 [89-fb] Tier based compaction
 -

 Key: HBASE-6371
 URL: https://issues.apache.org/jira/browse/HBASE-6371
 Project: HBase
  Issue Type: Improvement
Reporter: Akashnil
Assignee: Liyin Tang
  Labels: noob
 Attachments: HBASE-6371-089fb-commit.patch, 
 HBase_Tier_Base_Compaction.pdf


 Currently, the compaction selection is not very flexible and is not sensitive 
 to the hotness of the data. Very old data is likely to be accessed less, and 
 very recent data is likely to be in the block cache. Both of these 
 considerations make it inefficient to compact these files as aggressively as 
 other files. In some use-cases, the access-pattern is particularly obvious 
 even though there is no way to control the compaction algorithm in those 
 cases.
 In the new compaction selection algorithm, we plan to divide the candidate 
 files into different levels according to oldness of the data that is present 
 in those files. For each level, parameters like compaction ratio, minimum 
 number of store-files in each compaction may be different. Number of levels, 
 time-ranges, and parameters for each level will be configurable online on a 
 per-column family basis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6371) [89-fb] Tier based compaction

2012-10-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-6371:


Attachment: HBASE-6371-089fb-commit.patch

I am attaching the 0.89fb commit for reference.
The commit hash is 1b3e7bb4df1ed05d7d268cb90ffc23f5955c4398 

 [89-fb] Tier based compaction
 -

 Key: HBASE-6371
 URL: https://issues.apache.org/jira/browse/HBASE-6371
 Project: HBase
  Issue Type: Improvement
Reporter: Akashnil
Assignee: Liyin Tang
  Labels: noob
 Attachments: HBASE-6371-089fb-commit.patch


 Currently, the compaction selection is not very flexible and is not sensitive 
 to the hotness of the data. Very old data is likely to be accessed less, and 
 very recent data is likely to be in the block cache. Both of these 
 considerations make it inefficient to compact these files as aggressively as 
 other files. In some use-cases, the access-pattern is particularly obvious 
 even though there is no way to control the compaction algorithm in those 
 cases.
 In the new compaction selection algorithm, we plan to divide the candidate 
 files into different levels according to oldness of the data that is present 
 in those files. For each level, parameters like compaction ratio, minimum 
 number of store-files in each compaction may be different. Number of levels, 
 time-ranges, and parameters for each level will be configurable online on a 
 per-column family basis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6371) [89-fb] Tier based compaction

2012-10-14 Thread Liyin Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Tang updated HBASE-6371:
--

Summary: [89-fb] Tier based compaction  (was: [89-fb] Level based 
compaction)

 [89-fb] Tier based compaction
 -

 Key: HBASE-6371
 URL: https://issues.apache.org/jira/browse/HBASE-6371
 Project: HBase
  Issue Type: Improvement
Reporter: Akashnil
Assignee: Liyin Tang
  Labels: noob

 Currently, the compaction selection is not very flexible and is not sensitive 
 to the hotness of the data. Very old data is likely to be accessed less, and 
 very recent data is likely to be in the block cache. Both of these 
 considerations make it inefficient to compact these files as aggressively as 
 other files. In some use-cases, the access-pattern is particularly obvious 
 even though there is no way to control the compaction algorithm in those 
 cases.
 In the new compaction selection algorithm, we plan to divide the candidate 
 files into different levels according to oldness of the data that is present 
 in those files. For each level, parameters like compaction ratio, minimum 
 number of store-files in each compaction may be different. Number of levels, 
 time-ranges, and parameters for each level will be configurable online on a 
 per-column family basis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira