[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Hu updated HBASE-15181: - Release Note: Date tiered compaction policy is a date-aware store file layout that is beneficial for time-range scans for time-series data. When it performs well: reads for limited time ranges, especially scans of recent data When it doesn't perform as well: random gets without a time range frequent deletes and updates out of order data writes, especially writes with timestamps in the future bulk loads of historical data Recommended configuration: To turn on Date Tiered Compaction (It is not recommended to turn on for the whole cluster because that will put meta table on it too and random get on meta table will be impacted): hbase.hstore.compaction.compaction.policy: org.apache.hadoop.hbase.regionserver.compactions.DateTieredCompactionPolicy Parameters for Date Tiered Compaction: hbase.hstore.compaction.date.tiered.max.storefile.age.millis: Files with max-timestamp smaller than this will no longer be compacted.Default at Long.MAX_VALUE. hbase.hstore.compaction.date.tiered.base.window.millis: base window size in milliseconds. Default at 6 hours. hbase.hstore.compaction.date.tiered.windows.per.tier: number of windows per tier. Default at 4. hbase.hstore.compaction.date.tiered.incoming.window.min: minimal number of files to compact in the incoming window. Set it to expected number of files in the window to avoid wasteful compaction. Default at 6. hbase.hstore.compaction.date.tiered.window.policy.class: the policy to select store files within the same time window. It doesn’t apply to the incoming window. Default at exploring compaction. This is to avoid wasteful compaction. With tiered compaction all servers in the cluster will promote windows to higher tier at the same time, so using a compaction throttle is recommended: hbase.regionserver.throughput.controller:org.apache.hadoop.hbase.regionserver.compactions.PressureAwareCompactionThroughputController Because there will most likely be more store files around, we need to adjust the configuration so that flush won't be blocked and compaction will be properly throttled: hbase.hstore.blockingStoreFiles: change to 50 if using all default parameters when turning on date tiered compaction. Use 1.5~2 x projected file count if changing the parameters, Projected file count = windows per tier x tier count + incoming window min + files older than max age For more details, please refer to the design spec at https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit# was: > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.18 > > Attachments: HBASE-15181-0.98-ADD.patch, HBASE-15181-0.98.patch, > HBASE-15181-0.98.v4.patch, HBASE-15181-98.patch, HBASE-15181-ADD.patch, > HBASE-15181-branch-1.patch, HBASE-15181-master-v1.patch, > HBASE-15181-master-v2.patch, HBASE-15181-master-v3.patch, > HBASE-15181-master-v4.patch, HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully. Time range overlapping among > store files is tolerated and the performance impact is minimized. > Configuration can be set at hbase-site.xml or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing > Results in our production is at > https://docs.google.com/document/d/1GqRtQZMMkTEWOijZc8UCTqhACNmdxBSjtAQSYIWsmGU/edit# -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Hu updated HBASE-15181: - Release Note:(was: Date tiered compaction policy is a date-aware store file layout that is beneficial for time-range scans for time-series data. When it performs well: reads for limited time ranges, especially scans of recent data When it doesn't perform as well: random gets without a time range frequent deletes and updates out of order data writes, especially writes with timestamps in the future bulk loads of historical data Recommended configuration: To turn on Date Tiered Compaction (It is not recommended to turn on for the whole cluster because that will put meta table on it too and random get on meta table will be impacted): hbase.hstore.compaction.compaction.policy: org.apache.hadoop.hbase.regionserver.compactions.DateTieredCompactionPolicy Parameters for Date Tiered Compaction: hbase.hstore.compaction.date.tiered.max.storefile.age.millis: Files with max-timestamp smaller than this will no longer be compacted.Default at Long.MAX_VALUE. hbase.hstore.compaction.date.tiered.base.window.millis: base window size in milliseconds. Default at 6 hours. hbase.hstore.compaction.date.tiered.windows.per.tier: number of windows per tier. Default at 4. hbase.hstore.compaction.date.tiered.incoming.window.min: minimal number of files to compact in the incoming window. Set it to expected number of files in the window to avoid wasteful compaction. Default at 6. hbase.hstore.compaction.date.tiered.window.policy.class: the policy to select store files within the same time window. It doesn’t apply to the incoming window. Default at exploring compaction. This is to avoid wasteful compaction. With tiered compaction all servers in the cluster will promote windows to higher tier at the same time, so using a compaction throttle is recommended: hbase.regionserver.throughput.controller:org.apache.hadoop.hbase.regionserver.compactions.PressureAwareCompactionThroughputController Because there will most likely be more store files around, we need to adjust the configuration so that flush won't be blocked and compaction will be properly throttled: hbase.hstore.blockingStoreFiles: change to 50 if using all default parameters when turning on date tiered compaction. Use 1.5~2 x projected file count if changing the parameters, Projected file count = windows per tier x tier count + incoming window min + files older than max age For more details, please refer to the design spec at https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit#) > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.18 > > Attachments: HBASE-15181-0.98-ADD.patch, HBASE-15181-0.98.patch, > HBASE-15181-0.98.v4.patch, HBASE-15181-98.patch, HBASE-15181-ADD.patch, > HBASE-15181-branch-1.patch, HBASE-15181-master-v1.patch, > HBASE-15181-master-v2.patch, HBASE-15181-master-v3.patch, > HBASE-15181-master-v4.patch, HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully. Time range overlapping among > store files is tolerated and the performance impact is minimized. > Configuration can be set at hbase-site.xml or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing > Results in our production is at > https://docs.google.com/document/d/1GqRtQZMMkTEWOijZc8UCTqhACNmdxBSjtAQSYIWsmGU/edit# -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clara Xiong updated HBASE-15181: Release Note: Date tiered compaction policy is a date-aware store file layout that is beneficial for time-range scans for time-series data. When it performs well: reads for limited time ranges, especially scans of recent data When it doesn't perform as well: random gets without a time range frequent deletes and updates out of order data writes, especially writes with timestamps in the future bulk loads of historical data Recommended configuration: To turn on Date Tiered Compaction (It is not recommended to turn on for the whole cluster because that will put meta table on it too and random get on meta table will be impacted): hbase.hstore.compaction.compaction.policy: org.apache.hadoop.hbase.regionserver.compactions.DateTieredCompactionPolicy Parameters for Date Tiered Compaction: hbase.hstore.compaction.date.tiered.max.storefile.age.millis: Files with max-timestamp smaller than this will no longer be compacted.Default at Long.MAX_VALUE. hbase.hstore.compaction.date.tiered.base.window.millis: base window size in milliseconds. Default at 6 hours. hbase.hstore.compaction.date.tiered.windows.per.tier: number of windows per tier. Default at 4. hbase.hstore.compaction.date.tiered.incoming.window.min: minimal number of files to compact in the incoming window. Set it to expected number of files in the window to avoid wasteful compaction. Default at 6. hbase.hstore.compaction.date.tiered.window.policy.class: the policy to select store files within the same time window. It doesn’t apply to the incoming window. Default at exploring compaction. This is to avoid wasteful compaction. With tiered compaction all servers in the cluster will promote windows to higher tier at the same time, so using a compaction throttle is recommended: hbase.regionserver.throughput.controller:org.apache.hadoop.hbase.regionserver.compactions.PressureAwareCompactionThroughputController Because there will most likely be more store files around, we need to adjust the configuration so that flush won't be blocked and compaction will be properly throttled: hbase.hstore.blockingStoreFiles: change to 50 if using all default parameters when turning on date tiered compaction. Use 1.5~2 x projected file count if changing the parameters, Projected file count = windows per tier x tier count + incoming window min + files older than max age For more details, please refer to the design spec at https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit# was: Date tiered compaction policy is a date-aware store file layout that is beneficial for time-range scans for time-series data. When it performs well: reads for limited time ranges, especially scans of recent data When it doesn't perform as well: random gets without a time range frequent deletes and updates out of order data writes, especially writes with timestamps in the future bulk loads of historical data Recommended configuration: To turn on Date Tiered Compaction: hbase.hstore.compaction.compaction.policy: org.apache.hadoop.hbase.regionserver.compactions.DateTieredCompactionPolicy Parameters for Date Tiered Compaction: hbase.hstore.compaction.date.tiered.max.storefile.age.millis: Files with max-timestamp smaller than this will no longer be compacted.Default at Long.MAX_VALUE. hbase.hstore.compaction.date.tiered.base.window.millis: base window size in milliseconds. Default at 6 hours. hbase.hstore.compaction.date.tiered.windows.per.tier: number of windows per tier. Default at 4. hbase.hstore.compaction.date.tiered.incoming.window.min: minimal number of files to compact in the incoming window. Set it to expected number of files in the window to avoid wasteful compaction. Default at 6. hbase.hstore.compaction.date.tiered.window.policy.class: the policy to select store files within the same time window. It doesn’t apply to the incoming window. Default at exploring compaction. This is to avoid wasteful compaction. With tiered compaction all servers in the cluster will promote windows to higher tier at the same time, so using a compaction throttle is recommended: hbase.regionserver.throughput.controller:org.apache.hadoop.hbase.regionserver.compactions.PressureAwareCompactionThroughputController Because there will most likely be more store files around, we need to adjust the configuration so that flush won't be blocked and compaction will be properly throttled: hbase.hstore.blockingStoreFiles: change to 50 if using all default parameters when turning on date tiered compaction. Use 1.5~2 x projected file count if changing the parameters, Projected file count = windows per tier x tier count + incoming window min + files older than max age For more details, please refer to the design spec at
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-15181: --- Fix Version/s: (was: 0.98.19) 0.98.18 > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.18 > > Attachments: HBASE-15181-0.98-ADD.patch, HBASE-15181-0.98.patch, > HBASE-15181-0.98.v4.patch, HBASE-15181-98.patch, HBASE-15181-ADD.patch, > HBASE-15181-branch-1.patch, HBASE-15181-master-v1.patch, > HBASE-15181-master-v2.patch, HBASE-15181-master-v3.patch, > HBASE-15181-master-v4.patch, HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully. Time range overlapping among > store files is tolerated and the performance impact is minimized. > Configuration can be set at hbase-site.xml or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing > Results in our production is at > https://docs.google.com/document/d/1GqRtQZMMkTEWOijZc8UCTqhACNmdxBSjtAQSYIWsmGU/edit# -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clara Xiong updated HBASE-15181: Description: This is a simple implementation of date-based tiered compaction similar to Cassandra's for the following benefits: 1. Improve date-range-based scan by structuring store files in date-based tiered layout. 2. Reduce compaction overhead. 3. Improve TTL efficiency. Perfect fit for the use cases that: 1. has mostly date-based date write and scan and a focus on the most recent data. 2. never or rarely deletes data. Out-of-order writes are handled gracefully. Time range overlapping among store files is tolerated and the performance impact is minimized. Configuration can be set at hbase-site.xml or overriden at per-table or per-column-famly level by hbase shell. Design spec is at https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing Results in our production is at https://docs.google.com/document/d/1GqRtQZMMkTEWOijZc8UCTqhACNmdxBSjtAQSYIWsmGU/edit# was: This is a simple implementation of date-based tiered compaction similar to Cassandra's for the following benefits: 1. Improve date-range-based scan by structuring store files in date-based tiered layout. 2. Reduce compaction overhead. 3. Improve TTL efficiency. Perfect fit for the use cases that: 1. has mostly date-based date write and scan and a focus on the most recent data. 2. never or rarely deletes data. Out-of-order writes are handled gracefully. Time range overlapping among store files is tolerated and the performance impact is minimized. Configuration can be set at hbase-site.xml or overriden at per-table or per-column-famly level by hbase shell. Design spec is at https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.19 > > Attachments: HBASE-15181-0.98-ADD.patch, HBASE-15181-0.98.patch, > HBASE-15181-0.98.v4.patch, HBASE-15181-98.patch, HBASE-15181-ADD.patch, > HBASE-15181-branch-1.patch, HBASE-15181-master-v1.patch, > HBASE-15181-master-v2.patch, HBASE-15181-master-v3.patch, > HBASE-15181-master-v4.patch, HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully. Time range overlapping among > store files is tolerated and the performance impact is minimized. > Configuration can be set at hbase-site.xml or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing > Results in our production is at > https://docs.google.com/document/d/1GqRtQZMMkTEWOijZc8UCTqhACNmdxBSjtAQSYIWsmGU/edit# -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Latham updated HBASE-15181: Fix Version/s: (was: 1.4.0) > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.19 > > Attachments: HBASE-15181-0.98-ADD.patch, HBASE-15181-0.98.patch, > HBASE-15181-0.98.v4.patch, HBASE-15181-98.patch, HBASE-15181-ADD.patch, > HBASE-15181-branch-1.patch, HBASE-15181-master-v1.patch, > HBASE-15181-master-v2.patch, HBASE-15181-master-v3.patch, > HBASE-15181-master-v4.patch, HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully. Time range overlapping among > store files is tolerated and the performance impact is minimized. > Configuration can be set at hbase-site.xml or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Latham updated HBASE-15181: Release Note: Date tiered compaction policy is a date-aware store file layout that is beneficial for time-range scans for time-series data. When it performs well: reads for limited time ranges, especially scans of recent data When it doesn't perform as well: random gets without a time range frequent deletes and updates out of order data writes, especially writes with timestamps in the future bulk loads of historical data Recommended configuration: To turn on Date Tiered Compaction: hbase.hstore.compaction.compaction.policy: org.apache.hadoop.hbase.regionserver.compactions.DateTieredCompactionPolicy Parameters for Date Tiered Compaction: hbase.hstore.compaction.date.tiered.max.storefile.age.millis: Files with max-timestamp smaller than this will no longer be compacted.Default at Long.MAX_VALUE. hbase.hstore.compaction.date.tiered.base.window.millis: base window size in milliseconds. Default at 6 hours. hbase.hstore.compaction.date.tiered.windows.per.tier: number of windows per tier. Default at 4. hbase.hstore.compaction.date.tiered.incoming.window.min: minimal number of files to compact in the incoming window. Set it to expected number of files in the window to avoid wasteful compaction. Default at 6. hbase.hstore.compaction.date.tiered.window.policy.class: the policy to select store files within the same time window. It doesn’t apply to the incoming window. Default at exploring compaction. This is to avoid wasteful compaction. With tiered compaction all servers in the cluster will promote windows to higher tier at the same time, so using a compaction throttle is recommended: hbase.regionserver.throughput.controller:org.apache.hadoop.hbase.regionserver.compactions.PressureAwareCompactionThroughputController Because there will most likely be more store files around, we need to adjust the configuration so that flush won't be blocked and compaction will be properly throttled: hbase.hstore.blockingStoreFiles: change to 50 if using all default parameters when turning on date tiered compaction. Use 1.5~2 x projected file count if changing the parameters, Projected file count = windows per tier x tier count + incoming window min + files older than max age For more details, please refer to the design spec at https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit# > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.19, 1.4.0 > > Attachments: HBASE-15181-0.98-ADD.patch, HBASE-15181-0.98.patch, > HBASE-15181-0.98.v4.patch, HBASE-15181-98.patch, HBASE-15181-ADD.patch, > HBASE-15181-branch-1.patch, HBASE-15181-master-v1.patch, > HBASE-15181-master-v2.patch, HBASE-15181-master-v3.patch, > HBASE-15181-master-v4.patch, HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully. Time range overlapping among > store files is tolerated and the performance impact is minimized. > Configuration can be set at hbase-site.xml or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-15181: --- Resolution: Fixed Fix Version/s: 1.4.0 Status: Resolved (was: Patch Available) Findbug warnings are gone. Thanks, Clara. > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.19, 1.4.0 > > Attachments: HBASE-15181-0.98-ADD.patch, HBASE-15181-0.98.patch, > HBASE-15181-0.98.v4.patch, HBASE-15181-98.patch, HBASE-15181-ADD.patch, > HBASE-15181-branch-1.patch, HBASE-15181-master-v1.patch, > HBASE-15181-master-v2.patch, HBASE-15181-master-v3.patch, > HBASE-15181-master-v4.patch, HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully. Time range overlapping among > store files is tolerated and the performance impact is minimized. > Configuration can be set at hbase-site.xml or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clara Xiong updated HBASE-15181: Attachment: HBASE-15181-ADD.patch Addendum for master for findbugs. > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.19 > > Attachments: HBASE-15181-0.98-ADD.patch, HBASE-15181-0.98.patch, > HBASE-15181-0.98.v4.patch, HBASE-15181-98.patch, HBASE-15181-ADD.patch, > HBASE-15181-branch-1.patch, HBASE-15181-master-v1.patch, > HBASE-15181-master-v2.patch, HBASE-15181-master-v3.patch, > HBASE-15181-master-v4.patch, HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully. Time range overlapping among > store files is tolerated and the performance impact is minimized. > Configuration can be set at hbase-site.xml or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-15181: --- Status: Patch Available (was: Reopened) > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.19 > > Attachments: HBASE-15181-0.98-ADD.patch, HBASE-15181-0.98.patch, > HBASE-15181-0.98.v4.patch, HBASE-15181-98.patch, HBASE-15181-branch-1.patch, > HBASE-15181-master-v1.patch, HBASE-15181-master-v2.patch, > HBASE-15181-master-v3.patch, HBASE-15181-master-v4.patch, > HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully. Time range overlapping among > store files is tolerated and the performance impact is minimized. > Configuration can be set at hbase-site.xml or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clara Xiong updated HBASE-15181: Attachment: HBASE-15181-0.98-ADD.patch Addendum for 0.98. > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.19 > > Attachments: HBASE-15181-0.98-ADD.patch, HBASE-15181-0.98.patch, > HBASE-15181-0.98.v4.patch, HBASE-15181-98.patch, HBASE-15181-branch-1.patch, > HBASE-15181-master-v1.patch, HBASE-15181-master-v2.patch, > HBASE-15181-master-v3.patch, HBASE-15181-master-v4.patch, > HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully. Time range overlapping among > store files is tolerated and the performance impact is minimized. > Configuration can be set at hbase-site.xml or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-15181: -- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Resolving this, since it seems to be committed. Thanks, Clara Xiong for the work. We would still be interested in the IO numbers, and getting some documentation for this feature if possible. > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.19 > > Attachments: HBASE-15181-0.98.patch, HBASE-15181-0.98.v4.patch, > HBASE-15181-98.patch, HBASE-15181-branch-1.patch, > HBASE-15181-master-v1.patch, HBASE-15181-master-v2.patch, > HBASE-15181-master-v3.patch, HBASE-15181-master-v4.patch, > HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully. Time range overlapping among > store files is tolerated and the performance impact is minimized. > Configuration can be set at hbase-site.xml or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clara Xiong updated HBASE-15181: Attachment: HBASE-15181-0.98.patch Fixed checkstyle and other warnings for 0.98 > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.19 > > Attachments: HBASE-15181-0.98.patch, HBASE-15181-0.98.v4.patch, > HBASE-15181-98.patch, HBASE-15181-branch-1.patch, > HBASE-15181-master-v1.patch, HBASE-15181-master-v2.patch, > HBASE-15181-master-v3.patch, HBASE-15181-master-v4.patch, > HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully. Time range overlapping among > store files is tolerated and the performance impact is minimized. > Configuration can be set at hbase-site.xml or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-15181: --- Attachment: HBASE-15181-0.98.v4.patch Clara: The branch name is 0.98 Attaching same patch with correct branch name so that QA bot can test. > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.19 > > Attachments: HBASE-15181-0.98.v4.patch, HBASE-15181-98.patch, > HBASE-15181-branch-1.patch, HBASE-15181-master-v1.patch, > HBASE-15181-master-v2.patch, HBASE-15181-master-v3.patch, > HBASE-15181-master-v4.patch, HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully. Time range overlapping among > store files is tolerated and the performance impact is minimized. > Configuration can be set at hbase-site.xml or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clara Xiong updated HBASE-15181: Attachment: HBASE-15181-98.patch Patch for branch 98. > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.19 > > Attachments: HBASE-15181-98.patch, HBASE-15181-branch-1.patch, > HBASE-15181-master-v1.patch, HBASE-15181-master-v2.patch, > HBASE-15181-master-v3.patch, HBASE-15181-master-v4.patch, > HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully. Time range overlapping among > store files is tolerated and the performance impact is minimized. > Configuration can be set at hbase-site.xml or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clara Xiong updated HBASE-15181: Attachment: HBASE-15181-branch-1.patch Patch for branch-1 > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.19 > > Attachments: HBASE-15181-branch-1.patch, HBASE-15181-master-v1.patch, > HBASE-15181-master-v2.patch, HBASE-15181-master-v3.patch, > HBASE-15181-master-v4.patch, HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully. Time range overlapping among > store files is tolerated and the performance impact is minimized. > Configuration can be set at hbase-site.xml or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clara Xiong updated HBASE-15181: Attachment: HBASE-15181-master-v4.patch > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.19 > > Attachments: HBASE-15181-master-v1.patch, > HBASE-15181-master-v2.patch, HBASE-15181-master-v3.patch, > HBASE-15181-master-v4.patch, HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully. Time range overlapping among > store files is tolerated and the performance impact is minimized. > Configuration can be set at hbase-site.xml or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clara Xiong updated HBASE-15181: Attachment: HBASE-15181-master-v3.patch > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.19 > > Attachments: HBASE-15181-master-v1.patch, > HBASE-15181-master-v2.patch, HBASE-15181-master-v3.patch, > HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully. Time range overlapping among > store files is tolerated and the performance impact is minimized. > Configuration can be set at hbase-site.xml or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clara Xiong updated HBASE-15181: Attachment: HBASE-15181-master-v2.patch This new patch is to address HadoopQA complaints. I fixed checkstyle, changed annotation to suppress findbugs issue. The failed unit test doesn't seem to be related to the patch and passes locally. > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.19 > > Attachments: HBASE-15181-master-v1.patch, > HBASE-15181-master-v2.patch, HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully. Time range overlapping among > store files is tolerated and the performance impact is minimized. > Configuration can be set at hbase-site.xml or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clara Xiong updated HBASE-15181: Description: This is a simple implementation of date-based tiered compaction similar to Cassandra's for the following benefits: 1. Improve date-range-based scan by structuring store files in date-based tiered layout. 2. Reduce compaction overhead. 3. Improve TTL efficiency. Perfect fit for the use cases that: 1. has mostly date-based date write and scan and a focus on the most recent data. 2. never or rarely deletes data. Out-of-order writes are handled gracefully. Time range overlapping among store files is tolerated and the performance impact is minimized. Configuration can be set at hbase-site.xml or overriden at per-table or per-column-famly level by hbase shell. Design spec is at https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing was: This is a simple implementation of date-based tiered compaction similar to Cassandra's for the following benefits: 1. Improve date-range-based scan by structuring store files in date-based tiered layout. 2. Reduce compaction overhead. 3. Improve TTL efficiency. Perfect fit for the use cases that: 1. has mostly date-based date write and scan and a focus on the most recent data. 2. never or rarely deletes data. Out-of-order writes are handled gracefully so the data will still get to the right store file for time-range-scan and re-compacton with existing store file in the same time window is handled by ExploringCompactionPolicy. Time range overlapping among store files is tolerated and the performance impact is minimized. Configuration can be set at hbase-site or overriden at per-table or per-column-famly level by hbase shell. Design spec is at https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.19 > > Attachments: HBASE-15181-master-v1.patch, HBASE-15181-v1.patch, > HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully. Time range overlapping among > store files is tolerated and the performance impact is minimized. > Configuration can be set at hbase-site.xml or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-15181: -- Status: Patch Available (was: Open) > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.19 > > Attachments: HBASE-15181-master-v1.patch, HBASE-15181-v1.patch, > HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully so the data will still get to the > right store file for time-range-scan and re-compacton with existing store > file in the same time window is handled by ExploringCompactionPolicy. > Time range overlapping among store files is tolerated and the performance > impact is minimized. > Configuration can be set at hbase-site or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clara Xiong updated HBASE-15181: Attachment: HBASE-15181-master-v1.patch Upload the new patch with RB feedbacks incorporated . > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.19 > > Attachments: HBASE-15181-master-v1.patch, HBASE-15181-v1.patch, > HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully so the data will still get to the > right store file for time-range-scan and re-compacton with existing store > file in the same time window is handled by ExploringCompactionPolicy. > Time range overlapping among store files is tolerated and the performance > impact is minimized. > Configuration can be set at hbase-site or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-15181: --- Fix Version/s: 0.98.19 1.3.0 Updating fix versions. We have an interest in getting this back into 0.98 via branch-1, eventually. > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0, 1.3.0, 0.98.19 > > Attachments: HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully so the data will still get to the > right store file for time-range-scan and re-compacton with existing store > file in the same time window is handled by ExploringCompactionPolicy. > Time range overlapping among store files is tolerated and the performance > impact is minimized. > Configuration can be set at hbase-site or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-15181: --- Status: Open (was: Patch Available) > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0 > > Attachments: HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully so the data will still get to the > right store file for time-range-scan and re-compacton with existing store > file in the same time window is handled by ExploringCompactionPolicy. > Time range overlapping among store files is tolerated and the performance > impact is minimized. > Configuration can be set at hbase-site or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clara Xiong updated HBASE-15181: Description: This is a simple implementation of date-based tiered compaction similar to Cassandra's for the following benefits: 1. Improve date-range-based scan by structuring store files in date-based tiered layout. 2. Reduce compaction overhead. 3. Improve TTL efficiency. Perfect fit for the use cases that: 1. has mostly date-based date write and scan and a focus on the most recent data. 2. never or rarely deletes data. Out-of-order writes are handled gracefully so the data will still get to the right store file for time-range-scan and re-compacton with existing store file in the same time window is handled by ExploringCompactionPolicy. Time range overlapping among store files is tolerated and the performance impact is minimized. Configuration can be set at hbase-site or overriden at per-table or per-column-famly level by hbase shell. Design spec is at https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing was: This is a simple implementation of date-based tiered compaction similar to Cassandra's for the following benefits: 1. Improve date-range-based scan by structuring store files in date-based tiered layout. 2. Reduce compaction overhead. 3. Improve TTL efficiency. Perfect fit for the use cases that: 1. has mostly date-based date write and scan and a focus on the most recent data. 2. never or rarely deletes data. Out-of-order writes are handled gracefully so the data will still get to the right store file for time-range-scan and re-compacton with existing store file in the same time window is handled by ExploringCompactionPolicy. Time range overlapping among store files is tolerated and the performance impact is minimized. > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0 > > Attachments: HBASE-15181-v1.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully so the data will still get to the > right store file for time-range-scan and re-compacton with existing store > file in the same time window is handled by ExploringCompactionPolicy. > Time range overlapping among store files is tolerated and the performance > impact is minimized. > Configuration can be set at hbase-site or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clara Xiong updated HBASE-15181: Attachment: HBASE-15181-v2.patch Fixed problems reported by Hadoop QA. > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0 > > Attachments: HBASE-15181-v1.patch, HBASE-15181-v2.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully so the data will still get to the > right store file for time-range-scan and re-compacton with existing store > file in the same time window is handled by ExploringCompactionPolicy. > Time range overlapping among store files is tolerated and the performance > impact is minimized. > Configuration can be set at hbase-site or overriden at per-table or > per-column-famly level by hbase shell. > Design spec is at > https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clara Xiong updated HBASE-15181: Attachment: HBASE-15181-v1.patch > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0 > > Attachments: HBASE-15181-v1.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully so the data will still get to the > right store file for time-range-scan and re-compacton with existing store > file in the same time window is handled by ExploringCompactionPolicy. > Time range overlapping among store files is tolerated and the performance > impact is minimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15181) A simple implementation of date based tiered compaction
[ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clara Xiong updated HBASE-15181: Status: Patch Available (was: Open) HBASE-15181 > A simple implementation of date based tiered compaction > --- > > Key: HBASE-15181 > URL: https://issues.apache.org/jira/browse/HBASE-15181 > Project: HBase > Issue Type: New Feature > Components: Compaction >Reporter: Clara Xiong >Assignee: Clara Xiong > Fix For: 2.0.0 > > Attachments: HBASE-15181-v1.patch > > > This is a simple implementation of date-based tiered compaction similar to > Cassandra's for the following benefits: > 1. Improve date-range-based scan by structuring store files in date-based > tiered layout. > 2. Reduce compaction overhead. > 3. Improve TTL efficiency. > Perfect fit for the use cases that: > 1. has mostly date-based date write and scan and a focus on the most recent > data. > 2. never or rarely deletes data. > Out-of-order writes are handled gracefully so the data will still get to the > right store file for time-range-scan and re-compacton with existing store > file in the same time window is handled by ExploringCompactionPolicy. > Time range overlapping among store files is tolerated and the performance > impact is minimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)