[
https://issues.apache.org/jira/browse/HBASE-14477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vladimir Rodionov resolved HBASE-14477.
---------------------------------------
Resolution: Duplicate
Duplicate of HBASE-15181
> Compaction improvements: Date tiered compaction policy
> ------------------------------------------------------
>
> Key: HBASE-14477
> URL: https://issues.apache.org/jira/browse/HBASE-14477
> Project: HBase
> Issue Type: New Feature
> Reporter: Vladimir Rodionov
> Assignee: Vladimir Rodionov
> Fix For: 2.0.0
>
>
> For immutable and mostly immutable data the current SizeTiered-based
> compaction policy is not efficient.
> # There is no need to compact all files into one, because, data is (mostly)
> immutable and we do not need to collect garbage. (performance reason will be
> discussed later)
> # Size-tiered compaction is not suitable for applications where most recent
> data is most important and prevents efficient caching of this data.
> The idea is pretty similar to DateTieredCompaction in Cassandra:
> http://www.datastax.com/dev/blog/datetieredcompactionstrategy
> http://www.datastax.com/dev/blog/dtcs-notes-from-the-field
> From Cassandra own blog:
> {quote}
> Since DTCS can be used with any table, it is important to know when it is a
> good idea, and when it is not. I’ll try to explain the spectrum and
> trade-offs here:
> 1. Perfect Fit: Time Series Fact Data, Deletes by Default TTL: When you
> ingest fact data that is ordered in time, with no deletes or overwrites. This
> is the standard “time series” use case.
> 2. OK Fit: Time-Ordered, with limited updates across whole data set, or only
> updates to recent data: When you ingest data that is (mostly) ordered in
> time, but revise or delete a very small proportion of the overall data across
> the whole timeline.
> 3. Not a Good Fit: many partial row updates or deletions over time: When you
> need to partially revise or delete fields for rows that you read together.
> Also, when you revise or delete rows within clustered reads.
> {quote}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)