[
https://issues.apache.org/jira/browse/HBASE-14468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14977632#comment-14977632
]
Enis Soztutar commented on HBASE-14468:
---------------------------------------
This is a good idea. We should add this to the list of compaction policies with
good documentation. We have use cases where there is a TTL of a couple of days.
Metrics store is one such example for the raw data in a high ingest scenario.
For the patch itself, the first if is not needed if we are checking for the
DisabledRSP anyway:
{code}
+
if(splitPolicyClassName.equals(IncreasingToUpperBoundRegionSplitPolicy.class.getName())){
+ throw new RuntimeException("Default split policy for FIFO compaction"+
+ " is not supported, aborting.");
+ } else if(
!splitPolicyClassName.equals(DisabledRegionSplitPolicy.class.getName())){
+ warn.append(":region splits must be disabled:");
+ }
{code}
Can we make it so that if a split happens we still compact the reference files,
but we do not compact otherwise? We can also allow very-slow splits in the case
where the reference files will be cleaned out due to TTL. In this case, a
region can still split every TTL interval.
RuntimeException's thrown will cause region opening to fail or RS to abort? Can
we hook the verify code to {{HMaster.sanityCheckTableDescriptor()}}, so that
you cannot alter the table or create a table with those settings. This will
make a much better experience for the user.
Can we also simplify the configuration for these. Maybe we auto-disable the
major compactions, and set the blocking store files if they are not set?
Can we use HStore.removeUnneededFiles() or
{{storeEngine.getStoreFileManager()}} which already implements the is expired
logic so that there is no duplication there?
> Compaction improvements: FIFO compaction policy
> -----------------------------------------------
>
> Key: HBASE-14468
> URL: https://issues.apache.org/jira/browse/HBASE-14468
> Project: HBase
> Issue Type: Improvement
> Reporter: Vladimir Rodionov
> Assignee: Vladimir Rodionov
> Fix For: 2.0.0
>
> Attachments: HBASE-14468-v1.patch, HBASE-14468-v2.patch,
> HBASE-14468-v3.patch, HBASE-14468-v4.patch, HBASE-14468-v5.patch,
> HBASE-14468-v6.patch
>
>
> h2. FIFO Compaction
> h3. Introduction
> FIFO compaction policy selects only files which have all cells expired. The
> column family MUST have non-default TTL.
> Essentially, FIFO compactor does only one job: collects expired store files.
> I see many applications for this policy:
> # use it for very high volume raw data which has low TTL and which is the
> source of another data (after additional processing). Example: Raw
> time-series vs. time-based rollup aggregates and compacted time-series. We
> collect raw time-series and store them into CF with FIFO compaction policy,
> periodically we run task which creates rollup aggregates and compacts
> time-series, the original raw data can be discarded after that.
> # use it for data which can be kept entirely in a a block cache (RAM/SSD).
> Say we have local SSD (1TB) which we can use as a block cache. No need for
> compaction of a raw data at all.
> Because we do not do any real compaction, we do not use CPU and IO (disk and
> network), we do not evict hot data from a block cache. The result: improved
> throughput and latency both write and read.
> See: https://github.com/facebook/rocksdb/wiki/FIFO-compaction-style
> h3. To enable FIFO compaction policy
> For table:
> {code}
> HTableDescriptor desc = new HTableDescriptor(tableName);
>
> desc.setConfiguration(DefaultStoreEngine.DEFAULT_COMPACTION_POLICY_CLASS_KEY,
> FIFOCompactionPolicy.class.getName());
> {code}
> For CF:
> {code}
> HColumnDescriptor desc = new HColumnDescriptor(family);
>
> desc.setConfiguration(DefaultStoreEngine.DEFAULT_COMPACTION_POLICY_CLASS_KEY,
> FIFOCompactionPolicy.class.getName());
> {code}
> Make sure, that table has disabled region splits (either by setting
> explicitly DisabledRegionSplitPolicy or by setting
> ConstantSizeRegionSplitPolicy and very large max region size). You will have
> to increase to a very large number store's blocking file number :
> *hbase.hstore.blockingStoreFiles* as well.
>
> h3. Limitations
> Do not use FIFO compaction if :
> * Table/CF has MIN_VERSION > 0
> * Table/CF has TTL = FOREVER (HColumnDescriptor.DEFAULT_TTL)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)