[ 
https://issues.apache.org/jira/browse/HBASE-14468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15087681#comment-15087681
 ] 

Hudson commented on HBASE-14468:
--------------------------------

FAILURE: Integrated in HBase-0.98-matrix #283 (See 
[https://builds.apache.org/job/HBase-0.98-matrix/283/])
HBASE-14468 Compaction improvements: FIFO compaction policy. (Vladimir (larsh: 
rev 912b42786fbb1374f42648aceaa813ab009e3a9b)
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/FIFOCompactionPolicy.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/compactions/TestFIFOCompactionPolicy.java
* 
hbase-common/src/test/java/org/apache/hadoop/hbase/util/TimeOffsetEnvironmentEdge.java


> Compaction improvements: FIFO compaction policy
> -----------------------------------------------
>
>                 Key: HBASE-14468
>                 URL: https://issues.apache.org/jira/browse/HBASE-14468
>             Project: HBase
>          Issue Type: Improvement
>          Components: Compaction, Performance
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>             Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.17
>
>         Attachments: 14468-0.98-v2.txt, 14468-0.98.txt, HBASE-14468-v1.patch, 
> HBASE-14468-v10.patch, HBASE-14468-v2.patch, HBASE-14468-v3.patch, 
> HBASE-14468-v4.patch, HBASE-14468-v5.patch, HBASE-14468-v6.patch, 
> HBASE-14468-v7.patch, HBASE-14468-v8.patch, HBASE-14468-v9.patch, 
> HBASE-14468.add.patch
>
>
> h2. FIFO Compaction
> h3. Introduction
> FIFO compaction policy selects only files which have all cells expired. The 
> column family MUST have non-default TTL. 
> Essentially, FIFO compactor does only one job: collects expired store files. 
> These are some applications which could benefit the most:
> # Use it for very high volume raw data which has low TTL and which is the 
> source of another data (after additional processing). Example: Raw 
> time-series vs. time-based rollup aggregates and compacted time-series. We 
> collect raw time-series and store them into CF with FIFO compaction policy, 
> periodically we run  task which creates rollup aggregates and compacts 
> time-series, the original raw data can be discarded after that.
> # Use it for data which can be kept entirely in a a block cache (RAM/SSD). 
> Say we have local SSD (1TB) which we can use as a block cache. No need for 
> compaction of a raw data at all.
> Because we do not do any real compaction, we do not use CPU and IO (disk and 
> network), we do not evict hot data from a block cache. The result: improved 
> throughput and latency both write and read.
> See: https://github.com/facebook/rocksdb/wiki/FIFO-compaction-style
> h3. To enable FIFO compaction policy
> For table:
> {code}
> HTableDescriptor desc = new HTableDescriptor(tableName);
>     
> desc.setConfiguration(DefaultStoreEngine.DEFAULT_COMPACTION_POLICY_CLASS_KEY, 
>       FIFOCompactionPolicy.class.getName());
> {code} 
> For CF:
> {code}
> HColumnDescriptor desc = new HColumnDescriptor(family);
>     
> desc.setConfiguration(DefaultStoreEngine.DEFAULT_COMPACTION_POLICY_CLASS_KEY, 
>       FIFOCompactionPolicy.class.getName());
> {code}
> From HBase shell:
> {code}
> create 'x',{NAME=>'y', TTL=>'30'}, {CONFIGURATION => 
> {'hbase.hstore.defaultengine.compactionpolicy.class' => 
> 'org.apache.hadoop.hbase.regionserver.compactions.FIFOCompactionPolicy', 
> 'hbase.hstore.blockingStoreFiles' => 1000}}
> {code}
> Although region splitting is supported,  for optimal performance it should be 
> disabled, either by setting explicitly DisabledRegionSplitPolicy or by 
> setting ConstantSizeRegionSplitPolicy and very large max region size. You 
> will have to increase to a very large number store's blocking file number : 
> *hbase.hstore.blockingStoreFiles* as well (there is a sanity check on 
> table/column family configuration in case of FIFO compaction and minimum 
> value for number of blocking file is 1000).
>  
> h3. Limitations
> Do not use FIFO compaction if :
> * Table/CF has MIN_VERSION > 0
> * Table/CF has TTL = FOREVER (HColumnDescriptor.DEFAULT_TTL)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to