[
https://issues.apache.org/jira/browse/HBASE-15400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214720#comment-15214720
]
Ted Yu commented on HBASE-15400:
--------------------------------
Do you get the following compilation error on v12 ?
{code}
[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-compiler-plugin:3.2:compile (default-compile) on
project hbase-server: Compilation failure: Compilation failure:
[ERROR]
/Users/tyu/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java:[273,20]
method cloneForReader() is already defined in class
org.apache.hadoop.hbase.regionserver.StoreFile
[ERROR]
/Users/tyu/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/StripeCompactionPolicy.java:[52,8]
org.apache.hadoop.hbase.regionserver.compactions.StripeCompactionPolicy is not
abstract and does not override abstract method
shouldPerformMajorCompaction(java.util.Collection<org.apache.hadoop.hbase.regionserver.StoreFile>)
in org.apache.hadoop.hbase.regionserver.compactions.CompactionPolicy
[ERROR]
/Users/tyu/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/StripeCompactionPolicy.java:[168,3]
method does not override or implement a method from a supertype
[ERROR]
/Users/tyu/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/DateTieredStoreEngine.java:[74,5]
method does not override or implement a method from a supertype
{code}
> Use DateTieredCompactor for Date Tiered Compaction
> --------------------------------------------------
>
> Key: HBASE-15400
> URL: https://issues.apache.org/jira/browse/HBASE-15400
> Project: HBase
> Issue Type: Sub-task
> Components: Compaction
> Reporter: Clara Xiong
> Assignee: Clara Xiong
> Fix For: 2.0.0
>
> Attachments: HBASE-15400-15389-v12.patch, HBASE-15400-v1.pa,
> HBASE-15400.patch
>
>
> When we compact, we can output multiple files along the current window
> boundaries. There are two use cases:
> 1. Major compaction: We want to output date tiered store files with data
> older than max age archived in trunks of the window size on the higher
> tier.1. Once a file is old enough to be out of the range that we do tiered
> minor compaction, we don't compact them any further. So they retain the same
> timespan as they were compacted last time, which is the window size of the
> highest tier. Major compaction will touch these files and we want to maintain
> the same layout.
> 2. Bulk load files and the old file generated by major compaction before
> upgrading to DTCP.
> Pros:
> 1. Restore locality, process versioning, updates and deletes while
> maintaining the tiered layout.
> 2. The best way to fix a skewed layout.
>
> This work is based on a prototype of DateTieredCompactor from HBASE-15389 and
> focused on the part to meet needs for these two use cases while supporting
> others. I have to call out a few design decisions:
> 1. We only want to output the files along all windows for major compaction.
> And we want to output multiple files older than max age in the trunks of the
> maximum tier window size determined by base window size, windows per tier and
> max age.
> 2. For minor compaction, we don't want to output too many files, which will
> remain around because of current restriction of contiguous compaction by seq
> id. I will only output two files if all the files in the windows are being
> combined, one for the data within window and the other for the out-of-window
> tail. If there is any file in the window excluded from compaction, only one
> file will be output from compaction. When the windows are promoted, the
> situation of out of order data will gradually improve. For the incoming
> window, we need to accommodate the case with user-specified future data.
> 3. We have to pass the boundaries with the list of store file as a complete
> time snapshot instead of two separate calls because window layout is
> determined by the time the computation is called. So we will need new type of
> compaction request.
> 4. Since we will assign the same seq id for all output files, we need to sort
> by maxTimestamp subsequently. Right now all compaction policy gets the files
> sorted for StoreFileManager which sorts by seq id and other criteria. I will
> use this order for DTCP only, to avoid impacting other compaction policies.
> 5. We need some cleanup of current design of StoreEngine and CompactionPolicy.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)