[
https://issues.apache.org/jira/browse/HBASE-15454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236596#comment-15236596
]
Clara Xiong commented on HBASE-15454:
-------------------------------------
To be specific, it seems very inefficient that we need a routine to slice the
data along the exponential windows for minor/major compaction and another
concurrent routine to slice the data along the calendar windows to archive
them. A user should only need either layout, not both. Either layout satisfies
time-range scan efficiency and archive/TTL efficiency. This is the same idea as
Dave's pluggable window algorithm.
And please add the EC manager code and make it work with both types of windows.
To answer your question that the order of archiving differs from compaction, it
should be in EC's logic that scan the store file's time range to pick the files
to archive. It can share the TTL logic.
> Archive store files older than max age
> --------------------------------------
>
> Key: HBASE-15454
> URL: https://issues.apache.org/jira/browse/HBASE-15454
> Project: HBase
> Issue Type: Sub-task
> Components: Compaction
> Affects Versions: 2.0.0, 1.3.0, 0.98.18, 1.4.0
> Reporter: Duo Zhang
> Assignee: Duo Zhang
> Fix For: 2.0.0, 1.3.0, 0.98.19, 1.4.0
>
> Attachments: HBASE-15454-v1.patch, HBASE-15454.patch
>
>
> Sometimes the old data is rarely touched but we can not remove it. So archive
> it to several big files(by year or something) and use EC to reduce the
> redundancy.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)