[ 
https://issues.apache.org/jira/browse/HBASE-15454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236605#comment-15236605
 ] 

Duo Zhang commented on HBASE-15454:
-----------------------------------

{quote}
it seems very inefficient that we need a routine to slice the data along the 
exponential windows for minor/major compaction and another concurrent routine 
to slice the data along the calendar windows to archive them.
{quote}
For major compaction, windows before 'max age' will be ArchiveWindow. The 'max 
age' is a split point, newer windows are tiered and older windows are archived.

{quote}
A user should only need either layout, not both.
{quote}
That's not true. For example, you want to archive data by year, and today is 
Jan 1, I think most of people do not want to archive the data written 
yesterday, right? So there will be a 'max age' config which means we only 
archive data older than it.

{quote}
And please add the EC manager code and make it work with both types of windows.
{quote}
The EC manager is part of HDFS...As I said, it is transparent to HBase 
currently...And you can use it for any file you want, but it can not perform 
well on small files.

Thanks.

> Archive store files older than max age
> --------------------------------------
>
>                 Key: HBASE-15454
>                 URL: https://issues.apache.org/jira/browse/HBASE-15454
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Compaction
>    Affects Versions: 2.0.0, 1.3.0, 0.98.18, 1.4.0
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>             Fix For: 2.0.0, 1.3.0, 0.98.19, 1.4.0
>
>         Attachments: HBASE-15454-v1.patch, HBASE-15454.patch
>
>
> Sometimes the old data is rarely touched but we can not remove it. So archive 
> it to several big files(by year or something) and use EC to reduce the 
> redundancy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to