[ 
https://issues.apache.org/jira/browse/HBASE-15454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280418#comment-15280418
 ] 

Dave Latham commented on HBASE-15454:
-------------------------------------

Sorry for the late responses - a minor update on reviewboard as well.

I haven't changed feelings about the above:
{quote}
I don't have good intuition for how such an archiving mechanism would effect 
write amplification in practice, or how it performs under edge cases (e.g. once 
in awhile another "old" cell shows up) or if it's likely to output several 
small HFiles when it runs for example. Do you have any analysis, simulation, or 
arguments about how this will behave and perform? It seems that using this 
makes stronger assumptions about the use case and write behavior.
If going in this direction, I wonder if it's better to go all the way, from 
having every minor compaction output perfectly partitioned HFiles
{quote}

I'm nervous about the extra complexity here and whether it's going to be used.  
I wonder if it is better off being done differently or as an extended policy.  
I'm not an HBase committer so am fine going along with the flow.  If I were, I 
guess I would be -0: makes me nervous but I wouldn't try to stop it going in if 
other people think it's a good idea.

> Archive store files older than max age
> --------------------------------------
>
>                 Key: HBASE-15454
>                 URL: https://issues.apache.org/jira/browse/HBASE-15454
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Compaction
>    Affects Versions: 2.0.0, 1.3.0, 0.98.18, 1.4.0
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>             Fix For: 2.0.0, 1.3.0, 1.4.0, 0.98.20
>
>         Attachments: HBASE-15454-v1.patch, HBASE-15454-v2.patch, 
> HBASE-15454-v3.patch, HBASE-15454-v4.patch, HBASE-15454-v5.patch, 
> HBASE-15454-v6.patch, HBASE-15454-v7.patch, HBASE-15454.patch
>
>
> In date tiered compaction, the store files older than max age are never 
> touched by minor compactions. Here we introduce a 'freeze window' operation, 
> which does the follow things:
> 1. Find all store files that contains cells whose timestamp are in the give 
> window.
> 2. Compaction all these files and output one file for each window that these 
> files covered.
> After the compaction, we will have only one in the give window, and all cells 
> whose timestamp are in the give window are in the only file. And if you do 
> not write new cells with an older timestamp in this window, the file will 
> never be changed. This makes it easier to do erasure coding on the freezed 
> file to reduce redundence. And also, it makes it possible to check 
> consistency between master and peer cluster incrementally.
> And why use the word 'freeze'?
> Because there is already an 'HFileArchiver' class. I want to use a different 
> word to prevent confusing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to