Collect HLogs and delete them after a period of time
----------------------------------------------------

                 Key: HBASE-2070
                 URL: https://issues.apache.org/jira/browse/HBASE-2070
             Project: Hadoop HBase
          Issue Type: New Feature
            Reporter: Jean-Daniel Cryans
            Assignee: Jean-Daniel Cryans
             Fix For: 0.21.0


For replication we need to be able to service clusters that are a few hours 
behind in edits. For example, after distcp'ing a snapshot of the DB to another 
cluster, we need to make sure we get the edits that came in after the snapshot 
was taken.

I plan the following changes:
- Instead of deleting HLogs during a log roll or after a log split, move them 
to another folder where all logs should be aggregated.
- Add a new configuration for how old a log can be. For a normal cluster I 
think of a default of 2 hours. For replication you may want to set it much 
higher.
- Create a new thread in the master that checks for logs older than configured 
time and that deletes them.

I also fancy having the deletion time to be configurable while the cluster is 
running. I'm also thinking of adding a way to tell the cluster to replay edits 
on itself.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to