Collect HLogs and delete them after a period of time
----------------------------------------------------
Key: HBASE-2070
URL: https://issues.apache.org/jira/browse/HBASE-2070
Project: Hadoop HBase
Issue Type: New Feature
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Fix For: 0.21.0
For replication we need to be able to service clusters that are a few hours
behind in edits. For example, after distcp'ing a snapshot of the DB to another
cluster, we need to make sure we get the edits that came in after the snapshot
was taken.
I plan the following changes:
- Instead of deleting HLogs during a log roll or after a log split, move them
to another folder where all logs should be aggregated.
- Add a new configuration for how old a log can be. For a normal cluster I
think of a default of 2 hours. For replication you may want to set it much
higher.
- Create a new thread in the master that checks for logs older than configured
time and that deletes them.
I also fancy having the deletion time to be configurable while the cluster is
running. I'm also thinking of adding a way to tell the cluster to replay edits
on itself.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.