Collect HLogs and delete them after a period of time ----------------------------------------------------
Key: HBASE-2070 URL: https://issues.apache.org/jira/browse/HBASE-2070 Project: Hadoop HBase Issue Type: New Feature Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.21.0 For replication we need to be able to service clusters that are a few hours behind in edits. For example, after distcp'ing a snapshot of the DB to another cluster, we need to make sure we get the edits that came in after the snapshot was taken. I plan the following changes: - Instead of deleting HLogs during a log roll or after a log split, move them to another folder where all logs should be aggregated. - Add a new configuration for how old a log can be. For a normal cluster I think of a default of 2 hours. For replication you may want to set it much higher. - Create a new thread in the master that checks for logs older than configured time and that deletes them. I also fancy having the deletion time to be configurable while the cluster is running. I'm also thinking of adding a way to tell the cluster to replay edits on itself. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.