[ https://issues.apache.org/jira/browse/HBASE-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jean-Daniel Cryans updated HBASE-2070: -------------------------------------- Attachment: HBASE-2070-v2.patch New patch with a new test and it passes all the other tests. Good for a review. > Collect HLogs and delete them after a period of time > ---------------------------------------------------- > > Key: HBASE-2070 > URL: https://issues.apache.org/jira/browse/HBASE-2070 > Project: Hadoop HBase > Issue Type: Sub-task > Reporter: Jean-Daniel Cryans > Assignee: Jean-Daniel Cryans > Fix For: 0.21.0 > > Attachments: HBASE-2070-v2.patch, HBASE-2070.patch > > > For replication we need to be able to service clusters that are a few hours > behind in edits. For example, after distcp'ing a snapshot of the DB to > another cluster, we need to make sure we get the edits that came in after the > snapshot was taken. > I plan the following changes: > - Instead of deleting HLogs during a log roll or after a log split, move them > to another folder where all logs should be aggregated. > - Add a new configuration for how old a log can be. For a normal cluster I > think of a default of 2 hours. For replication you may want to set it much > higher. > - Create a new thread in the master that checks for logs older than > configured time and that deletes them. > I also fancy having the deletion time to be configurable while the cluster is > running. I'm also thinking of adding a way to tell the cluster to replay > edits on itself. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.