[
https://issues.apache.org/jira/browse/HBASE-2070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834506#action_12834506
]
Jean-Daniel Cryans commented on HBASE-2070:
-------------------------------------------
bq. We should make a new issue for this
bq. Yeah too to the logs shouldn't be cleared if replication is down. Can we
put up a gate in zk?
I was planning on doing that in the scope of HBASE-2223.
bq. It must be dumbest name ever given a file since the epoch began? (We should
do that in another patch.....another issue)
Yeah, another issue.
bq. Want to make a regex to verify expected file name rather than:
Will do
bq. Do you have to put a timestamp on it? Doesn't HDFS tell you its
last-modified time? (There may be caveats to this but IIRC, for something this
basic should be fine).
I wanted to avoid 2 logs created at the same time having the same name. It can
still happen, but the chance is very very low.
Thanks for the review!
> Collect HLogs and delete them after a period of time
> ----------------------------------------------------
>
> Key: HBASE-2070
> URL: https://issues.apache.org/jira/browse/HBASE-2070
> Project: Hadoop HBase
> Issue Type: Sub-task
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Fix For: 0.21.0
>
> Attachments: HBASE-2070-v2.patch, HBASE-2070-v3.patch,
> HBASE-2070-v4.patch, HBASE-2070.patch
>
>
> For replication we need to be able to service clusters that are a few hours
> behind in edits. For example, after distcp'ing a snapshot of the DB to
> another cluster, we need to make sure we get the edits that came in after the
> snapshot was taken.
> I plan the following changes:
> - Instead of deleting HLogs during a log roll or after a log split, move them
> to another folder where all logs should be aggregated.
> - Add a new configuration for how old a log can be. For a normal cluster I
> think of a default of 2 hours. For replication you may want to set it much
> higher.
> - Create a new thread in the master that checks for logs older than
> configured time and that deletes them.
> I also fancy having the deletion time to be configurable while the cluster is
> running. I'm also thinking of adding a way to tell the cluster to replay
> edits on itself.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.