[ https://issues.apache.org/jira/browse/HBASE-5403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208751#comment-13208751 ]
Li Pi commented on HBASE-5403: ------------------------------ Rolling the log would just reset the dictionary, which means performance will be degraded for a bit until the dictionary was built back up again. I'm assuming checkpointing would involve dumping the contents of the dictionary at certain points - but the max size of the dictionary can be quite large, up to 32 megabytes or so in extreme cases. This has its own problems. > Checkpoint the compressed HLog > ------------------------------ > > Key: HBASE-5403 > URL: https://issues.apache.org/jira/browse/HBASE-5403 > Project: HBase > Issue Type: Improvement > Reporter: Liyin Tang > Assignee: Liyin Tang > > Let's assume that HBase replication can be based on replaying the HLog in the > replica cluster. > The replica process could be crash during the replay. Obviously, the replica > process need a way to start from the lastest check point in the HLog, even > the HLog is compressed. > So the proposal is to write a series of checkpoints within the HLog. > Each each checkpoint will have a header with some special sequence of bytes. > And between each checkpoints, HLog should use new dictionaries to compress. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira