[ https://issues.apache.org/jira/browse/HADOOP-1820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12529255 ]
Hadoop QA commented on HADOOP-1820: ----------------------------------- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12366305/patch.txt against trunk revision r577603. @author +1. The patch does not contain any @author tags. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new compiler warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests -1. The patch failed contrib unit tests. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/801/testReport/ Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/801/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/801/artifact/trunk/build/test/checkstyle-errors.html Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/801/console This message is automatically generated. > [hbase] regionserver creates hlogs without bound > ------------------------------------------------ > > Key: HADOOP-1820 > URL: https://issues.apache.org/jira/browse/HADOOP-1820 > Project: Hadoop > Issue Type: Bug > Components: contrib/hbase > Affects Versions: 0.15.0 > Reporter: stack > Assignee: Jim Kellerman > Fix For: 0.15.0 > > Attachments: excerpt.log, patch.txt, patch.txt, patch.txt > > > Regionserver keeps log of all edits for all the regions its carrying. Its > used recoverying state if a regionserver crashes: edits that have not been > persisted to an HStoreFile are rerun to populate memcache which in turn is > converted to an on-filesytem HStoreFile. On a period, the log is rotated and > a new one is opened. While the region server is up, the logs grow in number > without bound. Only the most recent contain unpersisted edits. If the > region server goes down clean, then its logs are cleaned up. If a region > server crashes, as part of recovery, the logs of edits are sorted and split > per region. Recovery would run faster if it did not have to plough through > reams of stale edits. > Just now, I had a host crash w/ 112 log files each of 30k plus edits each. > We could rename the log rolling thread the log maintainer. As well as > rolling logs, it could check for edit logs to clean. When rolled, logs could > be marked with the sequence id of their last contained edit. The thread > could on a period ask each hosted region for the "lowest highest" sequence id > of all regions deployed. Once this number had crossed out that on a > particular log, the log could be cleaned up safely. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.