[ https://issues.apache.org/jira/browse/HBASE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891424#action_12891424 ]
HBase Review Board commented on HBASE-1364: ------------------------------------------- Message from: "Alex Newman" <newa...@cloudera.com> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/370/ ----------------------------------------------------------- Review request for hbase. Summary ------- This build on the previous work. It does some smarter stuff with testing and now splitting is configurable. This addresses bug hbase-1364. http://issues.apache.org/jira/browse/hbase-1364 Diffs ----- src/main/java/org/apache/hadoop/hbase/HConstants.java c77ebf5 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java f251d54 src/main/java/org/apache/hadoop/hbase/regionserver/LogSplitter.java PRE-CREATION src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java 5688c03 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java 8225178 src/main/resources/hbase-default.xml e3a9669 src/test/java/org/apache/hadoop/hbase/regionserver/wal/BaseTestHLogSplit.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/regionserver/wal/DistributedTestHLog.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/regionserver/wal/DistributedTestHLogSplit.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/regionserver/wal/DistributedTestHLogSplitSkipErrors.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/regionserver/wal/DistributedTestLogRolling.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java ad8f9e5 src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplit.java 908633e src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplitSkipErrors.java PRE-CREATION src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogActionsListener.java 776d78c src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java 9eae4b4 src/test/resources/hbase-site.xml 3c0601a Diff: http://review.hbase.org/r/370/diff Testing ------- ran on our private hudson Thanks, Alex > [performance] Distributed splitting of regionserver commit logs > --------------------------------------------------------------- > > Key: HBASE-1364 > URL: https://issues.apache.org/jira/browse/HBASE-1364 > Project: HBase > Issue Type: Improvement > Reporter: stack > Assignee: Alex Newman > Priority: Critical > Fix For: 0.92.0 > > Attachments: 1 (3), 1364-v2.patch, 1364.patch > > Time Spent: 8h > Remaining Estimate: 0h > > HBASE-1008 has some improvements to our log splitting on regionserver crash; > but it needs to run even faster. > (Below is from HBASE-1008) > In bigtable paper, the split is distributed. If we're going to have 1000 > logs, we need to distribute or at least multithread the splitting. > 1. As is, regions starting up expect to find one reconstruction log only. > Need to make it so pick up a bunch of edit logs and it should be fine that > logs are elsewhere in hdfs in an output directory written by all split > participants whether multithreaded or a mapreduce-like distributed process > (Lets write our distributed sort first as a MR so we learn whats involved; > distributed sort, as much as possible should use MR framework pieces). On > startup, regions go to this directory and pick up the files written by split > participants deleting and clearing the dir when all have been read in. Making > it so can take multiple logs for input, can also make the split process more > robust rather than current tenuous process which loses all edits if it > doesn't make it to the end without error. > 2. Each column family rereads the reconstruction log to find its edits. Need > to fix that. Split can sort the edits by column family so store only reads > its edits. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.