[ 
https://issues.apache.org/jira/browse/HBASE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891797#action_12891797
 ] 

HBase Review Board commented on HBASE-1364:
-------------------------------------------

Message from: "Alex Newman" <newa...@cloudera.com>


bq.  On 2010-07-23 13:41:09, Jonathan Gray wrote:
bq.  > Hey Alex, this is looking good.  The master rewrite branch has a 
refactoring of ZooKeeperWrapper and general ZK usage inside HBase that 
conflicts with this pretty significantly.
bq.  > 
bq.  > Do you think you could pull the new methods and classes nested in 
ZooKeeperWrapper into a separate class of static methods?  If you need the 
instantiated instance of ZKW, pass it in as the first argument to the static 
methods?  That will make my life WAY easier when I have to merge the branch 
back into trunk.
bq.  > 
bq.  > Also gives an opportunity to have a class comment in the new class 
explaining the overall usage of zk.
bq.  > 
bq.  > Stuff like the names of the nodes can be left in the instantiated ZKW 
class since it makes sense to pull those in from the confs on instantiation.
bq.  > 
bq.  > Cool?  Let me know if you want an example.

sounds good


- Alex


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.hbase.org/r/370/#review467
-----------------------------------------------------------





> [performance] Distributed splitting of regionserver commit logs
> ---------------------------------------------------------------
>
>                 Key: HBASE-1364
>                 URL: https://issues.apache.org/jira/browse/HBASE-1364
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: Alex Newman
>            Priority: Critical
>             Fix For: 0.92.0
>
>         Attachments: 1 (3), 1364-v2.patch, 1364.patch
>
>          Time Spent: 8h
>  Remaining Estimate: 0h
>
> HBASE-1008 has some improvements to our log splitting on regionserver crash; 
> but it needs to run even faster.
> (Below is from HBASE-1008)
> In bigtable paper, the split is distributed. If we're going to have 1000 
> logs, we need to distribute or at least multithread the splitting.
> 1. As is, regions starting up expect to find one reconstruction log only. 
> Need to make it so pick up a bunch of edit logs and it should be fine that 
> logs are elsewhere in hdfs in an output directory written by all split 
> participants whether multithreaded or a mapreduce-like distributed process 
> (Lets write our distributed sort first as a MR so we learn whats involved; 
> distributed sort, as much as possible should use MR framework pieces). On 
> startup, regions go to this directory and pick up the files written by split 
> participants deleting and clearing the dir when all have been read in. Making 
> it so can take multiple logs for input, can also make the split process more 
> robust rather than current tenuous process which loses all edits if it 
> doesn't make it to the end without error.
> 2. Each column family rereads the reconstruction log to find its edits. Need 
> to fix that. Split can sort the edits by column family so store only reads 
> its edits.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to