[jira] Commented: (HBASE-2066) Perf: parallelize puts

ryan rawson (JIRA) Tue, 09 Feb 2010 14:24:52 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831708#action_12831708
 ]


ryan rawson commented on HBASE-2066:
------------------------------------

looks like a basic thread concurrency problem here.

Now to the performance issues, the current code uses ONE threadpool for 
everyone, which is currently set to 10 threads static.  The original code used 
a thread pool per HTable and sized it to the number of regionservers - that is 
impossible to do in HCM because of chicken-and-egg bootstrap problems (the call 
we'd use calls HCM.<init> which calls ...).  

Maybe the threadpool should move back into HTable to support parallelism 
better?  With 10 worker threads for way more than 10 client threads, yeah put 
performance is going to nosedive.

> Perf: parallelize puts
> ----------------------
>
>                 Key: HBASE-2066
>                 URL: https://issues.apache.org/jira/browse/HBASE-2066
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.2
>            Reporter: ryan rawson
>            Assignee: ryan rawson
>             Fix For: 0.21.0
>
>         Attachments: HBASE-2066-branch.patch, HBASE-2066-v2.patch, 
> TestBatchPut.java
>
>
> Right now with large region count tables, the write buffer is not efficient.  
> This is because we issue potentially N RPCs, where N is the # of regions in 
> the table.  When N gets large (lets say 1200+) things become sloowwwww.
> Instead if we batch things up using a different RPC and use thread pools, we 
> could see higher performance!
> This requires a RPC change...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-2066) Perf: parallelize puts

Reply via email to