[jira] Commented: (HBASE-2066) Perf: parallelize puts

ryan rawson (JIRA) Wed, 20 Jan 2010 14:06:20 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803014#action_12803014
 ]


ryan rawson commented on HBASE-2066:
------------------------------------

This is much less ambitious than HBASE-1845 and seeks to optimize the Put case 
only. 

One of the problems with the original HBASE-1845 patch is that it requires a 
new API to take advantage of it, thus requires porting code.  Furthermore there 
is HTable handy things like write buffering, write buffer size settings, etc, 
etc.  I started with the 1845 patch, and realized we also needed a way to 
parallelize puts in the normal API.  This is much simpler than 1845 because we 
don't have to line up return codes (there are no return codes for puts, just 
exceptions due to temporary issues).

Short: this is a drop in replacement and makes things go fast now. HBASE-1845 
requires a new API.

> Perf: parallelize puts
> ----------------------
>
>                 Key: HBASE-2066
>                 URL: https://issues.apache.org/jira/browse/HBASE-2066
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.2
>            Reporter: ryan rawson
>            Assignee: ryan rawson
>             Fix For: 0.21.0
>
>         Attachments: HBASE-2066-branch.patch
>
>
> Right now with large region count tables, the write buffer is not efficient.  
> This is because we issue potentially N RPCs, where N is the # of regions in 
> the table.  When N gets large (lets say 1200+) things become sloowwwww.
> Instead if we batch things up using a different RPC and use thread pools, we 
> could see higher performance!
> This requires a RPC change...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-2066) Perf: parallelize puts

Reply via email to