[ 
https://issues.apache.org/jira/browse/HBASE-6295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13630828#comment-13630828
 ] 

stack commented on HBASE-6295:
------------------------------

[~liochon] Does it address the following issue Nicolas?  I am not clear it does:

Cluster has N regionservers.  It has M regions well distributed across the 
cluster.  There are X clients.

All clients access region Y once every other second or so for some small bit of 
data.

Something happens such that Y region responses get REALLY SLOW.

Eventually all clients get hung up waiting on their return from region Y (it is 
taking a REALLY LONG time).  Because it is taking so long for the clients to 
get the response back, all clients get totally occupied waiting on their 
outstanding requests against region Y.  Because client is stuck on region Y, 
the application can get no data out of the cluster; i.e. 99.99% of the regions 
are online but one REALLY SLOW region can make it so you can't get at the rest 
of the data.

An actual scenario is an apache frontend with a fixed number of workers, say 
100.  This apache webserver goes to a thrift server.  The thrift server is 
hosting the hbase client.  If a region is having a problem, it could be the 
case that we could back up such that all the apache fixed number workers are 
blocked waiting on a reply out of this one slow region.

Will this patch help w/ the above scenario?
                
> Possible performance improvement in client batch operations: presplit and 
> send in background
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6295
>                 URL: https://issues.apache.org/jira/browse/HBASE-6295
>             Project: HBase
>          Issue Type: Improvement
>          Components: Client, Performance
>    Affects Versions: 0.95.2
>            Reporter: Nicolas Liochon
>            Assignee: Nicolas Liochon
>              Labels: noob
>         Attachments: 6295.v1.patch, 6295.v2.patch, 6295.v3.patch
>
>
> today batch algo is:
> {noformat}
> for Operation o: List<Op>{
>   add o to todolist
>   if todolist > maxsize or o last in list
>     split todolist per location
>     send split lists to region servers
>     clear todolist
>     wait
> }
> {noformat}
> We could:
> - create immediately the final object instead of an intermediate array
> - split per location immediately
> - instead of sending when the list as a whole is full, send it when there is 
> enough data for a single location
> It would be:
> {noformat}
> for Operation o: List<Op>{
>   get location
>   add o to todo location.todolist
>   if (location.todolist > maxLocationSize)
>     send location.todolist to region server 
>     clear location.todolist
>     // don't wait, continue the loop
> }
> send remaining
> wait
> {noformat}
> It's not trivial to write if you add error management: retried list must be 
> shared with the operations added in the todolist. But it's doable.
> It's interesting mainly for 'big' writes

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to