[jira] [Commented] (HBASE-12790) Support fairness across parallelized scans

ramkrishna.s.vasudevan (JIRA) Tue, 03 Nov 2015 10:13:12 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-12790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14987759#comment-14987759
 ]


ramkrishna.s.vasudevan commented on HBASE-12790:
------------------------------------------------

Thanks for all the comments on the RB. Had an offline discussion with Andy, 
James and Anoop.  I would like to update the discussion here.
We will extend the groupid concept to all the client requests. That includes 
scan, gets, MutateRequest, MultiRequest, Bulkloadrequest etc.
In order to do this we expose the groupId API at the Operation level. This will 
allow every Put, Delete, Increment, Append, Get and Scan to have a grouping id. 
Now at the Rpc layer the scan and gets have one to one mapping with the scan 
requests. So the groupid set on the individual scan/gets can be used to do the 
round robin.
But for MultiRequest there could be 'n' number of actions like Puts, deletes, 
gets etc. And every thing will be mapped to one multiRequest. Since we expose 
groupId at the Operation level it will mean that different actions can have 
different groupids set but at the Rpc layer we take the first groupId as the id 
for the entire multiRequest. I had a concern with this part because users will 
be allowed to set different groupIds but internally we will be using only one 
of them and this point gets hidden from the user totally. May be it could 
confuse the user is what I thought. Overall this groupingId concept is not a 
direct parameter that affects the users result whereas it is more on how the 
server is going to handle the request. 
I can update the patch based on the above feedbacks/discussions. Any more 
queries and feedback are welcome!!


> Support fairness across parallelized scans
> ------------------------------------------
>
>                 Key: HBASE-12790
>                 URL: https://issues.apache.org/jira/browse/HBASE-12790
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: James Taylor
>            Assignee: ramkrishna.s.vasudevan
>              Labels: Phoenix
>         Attachments: AbstractRoundRobinQueue.java, HBASE-12790.patch, 
> HBASE-12790_1.patch, HBASE-12790_5.patch, HBASE-12790_callwrapper.patch, 
> HBASE-12790_trunk_1.patch, PHOENIX_4.5.3-HBase-0.98-2317-SNAPSHOT.zip
>
>
> Some HBase clients parallelize the execution of a scan to reduce latency in 
> getting back results. This can lead to starvation with a loaded cluster and 
> interleaved scans, since the RPC queue will be ordered and processed on a 
> FIFO basis. For example, if there are two clients, A & B that submit largish 
> scans at the same time. Say each scan is broken down into 100 scans by the 
> client (broken down into equal depth chunks along the row key), and the 100 
> scans of client A are queued first, followed immediately by the 100 scans of 
> client B. In this case, client B will be starved out of getting any results 
> back until the scans for client A complete.
> One solution to this is to use the attached AbstractRoundRobinQueue instead 
> of the standard FIFO queue. The queue to be used could be (maybe it already 
> is) configurable based on a new config parameter. Using this queue would 
> require the client to have the same identifier for all of the 100 parallel 
> scans that represent a single logical scan from the clients point of view. 
> With this information, the round robin queue would pick off a task from the 
> queue in a round robin fashion (instead of a strictly FIFO manner) to prevent 
> starvation over interleaved parallelized scans.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12790) Support fairness across parallelized scans

Reply via email to