[ 
https://issues.apache.org/jira/browse/HBASE-12790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14999067#comment-14999067
 ] 

stack commented on HBASE-12790:
-------------------------------

bq. Glad to see you're working on that over at Cloudera.  Hopefully you're 
testing with Phoenix too.

You did not read it. It is not "cloudera" work. It is apache hbase work. See 
listed JIRAs.  It is a summary of the state of scheduling art in apache hbase 
as of a while ago.

bq. I don't think having an extra optional attribute on an operation adds "a 
bunch of new complexity". That's fine if we disagree.

Andrews' considered response 'On complexity' plainly left no mark and you can't 
have reviewed the attached patch and comments. Only a superficial engagement 
with this issue and what all is involved could result in a characterization of 
what is going on here as just "having an extra optional attribute" (or that the 
cited, pertinent blog post is 'cloudera' work).

bq. Andrew Purtell made the point that if you're round robining on reads you 
should be consistent and do it on writes too - I think this is a fair point. 
Our immediate need is on the read side - I'll share our data when the analysis 
is complete... Our requirement is simple: the latency of point lookups and 
small-ish scans shouldn't be impacted by other workloads on the cluster. What 
ever implementation you come up with is fine by us.

Your requirement changes every time you comment and you do not know what you 
are asking for.

Let me try and write something up and situate it relative to work already done.

> Support fairness across parallelized scans
> ------------------------------------------
>
>                 Key: HBASE-12790
>                 URL: https://issues.apache.org/jira/browse/HBASE-12790
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: James Taylor
>            Assignee: ramkrishna.s.vasudevan
>              Labels: Phoenix
>         Attachments: AbstractRoundRobinQueue.java, HBASE-12790.patch, 
> HBASE-12790_1.patch, HBASE-12790_5.patch, HBASE-12790_callwrapper.patch, 
> HBASE-12790_trunk_1.patch, PHOENIX_4.5.3-HBase-0.98-2317-SNAPSHOT.zip
>
>
> Some HBase clients parallelize the execution of a scan to reduce latency in 
> getting back results. This can lead to starvation with a loaded cluster and 
> interleaved scans, since the RPC queue will be ordered and processed on a 
> FIFO basis. For example, if there are two clients, A & B that submit largish 
> scans at the same time. Say each scan is broken down into 100 scans by the 
> client (broken down into equal depth chunks along the row key), and the 100 
> scans of client A are queued first, followed immediately by the 100 scans of 
> client B. In this case, client B will be starved out of getting any results 
> back until the scans for client A complete.
> One solution to this is to use the attached AbstractRoundRobinQueue instead 
> of the standard FIFO queue. The queue to be used could be (maybe it already 
> is) configurable based on a new config parameter. Using this queue would 
> require the client to have the same identifier for all of the 100 parallel 
> scans that represent a single logical scan from the clients point of view. 
> With this information, the round robin queue would pick off a task from the 
> queue in a round robin fashion (instead of a strictly FIFO manner) to prevent 
> starvation over interleaved parallelized scans.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to