[
https://issues.apache.org/jira/browse/HBASE-12790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900259#comment-15900259
]
Samarth Jain edited comment on HBASE-12790 at 3/7/17 10:07 PM:
---------------------------------------------------------------
Thanks for the pointer [~anoop.hbase]. I see that there is already an executor
RWQueueRpcExecutor that gets hold of the ScanRequest in the dispatch method.
{code}
private boolean isScanRequest(final RequestHeader header, final Message param) {
if (param instanceof ScanRequest) {
// The first scan request will be executed as a "short read"
ScanRequest request = (ScanRequest)param;
return request.hasScannerId();
}
return false;
}
@Override
public boolean dispatch(final CallRunner callTask) throws
InterruptedException {
...
if (numScanQueues > 0 && isScanRequest(call.getHeader(), call.param)) {
queueIndex = numWriteQueues + numReadQueues +
scanBalancer.getNextQueue();
}
...
{code}
So yes, there is a way forward by utilizing the scan attribute for this purpose
without having to add an API to Operation. Having said that, looking at the
isWriteRequest method in the same class, I see that things can get
gnarly/brittle/inefficient.
{code}
private boolean isWriteRequest(final RequestHeader header, final Message param)
{
// TODO: Is there a better way to do this?
if (param instanceof MultiRequest) {
MultiRequest multi = (MultiRequest)param;
for (RegionAction regionAction : multi.getRegionActionList()) {
for (Action action: regionAction.getActionList()) {
if (action.hasMutation()) {
return true;
}
}
}
}
{code}
So the ideal would be to have a generic enough API to enable clients mark
read/write requests for whatever they would want to do with it on the server
side.
was (Author: samarthjain):
Thanks for the pointer [~anoop.hbase]. I see that there is already an executor
RWQueueRpcExecutor that gets hold of the ScanRequest in the dispatch method.
{code}
private boolean isScanRequest(final RequestHeader header, final Message param) {
if (param instanceof ScanRequest) {
// The first scan request will be executed as a "short read"
ScanRequest request = (ScanRequest)param;
return request.hasScannerId();
}
return false;
}
@Override
public boolean dispatch(final CallRunner callTask) throws
InterruptedException {
...
if (numScanQueues > 0 && isScanRequest(call.getHeader(), call.param)) {
queueIndex = numWriteQueues + numReadQueues +
scanBalancer.getNextQueue();
}
...
{code}
So yes, there is a way forward by utilizing the scan attribute for this purpose
without having to add an API to Operation. Having said that, looking at the
isWriteRequest method in the same class, I see that things can get
gnarly/brittle/inefficient.
{code}
private boolean isWriteRequest(final RequestHeader header, final Message param)
{
// TODO: Is there a better way to do this?
if (param instanceof MultiRequest) {
MultiRequest multi = (MultiRequest)param;
for (RegionAction regionAction : multi.getRegionActionList()) {
for (Action action: regionAction.getActionList()) {
if (action.hasMutation()) {
return true;
}
}
}
}
{code}
> Support fairness across parallelized scans
> ------------------------------------------
>
> Key: HBASE-12790
> URL: https://issues.apache.org/jira/browse/HBASE-12790
> Project: HBase
> Issue Type: New Feature
> Reporter: James Taylor
> Assignee: ramkrishna.s.vasudevan
> Labels: Phoenix
> Attachments: AbstractRoundRobinQueue.java, HBASE-12790_1.patch,
> HBASE-12790_5.patch, HBASE-12790_callwrapper.patch, HBASE-12790.patch,
> HBASE-12790_trunk_1.patch, PHOENIX_4.5.3-HBase-0.98-2317-SNAPSHOT.zip
>
>
> Some HBase clients parallelize the execution of a scan to reduce latency in
> getting back results. This can lead to starvation with a loaded cluster and
> interleaved scans, since the RPC queue will be ordered and processed on a
> FIFO basis. For example, if there are two clients, A & B that submit largish
> scans at the same time. Say each scan is broken down into 100 scans by the
> client (broken down into equal depth chunks along the row key), and the 100
> scans of client A are queued first, followed immediately by the 100 scans of
> client B. In this case, client B will be starved out of getting any results
> back until the scans for client A complete.
> One solution to this is to use the attached AbstractRoundRobinQueue instead
> of the standard FIFO queue. The queue to be used could be (maybe it already
> is) configurable based on a new config parameter. Using this queue would
> require the client to have the same identifier for all of the 100 parallel
> scans that represent a single logical scan from the clients point of view.
> With this information, the round robin queue would pick off a task from the
> queue in a round robin fashion (instead of a strictly FIFO manner) to prevent
> starvation over interleaved parallelized scans.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)