[ 
https://issues.apache.org/jira/browse/HBASE-12841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12841:
-----------------------------------
    Description: 
The ClientBackoffPolicy interface currently has a single method:
{code}
public interface ClientBackoffPolicy {
  public long getBackoffTime(ServerName serverName, byte[] region, 
ServerStatistics stats);
}
{code}

A backoff policy can only specify the amount of delay to inject before 
submitting the request(s) to a given server. 

How that works in the current implementation is we will submit runnables to 
AsyncProcess that sleep for the specified delay period before proceeding. This 
consumes task slots that could otherwise be performing useful work. 
AsyncProcess limits the number of outstanding tasks per region to 
"hbase.client.max.perregion.tasks" (default 1) and per server 
"hbase.client.max.perserver.tasks" (default 2). Tasks will be accepted and 
queued up to "hbase.client.max.total.tasks" (default 100), after which we start 
globally blocking submissions by waiting on a monitor.

Sophisticated applications could benefit from an alternate strategy that 
immediately rejects new work. Rather then returning a backoff interval, the 
policy could return a value from 0.0 to 1.0, or as percentage from 0 to 100, 
expressing the likelihood of task rejection. Immediately rejected tasks won't 
consume task slots nor "stall" by sleeping. Overall the client will be less 
likely to hit the global limit. Applications using APIs like Table#batch or 
Table#batchCallback will get control back faster, can determine what operations 
were failed by pushback, and deal intelligently with request ordering and 
resubmission/retry concerns. In network queuing this strategy is known as 
Random Early Drop (or Random Early Detection).

  was:
The ClientBackoffPolicy interface currently has a single method:
{code}
public interface ClientBackoffPolicy {
  public long getBackoffTime(ServerName serverName, byte[] region, 
ServerStatistics stats);
}
{code}

A backoff policy can only specify the amount of delay to inject before 
submitting the request(s) to a given server. 

How that works in the current implementation is we will submit runnables to 
AsyncProcess that sleep for the specified delay period before proceeding. This 
consumes task slots that could otherwise be performing useful work. 
AsyncProcess limits the number of outstanding tasks per region to 
"hbase.client.max.perregion.tasks" (default 1) and per server 
"hbase.client.max.perserver.tasks" (default 2). Tasks will be accepted and 
queued up to "hbase.client.max.total.tasks" (default 100), after which we start 
globally blocking submissions by waiting on a monitor.

Sophisticated applications could benefit from an alternate strategy that 
immediately rejects new work. Rather then returning a backoff interval, the 
policy could return a value from 0.0 to 1.0, or as percentage from 0 to 100, 
expressing the likelihood of task rejection. Immediately rejected tasks won't 
consume task slots nor "stall" by sleeping. Overall the client will be less 
likely to hit the global limit. Applications using APIs like Table#batch or 
Table#batchCallback will get control back faster, can determine what operations 
were failed by pushback, and deal intelligently with request ordering and 
resubmission/retry concerns. 


> ClientBackoffPolicies should support immediate rejection of submitted ops
> -------------------------------------------------------------------------
>
>                 Key: HBASE-12841
>                 URL: https://issues.apache.org/jira/browse/HBASE-12841
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>
> The ClientBackoffPolicy interface currently has a single method:
> {code}
> public interface ClientBackoffPolicy {
>   public long getBackoffTime(ServerName serverName, byte[] region, 
> ServerStatistics stats);
> }
> {code}
> A backoff policy can only specify the amount of delay to inject before 
> submitting the request(s) to a given server. 
> How that works in the current implementation is we will submit runnables to 
> AsyncProcess that sleep for the specified delay period before proceeding. 
> This consumes task slots that could otherwise be performing useful work. 
> AsyncProcess limits the number of outstanding tasks per region to 
> "hbase.client.max.perregion.tasks" (default 1) and per server 
> "hbase.client.max.perserver.tasks" (default 2). Tasks will be accepted and 
> queued up to "hbase.client.max.total.tasks" (default 100), after which we 
> start globally blocking submissions by waiting on a monitor.
> Sophisticated applications could benefit from an alternate strategy that 
> immediately rejects new work. Rather then returning a backoff interval, the 
> policy could return a value from 0.0 to 1.0, or as percentage from 0 to 100, 
> expressing the likelihood of task rejection. Immediately rejected tasks won't 
> consume task slots nor "stall" by sleeping. Overall the client will be less 
> likely to hit the global limit. Applications using APIs like Table#batch or 
> Table#batchCallback will get control back faster, can determine what 
> operations were failed by pushback, and deal intelligently with request 
> ordering and resubmission/retry concerns. In network queuing this strategy is 
> known as Random Early Drop (or Random Early Detection).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to