[jira] [Updated] (HBASE-13090) Progress heartbeats for long running scanners

Jonathan Lawlor (JIRA) Fri, 17 Apr 2015 10:32:45 -0700

     [ 
https://issues.apache.org/jira/browse/HBASE-13090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jonathan Lawlor updated HBASE-13090:
------------------------------------
    Release Note: 
Previously, there was no way to enforce a time limit on scan RPC requests. The 
server would receive a scan RPC request and take as much time as it needed to 
accumulate enough results to reach a limit or exhaust the region. The problem 
with this approach was that, in the case of a very selective scan, the 
processing of the scan could take too long and cause timeouts client side.

With this fix, the server will now enforce a time limit on the execution of 
scan RPC requests. When a scan RPC request arrives to the server, a time limit 
is calculated to be half of whichever timeout value is more restictive between 
the configurations ("hbase.client.scanner.timeout.period" and 
"hbase.rpc.timeout"). When the time limit is reached, the server will return 
whatever results it has accumulated up to that point. The results may be empty.

To ensure that timeout checks do not occur too often (which would hurt the 
performance of scans), the configuration 
"hbase.cells.scanned.per.heartbeat.check" has been introduced. This 
configuration controls how often System.currentTimeMillis() is called to update 
the progress towards the time limit. Currently, the default value of this 
configuration value is 10000. Specifying a smaller value will provide a tighter 
bound on the time limit, but may hurt scan performance due to the higher 
frequency of calls to System.currentTimeMillis().

Protobuf models for ScanRequest and ScanResponse have been updated so that 
heartbeat support can be communicated. Support for heartbeat messages is 
specified in the request sent to the server via 
ScanRequest.Builder#setClientHandlesHeartbeats. Only when the server sees that 
ScanRequest#getClientHandlesHeartbeats() is true will it send heartbeat 
messages back to the client. A response is marked as a heartbeat message via 
the boolean flag ScanResponse#getHeartbeatMessage

> Progress heartbeats for long running scanners
> ---------------------------------------------
>
>                 Key: HBASE-13090
>                 URL: https://issues.apache.org/jira/browse/HBASE-13090
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>            Assignee: Jonathan Lawlor
>             Fix For: 2.0.0, 1.2.0
>
>         Attachments: HBASE-13090-v1.patch, HBASE-13090-v2.patch, 
> HBASE-13090-v3.patch, HBASE-13090-v3.patch, HBASE-13090-v4.patch, 
> HBASE-13090-v6.patch, HBASE-13090-v7.patch
>
>
> It can be necessary to set very long timeouts for clients that issue scans 
> over large regions when all data in the region might be filtered out 
> depending on scan criteria. This is a usability concern because it can be 
> hard to identify what worst case timeout to use until scans are 
> occasionally/intermittently failing in production, depending on variable scan 
> criteria. It would be better if the client-server scan protocol can send back 
> periodic progress heartbeats to clients as long as server scanners are alive 
> and making progress.
> This is related but orthogonal to streaming scan (HBASE-13071). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13090) Progress heartbeats for long running scanners

Reply via email to