[
https://issues.apache.org/jira/browse/HBASE-27048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566876#comment-17566876
]
Bryan Beaudreault commented on HBASE-27048:
-------------------------------------------
I'm still working on this. Changing NettyServerRpcConnection to use
EnvironmentEdge then uncovered other issues – I needed to change CallRunner to
set the Call startTime using EnvironmentEdge as well, otherwise the call is
erroneously considered dropped. Even then I'm still running into other less
obvious issues.
Changing RSRpcServices.getTimeLimit to fetch now from
System.currentTimeMillis() instead of EnvironmentEdge fixes the issue, so I'm
on the right track here. But that seems like a step backwards rather than
forwards, so I'm trying to see what the minimal scope of EnvironmentEdge
changes I need to make is.
Alternatively since we already exclude these tests (maybe for a similar
reason?) we could leave it as is. But I'm going to give it a bit more time.
> Server side scanner time limit should account for time in queue
> ---------------------------------------------------------------
>
> Key: HBASE-27048
> URL: https://issues.apache.org/jira/browse/HBASE-27048
> Project: HBase
> Issue Type: Improvement
> Reporter: Bryan Beaudreault
> Assignee: Bryan Beaudreault
> Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-4, 2.4.14
>
>
> When a scan request comes in with a timeout specified and heartbeats/partials
> allowed, we calculate a time limit for running the scan to be half of that
> timeout. The idea is to return before the timeout expires.
> The calculation of that time limit is "now + timeout / 2", where now is the
> point at which the scan is starting to run. What's missed here is the scan
> may have spent upwards of a few seconds in the IPC queue before being
> serviced. In this case, the time limit may extend beyond the timeout of the
> request and the server will not return in time.
> We should calculate the time limit from ServerCall.getReceiveTime instead to
> avoid these timeouts.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)