[
https://issues.apache.org/jira/browse/KUDU-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Lipcon updated KUDU-1395:
------------------------------
Target Version/s: 1.0.0 (was: 0.9.0)
> Scanner KeepAlive requests can get starved on an overloaded server
> ------------------------------------------------------------------
>
> Key: KUDU-1395
> URL: https://issues.apache.org/jira/browse/KUDU-1395
> Project: Kudu
> Issue Type: Bug
> Components: impala, rpc, tserver
> Affects Versions: 0.8.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
>
> As of 0.8.0, the RPC system schedules RPCs on an earliest-deadline-first
> basis, rejecting those with later deadlines. This works well for RPCs which
> are retried on SERVER_TOO_BUSY errors, since the retries maintain the
> original deadline and thus get higher and higher priority as they get closer
> to timing out.
> We don't, however, do any retries on scanner KeepAlive RPCs. So, if a
> keepalive RPC arrives at a heavily overloaded tserver, it will likely get
> rejected, and won't retry. This means that Impala queries or other long scans
> that rely on KeepAlives will likely fail on overloaded clusters since the
> KeepAlive never gets through.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)