[
https://issues.apache.org/jira/browse/IMPALA-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17153130#comment-17153130
]
Sahil Takiar commented on IMPALA-3380:
--------------------------------------
Now that the control plane has been migrated to kRPC the RPC timeout is
controlled by backend_client_rpc_timeout_ms (5 minutes by default).
catalog_client_rpc_timeout_ms controls catalogd client RPC timeouts, although
it is set to 0 by default. Given that catalogd operations are blocking its
seems tricky to come up with a good default value. For example, a catalogd
operation that drops a database with hundreds of tables, could take hours to
run.
We should add an RPC timeout for statestore operations though, which should be
easier to bound.
> Add TCP timeouts to all RPCs that don't block
> ---------------------------------------------
>
> Key: IMPALA-3380
> URL: https://issues.apache.org/jira/browse/IMPALA-3380
> Project: IMPALA
> Issue Type: Sub-task
> Components: Distributed Exec
> Affects Versions: Impala 2.5.0
> Reporter: Henry Robinson
> Assignee: Sahil Takiar
> Priority: Minor
> Labels: observability, supportability
>
> Most RPCs should not take an unbounded amount of time to complete (the
> exception is {{TransmitData()}}, but that may also change). To handle hang
> failures on the remote machine, we should add timeouts to every RPC (so,
> really, every RPC client), and handle the timeout failure.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]