[ 
https://issues.apache.org/jira/browse/IMPALA-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17153130#comment-17153130
 ] 

Sahil Takiar commented on IMPALA-3380:
--------------------------------------

Now that the control plane has been migrated to kRPC the RPC timeout is 
controlled by backend_client_rpc_timeout_ms (5 minutes by default).

catalog_client_rpc_timeout_ms controls catalogd client RPC timeouts, although 
it is set to 0 by default. Given that catalogd operations are blocking its 
seems tricky to come up with a good default value. For example, a catalogd 
operation that drops a database with hundreds of tables, could take hours to 
run.

We should add an RPC timeout for statestore operations though, which should be 
easier to bound.

> Add TCP timeouts to all RPCs that don't block
> ---------------------------------------------
>
>                 Key: IMPALA-3380
>                 URL: https://issues.apache.org/jira/browse/IMPALA-3380
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Distributed Exec
>    Affects Versions: Impala 2.5.0
>            Reporter: Henry Robinson
>            Assignee: Sahil Takiar
>            Priority: Minor
>              Labels: observability, supportability
>
> Most RPCs should not take an unbounded amount of time to complete (the 
> exception is {{TransmitData()}}, but that may also change). To handle hang 
> failures on the remote machine, we should add timeouts to every RPC (so, 
> really, every RPC client), and handle the timeout failure.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to